
Nominal data: This type of categorical data consists of the name variable without any numerical values. There can be two kinds of categorical data: in which each of the blood types is a categorical value.

For example, a list of many people with their blood group: A+, A-, B+, B-, AB+, AB-,O+, O- etc. We can say that the data consisting of finite possible values can be considered categorical data.Ĭategorical data can be considered as gathered information that is divided into groups. Table of contentsĪs the whole discussion in this article is based on working with the categorical data, we should begin with understanding the categorical data. The following are the important points that we will discuss in this article. In this article, we will discuss categorical data encoding and we will try to understand why we need the process of categorical data encoding. So in short we can say most of the models require numbers as the data, not strings or not anything else and these numbers can be float or integer.Įncoding categorical data is a process of converting categorical data into integer format so that the data with converted categorical values can be provided to the models to give and improve the predictions. But the harsh truth is that mathematics is totally dependent on numbers. All models basically perform mathematical operations which can be performed using different tools and techniques. As we know, most of the data in real life come with categorical string values and most of the machine learning models work with integer values only and some with other different values which can be understandable for the model.

Encoding categorical data is one of such tasks which is considered crucial. There are various tasks we require to perform in the data preparation.

In the field of data science, before going for the modelling, data preparation is a mandatory task.
