8 Feature Engineering Techniques for Machine Learning

Crack the code of feature engineering and gain a competitive edge in data science with this comprehensive guide on Feature Engineering techniques for ML. | ProjectPro

8 Feature Engineering Techniques for Machine Learning
 |  BY ProjectPro

Want to make your machine learning models more accurate? Try feature engineering! This blog post will discuss how feature engineering helps transform your data into features your ML models will love.


Build Real Estate Price Prediction Model with NLP and FastAPI

Downloadable solution code | Explanatory videos | Tech Support

Start Project

Imagine you are a chef preparing a gourmet meal. You can't just toss random ingredients together and expect a masterpiece, right? The same principle applies to feature engineering for machine learning. Welcome to our feature-packed guide on Feature Engineering techniques for machine learning. Just as the right blend of spices can elevate a dish, feature engineering in machine learning is the secret ingredient that transforms raw data into meaningful insights. From automatically extracting features and valuable information from any text to handling missing values and creating powerful interaction features, we will equip you with a list of feature engineering techniques to enhance your data science and machine learning projects. Get ready to become the master chef of your predictive models!

Image for Feature Engineering Techniques in ML

What Is Feature Engineering For Machine Learning?

Before moving straight on to feature engineering, let us get a quick overview of features and the various types of features in machine learning.

What Are Features In Machine Learning?

Machine learning algorithms are designed to process large amounts of data and identify useful patterns for making predictions or decisions. In supervised learning, features are the algorithm's input variables to make predictions. For example, in a spam filter, the features include the sender's email address, the subject line, the message content, and so on. By analyzing these features, the machine learning algorithm can determine whether the email will likely be spam. In unsupervised learning, features are used to identify data patterns without predefined labels or categories. For instance, features might be used in a clustering algorithm to group similar data points based on their shared characteristics.

Types of Features in Machine Learning

Features in machine learning can roughly be termed as the building blocks of any machine learning model and the input variables that a machine learning algorithm uses to make predictions or decisions. Here are the different types of features in machine learning-

  • Numerical Features- These features are continuous values that can be measured on a scale. Examples of numerical features include age, height, weight, and income. Numerical features can be used in machine learning algorithms directly.

  • Categorical Features- These features are discrete values that can be grouped into categories. Examples of categorical features include gender, color, and zip code. Categorical features in machine learning typically need to be converted to numerical features before they can be used in machine learning algorithms. You can easily do this using one-hot, label, and ordinal encoding.

  • Time-series Features- These features are measurements that are taken over time. Time-series features include stock prices, weather data, and sensor readings. You can use these features to train machine learning models that can predict future values or identify patterns in the data.

  • Text Features- These features are text strings that can represent words, phrases, or sentences. Examples of text features include product reviews, social media posts, and medical records. You can use text features to train machine learning models that can understand the meaning of text or classify text into different categories.

The type of feature that is most suitable for a particular machine-learning task will depend on the specific problem that is being solved. For example, if you are trying to predict the price of a house, you might use numerical features such as the size of the house, the number of bedrooms, and the house's location. If you classify a piece of text as spam or not, you might use text features such as the words used in the text and the order in which they are used.

ProjectPro Free Projects on Big Data and Data Science

How To Select Features in Machine Learning?

Selecting the right features is crucial for ensuring the effectiveness of a machine-learning model. The choice of features can significantly impact the accuracy and efficiency of the model. 

One way to select features is by using domain knowledge. For example, if you are building a spam filter, you know that certain words or phrases are more commonly used in spam emails than legitimate ones. You can use this knowledge to include those words or phrases as features in the model.

Another approach involves feature extraction and selection techniques such as correlation analysis, principal components analysis (PCA), or recursive feature elimination (RFE). These techniques help you identify the most relevant features of the model while ignoring irrelevant or redundant ones.

Here are some tips on how to select the most appropriate features in machine learning-

  • You must understand the problem you are trying to solve. Try to answer the questions, ‘what are the features that are most relevant to the problem?’, ‘what features are likely to be most predictive of the target variable?’, etc.

  • You need to explore the data. You must look at the distribution of the features and see if there are any outliers or missing values. You may need to clean the data or remove features that are not informative.

  • You need to use feature selection techniques. Several feature selection techniques, such as filter, wrapper, and embedded methods, are available. Each technique has strengths and weaknesses, so you must choose the most appropriate for your problem.

  • You must evaluate the results. Once you select a set of features, you must assess the results. How well does the model perform with the selected features? Could you remove any features without significantly impacting the model's performance?

Below is an example of how you can select features in machine learning using Python-

Image for Feature Selection Using Python in ML

This code will load the data from a CSV file and select the top 10 features using the chi-squared test. The selected features will then be printed to the console.

Let us now understand feature engineering for machine learning and the different techniques you can use to perform feature engineering in machine learning.

What Is Feature Engineering in Machine Learning?

“Coming up with features is difficult, time-consuming, and requires expert knowledge. ‘Applied machine learning is basically feature engineering.” — Prof. Andrew Ng.

Data Scientists spend 80% of their time doing automated feature engineering because a time-consuming and difficult. Understanding features and the various techniques to deconstruct this art can ease the complex automated feature engineering process.

Feature engineering in Machine Learning involves extracting useful features from given input data following the target to be learned and the machine learning model used. It involves transforming data to forms that better relate to the underlying target to be learned. When done right, feature engineering can augment the value of your existing data and improve the performance of your machine learning models. On the other hand, using bad features may require you to build much more complex models to achieve the same level of performance.

This is why Feature Engineering has become indispensable in machine learning projects. Yet, when it comes to applying this magical concept of Feature Engineering for machine learning projects, there is no hard and fast method or theoretical framework, which is why it has maintained its status as a concept that eludes many.

This article will try to demystify this subtle art while establishing the significance it bears despite its nuances and finally get started on our journey with a fun feature engineering Python example you can follow along!

Why Is Feature Engineering Important For Machine Learning?

To understand what feature engineering is at an intuitive level and why it is indispensable, it might be useful to decipher how humans comprehend data. Humans have an ability, leaps ahead of that of a machine, to find complex patterns or relations, so much so that we can see them even when they don’t exist. Yet even to us, data presented efficiently could mean much more than that given randomly.  If you haven’t experienced this already, let’s try to drive this home with a ‘sweet’ feature engineering example!

Say you have been provided with the following data about candy orders:

Image for Feature Engineering Example Dataset

Image for Feature Engineering Example Data Table

You have also been informed that the customers are uncompromising candy-lovers who consider their candy preference far more critical than the price or dimensions (essentially uncorrelated price, dimensions, and candy sales). What would you do when asked to predict which kind of candy is most likely to sell on a particular day?

Download Feature Engineering for Machine Learning Principles and Techniques for Data Scientists iPython Notebook

Then, you would agree that the variety of candy ordered would depend more on the date than on the time it was ordered and that the sales for a particular variety of candy would vary according to the season. 

Now that you instinctively know what features would most likely contribute to your predictions, let us present our data better by simply creating a new feature Date from the existing feature Date and Time.

Image for Feature Extraction Code

Image for Creating New Table Using Feature Extraction

Given the very same input data, the table you have obtained should make it simpler to predict that Sour Jellies are most likely to sell, especially around the end of October (Halloween!).

In addition, if you wanted to know more about the weekend and weekday sale trends, in particular, you could categorize the days of the week in a feature called Weekend with 1=True and 0=False. 

With this, you could predict stocking your shelves on weekends would be best!

Image for Sales Prediction Code Sample Using New Feature Data

Image for Sales Prediction Table Using New Feature Data

This example should have emphasized how a little bit of Feature Engineering could transform how you understand your data. However, such linear and straightforward relationships could do wonders for a machine.

Now that you have wrapped your head around why Feature Engineering is so important, how it could work, and why it can’t be simply done mechanically, let’s explore a few in Feature engineering techniques in machine learning that could help!

Upskill yourself for your dream job with industry-level Big Data Analytics Projects with Source Code

Feature Engineering Techniques For Machine Learning -How to do Feature Engineering?

While understanding the training data and the targeted problem is an indispensable part of Feature Engineering in machine learning, and there are indeed no hard and fast rules as to how it is to be achieved, the following feature engineering techniques for machine learning are a must know for all data scientists-

Imputation deals with handling missing values in data. While deleting records missing specific values is one way of dealing with this issue, it could also mean losing out on valuable data. This is where imputation can help. It can be classified into two types-

  • Categorical Imputation: Missing categorical variables are generally replaced by the most commonly occurring value in other records

  • Numerical Imputation: Missing numerical values are generally replaced by the mean of the corresponding value in other records

Image for Categorical Imputation Sample Code

Image for Categorical Imputation Output Table

Image for Numerical Imputation Sample Code

Image for Numerical Imputation Output Table

Notice how the technique of imputation given above corresponds with the principle of normal distribution (where the values in the distribution are more likely to occur closer to the mean rather than the edges), which results in a relatively good estimate of missing data. Other ways to do this include replacing missing values by picking the value from a normal distribution with the mean/standard deviation of the corresponding existing values or even replacing the missing value with an arbitrary value.

However, one must be reasonably cautious when using this technique because retention of data size with this technique could come at the cost of deterioration of data quality. For example, in the above candy problem, you were given 5 records instead of one with the ‘Candy Variety’ missing. Using the above technique, you would predict the missing values as ‘Sour Jelly,’ possibly predicting the high sales of Sour Jellies all through the year!  Therefore, it is wise to filter out records with greater than a certain number of missing data or critical values missing and apply your discretion depending on the size and quality of the data you are working with.

Discretization involves taking a set of data values and grouping sets of them together logically into bins (or buckets). Binning can apply to numerical values as well as to categorical data values. This could help prevent data from overfitting but comes at the cost of loss of granularity of data. The grouping of data can be done as follows:

  • Grouping of equal intervals

  • Grouping based on equal frequencies (of observations in the bin)

  • Grouping based on decision tree sorting (to establish a relationship with target)

Image for Discretization Sample Code

Image for Discretization Output Table

Recommended Projects to Learn Feature Engineering for Machine Learning

Categorical encoding is the technique used to encode categorical features into numerical values, which are usually simpler for an algorithm to understand. One hot encoding(OHE)  is a popularly used technique of categorical encoding. Here, categorical values are converted into simple numerical 1’s and 0’s without losing information. As with other techniques, OHE has disadvantages and must be used sparingly. It could dramatically increase the number of features and result in highly correlated features.

Image for Categorical Encoding Sample Code

Image for Categorical Encoding Output Table

Besides OHE there are other methods of categorical encodings, such as-

  • Count and Frequency encoding- captures each label's representation,

  • Mean encoding -establishes the relationship with the target, and

  • Ordinal encoding- the number assigned to each unique label.

Unlock the ProjectPro Learning Experience for FREE

Splitting features into parts can sometimes improve the value of the features toward the target to be learned. For instance, in this case, Date better contributes to the target function than Date and Time.

Image for Feature Splitting Sample Code

Image for Feature Splitting Output Table

Outliers are unusually high or low values in the dataset, which are unlikely to occur in normal scenarios. Since these outliers could adversely affect your prediction, they must be handled appropriately. The various methods of handling outliers include:

  • Removal: The records containing outliers are removed from the distribution. However, the presence of outliers over multiple variables could result in losing out on a large portion of the datasheet with this method.

  • Replacing values: The outliers could alternatively bed treated as missing values and replaced by using appropriate imputation.

  • Capping: Capping the maximum and minimum values and replacing them with an arbitrary value or a value from a variable distribution.

Access to a curated library of 250+ end-to-end industry projects with solution code, videos and tech support.

Request a demo

Variable transformation techniques help with normalizing skewed data. One such popularly used transformation is the logarithmic transformation. Logarithmic transformations operate to compress the larger numbers and relatively expand the smaller numbers. This, in turn, results in less skewed values, especially in the case of heavy-tailed distributions. Other variable transformations used include Square root and Box-Cox transformations, which generalize the former two.

Feature scaling is done owing to the sensitivity of some machine learning algorithms to the scale of the input values. This technique of feature scaling is sometimes referred to as feature normalization. The commonly used scaling processes include:

  • Min-Max Scaling- This process involves rescaling all values in a feature from 0 to 1. In other words, the minimum value in the original range will take 0, the maximum value will take 1, and the rest of the values between the two extremes will be appropriately scaled.

  • Standardization/Variance Scaling- All the data points are subtracted by their mean, and the result is divided by the distribution's variance to arrive at a distribution with a 0 mean and variance of 1.

It is necessary to be cautious when scaling sparse data using the above two techniques as it could result in additional computational load.

Feature creation involves deriving new features from existing ones. This can be done by simple mathematical operations such as aggregations to obtain the mean, median, mode, sum, or difference and even product of two values. Although derived directly from the given input data, these features can impact the performance when carefully chosen to relate to the target (as demonstrated later!)

While the techniques for feature creation in machine learning listed above are by no means a comprehensive list of techniques, they are popularly used and should help you get started with feature engineering in machine learning.

Join the Big Data community of developers by gaining hands-on experience in industry-level Spark Projects.

Feature Engineering Python-A Sweet Takeaway!

We have gone over ML Feature Engineering, some commonly used feature engineering techniques in machine learning projects, and their impact on our machine learning model’s performance. But why just take someone’s word for it?

Let’s consider a simple price prediction problem for our candy sales data –

Image for Candy Sales Input Data

Image for Candy Sales Input Data Table

We will employ a basic linear regression model to predict the price of various candies and learn how to implement Python ML feature engineering.

Let us start by building a function to calculate the coefficients using the standard formula for calculating the slope and intercept for our simple linear regression model

Image for Feature Engineering Python Example

Now we build our initial model without any Feature Engineering, by trying to relate one of the given features to our target. From observing all the variables in the given data we know that it is most likely that the Length or the Breadth of the candy is most likely related to the price.

Let us start by trying to relate the length of the candy with the price. 

Image for Feature Engineering Python Implementation

Image for Feature Engineering Graph

We observe from the figure that Length does not have a linear relation with the price.

We attempt a similar prediction with the Breadth to get a similar outcome. (You can execute this by replacing ‘Length by ‘Breadth in the above code block.)

Image for Feature Relationship Graph

Finally, it’s time to apply our newly gained knowledge of Feature Engineering in Python! Instead of using just the given features, we use the Length and Breadth feature to derive a new feature called Size which (you might have already guessed) should have a much more monotonic relation with the Price of candy than the two features it was derived from.

Image for Feature Engineering Code

Image for New Feature Table Using Feature Engineering

We now use this new feature Size to build a new simple linear regression model.

Image for Model Building With New Feature Table

Image for New Feature Relationship Graph

If you thought that the previous predictions with the Length(or Breadth) feature were not too disappointing, you would agree that the results with the Size feature are quite spectacular!

We have demonstrated with this example that by simply multiplying the Length and Breadth features of a pack of candy, you can achieve Price predictions well beyond what you would with the much less efficient relationship of Prices to Length (or Breadth). However, when working with real-life data, Feature Engineering could be the difference between a simple model that works perfectly well and a complex model that doesn’t.

Master Feature Engineering Techniques in Machine Learning With ProjectPro

Don't settle for average-performing machine learning models when you have the power of feature engineering at your fingertips. The famous physicist Richard Feynman once said, "What I cannot create, I do not understand." So, take the leap from theory to practice by engaging in real-world data science and ML projects offered by ProjectPro. Gain hands-on experience in implementing feature engineering techniques and witness firsthand the magic they bring when building effective ML models. By delving into these projects, you will discover how to wrangle and transform your data, from creating new features to selecting the most relevant ones. Remember- it's not just about the algorithms; it's about the artistry of the feature engineering process that sets individuals apart as data scientists or ML professionals. Master the art of feature engineering by exploring the ProjectPro repository, and let your data science journey shine brighter than ever before.

Get FREE Access to Data Analytics Example Codes for Data Cleaning, Data Munging, and Data Visualization

FAQs On Feature Engineering Techniques

Some common techniques used in feature engineering include one-hot encoding, feature scaling, handling missing values (e.g., imputation), creating interaction features (e.g., polynomial features), dimensionality reduction (e.g., PCA), feature selection (e.g., using statistical tests or feature importance), and transforming variables (e.g., logarithmic or power transformations).

Feature engineering can improve the performance of machine learning models by creating relevant and informative features from raw data. By engineering features, ML models can make more accurate predictions, handle complex and distributed data, reduce overfitting, and extract valuable insights from categorical and numerical data.

Using various techniques, you can determine relevant and useful features for a specific machine-learning problem. You can start by conducting exploratory data analysis, using domain knowledge, and leveraging statistical measures such as correlation and mutual information. You can also use feature importance methods like decision trees or regularization techniques to assess the impact of features on the model's performance. You can use feature selection algorithms, like recursive feature elimination, to identify the most relevant features based on their contribution to the model's accuracy.

When handling missing data during feature engineering, you can remove instances with missing values, fill in missing values with mean/median/mode, or use more advanced techniques like imputation methods (e.g., K-nearest neighbors or regression imputation). For outliers, you can consider removing them if they are shown to be incorrect or transform them using techniques like winsorization or capping. 

 

PREVIOUS

NEXT

Access Solved Big Data and Data Projects

About the Author

ProjectPro

ProjectPro is the only online platform designed to help professionals gain practical, hands-on experience in big data, data engineering, data science, and machine learning related technologies. Having over 270+ reusable project templates in data science and big data with step-by-step walkthroughs,

Meet The Author arrow link