What is feature column data format in tensorflow

This recipe explains what is feature column data format in tensorflow
Last Updated: 05 Jul 2022

Get access to Data Science projects View all Data Science projects

DATA SCIENCE PROJECTS IN PYTHON DATA CLEANING PYTHON DATA MUNGING MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective

What is feature_column data format?

Feature columns these are nothing but the bridge between the raw data and the model or estimator. These are very rich, enabling us for transforming and diversify the range of raw data into the formats that the models or estimators can use, allowing easy experimentation.

Complete Guide to Tensorflow for Deep Learning with Python for Free

Step 1 - Import libraries

import tensorflow as tf import numpy as np import pandas as pd from tensorflow import feature_column from tensorflow.keras import layers

Step 2 - Take a sample data

sample_data = {'Student_marks': [55,21,63,88,74,54,95,41,84,52], 'Student_grade': ['average','poor','average','good','good','average','good','average','good','average'], 'Final_point': ['c','f','c+','b+','b','c','a','d+','b+','c']}

Step 3 - Convert Sample data into dataframe

dataframe = pd.DataFrame(sample_data) dataframe

Step 4 - Define feature columns

def My_feature(feature_column): layer_feature = layers.DenseFeatures(feature_column) print(layer_feature(sample_data).numpy())

Step 5 - Numeric Column

Student_marks = feature_column.numeric_column("Student_marks") My_feature(Student_marks)

[[55.]
 [21.]
 [63.]
 [88.]
 [74.]
 [54.]
 [95.]
 [41.]
 [84.]
 [52.]]

Here we have represented the real value features by using Numeric column, It is the simplest type of column. Our model will recieve the column value from the dataframe unchanged.

Step 5 - Bucketized Column

Student_marks_bucket = feature_column.bucketized_column(Student_marks, boundaries=[30,40,50,60,70,80,90]) My_feature(Student_marks_bucket)

[[0. 0. 0. 1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0.]]

Here we have tried the feature with the bucketed column, in which the data is being split into several buckets by using this bucketed column. The buckets exclude the right boundary and include the left boundary.

{"mode":"full","isActive":false}

What Users are saying..

Savvy Sahai

Data Science Intern, Capgemini

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More