How to read a csv file in tensorlow

This recipe helps you read a csv file in tensorlow

Recipe Objective

How to read a csv file in tensorlow?

In this we are going to see how to use the CSV data by using tensorflow. There are two main parts in this which are: -- Preprocessing the data into a form of suitable training -- Off the disk loading the data The dataset here we are going to use a Abalone train data, in which its normal task is to predict the age from the other measurements for that we have to perform various operations. Lets understand this with practical implementation.

Step 1 - Import library

import pandas as pd import numpy as np np.set_printoptions(precision=3, suppress=True) import tensorflow as tf from tensorflow.keras import layers from tensorflow.keras.layers.experimental import preprocessing

Step 2 - Load the dataset

dataset_abalone_train = pd.read_csv( "https://storage.googleapis.com/download.tensorflow.org/data/abalone_train.csv", names=["Length", "Diameter", "Height", "Whole weight", "Shucked weight", "Viscera weight", "Shell weight", "Age"]) dataset_abalone_train.head()

Step 3 - Seperate the labels and features

dataset_abalone_features = dataset_abalone_train.copy() dataset_abalone_labels = dataset_abalone_features.pop('Age')

Step 4 - Store features in Numpy array

dataset_abalone_features = np.array(dataset_abalone_features) dataset_abalone_features

array([[0.435, 0.335, 0.11 , ..., 0.136, 0.077, 0.097],
       [0.585, 0.45 , 0.125, ..., 0.354, 0.207, 0.225],
       [0.655, 0.51 , 0.16 , ..., 0.396, 0.282, 0.37 ],
       ...,
       [0.53 , 0.42 , 0.13 , ..., 0.374, 0.167, 0.249],
       [0.395, 0.315, 0.105, ..., 0.118, 0.091, 0.119],
       [0.45 , 0.355, 0.12 , ..., 0.115, 0.067, 0.16 ]])

As we are going to treat the features identically we have to store them into a numpy array

Step 5 - Make model

model = tf.keras.Sequential([ layers.Dense(64), layers.Dense(1) ]) model.compile(loss = tf.losses.MeanSquaredError(), optimizer = tf.optimizers.Adam())

Here we are making a regression model for predicting the age, for this model making a single input tensor that is "tf.keras.sequential" model is enough here.

Step 6 - Train the model

model.fit(dataset_abalone_features, dataset_abalone_labels, epochs=10)

Epoch 1/10
104/104 [==============================] - 1s 1ms/step - loss: 85.7106
Epoch 2/10
104/104 [==============================] - 0s 1ms/step - loss: 15.0037
Epoch 3/10
104/104 [==============================] - 0s 1ms/step - loss: 8.8471
Epoch 4/10
104/104 [==============================] - 0s 1ms/step - loss: 7.4363
Epoch 5/10
104/104 [==============================] - 0s 1ms/step - loss: 7.2182
Epoch 6/10
104/104 [==============================] - 0s 1ms/step - loss: 6.8986
Epoch 7/10
104/104 [==============================] - 0s 1ms/step - loss: 6.3194
Epoch 8/10
104/104 [==============================] - 0s 1ms/step - loss: 6.6458
Epoch 9/10
104/104 [==============================] - 0s 1ms/step - loss: 6.7457
Epoch 10/10
104/104 [==============================] - 0s 1ms/step - loss: 6.6510

Step 7 - Perform Preprocessing

data_normalization = preprocessing.Normalization()

Here we are going to perform basic preprocessing and for that we are going normalize the data

Step 8 - Normalize the data

data_normalization.adapt(dataset_abalone_features)

Step 9 - Use normalization in model

model_normalize = tf.keras.Sequential([ data_normalization, layers.Dense(64), layers.Dense(1) ]) model_normalize.compile(loss = tf.losses.MeanSquaredError(), optimizer = tf.optimizers.Adam())

Step 10 - Train the normalized model

model_normalize.fit(dataset_abalone_features, dataset_abalone_labels, epochs=20)

Epoch 1/20
104/104 [==============================] - 0s 1ms/step - loss: 98.8254
Epoch 2/20
104/104 [==============================] - 0s 1ms/step - loss: 67.2705
Epoch 3/20
104/104 [==============================] - 0s 1ms/step - loss: 24.0828
Epoch 4/20
104/104 [==============================] - 0s 1ms/step - loss: 6.3781
Epoch 5/20
104/104 [==============================] - 0s 1ms/step - loss: 5.2805
Epoch 6/20
104/104 [==============================] - 0s 1ms/step - loss: 5.1448
Epoch 7/20
104/104 [==============================] - 0s 1ms/step - loss: 5.2513
Epoch 8/20
104/104 [==============================] - 0s 1ms/step - loss: 5.0973
Epoch 9/20
104/104 [==============================] - 0s 1ms/step - loss: 4.8350
Epoch 10/20
104/104 [==============================] - 0s 1ms/step - loss: 5.0443
Epoch 11/20
104/104 [==============================] - 0s 1ms/step - loss: 4.6720
Epoch 12/20
104/104 [==============================] - 0s 1ms/step - loss: 4.9722
Epoch 13/20
104/104 [==============================] - 0s 1ms/step - loss: 4.9222
Epoch 14/20
104/104 [==============================] - 0s 1ms/step - loss: 4.7132
Epoch 15/20
104/104 [==============================] - 0s 1ms/step - loss: 4.9126
Epoch 16/20
104/104 [==============================] - 0s 1ms/step - loss: 4.8753
Epoch 17/20
104/104 [==============================] - 0s 1ms/step - loss: 4.5755
Epoch 18/20
104/104 [==============================] - 0s 1ms/step - loss: 5.0861
Epoch 19/20
104/104 [==============================] - 0s 1ms/step - loss: 4.8576
Epoch 20/20
104/104 [==============================] - 0s 1ms/step - loss: 5.3921

{"mode":"full","isActive":false}

What Users are saying..

profile image

Anand Kumpatla

Sr Data Scientist @ Doubleslash Software Solutions Pvt Ltd
linkedin profile url

ProjectPro is a unique platform and helps many people in the industry to solve real-life problems with a step-by-step walkthrough of projects. A platform with some fantastic resources to gain... Read More

Relevant Projects

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Multilabel Classification Project for Predicting Shipment Modes
Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

Deploy Transformer-BART Model on Paperspace Cloud
In this MLOps Project you will learn how to deploy a Tranaformer BART Model for Abstractive Text Summarization on Paperspace Private Cloud

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

Learn Hyperparameter Tuning for Neural Networks with PyTorch
In this Deep Learning Project, you will learn how to optimally tune the hyperparameters (learning rate, epochs, dropout, early stopping) of a neural network model in PyTorch to improve model performance.

PyCaret Project to Build and Deploy an ML App using Streamlit
In this PyCaret Project, you will build a customer segmentation model with PyCaret and deploy the machine learning application using Streamlit.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Learn to Build an End-to-End Machine Learning Pipeline - Part 1
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, addressing a major challenge in the logistics industry.