How to read a textual data in tensorflow

This recipe helps you read a textual data in tensorflow

Recipe Objective

How to read a textual data in tensorflow?

This is achieved by using "TF.Text" which we have to install first then it can be used in tensorflow. The function provides text related classes and operations ready to be use with tensorflow. It can perform preprocessing regularly required by the text-based models, and includes the other features useful for the sequence modelling not provided by core Tensorflow.

Complete Guide to Tensorflow for Deep Learning with Python for Free

Step 1 - Import library

import tensorflow as tf import tensorflow_text as text

Step 2 - Load the data

Text_data = tf.constant([u'All the things are not going good.'.encode('UTF-16-BE'), u'Sad☹'.encode('UTF-16-BE')]) utf8_text_data = tf.strings.unicode_transcode(Text_data, input_encoding='UTF-16-BE', output_encoding='UTF-8')

Step 3 - Perform tokenization

tokenize_data = text.WhitespaceTokenizer() data_token = tokenize_data.tokenize(['All the things are not going good.', u'Sad☹'.encode('UTF-8')]) print("The tokens of the data is:",data_token.to_list())

The tokens of the data is: [[b'All', b'the', b'things', b'are', b'not', b'going', b'good.'], [b'Sad\xe2\x98\xb9']]

{"mode":"full","isActive":false}

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

MLOps Project for a Mask R-CNN on GCP using uWSGI Flask
MLOps on GCP - Solved end-to-end MLOps Project to deploy a Mask RCNN Model for Image Segmentation as a Web Application using uWSGI Flask, Docker, and TensorFlow.

Ola Bike Rides Request Demand Forecast
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

Word2Vec and FastText Word Embedding with Gensim in Python
In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

Build a Churn Prediction Model using Ensemble Learning
Learn how to build ensemble machine learning models like Random Forest, Adaboost, and Gradient Boosting for Customer Churn Prediction using Python

End-to-End Snowflake Healthcare Analytics Project on AWS-1
In this Snowflake Healthcare Analytics Project, you will leverage Snowflake on AWS to predict patient length of stay (LOS) in hospitals. The prediction of LOS can help in efficient resource allocation, lower the risk of staff/visitor infections, and improve overall hospital functioning.

Azure Deep Learning-Deploy RNN CNN models for TimeSeries
In this Azure MLOps Project, you will learn to perform docker-based deployment of RNN and CNN Models for Time Series Forecasting on Azure Cloud.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

PyTorch Project to Build a GAN Model on MNIST Dataset
In this deep learning project, you will learn how to build a GAN Model on MNIST Dataset for generating new images of handwritten digits.

Census Income Data Set Project-Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based oncensus data.