What is the use of function build vocab from iterator in pytorch

This recipe explains what is the use of function build vocab from iterator in pytorch

Recipe Objective

What is the use of function build_vocab_from_iterator?

This is achieved by using "torchtext.vocab.build_vocab_from_iterator(iterator, num_lines=None)" function in which the iterator is used to build the vocab and then must yield the list or iterator of tokens. The num_lines is nothing but the number of elements returned by the iterator by default it is None.

Explore the BERT Variants - ALBERT vs DistilBERT

Step 1 - Import library

import torch
from torchtext.vocab import build_vocab_from_iterator as Bl

Step 2 - Sample text

sentence = "This is a pytorch NLP tutorial"

Step 3 - Create tokens

tokens = [t for t in sentence.split()]
tokens

['This', 'is', 'a', 'pytorch', 'NLP', 'tutorial']

Step 4 - Apply build_vocab_from_iterator

bul = Bl(tokens, num_lines=7)

 86%|████████▌ | 6/7 [00:00<00:00, 6056.76lines/s]

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

AWS MLOps Project for Gaussian Process Time Series Modeling
MLOps Project to Build and Deploy a Gaussian Process Time Series Model in Python on AWS

End-to-End ML Model Monitoring using Airflow and Docker
In this MLOps Project, you will learn to build an end to end pipeline to monitor any changes in the predictive power of model or degradation of data.

Build a Multi Class Image Classification Model Python using CNN
This project explains How to build a Sequential Model that can perform Multi Class Image Classification in Python using CNN

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

AWS MLOps Project for ARCH and GARCH Time Series Models
Build and deploy ARCH and GARCH time series forecasting models in Python on AWS .

GCP MLOps Project to Deploy ARIMA Model using uWSGI Flask
Build an end-to-end MLOps Pipeline to deploy a Time Series ARIMA Model on GCP using uWSGI and Flask

MLOps Project to Build Search Relevancy Algorithm with SBERT
In this MLOps SBERT project you will learn to build and deploy an accurate and scalable search algorithm on AWS using SBERT and ANNOY to enhance search relevancy in news articles.

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

Recommender System Machine Learning Project for Beginners-1
Recommender System Machine Learning Project for Beginners - Learn how to design, implement and train a rule-based recommender system in Python

PyCaret Project to Build and Deploy an ML App using Streamlit
In this PyCaret Project, you will build a customer segmentation model with PyCaret and deploy the machine learning application using Streamlit.