What is Causal Language Modeling in transformers?

This recipe explains what is Causal Language Modeling in transformers.

Recipe Objective - What is Causal Language Modeling in transformers?

The task of fitting a model to a corpus, which can be domain-specific, is known as language modeling. Language modeling versions, such as BERT with masked language modeling and GPT2 with causal language modeling, are used to train all popular transformers-based models.

Language modeling is also useful outside of pre-training, for example, to transform the model distribution in a specific domain: use a trained language model on a very large corpus and then fit it to data sets from news or scientific articles, such as LysandreJik / arxivnlp.

Learn How to Build a Multi Class Text Classification Model using BERT

Causal Language Modeling:

The task of predicting the token after a sequence of tokens is known as causal language modeling. In this case, the model is just concerned with the left context (tokens on the left of the mask).

For more related projects -

/projects/data-science-projects/tensorflow-projects
/projects/data-science-projects/keras-deep-learning-projects

Example of Causal Language Model using pipeline:

# Importing libraries
from transformers import AutoModelWithLMHead, AutoTokenizer, top_k_top_p_filtering
import torch
from torch import nn

# Creating tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelWithLMHead.from_pretrained("gpt2")

# Creating context for sequence
context_sequence = f"I have never watched anything like this, and it was"

# Applying tokenizer on sequence
tokens = tokenizer.encode(context_sequence, return_tensors="pt")

# Extracting logits of last hidden state
last_logits = model(tokens).logits[:, -1, :]

# Applying top k top p filtering
filter = top_k_top_p_filtering(last_logits, top_k=50, top_p=1.0)

# Finding probabilities using softmax function
probabilities = nn.functional.softmax(filter, dim=-1)

# Applying multinomial
final_token = torch.multinomial(probabilities, num_samples=1)

# Applying cat function
output = torch.cat([tokens, final_token], dim=-1)

# Decoding
answer = tokenizer.decode(output.tolist()[0])

# Printing answer
print(answer)

Output -
I have never watched anything like this, and it was amazing

In this way, we can perform causal language modeling in transformers.

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Build an AI Chatbot from Scratch using Keras Sequential Model
In this NLP Project, you will learn how to build an AI Chatbot from Scratch using Keras Sequential Model.

Build CI/CD Pipeline for Machine Learning Projects using Jenkins
In this project, you will learn how to create a CI/CD pipeline for a search engine application using Jenkins.

Learn Hyperparameter Tuning for Neural Networks with PyTorch
In this Deep Learning Project, you will learn how to optimally tune the hyperparameters (learning rate, epochs, dropout, early stopping) of a neural network model in PyTorch to improve model performance.

Learn to Build an End-to-End Machine Learning Pipeline - Part 1
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, addressing a major challenge in the logistics industry.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.