What is BART model in transformers?

This recipe explains what is BART model in transformers.

Recipe Objective: What is BART model in transformers?

BART stands for Bidirectional Auto-Regressive Transformers. This model is by Facebook AI research that combines Google's BERT and OpenAI's GPT It is bidirectional like BERT and is auto-regressive like GPT.

BERT's bidirectional, autoencoder nature is
* good for downstream tasks (e.g.: classification) that requires information about the whole sequence
* not so good for generation tasks where generated word should only depend on previously generated words

GPT's unidirectional auto-regressive approach is
* good for text generation
* not so good for tasks that require information of the whole sequence (e.g.: classification)

BART is the best of both worlds.
BART= BERT encoder + GPT Decoder + Noise Transformations
* Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT).
* The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, where spans of text are replaced with a single mask token.
* Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT).
* The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, where spans of text are replaced with a single mask token.
* BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE.

BartTokenizer - It is identical to RobertaTokenizer.

BartModel - The bare BART Model outputting raw hidden-states without any specific head on top. This model inherits from PreTrainedModel.

For more related projects -

/projects/data-science-projects/neural-network-projects
/projects/data-science-projects/deep-learning-projects

Example -

#practical implementation of BartModel and BartTokenizer

#importing required libraries
import torch
from transformers import BartModel, BartTokenizer

# Load the tokenizer and model of the pretrained base BART model
tz = BartTokenizer.from_pretrained('facebook/bart-large')
model = BartModel.from_pretrained('facebook/bart-large')

#Tokenizing the input data and assigning the token their IDs
inputdata = tz("The quick brown fox jumps over the lazy dog", return_tensors="pt")
outputdata = model(**inputdata)

#last_hidden_state contains the sequence of hidden-states at the output of the last layer of the model.
last_hidden_states = outputdata.last_hidden_state

#displaying the hidden-states
print("last hidden states: ",last_hidden_states)

Output -
last hidden states:  tensor([[[ 0.5066,  0.5245, -1.0789,  ..., -0.0657, -0.1174, -0.6937],
         [ 0.5066,  0.5245, -1.0789,  ..., -0.0657, -0.1174, -0.6937],
         [ 0.4948, -1.2203,  0.9083,  ...,  0.6206,  0.6097, -0.2111],
         ...,
         [ 0.0689, -1.9124,  0.8337,  ...,  0.0518,  0.8280, -0.9057],
         [-0.2178, -1.0660, -1.6880,  ...,  0.3749,  0.4627, -0.7621],
         [-0.5963,  1.0727, -0.9889,  ...,  0.5723,  0.5521, -0.3102]]],
       grad_fn=)

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

CycleGAN Implementation for Image-To-Image Translation
In this GAN Deep Learning Project, you will learn how to build an image to image translation model in PyTorch with Cycle GAN.

Build Portfolio Optimization Machine Learning Models in R
Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.

Personalized Medicine: Redefining Cancer Treatment
In this Personalized Medicine Machine Learning Project you will learn to classify genetic mutations on the basis of medical literature into 9 classes.

LLM Project to Build and Fine Tune a Large Language Model
In this LLM project for beginners, you will learn to build a knowledge-grounded chatbot using LLM's and learn how to fine tune it.

Loan Default Prediction Project using Explainable AI ML Models
Loan Default Prediction Project that employs sophisticated machine learning models, such as XGBoost and Random Forest and delves deep into the realm of Explainable AI, ensuring every prediction is transparent and understandable.

Build a Graph Based Recommendation System in Python -Part 1
Python Recommender Systems Project - Learn to build a graph based recommendation system in eCommerce to recommend products.

Learn How to Build a Linear Regression Model in PyTorch
In this Machine Learning Project, you will learn how to build a simple linear regression model in PyTorch to predict the number of days subscribed.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Build ARCH and GARCH Models in Time Series using Python
In this Project we will build an ARCH and a GARCH model using Python