Explain AlBert and it's working with the help of an example.
Albert is an "A lit BERT" for self-supervised learning language representation, it is an upgrade to BERT that offers improved performance on various NLP tasks. ALBERT reduces model sizes in two ways - by sharing parameters across the hidden layers of the network, and by factorizing the embedding layer.
!pip install transformers
from transformers import AlbertConfig, AlbertModel
albert_configuration_xxlarge = AlbertConfig()
albert_configuration_base = AlbertConfig(
hidden_size=768,
num_attention_heads=12,
intermediate_size=3072,
)
Here we are configuring the Albert from the transformer library, The first step is about initializing the ALBERT-xxlarge style configuration, after that we are initializing the ALBERT-base style configuration, then initialize the model
from transformers import AlbertTokenizer, AlbertModel
import torch
albert_tokenizer = AlbertTokenizer.from_pretrained('albert-base-v2')
albert_model = AlbertModel.from_pretrained('albert-base-v2', return_dict=True)
Sample = tokenizer("Hi everyone your learning NLP", return_tensors="pt")
Results = albert_model(**Sample)
last_hidden_states = Results.last_hidden_state
print(last_hidden_states)
tensor([[[ 2.4208, 1.8559, 0.4701, ..., -1.1277, 0.1012, 0.7205], [ 0.2845, 0.7017, 0.3107, ..., -0.1968, 1.9060, -1.2505], [-0.5409, 0.8328, -0.0704, ..., -0.0470, 1.0203, -1.0432], ..., [ 0.0337, -0.5312, 0.3455, ..., 0.0088, 0.9658, -0.8649], [ 0.2958, -0.1336, 0.6774, ..., -0.1669, 1.6474, -1.7187], [ 0.0527, 0.1355, -0.0434, ..., -0.1046, 0.1258, 0.1885]]], grad_fn=)