Skip to content


Repository files navigation


Use cases:

  1. Text generation.
  2. Text classification/Sentiment Analysis.
  3. Text Summarisation.
  4. Text rewriting/Paraphrasing.
  5. Text clustering.
  6. Embeddings generation.
  7. Translation.

Multi-Purpose NLP Models


  • Text generation steps:
python pytorch-transformers/examples/ 


  • Text generation steps:
python pytorch-transformers/examples/

OpenAI’s GPT-2

  • Text completion steps:
  1. Tokenize and index the text as a sequence of numbers
  2. Pass it to the gp2 pretrained model e.g Pytorch's GPT2LMHeadModel.
  3. Get predictions.
  • Text generation steps:
python pytorch-transformers/examples/

Universal Language Model Fine Tuning - ULMFiT

  1. Data prep.
  2. Creating LM Model & fine-tuning it with the pre-trained model.
  3. Get predictions with the fine tuned model.

Implementations in Spacy and Fastai.


Google’s BERT

  • Masked language modeling steps:
  1. Text tokenisation.
  2. Convert tokesn into a sequence of integers.
  3. Use bert's masked language model e.g Pytorch's BertForMaskedLM.
  4. Get predictions.

Word Embeddings

Embeddings from Language Model - ELMo

  • NLP framework by AllenNLP. Word vectors are calculated using a 2-layer bidirectional language model (biLM). Each layer comprises back &forward pass.
  • Represents word embeddings using complete sentence, thus, capture the context of the word used in the sentence unlike Glove and Word2Vec.


  • Captures latent syntattic-semantic info from text.
  • Gives word embeddings based on its sorrounding text.