How to load any Huggingface [Transformer] models and use them?
Consider this article as part 2 of my previous post, “How to use [HuggingFace’s] Transformers Pre-Trained tokenizers?”, where I attempted to describe what a tokenizer is, why it is important, and how to use pre-trained ones from the Huggingface library. If you are not familiar with the concept, I highly encourage you to give that piece a read. Also, you should have a basic understanding of Artificial Neural Networks.
So, Huggingface 🤗
It is a library that focuses on the Transformer-based pre-trained models. The main breakthrough of this architecture was the Attention mechanism which gave the models the ability to pay attention (get it?) to specific parts of a sequence (or tokens). In the time of writing this piece, Transformer is dominating the Natural Language Processing (NLP) field, and there are recent papers that are trying to use it for vision. If you want to get into NLP, it definitely worth it to put in a time and truly understand each component of this architecture.
I will not go through the Transformer details as there are already numerous guides and tutorials available. (Want my recommendation? Ok! Read the Jay Alammar’s “The Illustrated Transformer”) Just know these pre-trained Transformer models (like BERT, GPT-2, BART, …) are very useful whether you want to do a…