We want a number to let the computer understand us.

For that, we want to convert words into numeric representations (vectors or matrices).

If you are converting text into numerical embeddings you have 2 options →

  1. Statistical Methods
    1. BagOfWords
    2. TF-IDF
  2. ML/DL Based Methods
    1. Lookup method
    2. Word2Vec (Word to Vector) → Continuous BagOfWords
    3. Transformer Based Architectures

1 → Statistical Methods

Bag-Of-Words

Consider these 2 sentences

the cat sat on the mat

the dog sat on the cat