Understanding/visualizing TF-IDF concept

ROHIT SHUKLA
2 min readNov 28, 2020

--

When we hear the word TF-IDF, it looks like some code word of James bond movie, and we get frighten that how we are going to crack this code. but in real this is just a conversion technique or to be more precise we can say that TF-IDF is used to convert strings into some numerical ( vector ) notation that machine can understand.

As we are very well aware that machines to not understand simple literature words, to get machine understand our simple English ( or any other language) we need to convert them into the form machine understands. And TF-IDF is just a tool to convert that.

let me take an example to understand how this conversion happens.

suppose we have below mentioned 3 sentences.

  1. Good Morning

2. Good Evening

3. Good Morning and Evening

we will be using TF-IDF to convert above sentences into numerical (vector) forms that model can use for its analysis.

Basically TF-IDF is combination of two words

  1. TF ( Term Frequency) :
  2. IDF ( Inverse Document frequency)

So now we will convert above 3 sentence into TF and IDF.

In above mentioned image, we can see how words has been converted into numerical formats.

Hopes this will clear give an better understanding of TF-IDF.

--

--