What is perplexity?

Perplexity is an accuracy measurement of a probability model.

A language model is a kind of probability model that measures how likely is a given sentence according to a large corpus of text or the training set (The Wall Street Journal dataset, comments on Youtube in a given language, Brown corpus …).

A unigram model (order 1) is an example of language model which gives the probability of a sentence multiplying the probability of each word in the sentence based on their frequency in the training set.

A bigram model (order 2) is an other example of language model which gives the probability of a sentence multiplying the probability of each word in the sentence taking into account the previous word (except for the first word) based on the frequency of those pairs (or the first word) in the training set.

This can be generalised to order N cases.

Having this in mind, the perplexity of such model will be the inverse of geometric average of each word probability (or pair, or triplet …). Refer to the main image to see the actual formula.

This formula can be intuitively derived from the notion of cross-entropy in this great article [1].

How do we use it in Natural Language Processing model?

An example of application would be to determine how relevent (correct syntax, grammar, insightful …) is a comment among the list of all the comment on the IMDb website (film rating website).

Another application could be found in the translation field. A chinese sentence could be translated in different ways and calculating the perplexity of each possiblity would give a way to chose one translation or the other.

Speach recognition is another interesting application of the usefulness of such measurement. Sounds could be interpreted in different ways (homophones, noise, polysemy …) but our brain or the language model could tell which version is the most probable or would make us feel less perplexe.

Relevent reference

[1] Vajapeyam, S. Understanding Shannon’s Entropy metric for Information (2014).

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

via WordPress https://ramseyelbasheer.wordpress.com/2021/02/01/perplexity/

Perplexity …

What is perplexity?

How do we use it in Natural Language Processing model?

Relevent reference

Popular posts from this blog

Fully Explained DBScan Clustering Algorithm with Python

Streamlit — Deploy your app in just a few minutes

Hierarchical clustering explained