The relationship between Perplexity and Entropy in NLPUse Information Theory to understand NLP MetricsPerplexity is a common metric to use when evaluating language models.
Finally, a technical point: we want to define the entropy of the language L (or language model M) regardless of sentence length n. So finally we defineFinal definition of entropy for a language (model)The Shannon-McMillan-Breiman TheoremUnder anodyne assumptions³ the entropy simplifies even further.
Cross-EntropySuppose we mistakenly think that our language model M is correct.
The cross-entropy H(L,M) is what we measure the entropy to beCross entropy for our language model MWhere the second line again applies the Shannon-McMillan-Breiman theorem.
The perplexity of M is bounded below by the perplexity of the actual language L (likewise, cross-entropy).

Comments to: The relationship between Perplexity and Entropy in NLP

Your email address will not be published. Required fields are marked *

Attach images - Only PNG, JPG, JPEG and GIF are supported.


Welcome to Typer

Brief and amiable onboarding is the first thing a new user sees in the theme.
Join Typer
Registration is closed.