All you need to know about RNN
Jerry Xiao One

Language Modeling

Language modeling is the task of predicting the next word in a sequence.

Pre-Neural Network Solutions: N-gram Models

An-gram is a chunk ofconsecutive words.

-gram models are based on the Markov assumption: the probability of a word depends only on the previouswords.

The probability of the sentence is the product of the probabilities of the words, which can be collected from some large corpus. There are two main problems:

  1. Sparsity Problem: The words might not be in the corpus.
  2. Storage Problem: Need to store all the n-grams in the corpus.

The generated result is usually grammatically correct but not consistent.

Neural Network Solutions: Fixed-Window Neural LM

Represent words with embedding vectors; predict the next word using the concatenated embeddings from a fixed context window.

This improves the sparsity problem but gets restricted by the fixed context window. If we want to include more context, we need to increase the window size, which leads to a higher complexity.

Neural Network Solutions: RNN

RNN Variants: LSTM

Sequence Labeling

Powered by Hexo & Theme Keep
This site is deployed on