Based on Wikipedia, Natural language processing (NLP) is an area of computer science and artificial intelligence concerned with the interactions between computers and human (natural) languages. And these days NLP involves information extraction, parsing, machine translation, dialogue systems, question answering system …
And this blog intends to be a collection of algorithms for NLP and tries to get a overview of this field. It will be updated often as my learning path continued.
Language Models
A language model is a distribution over sequences of words or sentences. LMs use the chain rule of probability to compute the entire sequence.
Bag of Words: a text is represented as the bag of its words, disregarding grammar and word order
N-Gram Models
An N-gram is a sequence of N words. The intuition of the N-gram model is that instead of computing the probability of a word given its entire history, we can approximate the history by just the last few words.
Kneser-Ney Smoothing
Maximum Entropy Models
Linear models that are typically regularized via L1 or L2 terms in the likelihood objective avoiding the need for the kinds of backoff or mixture weights used in smoothed N-Gram language models, it’s more statistically efficient.
Related to Maximum Entropy Principle
Neural Network Models
Learn a distributed representation for words with neural networks.
Word Embedding maps each word into a lower-dimensional real-valued space and shares a weight matrix.
RNN can use their internal state (memory) to process sequences of inputs.
LSTM uses several gates to control the state change.
Parsing
Parsing is a combination of recognizing an input string and assigning a structure to it. The structure is mostly trees.
Parse trees are directly useful in applications such as grammar checking, and also serve as an important intermediate stage of representation for semantic analysis and play an important role in applications like question answering and information extraction.
Tagging
POS Tagging is useful as a pre-processing step for parsing
MEMM: left to right local decisions, condition on previous tags and also entire input
Conditional Random Fields
Global Discriminative Tagger
Syntactic Parsing
CKY Parsing
Agenda-Based Parsing
Statistical Parsing
Dependency Parsing
Speech Recognition
Noisy Channel Model
Find the intended word given a word where the letters have been scrambled in some manner. Actually this is not restricted in speech recognition, it’s a general framework in NLP.
Emission Model
Hidden Markov Model: Model assumed to be a Markov process with unobserved states.
Viterbi Algorithm: a dynamic programming algorithm for finding the most likely sequence of hidden states, especially in HMM
Gaussian Mixture Model
EM
Beam search: a heuristic search algorithm that explores a graph by expanding the most promising node in a limited set.
Machine Translation
Evaluation Metrics
BLEU
ROUGE
METEOR
CIDEr
Word Alignment
IBM Model
HMM Model
Modern Approaches
Seq2Seq
Attention Mechanism