# Autoregressive (AR) Language Modeling

Autoregressive (AR) Language Modeling is one of the most well-known and used pertaining objectives in the Natural Language Processing sphere. Given its roots in time series modeling, it is quite difficult to grasp its functionality in language modeling at once. This article aims to provide a simple explanation of AR modeling.

# Regression Analysis

A set of statistical methods that are used to evaluate the relationship between a dependant variable and one or more independent variables is known as regression analysis. It is useful to determine the robustness of the relationship of the variables under consideration and for fine-tuning the future relationship between these variables.

Variations of this analysis include non-linear regression, linear regression, and multiple linear regression. Each variation has its own set of assumptions.

# What is Autoregression?

Autoregression is a time series model that uses results from prior time steps as input to a regression equation that predicts the value at the next time step. The term autoregression specifies that it is a regression of the variable against itself.

An autoregressive model of order p can be written as follows:

where εt is white noise. This is like a multiple regression but with lagged values of yt as predictors. We refer to this as an AR(p) model, an autoregressive model of order p.

# Autoregressive (AR) Language Modeling

The autoregressive model is a feed-forward model, that predicts the future word from a set of words in a given context. The context word, in this model, is restrained to two directions, either backward or forward. Thus making it effective in NLP generative tasks that create context in the forward direction.

Howbeit, it does have a problem. This model can only utilize the forward context or the backward context, therefore implying that both contexts cannot be used simultaneously, this causes the model to curb itself in its understanding of prediction and context.