**As A "log-linear" Model**

Yet another formulation combines the two-way latent variable formulation above with the original formulation higher up without latent variables, and in the process provides a link to one of the standard formulations of the multinomial logit.

Here, instead of writing the logit of the probabilities *p*_{i} as a linear predictor, we separate the linear predictor into two, one for each of the two outcomes:

Note that two separate sets of regression coefficients have been introduced, just as in the two-way latent variable model, and the two equations appear a form that writes the logarithm of the associated probability as a linear predictor, with an extra term at the end. This term, as it turns out, serves as the normalizing factor ensuring that the result is a distribution. This can be seen by exponentiating both sides:

In this form it is clear that the purpose of *Z* is to ensure that the resulting distribution over *Y*_{i} is in fact a probability distribution, i.e. it sums to 1. This means that *Z* is simply the sum of all un-normalized probabilities, and by dividing each probability by *Z*, the probabilities become "normalized". That is:

and the resulting equations are

Or generally:

This shows clearly how to generalize this formulation to more than two outcomes, as in multinomial logit.

Now, how can we prove that this is equivalent to the previous model? Keep in mind that the above model is overspecified, in that and cannot be independently specified: rather so knowing one automatically determines the other. As a result, the model is nonidentifiable, in that multiple combinations of **β**_{0} and **β**_{1} will produce the same probabilities for all possible explanatory variables. In fact, it can be seen that adding any constant vector to both of them will produce the same probabilities:

As a result, we can simplify matters, and restore identifiability, by picking an arbitrary value for one of the two vectors. We choose to set Then,

and so

which shows that this formulation is indeed equivalent to the previous formulation. (As in the two-way latent variable formulation, any settings where will produce equivalent results.)

Note that most treatments of the multinomial logit model start out either by extending the "log-linear" formulation presented here or the two-way latent variable formulation presented above, since both clearly show the way that the model could be extended to multi-way outcomes. In general, the presentation with latent variables is more common in econometrics and political science, where discrete choice models and utility theory reign, while the "log-linear" formulation here is more common in computer science, e.g. machine learning and natural language processing.

Read more about this topic: Logistic Regression, Formal Mathematical Specification

### Famous quotes containing the word model:

“Your home is regarded as a *model* home, your life as a *model* life. But all this splendor, and you along with it ... it’s just as though it were built upon a shifting quagmire. A moment may come, a word can be spoken, and both you and all this splendor will collapse.”

—Henrik Ibsen (1828–1906)