Categorical Distribution

A categorical distribution is a discrete probability distribution whose sample space is the set of k individually identified items. It is the generalization of the Bernoulli distribution for a categorical random variable.

In one formulation of the distribution, the sample space is taken to be a finite sequence of integers. The exact integers used as labels are unimportant; they might be {0, 1, ..., k-1} or {1, 2, ..., k} or any other arbitrary set of values. In the following descriptions, we use {1, 2, ..., k} for convenience, although this disagrees with the convention for the Bernoulli distribution, which uses {0, 1}. In this case, the probability mass function f is:

f(x=i| boldsymbol{p} ) = p_i ,

where represents the probability of seeing element i and .

Another formulation that appears more complex but facilitates mathematical manipulations is as follows, using the Iverson bracket:

f(x| boldsymbol{p} ) = prod_{i=1}^k p_i^{} ,

where evaluates to 1 if, 0 otherwise. There are various advantages of this formulation, e.g.:

  • It is easier to write out the likelihood function of a set of independent identically distributed categorical variables.
  • It connects the categorical distribution with the related multinomial distribution.
  • It shows why the Dirichlet distribution is the conjugate prior of the categorical distribution, and allows the posterior distribution of the parameters to be calculated.

Yet another formulation makes explicit the connection between the categorical and multinomial distributions by treating the categorical distribution as a special case of the multinomial distribution in which the parameter n of the multinomial distribution (the number of sampled items) is fixed at 1. In this formulation, the sample space can be considered to be the set of 1-of-K encoded random vectors x of dimension k having the property that exactly one element has the value 1 and the others have the value 0. The particular element having the value 1 indicates which category has been chosen. The probability mass function f in this formulation is:

f( mathbf{x}| boldsymbol{p} ) = prod_{i=1}^k p_i^{x_i} ,

where represents the probability of seeing element i and . This is the formulation adopted by Bishop.

Read more about Categorical Distribution:  Properties, With A Conjugate Prior, Sampling

Other articles related to "distribution, categorical distribution":

Hidden Markov Model - Architecture
... From the diagram, it is clear that the conditional probability distribution of the hidden variable x(t) at time t, given the values of the hidden variable x at all times, depends only on the value ... themselves can either be discrete (typically generated from a categorical distribution) or continuous (typically from a Gaussian distribution) ... assumed to consist of one of possible values, modeled as a categorical distribution ...
Categorical Distribution - Sampling
... The most common way to sample from a categorical distribution uses a type of inverse transform sampling Assume we are given a distribution expressed as "proportional to" some expression ... some values as follows Compute the unnormalized value of the distribution for each category ... Convert the values to a cumulative distribution function (CDF) by replacing each value with the sum of all of the previous values ...
Multinomial Distribution
... In probability theory, the multinomial distribution is a generalization of the binomial distribution ... The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial ... In a multinomial distribution, the analog of the Bernoulli distribution is the categorical distribution, where each trial results in exactly one of some ...
Hidden Markov Model - Extensions
... is discrete, while the observations themselves can either be discrete (typically generated from a categorical distribution) or continuous (typically from a Gaussian distribution) ... and where all hidden and observed variables follow a Gaussian distribution ... Hidden Markov models are generative models, in which the joint distribution of observations and hidden states, or equivalently both the prior distribution of hidden ...

Famous quotes containing the words distribution and/or categorical:

    Classical and romantic: private language of a family quarrel, a dead dispute over the distribution of emphasis between man and nature.
    Cyril Connolly (1903–1974)

    We do the same thing to parents that we do to children. We insist that they are some kind of categorical abstraction because they produced a child. They were people before that, and they’re still people in all other areas of their lives. But when it comes to the state of parenthood they are abruptly heir to a whole collection of virtues and feelings that are assigned to them with a fine arbitrary disregard for individuality.
    Leontine Young (20th century)