In mathematics and statistics, a probability vector or stochastic vector is a vector with non-negative entries that add up to one.

Underlying every probability vector is an experiment that can produce an outcome. To connect this experiment with mathematics, one introduces a discrete random variable, a function that assigns a numerical value to each possible outcome. For example, if the experiment consists of rolling a single die, the random variable may be defined as the number of pips on the upward face. The possible values of this random variable are the integers 1 , 2 , … , 6 The associated probability vector then has six components, each representing the probability of obtaining the corresponding outcome. More generally, a probability vector of length 𝑛 represents the distribution of probabilities across the 𝑛 possible numerical outcomes of a discrete random variable.

The vector gives us the probability mass function of that random variable, which is the standard way of characterizing a discrete probability distribution.[1]

Examples

edit

Here are some examples of probability vectors. The vectors can be either columns or rows.

  •  
  •  
  •  
  •  

Geometric interpretation

edit

Writing out the vector components of a vector   as

 

the vector components must sum to one:

 

Each individual component must have a probability between zero and one:

 

for all  . Therefore, the set of stochastic vectors coincides with the standard  -simplex. It is a point if  , a segment if  , a (filled) triangle if  , a (filled) tetrahedron if  , etc.

Properties

edit
  • The mean of the components of any probability vector is  .
  • The shortest probability vector has the value   as each component of the vector, and has a length of  .
  • The longest probability vector has the value 1 in a single component and 0 in all others, and has a length of 1.
  • The shortest vector corresponds to maximum uncertainty, the longest to maximum certainty.
  • The length of a probability vector is equal to  ; where   is the variance of the elements of the probability vector.
  • The bounds on the variance are  
  • The derivative with respect to n of the maximum variance is  

Significance of the bounds on variance

edit

Many natural experiments have a large number of possible outcomes, but the bounds on variance show that as 𝑛 increases the variance decreases toward 0. This corresponds to increasing uncertainty, since the components become more nearly equal. As a result, probability vectors become less informative for large 𝑛, which motivates the common practice of binning outcomes to reduce the effective number of categories. While binning discards information from the original fine-grained outcomes, it reveals knowledge of the coarser structure that is otherwise obscured.

This notion of uncertainty is closely related to entropy as used in information theory and to entropy in statistical mechanics.

See also

edit

References

edit
  1. ^ Jacobs, Konrad (1992), Discrete Stochastics, Basler Lehrbücher [Basel Textbooks], vol. 3, Birkhäuser Verlag, Basel, p. 45, doi:10.1007/978-3-0348-8645-1, ISBN 3-7643-2591-7, MR 1139766.