Data.Random.Distribution
- class Distribution d t where
- class Distribution d t => CDF d t where
Documentation
class Distribution d t whereSource
A Distribution is a data representation of a random variable's probability
structure. For example, in Data.Random.Distribution.Normal, the Normal
distribution is defined as:
data Normal a
= StdNormal
| Normal a a
Where the two parameters of the Normal data constructor are the mean and
standard deviation of the random variable, respectively. To make use of
the Normal type, one can convert it to an rvar and manipulate it or
sample it directly:
x <- sample (rvar (Normal 10 2)) x <- sample (Normal 10 2)
A Distribution is typically more transparent than an RVar
but less composable (precisely because of that transparency). There are
several practical uses for types implementing Distribution:
- Typically, a
Distributionwill expose several parameters of a standard mathematical model of a probability distribution, such as mean and std deviation for the normal distribution. Thus, they can be manipulated analytically using mathematical insights about the distributions they represent. For example, a collection of bernoulli variables could be simplified into a (hopefully) smaller collection of binomial variables. - Because they are generally just containers for parameters, they can be easily serialized to persistent storage or read from user-supplied configurations (eg, initialization data for a simulation).
- If a type additionally implements the
CDFsubclass, which extendsDistributionwith a cumulative density function, an arbitrary random variablexcan be tested against the distribution by testingfmap (cdf dist) xfor uniformity.
On the other hand, most Distributions will not be closed under all the
same operations as RVar (which, being a monad, has a fully turing-complete
internal computational model). The sum of two uniformly-distributed
variables, for example, is not uniformly distributed. To support general
composition, the Distribution class defines a function rvar to
construct the more-abstract and more-composable RVar representation
of a random variable.
Methods
Return a random variable with this distribution.
Instances
class Distribution d t => CDF d t whereSource
Methods
cdf :: d t -> t -> DoubleSource
Return the cumulative distribution function of this distribution.
That is, a function taking x :: t to the probability that the next
sample will return a value less than or equal to x, according to some
order or partial order (not necessarily an obvious one).
In the case where t is an instance of Ord, cdf should correspond
to the CDF with respect to that order.
In other cases, cdf is only required to satisfy the following law:
fmap (cdf d) (rvar d)
must be uniformly distributed over (0,1). Inclusion of either endpoint is optional,
though the preferred range is (0,1].
Note that this definition requires that cdf for a product type
should _not_ be a joint CDF as commonly defined, as that definition
violates both conditions.
Instead, it should be a univariate CDF over the product type. That is,
it should represent the CDF with respect to the lexicographic order
of the product.
The present specification is probably only really useful for testing conformance of a variable to its target distribution, and I am open to suggestions for more-useful specifications (especially with regard to the interaction with product types).
Instances