NumPy var() Function



The NumPy var() function computes the variance along the specified axis. This function returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.

In statistics, the variance is a measure of the spread of a data set. The formula is var = sum((x_i - mean)^2) / N, where x_i is each data point, mean is the mean of the data, and N is the number of data points.

For a one-dimensional array, the variance is computed over all elements. For multi-dimensional arrays, the variance is computed along the specified axis.

Syntax

Following is the syntax of the NumPy var() function −

numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>, where=<no value>, mean=<no value>, correction=<no value>)

Parameters

Following are the parameters of the NumPy var() function −

  • a: Input array or object that can be converted to an array. It can be a NumPy array, list, or a scalar value.
  • axis (optional): Axis or axes along which the variance is computed. Default is None, which means the variance is computed over the entire array.
  • dtype (optional): Data type to use in computing the variance. If None, it is inferred from the input array.
  • out (optional): A location into which the result is stored. If provided, it must have the same shape as the expected output.
  • ddof (optional): Delta Degrees of Freedom. The divisor used in the calculation is N - ddof, where N is the number of elements. Default is 0.
  • keepdims (optional): If True, the reduced dimensions are retained as dimensions of size one in the output. Default is False.
  • where (optional): A boolean array specifying the elements to include in the calculation.
  • mean (optional): Provides the mean to prevent its re-calculation. The shape of the mean should match as if calculated with keepdims=True.
  • correction (optional): Controls the calculation of variance, with options for modifying degrees of freedom and more.

Return Values

This function returns the variance of the input array. The result is a scalar if the input is one-dimensional, and an array if the input is multi-dimensional.

Example

Following is a basic example to compute the variance of an array using the NumPy var() function −

import numpy as np
# input array
x = np.array([1, 2, 3, 4, 5])
# applying var
result = np.var(x)
print("Variance Result:", result)

Output

Following is the output of the above code −

Variance Result: 2.0

Example: Specifying an Axis

The var() function can compute the variance along a specific axis of a multi-dimensional array. In the following example, we have computed the variance along axis 0 (columns) and axis 1 (rows) of a 2D array −

import numpy as np
# 2D array
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# applying var along axis 0 (columns)
result_axis0 = np.var(x, axis=0)
# applying var along axis 1 (rows)
result_axis1 = np.var(x, axis=1)
print("Variance along axis 0:", result_axis0)
print("Variance along axis 1:", result_axis1)

Output

Following is the output of the above code −

Variance along axis 0: [6. 6. 6.]
Variance along axis 1: [0.66666667 0.66666667 0.66666667]

Example: Usage of 'ddof' Parameter

The ddof (Delta Degrees of Freedom) parameter adjusts the divisor used in the variance calculation. By default, ddof=0, but it can be set to a different value to customize the calculation. In the following example, we have computed the variance with ddof=1

import numpy as np
# input array
x = np.array([1, 2, 3, 4, 5])
# applying var with ddof=1
result = np.var(x, ddof=1)
print("Variance with ddof=1:", result)

Output

Following is the output of the above code −

Variance with ddof=1: 2.5

Example: Plotting 'var()' Function

In the following example, we have plotted the behavior of the var() function. We have calculated and plotted the variance for different sizes of input arrays by importing the NumPy and matplotlib.pyplot modules −

import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 100)
y = np.var(x)
plt.plot(x, np.full_like(x, y), label="Variance")
plt.title("Variance Function")
plt.xlabel("Input")
plt.ylabel("Variance Value")
plt.legend()
plt.grid()
plt.show()

Output

The plot demonstrates the constant nature of the variance value across the input range −

Variance Visualization
numpy_statistical_functions.htm
Advertisements