## statistics -- basic statistical measurements
The module contains functions for computing basic statistical measures of samples of data
### Index
namespace [stat](#stat)
Functions:
- [mean](#mean)(invar _data_: array<@T<int|float>>) => float
- [variance](#variance)(invar _data_: array<@T<int|float>>, _kind_: enum<sample,population> = $sample) => float
- [variance](#variance)(invar _data_: array<@T<int|float>>, _mean_: float, _kind_: enum<sample,population> = $sample) => float
- [median](#median)(_data_: array<@T<int|float>>) => float
- [percentile](#percentile)(_data_: array<@T<int|float>>, _percentage_: float) => float
- [mode](#mode)(invar _data_: array<@T<int|float>>) => @T
- [range](#range)(invar _data_: array<@T<int|float>>) => tuple<min: @T, max: @T>
- [distribution](#distribution1)(invar _data_: array<@T<int|float>>) => map<@T,int>
- [distribution](#distribution2)(invar _data_: array<@T>int|float>>, _interval_: float, _start_ = 0.0) => map<int,int>
- [correlation](#correlation)(invar _data1_: array<@T<int|float>>, invar _data2_: array<@T>, _coefficient_: enum<pearson,spearman>) => float
- [correlation](#correlation)(invar _data1_: array<@T<int|float>>, invar _data2_: array<@T>, _coefficient_: enum<pearson,spearman>, _mean1_: float, _mean2_: float) => float
- [skewness](#skewness)(invar _data_: array<@T<int|float>>) => float
- [skewness](#skewness)(invar data: array<@T<int|float>>, mean: float) => float
- [kurtosis](#kurtosis)(invar data: array<@T<int|float>>) => float
- [kurtosis](#kurtosis)(invar _data_: array<@T<int|float>>, _mean_: float) => float
### Functions
```ruby
mean(invar data: array<@T>) => float
```
Returns arithmetic mean of *data*.
E[X] = Σx / N
**Errors:** `Value` when *data* is empty
```ruby
variance(invar data: array<@T>, kind: enum = $sample) => float
variance(invar data: array<@T>, mean: float, kind: enum = $sample) => float
```
Returns variance of *data* (measure of spread) of the given *kind*. Uses *mean* if it is given.
Sample: σ²[X] = Σ(x - E[X]) / (N - 1)
Population: σ²[X] = Σ(x - E[X]) / N
**Errors:** `Value` when *data* is empty or contains a single item
```ruby
median(data: array<@T>) => float
```
Returns median (middle value) of *data* while partially sorting it. If *data* size is even, the mean of two middle values is returned
```ruby
percentile(data: array<@T>, percentage: float) => float
```
Returns percentile *percentage* (the value below which the given percentage of sample values fall) of *data* while partially sorting it. *percentage* must be
in range (0; 100)
**Errors:** `Value` when *data* is empty, `Param` when *percentage* is invalid
```ruby
mode(invar data: array<@T>) => @T
```
Returns mode (most common value) of *data*
**Errors:** `Param` when *data* is empty
```ruby
range(invar data: array<@T>) => tuple
```
Returns minimum and maximum value in *data*
**Errors:** `Value` when *data* is empty
```ruby
distribution(invar data: array<@T>) => map<@T,int>
```
Returns distribution of values in *data* in the form `value` => `frequency`, where `value` is a single unique value and `frequency` is the number of its appearances in *data*
**Errors:** `Value` when *data* is empty
```ruby
distribution(invar data: array<@T>, interval: float, start = 0.0) => map
```
Returns values of *data* grouped into ranges of width *interval* starting from *start*. The result is in the form `index` => `frequency` corresponding to the ranges
present in the sample. `index` identifies the range, it is equal to integer number of intervals *interval* between *start* and the beginning of the particular range;
the exact range boundaries are [*start* + `floor`(`index` / *interval*); *start* + `floor`(`index` / *interval*) + *interval*). `frequency` is the number of values
which fall in the range. The values lesser then *start* are not included in the resulting statistics
**Errors:** `Param` when *data* is empty or *interval* is zero
```ruby
correlation(invar data1: array<@T>, invar data2: array<@T>, coefficient: enum) => float
correlation(invar data1: array<@T>, invar data2: array<@T>, coefficient: enum, mean1: float, mean2: float) => float
```
Returns correlation *coefficient* between *data1* and *data2*. Pearson coefficient measures linear dependence, Spearman's rank coefficient measures monotonic dependence.
If *mean1* and *mean2* are given, they are used for calculating Pearson coefficient.
**Note:** *self* and *other* must be of equal size
Pearson: r[X,Y] = E[(X - E[X])(Y - E[Y])] / σ[X]σ[Y]
Spearman's rank: ρ[X,Y] = r(Xrank, Yrank)
**Errors:** `Value` when *data1* or *data2* are empty or have different size
```ruby
skewness(invar data: array<@T>) => float
skewness(invar data: array<@T>, mean: float) => float
```
Returns skewness (measure of asymmetry) of *data*. Uses *mean* if it is given.
γ1[X] = E[((x - E[X]) / σ)^3]
**Errors:** `Value` when *data* is empty
```ruby
kurtosis(invar data: array<@T>) => float
kurtosis(invar data: array<@T>, mean: float) => float
```
Returns kurtosis (measure of "peakedness"). Uses *mean* if it is given
γ2[X] = E[((x - E[X]) / σ)^4] - 3
**Errors:** `Value` when *data* is empty