Measure of Variations

example:

If we have two data sets:

A: 1, 4, 5, 6, 9

B: 5, 5, 5, 5, 5

The mean = $\bar{x} = \frac{\sum x}{n}$ → $\bar{x} _A =\bar{x} _B=$ 5

<aside>

Here, we have two different data sets but with an equal mean, which indicates that using the mean solely is insufficient to evaluate the data and reach a conclusion.

</aside>

1. Range

The range of a data set is the size of the narrowest interval, which contains all the data.

For the same previous example:

A: 1, 4, 5, 6, 9

B: 5, 5, 5, 5, 5

The mean = $\bar{x} = \frac{\sum x} { n}$ → $\bar{x} _A =\bar{x} _B=$ 5

The range = Max - Min → $Range_A =$ 8 and $Range_B=$ 0

<aside>

Both Data sets have equal mean but differing ranges

</aside>

However, this is not always the case. For the following example:

A: 2, 4, 6, 8

B: 2, 2, 8, 8

The mean = $\bar{x} = \frac{\sum x}{ n}$ → $\bar{x} _A =\bar{x} _B=$ 5

The range = Max - Min → $Range_A = Range_B =$ 6

<aside>

Both Data sets have equal mean and range

</aside>

2. Variance

The variance measures the dispersion of data with respect to the mean.

           In the second figure, the data is more scattered about the mean than in the first figure

       **In the second figure, the data is more scattered about the mean than in the first figure**

<aside>

The calculator can calculate both variance and standard deviation. Please see this video to learn how to use it.

</aside>

https://www.youtube.com/watch?v=AD_e7qW_Qq0

Coefficient of Variation

The Coefficient of Variation (CV) is a statistical measure of the relative variability of data, often used to compare the degree of variation between datasets with different means.

$$

CV = \frac{\sigma}{\mu}*100 / \frac{s}{\bar{x}}*100

$$

Example:

Imagine you have two performance metrics for a web service:

You want to compare how variable these metrics are relative to their mean values.

We will use the calculator as shown in the video

  1. API response time: $\bar{x} = 100 ms$ and $S=7.91ms$
  2. Memory Usage: $\bar{x} = 504$ MB and $S=9.61$ MB

Then calculate the coefficient of variation:

  1. $CV_1=\frac{7.91}{100}*100 = 7.91\%$
  2. $CV_{Memory}=\frac{9.61}{504}*100 = 1.91\%$

<aside>

Chebyshev’s Theorem

It is an estimation of the minimum proportion of observations that will fall within a specified number of standard deviations regardless of the shape of the distribution.

image.png

The Empirical (Normal) Rule

It estimates the minimum proportion of observations that will fall within a specified number of standard deviations of normally distributed data.

image.png