Measures of dispersion: base your decisions on true data and forget about existential doubts!

If you work in statistical analysis, you may have heard about measures of dispersion. This concept, which belongs to the branch of descriptive statistics, refers to the degree to which observations are dispersed or separated from the mean .

Believe it or not, statistics is very useful in various fields, since it allows to better understand the information being analyzed and, based on this, to carry out strategies according to the desired objectives. For this reason, understanding well what the dispersion measures are for is key to use them correctly in different situations.

With this in mind, we have prepared this article that will allow you to learn more about one of the most relevant concepts in statistics: dispersion measures. We will explain in depth what dispersion measures are and what they are used for. In addition, we will tell you which are the main measures of dispersion and the use of each one of them.

Definition of dispersion measures

Dispersion measures consist of numbers that provide information about the variability of the data. That is, they are responsible for showing how close together or far apart the data in a distribution are. They are usually used in conjunction with measures of central tendency, such as the mean or median, to provide a general description of a data set.

As highlighted by Matemovil, “the values of dispersion measures let us know whether the data are tightly clustered, widely dispersed, or equal.”

When the dispersion measure has a small value, it means that the data are located close to the central position, while when it has a large value, it means that they are more separated or farther away from the center.

Thus, considering the above, we can define dispersion measures as statistical measures oriented to show how far or close the scores of a variable are to the mean or arithmetic average.

Characteristics of dispersion measures

Now that you are clearer about the concept of dispersion or variability measures, we will provide you with some of their most representative characteristics so that you do not miss any detail:

Dispersion measures indicate how spread out the data of a distribution are.

It allows us to know how close or far from the mean the data are.

Measures of variability give you the possibility to know the homogeneity or heterogeneity of the data distributions.

Their application is easy and fast.

Its dispersion values are always positive or zero, in case they are equal.

The use of dispersion measures can be applied in various fields, such as health, industry, business economics, etc.

what are dispersion measures for?

We know that the objective of measuring dispersion is to determine the degree of deviation that exists in the data and, therefore, the limits within which the data will vary in some measurable variable, attribute or quality. In that sense, measures of dispersion are of great importance and occupy a unique position in statistical methods.

In order for you to understand the usefulness of dispersion measures, let us look at their main applications:

1. They help to understand the data set

The most important use of dispersion measures is that they help to understand the distribution of data. As data become more diverse, the value of the dispersion measure increases.

Therefore, knowledge of dispersion is vital in understanding statistics. Basically, it helps you understand concepts such as how diversified the data are, how they are distributed, and how they hold about the central value or central tendency.

In addition, measures of dispersion in statistics gives you a way to gain better insights into the distribution of data. For example, 3 different samples may have the same mean, median or range, but completely different levels of variability.

2. They complement the information given by measures of central tendency

Dispersion measures are also called second-order averages, i.e., averaging the deviations from a measure of central tendency a second time.

It provides an estimate of the phenomena to which the given (original) data refer. This increases the precision of statistical analysis and interpretation, so that you may be in a position to draw more reliable inferences.

3. They make it possible to compare different groups

If the original data are expressed in different units, comparisons will not be possible. But with the help of relative dispersion measures, all these comparisons can be made easily. Accurate comparison between the variability of two series will lead to reliable results.

4. Serve as a useful control to avoid erroneous conclusions in the comparison of data

The arithmetic mean may be the same for two different groups, but it will not reveal the prosperity of one group and the backwardness of another. This type of internal composition can be known through the application of dispersion measures.

Therefore, with the help of dispersion or variability measures, you will not conclude that both groups are similar. You can find that one group is prosperous and the other is lagging by knowing the amount of variability around the measures of central tendency.

Measures of dispersion are of great value in a statistical analysis as long as the coefficients of dispersion are put into practice. Otherwise, the conclusions drawn will be largely unreliable.

5. Control variability

Different measures of dispersion give you variability data from different angles, and this knowledge can be useful in controlling variation. Especially in the financial analysis of business and medicine, these measures of dispersion can be very useful.

They also provide the basis for further statistical analysis, such as calculating correlation, regression, hypothesis testing, etc.

importancia de las medidas de dispersiónSource: Unsplash

Types of dispersion measures

Dispersion measures can be classified into two broad categories. These are absolute dispersion measures and relative dispersion measures. That said, let’s take an in-depth look at each of them. take note!

Measures of absolute dispersion

Absolute dispersion measures are in charge of presenting how far apart or together the data are, as well as showing the variability as a function of the average of the observation deviations. All of this is supported by the measures mentioned below:

1. Range

The range is a measure of dispersion that refers to the difference between the extreme values of a set. That is, the subtraction between its maximum and minimum values.

R: Xmax – Xmin

Where:

Range: R

Maximum value of the sample: Xmax

Minimum value of the sample: Xmin

Characteristics

The path allows to know the distance between the maximum and the minimum value.

It is the simplest measure of dispersion.

It is easier to understand and calculate.

The use of the range is limited to the conception of initial ideas.

Itonly considers extreme values, but not those in the intermediate range.

persona analizando datos en su tabletSource: Unsplash

2. Mean deviation

This measure of dispersion is the difference between the values of the statistical variable and the arithmetic mean. That is, the mean of the absolute deviations, which is expressed as follows:

Dm = 1/n [| x1 – A |+| x2- A |+| xn – A |+| xn – A |]

Characteristics

The mean deviation uses all observations for the calculation.

It is complex and not very understandable.

The calculation is time-consuming.

índice de la desviación mediaImage: Unsplash

3. Standard deviation

Another measure of dispersion is the standard or typical deviation. It is the square root of the arithmetic mean of the squares of the deviations values. In short, it is the square root of the variance and is represented as follows:

S= +√ [|x1 – A |2 n1] / N

S=+√S2

Characteristics

The standard deviation pays more attention to the extreme deviations with respectto the rest of the deviations.

It is difficult to understand and calculate.

Results in zero if all other observations remain the same.

medidas de dispersión proyectadas en una hoja blanca junto a una reglaImage: Unsplash

4. Variance

The last of the absolute dispersion measures is variance. This represents the variability of a data set with respect to its arithmetic mean. it is represented as the square of the deviations versus the mean of a statistical distribution and is expressed through the following syntax:

S2= Σ |x1 – A |2 n1 / N

Characteristics

A value can be added to each variable score and the variance will remain constant.

The variance does not have negative values, only positive or zero.

Measures of relative dispersion

Relative dispersion measures are used to compare the distribution of various samples. That is, they let you know how far apart or dispersed the scores are in the statistical distribution, regardless of how they are presented. To do this, they rely on the following measures of relative dispersion:

1. Coefficient of variation

This measure of relative dispersion provides information on the relative dispersion of a set of data with respect to the mean or arithmetic average and, in turn, the dispersion of the data among themselves.

Basically, it is used to compare the data set with respect to homogeneity or consistency. This is expressed as a percentage as follows:

CV = (σ / X) 100

X = standard deviation

σ = mean

Characteristics

The coefficient of variation is calculated as the quotient between the standard deviation and the arithmetic mean.

Represents an abstract number.

Indicates the degree of variability of a data set.

It reveals the representativeness of the mean.

análisis de datos en tiempo realSource: Unsplash

2. Rank coefficient

It is the measure of relative dispersion consisting of the ratio of the difference of the highest value and the lowest value in a data set to the sum of the highest value and the lowest value.

Simply put, it is calculated as the ratio of the difference between the highest and lowest terms of the distribution, to the sum of the highest and lowest terms of the distribution. This is the formula:

L-S / L + S

where L = largest value

S = smallest value

coeficiente de rangoSource: Pexels
3. Mean deviation coefficient

It can be defined as the ratio between the mean deviation and the value of the central point from which it is calculated. This measure of relative dispersion is represented as follows:

Mean deviation using the mean: ∑ | X-M | / north

Mean deviation using the mean: ∑ | X-X1 | / North

coeficiente de desviación mediaSource: Pexels

4. Quartile deviation coefficient

It is the ratio of the difference between the third quartile and the first quartile to the sum of the third and first quartiles. the formula for this measure of relative dispersion is defined as follows:

(T3 – T3) / (T3 + T1)

Q3 = Upper quartile

Q1 = Lower quartile

analista de datos revisando informaciónSource: Unsplash

5. Standard deviation coefficient

Another measure of relative dispersion is the standard deviation coefficient. This is the ratio of the standard deviation to the mean of the term distribution. here is its formula:

σ = ( √( X – X1)) / (N – 1)

Deviation = (X – X1)

σ = standard deviation

N= total number

In short, absolute and relative measures of dispersion are very useful when calculating different aspects of data. In fact, when you use them with data science, accomplishing this becomes easier, so you can incorporate process automation into your business easily.

Coeficiente de desviación estándarSource: Unsplash

Example of a dispersion measure

We’re sure you’ll already have an idea of how these dispersion measures can present themselves in everyday situations or businesses. However, so that there is no doubt related to this statistical measure, we will provide you with an example in which its importance becomes evident .

Imagine that you are going on a trip with your friends and the hotel where you are staying has a swimming pool with an average height of 1.60 meters. Considering that your height is 1.70 meters, you could think about entering the pool without any inconvenience.

However, since you do not know how to swim, you prefer to be cautious and find out if the entire pool has the same depth. To do so, you go to the lifeguard and ask him what is the maximum and minimum height, since, based on that, you can decide if you can use the entire pool or just move to a certain point.

It turns out that the maximum height of the pool is 1.80 meters; while the minimum is 1.40 meters. This means that if you move to the deepest point, you might be in danger, because according to the dispersion measure example, you don’t know how to swim, so the most advisable thing to do would be to enter with some float or just move to the middle of the pool.

computadora portátil con análisis de datosSource: Pexels

are you ready to get started with data analysis? Now that you know more about dispersion measures and their classification, we are sure that it will be more practical for you to understand and participate in courses or jobs that require the use of this statistical measure.

we look forward to seeing you in the next article!

Leave a Comment