Standard Deviation

Here’s a guide for 1st year students to calculate measures of central tendency and dispersion in SPSS:

Calculating Measures of Central Tendency

Open your dataset in SPSS.
Click on “Analyze” in the top menu, then select “Descriptive Statistics” > “Frequencies”
In the new window, move the variables you want to analyze into the “Variable(s)” box
Click on the “Statistics” button
In the “Frequencies: Statistics” window, check the boxes for:

Mean
Median
Mode

Click “Continue” and then “OK” to run the analysis

Calculating Measures of Dispersion

Follow steps 1-4 from above.
In the “Frequencies: Statistics” window, also check the boxes for:

Standard deviation
Range
Minimum
Maximum

For interquartile range, check the box for “Quartiles”
Click “Continue” and then “OK” to run the analysis.

Interpreting the Results

Mean: The average of all values
Median: The middle value when data is ordered
Mode: The most frequently occurring value
Range: The difference between the highest and lowest values
Standard Deviation: Measures the spread of data from the mean
Interquartile Range: The range of the middle 50% of the data.

Choosing the Appropriate Measure

For nominal data: Use mode only.
For ordinal data: Use median and mode.
For interval/ratio data: Use mean, median, and mode.

Remember, if your distribution is skewed, the median may be more appropriate than the mean for interval/ratio data.

Measures of Central Tendency

Measures of central tendency are statistical values that aim to describe the center or typical value of a dataset. The three most common measures are mean, median, and mode.

Mean

The arithmetic mean, often simply called the average, is calculated by summing all values in a dataset and dividing by the number of values. It is the most widely used measure of central tendency.

For a dataset $$x_1, x_2, …, x_n$$, the mean ($$\bar{x}$$) is given by:

$$\bar{x} = \frac{\sum_{i=1}^n x_i}{n}$$

The mean is sensitive to extreme values or outliers, which can significantly affect its value.

Median

The median is the middle value when a dataset is ordered from least to greatest. For an odd number of values, it’s the middle number. For an even number of values, it’s the average of the two middle numbers.

The median is less sensitive to extreme values compared to the mean, making it a better measure of central tendency for skewed distributions[1].

Mode

The mode is the value that appears most frequently in a dataset. A dataset can have one mode (unimodal), two modes (bimodal), or more (multimodal). Some datasets may have no mode if all values occur with equal frequency [1].

Measures of Dispersion

Measures of dispersion describe the spread or variability of a dataset around its central tendency.

Range

The range is the simplest measure of dispersion, calculated as the difference between the largest and smallest values in a dataset [3]. While easy to calculate, it’s sensitive to outliers and doesn’t use all observations in the dataset.

Variance

Variance measures the average squared deviation from the mean. For a sample, it’s calculated as:

$$s^2 = \frac{\sum_{i=1}^n (x_i – \bar{x})^2}{n – 1}$$

Where $$s^2$$ is the sample variance, $$x_i$$ are individual values, $$\bar{x}$$ is the mean, and $$n$$ is the sample size[2].

The standard deviation is the square root of the variance. It’s the most commonly used measure of dispersion as it’s in the same units as the original data [3]. For a sample:

$$s = \sqrt{\frac{\sum_{i=1}^n (x_i – \bar{x})^2}{n – 1}}$$

In a normal distribution, approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations [3].

Quartiles and Percentiles

Quartiles divide an ordered dataset into four equal parts. The first quartile (Q1) is the 25th percentile, the second quartile (Q2) is the median or 50th percentile, and the third quartile (Q3) is the 75th percentile [4].

The interquartile range (IQR), calculated as Q3 – Q1, is a robust measure of dispersion that describes the middle 50% of the data [3].

Percentiles generalize this concept, dividing the data into 100 equal parts. The pth percentile is the value below which p% of the observations fall [4].

Citations:
[1] https://datatab.net/tutorial/dispersion-parameter
[2] https://www.cuemath.com/data/measures-of-dispersion/
[3] https://pmc.ncbi.nlm.nih.gov/articles/PMC3198538/
[4] http://www.eagri.org/eagri50/STAM101/pdf/lec05.pdf
[5] https://www.youtube.com/watch?v=D_lETWU_RFI
[6] https://www.shiksha.com/online-courses/articles/measures-of-dispersion-range-iqr-variance-standard-deviation/
[7] https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data/variance-standard-deviation-population/v/range-variance-and-standard-deviation-as-measures-of-dispersion

Centiles, Central Tendency, Dispersion, Mean, Median, Mode, Podcast, Quantitative, Quartiles, Standard Deviation, Variance

Guide SPSS how to: Measures of Central Tendency and Measures of Dispersion