Lab 5
Measures of Central Tendency

We'll use the same students.sav file that you used in the last lab. You may use the file that you saved, or if you need to get it again click on students.sav

Central tendency is a statistical measure that identifies a single score as representative of an entire distribution. The goal of central tendency is to find the single score that is most typical or most representative of the entire group.

We will focus on three measures of central tendency: the mean, the median, and the mode.

For now let's learn how to compute these three measures of center.

We'll begin by looking at the distribution for quiz 4. Look at the histogram.


If you had to pick a single value that is representative of the entire distribution, there are several reasonable options:

  • One possibility would be to select 7.0 because it is the most frequent score. The mode.
  • Another possibility would be to select the value at which half of the scores are to the right and half are to the left. This will again be 7.0. This is the median.
  • A third possibility would be to select the arithmetic average. For this distribution the mean is 6.89.

How do we compute these measures of central tendency?

Calculating the mode is easy. You simply find the number that occurs most often. For example, consider this distribution:

11,3,5,3,1,7

Rearrange it in ascending order:

1,3,3,5,7,11

Now it is easy to see that the most frequent number (i.e., the mode) is a 3.

Calculating the median is only slightly more complicated. After you've arranged a distribution of numbers in ascending order, simply count how many numbers there are. Let's call this total N (N is also called the sample size because it is the number of participants in your study). Notice that N is italicized. This is proper for all letters (even Greek ones) that stand for statistics or parameters.

If N is an odd number, the median is the middle number in the ordered distribution. To find which number is the middle, you can divide (N +1) by 2 . Suppose N were 7.

(7 + 1)/ 2 = 4.

This tells us that the median number is the 4th number in the distribution. So if the distribution were:
45, 52, 76, 88, 99, 100, 100
then the median would be the 4th number, 88.

In the earlier example above (1,3,3,5,7,11), N was 6. When N is even, the median is the average of both middle scores. Actually the formula from above still works. The median is the score falls in slot number (N + 1) / 2. In this case (6 + 1) / 2 equals 3.5. What is the 3.5th number? I don't really know but by convention it is the average of the 3rd and 4th numbers. In this case, the 3rd and 4th numbers are 3 and 5 so the median is (3 + 5) / 2 = 4. More complicated formulas exist for the median but we will not concern ourselves with them in this class.

Calculating the mean is something that is familiar to you. Simply add up all the numbers and divide by N. In example above, the mean is (1 + 3 + 3 + 5 + 7 + 11) / 6 = 30 / 6 = 5. Technically, this is called the arithmetic mean. Note that there are other kinds of averages that won't be covered in this course. Each kind of average has its specific purposes but the arithmetic mean will do for most things in this course.

Generally, unless it makes sense to do otherwise, we will calculate the mean and other kinds of statistics to 2 decimal points of precision.

In Blackboard Lab 5, answer the following questions about this dataset:
21, 22, 23, 24, 24, 24, 25, 25, 25, 25, 26, 27, 28
Blackboard 1) By hand or by calculator calculate the mean. Don't use SPSS.
Blackboard 2) What is the mode of these data?
Blackboard 3) What is the median of these data?



 

 

.


The following section outlines how to compute the mean, median, and mode in SPSS.

We can get SPSS to compute all of these values in the same command submenu. Go to the Statistics menu, select the Analyze submenu, and then the Descriptive Statistics submenu, and then the Frequencies option.

This should open a window that looks like this:

Select quiz 4 as your variable. And then click on the Statistics button.

This will open another window.

In this window select mean, median, and mode. Then click "continue". This will take you back to the previous window. Now click "OK".

Now SPSS should open up an output window that includes a table that looks like this:

                                                

 

That's all there is to it.


Okay. Now I'd like you to try to do what I just outlined above for quiz4 with a different variable. For the variable "final" in your students.sav file I'd like you to answer the following questions.

Blackboard 4) What is the mean for the "final" variable?  (Round to 2 digits)

Blackboard 5) What is the median for the "final" variable?

Blackboard 6) What is the mode for the "final" variable?  

Blackboard 7) What percent of students scored lower than the mode on the final? (Don't include student who scored the mode exactly. Don't include "%" in your answer. Round to 1 digit. Thus, 22.2% would be entered as 22.2.)



Properties of Central Tendency Measures

The mean, median and mode are descriptive statistics that are designed to tell us something about the center of a distribution. That is, where most of the data are.

So how do you know which measure of central tendency should be used?

- The answer depends on a number of factors, including the shape of the distribution and the scale of measurement that you use.

The mean is the most preferred measure. It takes every item in the distribution into account, and it is closely related to measures of variability.

A mean has several important properties or characteristics:

  • If you change a given score, add an observation, delete an observation, and then the mean will change.
  • If you add (or subtract) a constant to each score, then the mean will change by adding that constant.
  • If you multiply (or divide) each score by a constant, then the mean will change by being multiplied by that constant.

However, there are times when the mean isn't the appropriate measure.

- You cannot find a mean or median of a nominal scale (A nominal scale is an unordered set of categories for a variable. e.g., the categories of eye color may be: blue, green, hazel, and brown.), however you can find a mode for a nominal scale ("the most frequent eye color is ...", this statement makes sense.)

- Use the median if:

1) there are a few extreme scores in the distribution (skewed distributions with long tails)

2) there are undetermined values - if for some reason you don't know the value of one (or more) of your items (e.g., the person died before answering your question)

3) your distributions are 'open-ended' - by this we mean that there is no upper or lower limit on the possible values of your variable (e.g. your top answer on your questionnaire is '5 or more')

4) If your data are on an ordinal scale (rankings), then use the median.

Point number 1 above suggests that extreme scores (outliers) may influence which measure of central tendency we use. Extreme scores may influence the shape of distributions, which in turn will affect our measures of central tendency.

In a distribution like this one (symmetric distribution), the mean, median, and mode will have similar values.

However, in a positively skewed distribution, the mean will be larger than the median which will be larger than the mode.

The opposite is true for a negatively skewed distribution
mean < median < mode

 

 

Okay now let's look at a few distributions to examine these different measures of central tendency relate to one another.

http://www.ilstu.edu/%7Ewjschne/138/quiz5.jpg

Blackboard 8) Look at the distribution in the histogram above. From lowest to highest, rank the mean, median, and mode.
Blackboard 9) Of the 3 measures of central tendency (
mean, median, and mode), which would be the least representative number in this case?
Blackboard 10) What kind of skew is evident in this distribution?