These Were The "Good Old Days" In Statistics

  1. WHEN an instructor spent a third of a course teaching the computation of the mean and standard deviation from a grouped frequency distribution and then apologized to the student that these were only approximations.

  2. WHEN a real luxury was owning a $150 Texas Instrument hand-held calculator that could perform the four fundamental operations.

  3. WHEN a student complained about math anxiety the instructor could compassionately recommend completing a one-week regimen of a paperbacked programmed-instruction book stressing the mathematics necessary for basic statistics.

  4. WHEN angrily dropping a jammed 50 lb Frieden rotary calculator to the floor would magically restore the machine to full operation.

  5. WHEN performing a Wherry-Doolittle multiple regression on a rotary calculator produced only a weary statistician.

  6. WHEN a bulky $3000 Monroe programmable electonic calculator was regrettably limited to 32 steps and would just barely compute a standard deviation.

  7. WHEN your Nixie-tube Sony programmable desktop calculator was the envy of the entire department.

  8. WHEN using the mainframe computer for analyzing data necessitated punching Hollerith cards on a humungous 8 ft. wide steel contraption and then storing 100's of these cards in long cardboard boxes which could be lugged from building to building.

  9. WHEN students were awestruck with the notion of an ANOVA replacing six pairwise t-tests to test the equality of four treatment population means.

  10. WHEN the spirit duplicator ("ditto") machine made you appreciate the dangers of glue sniffing and had you begging for a box of latex gloves to protect your hands from the purple plague.

  11. WHEN nonparmetric tests which were the rage of the 1950's were likened to the discovery of penicillin and forced you to question even the most minute violation in the assumptions of parametric tests and subsequently toss many t or F-tests on the junk heap.

  12. WHEN students were convinced that there was only one unique table of random numbers and were dumbfounded when they did a frequency count of single digits in the table and found them roughly rectangularly distributed rather than normal.

The Above Was Archived on 17 August 2001.

During the month of November or December, graduate students in my multivariate analysis class traditionally pay special homage to the celebrated Cayley-Hamilton theorem. It is accorded this high honor by Professor Maurice Tatsuoka in the chapter on linear transformations, axis rotation, and eigenvalues in his excellent textbook. The role of this theorem in the textbook is rather obscure. It is not, to my knowledge, applied or used in any multivariate technique or employed in the proof of any other theorem or formula in the entire text! It is presented as a stand-alone pillar of mathematical splendor. It is shocking to many students that a theorem can have absolutely no practical application what so ever but yet be intrinsically elegant in and of itself. However, one can reflect on innumerable things in real life that are of this very same nature.

In my graduate-level statistics classes I have three, four and five star handouts graded according to importance. The one-page handout on Cayley-Hamilton is a SIX-STAR handout and is the only one I have ever given this highest distinction. This handout is printed in limited quantities and serially numbered to insure its value as a collector's item. My students are admonished not to discard or in any way, bend, fold, or mutilate this work of art. Remember, I tell them..."ASK NOT WHAT THE CAYLEY-HAMILTON THEOREM CAN DO FOR YOU BUT WHAT YOU CAN DO FOR THE CAYLEY-HAMILTON THEOREM."

Here is a spectacular demonstration of how like its eigenvalues a matrix behaves. I am proud to present in gif format the celebrated Cayley-Hamilton theorem. ENJOY!!!

The Above Was Archived on 21 September 1999

This month I would like to focus on perhaps one of the most significant news event that has occurred in my lifetime. It makes the discovery of the incandescent light bulb or man's landing on the moon very inconsequential. I am of course referring to Mark McGwire of the St. Louis Cardinals and his shattering of the single season home run record on Tuesday, September 8, 1998 at 9:18 PM EDT. This record is the most revered one in all of sport both in the USA and many foreign countries. All eyes were fixed on Busch Stadium that evening when Mr. McGwire in the wink of an eye rifled his 62nd home run over the left field wall.

What does this feat have to do with the field of statistics you ask? I maintain it has everything to do with statistics. Baseball is a game whose very objective and rich heritage is vitally dependent on the art of record keeping and the meaningful manipulation of these records. There is no other sport in the world that breeds the thousands upon thousands of numbers and summaries that baseball does year after year. Indeed, each season there are many new records contrived to fit the particular accomplishments and combinations of skills of certain ballplayers and or the teams that employ them. Observe the emergence in recent years of the 30-30 or the 40-40 player or the manager's detailed charting of pitches thrown.

My purpose here is not to discusss all these newfangled indices. I will leave that task to the writers and news media who scramble to produce these tidbits to justify their existence. I simply want to capture the wonder of that magical September night and relate to you my observations of what were the important coincidences and facts about that historic day. Here they are:

  1. The stock market made its greatest daily gain ever of 380 points on the Dow.

  2. McGwire's 62nd home run was his shortest up to that point in the season at 341 ft. His longest was 550 ft. Through 144 games his 62 home runs have totalled 25,684 ft. or 4.9 miles.

  3. The record breaking home run would have been nullified by the umpire had McGwire missed first base ( He initially did ) but touched all the other bases and the Cubs would have appealed before the first pitch to the next batter.

  4. Tuesday was McGwire's favorite day during this stretch. He hit 15 of his 62 home runs on Tuesdays.

  5. The fourth inning was McGwire's favorite inning. He hit 11 of his 62 home runs in the fourth inning.

  6. None of McGwire's 62 homers was hit on a 3-0 count! Only one was hit on a 3-1 count! Wow what patience as a hitter.

  7. McGwire hit his 62 homers off of 57 different pitchers. He did a masterful job of spreading the misery around.

  8. McGwire came to bat 451 times officially up to and including his 62nd home run. In other words, through 144 games he could be expected to hit a home run every 7.27 times at bat.

  9. During this stretch the Chicago Cubs and Florida Marlins tied for being victimized the most by McGwire home runs. Each of these teams had 7 homers hit against them.

  10. Of McGwire's 62 home runs so far, 30 were solo homers and 23 of his last 31 homers were solo blasts.

  11. The date and time of McGwire's record home run was 9/8/1998 at 9:18 EDT. If the single digits in these numbers are totaled the sum is 62. TRULY AMAZING!

  12. Finally my wife put the frosting on the cake that day. She found the bezel and crystal which had been lost for several days for my FAKE Rolex watch. A STATISTICIAN JUST COULD NOT ASK FOR A GREATER DAY!

You can see how a statistician can easily become obsessed with facts and figures like the above set particularly when that statistician happens to be a Cardinal fan. However, there is much more to this story. I would truly like to thank both Sammy Sosa of the Cubs and Mark McGwire for the great show that they have put on this year ( and the home run derby is not over with this writing). I think these two fine role models have shown baseball and this country that their close friendship and encouragement of one another is what life is all about! All other athletes should take special note of this relationship.

The Above Was Archived on 16 December 1998

This month we will present another neat probability experiment that can easily be conducted by students as a short assignment or an in-class project. If you have been a regular reader of my home page you may recall the penny-spinning problem whose disscussion now resides in my Archives of Statistics Fun. That experiment and the current one represent excellent opportunities for an instructor to highlight the value of the Monte Carlo method in estimating probabilities for unusual variations in experiments that don't lend themselves to the usual formulas.

Let's state the current question in simple language:

If a penny is flipped until a head first appears, what is the probability that this first head occurs on an odd-numbered trial (i.e.,first, third, fifth, etc.)?
At first blush, a typical student would reason that since the first head is just as likely to occur on an odd trial as it is on an even trial (second, fourth, sixth, etc.), the probability is obviously .5. But wait! Another student mentions that maybe the probaility should be somewhat greater than .5 since the first opportunity for a head to pop up is on the very first trial, and an odd trial continues to preceed an even trial after the first two. At this point the band-wagon effect sets in and students begin to incrementally up their estimates slightly from .5. But after many values are offered, a hush settles over the room and students begin to look at one another and shrug their shoulders. No one is really sure!

Enter Captain Sigma (the instructor)! With a flourish of his cape and a wink of his eye, he quietly suggests that this is a problem that just begs for empirical data. He urges each student to take about five minutes at home and repeat the experiment 10 times, tally how many times the first head appears on an odd trial, and bring the data to the next meeting. The students happily concent to this simple task ( Someone in the back of the room asks, "How many points is it worth?") and they all eagerly await the pooling of their data at the next meeting.

Two days later the instructor rushes into the classroom and puts all the results from 35 students on the board. The students sit on the edge of their seats in awe as the numbers accumulate. The final tally results in 245 out of 350 replications ending on an odd trial. Zowie! THAT IS 70%! Something is wrong. The pennies must have been seriously flawed.

The instuctor showing no emotion on his face allows the buzzing and chattering to go on for several minutes. Finally he cracks a grin and informs the students that this result is a very good estimate although it is a tad too high. He proudly states that the actual answer is P=2/3 or 67%. The students are dumbfounded and become quite excitable. They actually all cheer for the instructor and demand a formal proof (Did I say "cheer" in a stat class? I must be delirious from a high fever!).

Here is what the instructor wrote on the board:

The solution involves the sum of the first n terms of a geometric series expressed as:

S = a + ar + ar2 + ... + arn-1

a = first term of the series
n = number of terms
r = the common ratio
S = the sum of the first n terms calculated by

S = a (1 - rn) / (1 - r)

In our case, a =1/2 = .5 and r = (1/2)(1/2) = .52 or .25 and using the first expression for S we have:

S = .5 + .53 + .55 + ...

In words, the above is stating that the probability of getting the first head on an odd trial is the probability of getting a head on the first trial (.5) plus the probability of geting a head on the third trial (.5)(.5)(.5) plus the probability of getting a head on the fifth trial (.5)(.5)(.5)(.5)(.5) plus etc.,etc. for n trials.

Now to compute what this sum would be for n trials we calculate using the second formula:

S = .5 (1 - .25n) / (1 - .25)

Finally taking the limit of this calculation as n approachs infinite, we arrive at

S = (.5) / (1 - .25) = (1/2) / (3/4) = 2/3 or .67

Truly Remarkable! The students all give the instructor a standing ovation and shout "QED" "QED" "QED".... The instructor smiles sheepishly while taking a bow and thinks to himself how rewarding it is to be a statistics professor.

The Above Was Archived on 13 September 1998

Here are the answers to last month's crossword puzzle. As warned previously, some of these statististicians are not exactly household words. Use the following scale to grade your perfomance as a Statistician Trivialist:




4. Likes a good match
6. Bullets that strayed from target
7. Advocate of equal rights for variations
8. Always correcting things
15. Rymes with macaroni
17. Past tense of feel
18. Delights in a small ratio
20. Homo sapiens
21. A delicious pear
22. Finished
23. Old Faithful
25. Detested a small expected frequency
26. Shopping for sleeping accomodations


1. Inventor of cotton gin
2. French cook
3. Measures agreement in judges
5. Can walk on water in flooded farm plots
9. Honestly different than others
10. Plants thrive in this
11. A swear word
12. Quality control expert at brewery
13. Storage container
14. A fine vodka
16. Rejuvinated male
19. A chocolate covered mint
24. Uses lambda and is sometimes exact

The Above Was Archived on 10 July 1998

One of the most intriguing and frequently mentioned probability questions of all time is the so-called "Birthday Problem." I really do not remember when I first encountered this question but I know it has been around for decades and pops up in many treatments of probability in basic statistics textbooks. Let us revisit this interesting problem and hopefully shed some new light on this time-honored topic.

First, for those readers who are not familiar with this problem, let us pose the question in its simplest form:

Given a room with a random collection of N people, what is the minimum N needed for an observer to state there is greater than a 50-50 chance of at least two people having identical birthdays?
Responses to this question are many and varied depending on a person's exposure to probability topics. However, the three most frequently offered answers are (a) 183 (b) 20 and (c) 23. The correct answer,of course, is (c) 23 which may shock some of you and prompt you to immediately head out and bet some of your buddies on a coincidence of birthdays in rooms with this few people present. Before you make this rash decision read the remainder of this discussion.Note:We shall assume in our discussion that a year has 365 days rather than the 366 in a leap year. We shall also assume that by "identical birthday" or "birthday coincidence" or "duplicate birthday" we mean the same month and day disregarding the year of birth.

The (a) response of 183 has much intuitive appeal for the ordinary person on the street. He or she would reason that in order to be absolutely certain that two birthdays coincide, 366 people would be needed in the room. Now since a probability just greater than .50 of a duplicate is all that is wanted, simply take 1/2 of 366 and arrive at 183. This seems logical but the laws of probability tell us the correct N is dramatically smaller than 183! Just how much smaller?

Many people who have studied a little probability would give the (b) response of 20. Wow! This intuitively seems way to small to give us even a slight chance of a coincidence of birthdays let alone a better than even chance. But the reasoning merits close examination and goes something like this:
Check the birthdays in the room one by one. After the first person has given his or her birthday, the second person will have one chance in 365 of having the same birthday as the first. The third person could have the same birthday as the first or second person, so the third person has 2 chances in 365. Added to the chance from the second person, there are a total of 3 chances in 365. By the same logic, the fourth person has 3 chances of having the same birthday as any of the first three so this needs to be added to the previous 3 to get 6 chances out of 365, and so on. By the time we exceed 183 chances in 365, which is just greater than our 50-50 probability, we will have checked just 20 people. Mathematically, this is more concisely expressed as follows: We want the smallest integer N-1 such that
(0)(1/365)+(1)(1/365)+(2)(1/365)+(3)(1/365)+...+(N-1)(1/365) > 1/2 or
(1 + 2 + 3 +...+(N-1))/365 > 1/2

The required N-1 is 19 since 1 + 2 + 3 +...+ 19 = 190 but don't forget to add one for the first person checked even though a match can not occur with just one person. Thus N = 20 people are required according to this line of reasoning.

But hold the phone! We have a serious flaw in this argument. The number (1 + 2 + 3 + ...+(N-1))/365 is NOT a probability but the EXPECTED VALUE or MEAN NUMBER of birthday coincidences for N = 20 people in a room. Thus, if many rooms of randomly assembled N's of 20 people were examined, the mean number of coincidences is just greater than 1/2. This is not particularly reassuring to a shrewd betting person!

Although N=20 is an incorrect answer to the original problem it does suggest an alternate approach for betting purposes. Suppose we wanted the expected value of coincidences to be just greater than one. We could continue the above computatation for several more terms until the ratio just exceeded one. With a calculator it is easy to see that we must only go out to N-1=27 or N=28 for this to occur. Thus with many rooms of N=28 people we would have a mean number of coincidences just greater than one and many bettors would take greater comfort in this value.

Now let us explain the correct answer (c) N=23 for the original problem. The easiest approach is to find the probability of NO duplicate birthdays in a sample of size N and then subtract this result from ONE to get the probability of at LEAST ONE duplicate. Again we shall check the people one by one. After the first person establishes a birthday (P=365/365), the probability of the second person's birthday not duplicating the first is (365/365)(364/365). The probability of the third person not duplicating the first two is (365/365)(364/365)(363/365). This multiplicative process goes on and on and for a given N this product becomes (365/365)(364/365)(363/365)...((365-N)/365). Since this term is the probability of no duplicates, our task is to determine the value of N that will cause this probability to be as large as possible without exceeding .50. Then when this probability is subtacted from one the probability of at least one duplicate will just exceed .50. With a hand calculator it is easy to show that when N=22 this product is .5243 and 1 - .5243 = .4757 but when N=23 the product is .4927 and 1 -.4927 =.5073. We can thus state that if a room contains 23 randomly assembled people, we stand a slightly better than 50-50 chance of finding a duplicate birthday.

If you are a conservative bettor and flinch at the above chances, I have developed the following table that allows you to read in the desired probability level for a duplicate birthday and read out the required N in your room. Note that the probability levels should be interpreted as actually those just greater than the listed value.

Required Numbers of People
For Selected Probabilities
of a Birthday Coincidence
Probability N
.50 23
.60 27
.70 30
.75 32
.80 35
.90 41
.95 47
.99 57

Thus if you are a real gambler, when the situation presents itself, you go with N=23 and impress the pants off everyone in the room by hopefully finding a duplicate. If you don't feel that you are an exceptionally lucky person, then you might select the comfortable 75-25 chance of a duplicate and use N=32. On the other hand, if you fall at the other end of the continuum and only bet on close to sure things, then pick the .99 level and go with N=57. Here you are almost certain to find a duplicate but the people will not be that impressed and you won't elicit that wonderful "WOW!" effect.

I tried this experiment last semester in my Statistics I class with N=26 students in attendance that day. I knew my chances were below .60 but I put on an air of absolute certainty with my pronouncement. I confidently started around the room with students stating their birthdays. When I got to the 12th person I had a duplicate. The reaction was electrifying. You would have thought that I had just floated an elephant in mid-air! My student ratings skyrocketed for at least one day!

I hope you have enjoyed my presentation of the above topic and hopefully if you are an instructor you can have some fun and try this in your class. I must fess up to one other assumption that was not mentioned earlier. Not only must you assume a random sample of people are assembled in the room but theoretically you must assume that birthdays are randomly distributed throughout the 365 days of the year. This is probably not satisfied in any strict sense but that is a question involving a whole different ballgame. If you are turned on by the concept of chance and how pervasive it is in our society check out Chance News, a bimonthly newsletter letter published at Dartmouth University.

The Above Was Archived on 5 April 1998

This is the season of good cheer and merriment. If you know of a lonely statistician please tell him that you love him and truly appreciate all the wonderful methodologies that he has perpetuated and enhanced throughout his career. I am sure any kind remarks directed his way will warm his heart and point him toward the new year with a renewed sense of vitality and dedication.
HAPPY HOLIDAYS to all my readers! May all your Summation Sigmas be operational and all your means be true m's.

We at RAMO PRODUCTIONS are particularly thankful for this Holiday Season. In fact, we are ecstatic and even giddy over the honor that was recently bestowed on this Web Site. At the 1997 Annual Conference of the Society for the Preservation of Humor In Statistics (SPOHIS) held November 20-22 in Las Vegas, this Home Page was awarded "The Golden Sigma Cup." This highly coveted award signifies the BEST contribution of any Site on the WWW toward the promotion of statistics as a humorous subject. The acceptance of this award was truly a defining emotional moment in my career. I would like to thank all the members of the SPOHIS Academy for the necessary and sufficient consideration given all the nominees for this award and the unbiased selection of this particular site. I will try to be a worthy recipient of this magnificent cup and redirect my energies toward uncovering new tidbits of humor that make statistics the enchanting field that it has now become.

The Above Was Archived on 7 February 1998

This month all my readers will be given a real treat. The World Famous Three Step Method (WFTSM) will be revealed. I have had many requests and pleadings through my guestbook and other personal email to present this marvelous technique to the World Wide Web. This procedure which I consider the Holy Grail of statistical methodology (just ask my students) is a three-step sequence for calculating the sample standard deviation. Some textbooks emphasize a multi-step, direct or brute force procedure (see formula on right) which for most situations is very painful and tedious. That is, given a set of N raw scores (X-scores) you are advised to (a) compute the mean (b) subtract the mean from each score (c) square each of these deviations (d) sum the squared deviations (e) divide the sum of squared deviations by the number of scores N and finally (f) extract the square root of this result. While this technique works rather well when the number of scores is small and the mean is a nice whole number, it is a nightmare in other situations. When the number of scores is say 15 or more and the mean is a decimal (In practice this will be true about 95% of the time), this procedure involves repeated subtracting and squaring of decimals and gets extremely messy even when using a calculator. A better method is needed!

Never fear. A white knight is waiting in the wings. Let us apply some finesse and demonstrate an elegant substitution for all but the last two steps in the above procedure. Please study the animated gif on the right. Observe that Step One is the key to the entire computation. It is the mathematical equivalent of steps (a) through (d) in the "brute force" method. It requires only two basic calculations: ∑X (the sum of the raw scores) and ∑X2 (the sum of the squares of the raw scores). Once Step One is computed, school is almost out and Steps Two and Three roll out very easily. Note also that Steps Two and Three here are exactly the same as the earlier steps (e) and (f) respectively.

To illustrate this new calculation consider a simple example. Suppose we are given the following set of 15 raw scores (X's):
5, 6, 8, 8, 10, 12, 12, 12, 14, 16, 16, 18, 18, 19, 20
For our data ∑X = 5 + 6 + 8 +...+ 20 = 194 and
∑X2 = 52 + 62 + 82 +...+ 202 = 2838

Now substituting the above results and applying WFTSM:

  1. x2 = ∑X2 - (∑X)2/N (STEP ONE-Sum of Squares of Deviation Scores)
    x2 = 2838 - (194)2/15
    x2 = 2838 - 2509.0667 = 328.9323
  2. s2 = ∑x2/N (STEP TWO-Variance)
    s2 = 328.9323 / 15 = 21.9288
  3. s = Sq Root (∑x2/N) (STEP THREE-Standard Deviation)
    s = Sq Root (21.9288) = 4.68
VOILA! There you have it ladies and gentlemen. This is the formula that has taken the world by storm all the way from El Paso, Illinois to Paris, France! You ask - Why is it so gosh darn great and globally famous?
Let me enumerate its many advantages and virtues:
  1. It allows the researcher to stop at any of the three stages of the sequence. For a descriptive index of variability you probably want to perform all three steps and get the standard deviation. However, sometimes in different statistical developments you may want to stop at Step Two and obtain the Variance. Most Important, a researcher on many occasions will fold the tent and stop with Step One. This result is probably the single most pivotal calculation in the entire statistical world. It is the cornerstone for such procedures as the t-test, Analysis of Variance, correlation and regression, discriminant analysis, Multivariate Analysis of Variance, etc.
  2. It lends itself well to computation with a scientific calculator. Most of these have hard-wired functions that will give the results of all three steps at the touch of several keys after a one-touch entry for each of the raw scores.
  3. It avoids nasty computations with decimals in the early stages when the raw scores are whole numbers.
  4. It avoids rounding error early since the "brute force" method requires as a first step the computation of the mean and subsequent rounding.
  5. There is something sacrosanct about three-step methods in math and statistics. The third step has the effect of developing closure. Two, four, and six step-methods seem out of balance to a statistician! :-)
  6. It is COOL! COOL! COOL!

OK now that we have proven WFTSM is the greatest thing since sliced bread, how do we accord it high distinction in the folklore of statistical methodology? This is my suggestion to all students: Get a calligrapher to write WFTSM on fine parchment paper. Then roll and tie the scroll and place it in a soft bed of puffed pima cotton. Place the cotton and scroll in a box wrapped in blue holographic foil. Finally have an expert gift-wrapper surround the box with white silk ribbon topped with an elegant bow. Finally go to your dining room table and replace the flower arrangement as the center piece with the blue box. Now we have given WFTSM its proper respect. It will be the center of attraction for all guests invited to your house for dinner. Just think of the thrills you will experience explaining to your best friends the story of the blue box and WFTSM.

Thanks for reading the development of my favorite statistical procedure. Who says statistics has to be dull when it embraces world renowned technology cradled in elegant blue boxes!

The Above Was Archived on 20 December 1997

October is the month of goblins and ghoulies. Unfortunately, students of basic statistics experience far too many of these creatures on days other than Halloween night. As promised last month, I will offer some general suggestions for teaching the course in basic statistics. Several caveats are in order. First, these ideas have worked for me over several decades of teaching but I make no warranties they will work for other instructors. Secondly, these techniques have been employed in classes with enrollments of between 30 and 40 students and therefore are probably not appropriate for large lecture sections. With this in mind, I present this short list of hints to help rid the statistical learning environment of goblins and ghoulies:

Teaching Tips for the Instructor of Basic Statistics

  1. Utilize group activities in the classroom when it is desired to solidify certain critical skills (e.g., calculating the standard deviation with a computational routine). This is time consuming but most students enjoy a change of pace from the usual lecture or discussion. I have good luck with triads and quads of students working together. Larger groups inhibit the verbal interchange of ideas.
  2. Use open-book examinations! Yes, I said it! This encourages students to study concepts and relationships rather than memorize formulas. Moreover, it is one of the best anxiety-reducing techniques available.
  3. Use power not timed examinations. I generally give students two hours for what would typically be a one hour examination. This usually necessitates giving the examination in the evening and accomodating many make-up exams. I believe it is well worth the price. Students really appreciate the extra time and this is another great anxiety-reducing tool. (Consider that for many students it takes thirty minutes just to quit twitching and quaking and get down to business. :-) )
  4. I firmly believe that a comprehensive basic course that covers the waterfront of statistical techniques is WORTHLESS. It is far better to cover fewer topics but cover them in depth rather than jam a plethora of topics into the course and only touch upon the highlights.
  5. Emphasize the handful of reoccurring themes in basic statistics. For example, with any set of data we always want information about three important characteristics: (a) the form and outstanding features of the data when it is graphed through a histogram, stem and leaf plot, or a box and whisker plot (b) central tendancy or location of the data and (c) variability or dispersion of the data. This theme MUST be stressed whether you are dicussing raw sets of data or sampling distributions of statistics.
  6. Give graded assignments on a weekly basis consisting of one or two problems. This sends an important message to the student that statistics must be practiced on a daily basis and must not be allowed to slide into a single marathon study session over a two or three week period.
  7. Above everything else, maintain a sense of humor and don't take yourself so seriously. Students associate your mood and outlook with how they will perceive the content of the course. Remember you are not teaching a course in human sexuality which is inherently interesting. You must exude excitement and enthusiasm in showing students how statistical methodology can have relevance in their lives.
Thanks everyone for reading my tips for the instructor. Hopefully, some of the above will promote some lively discussion. Let me hear from you!

The Above Was Archived on 11 November 1997

The fall semester has now begun at most universities across this great country. This means that many students are experiencing for the first time an encounter with a basic applied statistics course. Whether the course is taken in business, psychology, education, biology, economics or some other discipline really does not matter. The frequency of application of certain techniques will vary from field to field but the basic concepts remain amazingly the same. If you are an upperclassman, this is the course you have postponed for several years and with great trepidation must now meet head on. If you are an underclassman, the fear is no less since the horror stories already hit the moment you arrived on campus. You must cope with this perceived encirclement by dragons. Your mental outlook and approach to this course will become the single most important determinant of a meaningful positive experience with beginning statistics. I have attempted to put together, from several decades of teaching, a short list of helpful suggestions for the student. I make no warranties that these will work with everyone. I do know, however, that many students over the years have given me feedback that these hints are quite useful. Here they are:

Study Tips for the Student of Basic Statistics

  1. Use distributive practice rather than massed practice. That is, set aside one to two hours at the same time each day for six days out of the week (Take the seventh day off) for studying statistics. Do not cram your study for four or five hours into one or two sittings each week. This is a cardinal principle.
  2. Study in triads or quads of students at least once every week. Verbal interchange and interpretation of concepts and skills with other students really cements a greater depth of understanding.
  3. Don't try to memorize formulas (A good instructor will never ask you to do this). Study CONCEPTS CONCEPTS CONCEPTS. Remember, later in life when you need to use a statistical technique you can always look the formula up in a textbook.
  4. Work as many and varied problems and exercises as you possibly can. Hopefully your textbook is accompanied by a workbook. You can not learn statistics by just reading about it. You must push the pencil and practice your skills repeatedly.
  5. Look for reoccuring themes in statistics. There are probably only a handful of important skills that keep popping up over and over again. Ask your instructor to emphasize these if need be.
  6. Be a Gestalt Psychologist! In other words, recognize that the whole of statistics is greater than the sum of its parts. It is very easy to get hung up on nit-picking details and fail to see the forest because of the trees.
  7. If you are a victim of math or stat anxiety (Probably 70 % of the general population are) do something about it! Most universities understand the debilitating nature of this problem and provide excellent counseling programs for the alleviation of this disability. Do yourself a favor and get help. This may very well be the best decision you make in undergraduate school.

If you are a student, I hope the above suggestions prove useful. Next month I will present some tips for the instructor of basic statistics.

The Above Was Archived on 7 October 1997

Merry Christmas everyone! Contrary to popular belief statisticians also believe in Santa Claus and have their wish lists. I thought you might want to see what desires I have had for many years. These are far out so be prepared!


  1. A revolutionary cylindrical statistics classroom with the following features:
    1. A wrap around chalkboard with a dustless, automatic wipe-eraser.
    2. A catwalk next to the board on which the instructor moves.
    3. A circular revolving platform on which the seated class moves.
    4. Finally, if the administration is far sighted, these classrooms can be stacked on top of one another to form the statistics classroom of the future that looks like the Leaning Tower of Pisa.

  2. A complete set of 50 trading cards portraying the Top Statisticians of All Time. The cards would be uv-coated with holographic foil stamping on both sides. The backs would contain the significant publications and contributions of each statistician. A gold insert set would feature R. A. Fisher, Karl Pearson, Harold Hotelling, G. P. Box and John W. Tukey.

  3. A year's supply of head-nodding, smiley-faced students who could be strategically seated at the instructor's discretion in any of the statistics classes. These students roar into action when the lectures become the least bit dull or boring.

  4. A 30-minute documentary 3-D movie on bivariate normal distributions with all the accompanying projection equipment. This movie would demonstrate in virtual reality the passing of planes both parallel and perpendicular to the xy-plane through the bivariate surface to yield isodensity contour ellipses and univariate normal distributions respectively. Also, the testing of the equality of the centroids of several bivariate populations could be dramatically illustrated.

  5. Semester evaluations of my statistics classes by my students that would compare to the glowing ratings received by a professor who teaches human sexuality where the material is intrinsically interesting and not steeped in mathematics.

Hope you enjoyed the above. Have a happy holiday season! If you are a statistician don't take yourself seriously and laugh at yourself. If you are a student make a New Year's resolution to attempt to understand the poor statisticians of this world who are only trying to eke out a living.

The Above Was Archived on 28 February 1997

For the month of November we have a very special report for you! From our home office high atop the grain elevator in Fooseland, Illinois we are proud to bring you: THE TOP TEN REASONS WHY STATISTICIANS ARE MISUNDERSTOOD. These are not listed in any particular order of importance but represent all those nagging suspicions you have always harbored against statisticians but were always afraid to ask about. Fasten your seat belts and here we go!

  1. They speak only the Greek language.

  2. They usually have long threatening names such as Bonferonni, Tchebycheff, Schatzoff, Hotelling, and Godambe. Where are the statisticians with names such as Smith, Brown, or Johnson?

  3. They are fond of all snakes and typically own as a pet a large South American snake called an ANOCOVA.

  4. For perverse reasons, rather than view a matrix right side up they prefer to invert it.

  5. Rather than moonlighting by holding Amway parties they earn a few extra bucks by holding pocket-protector parties.

  6. They are frequently seen in their back yards on clear nights gazing through powerful amateur telescopes looking for distant star constellations called ANOVA's.

  7. They are 99% confident that sleep can not be induced in an introductory statistics class by lecturing on z-scores.

  8. Their idea of a scenic and exotic trip is traveling three standard deviations above the mean in a normal distribution.

  9. They manifest many psychological disorders because as young statisticians many of their statistical hypotheses were rejected.

  10. They express a deap-seated fear that society will someday construct tests that will enable everyone to make the same score. Without variation or individual differences the field of statistics has no real function and a statistician becomes a penniless ward of the state.

The Above Was Archived on 20 December 1996

We are quickly approaching election day and throughout the entire month of October you can expect to be bombarded with the results of many presidential polls. The pollsters of today (Gallup, Roper, etc.) use highly sophisticated techniques that employ samples of about 1600 or less registered voters who are likely to vote. If these samples are drawn at random, the public can expect the percentages that favor the candidates to fall within a 3% or 4% margin of error. This all sounds great to the typical citizen (except if your candidate is trailing). What happens, however, if the sample is biased or in some pernicious way, nonrandom? In short, incorrect inferences may be drawn and widely disseminated, the public may lose faith, and entire polling organizations or their sponsers may go out of business! Following is what I consider to be the worst case in history of a biased presidential poll which resulted in such a devastating effect (No folks, I am not going to rehash the Truman-Dewey election of 1948).

Here are the basic concepts of a random sample and bias. A sample is considered random if each member of the population from which it is drawn has an equal chance of of being selected. I like to think of a random sample as an equal opportunity employer. A table of random numbers or a computer is usually employed to draw a random sample. A sample becomes biased when certain members of the population have a greater chance of being selected than others. The sample then tends to systematically overestimate or underestimate a certain characteristic of the population such as the percentage of a particular class of individuals. Thus, serious inferential errors may occur.

Now go back to the year 1936. This was the year that pitted the Republican, Alf Landon, against the Democratic incumbent, Franklin Roosevelt. This year was during the great depression. It also should be remembered for the record extreme temperatures and dust storms that hit the midwest. In fact, it was so hot that a statistician could not even calculate a standard deviation without working up a sweat. Moreover, the high humidity that year forced the Goudey Baseball Card Company to print only black and white cards and many of the cards came off the printing press hopelessly bowed.

A prestigious weekly news periodical called The Literary Digest continued that year a tradition of conducting a national presidential poll through the mail. Supreme faith was placed in a humongous sample of 10,000,000 prospective voters drawn primarily from telephone directories. When the returns were tallied, an easy win for Landon was indicated. This highly respected periodical staked its reputation on this outcome. With such a huge sample how could anything go wrong?

Well, as strange as it may seem, the poll was a miserable failure. Two sources of bias that were unfortunately in the same direction raised havoc with the results. First, the poll excluded non-telephone owners and hence also included a disproportionate number in the older age groups. Since many more Republicans (the wealthy) in these depression years owned phones than Democrats (the poor), it is not surprising that the returned ballots would favor the GOP candidate. Secondly, it is a fairly well known fact that members of the party out of power are far more likely to return ballots through the mail than members of the "in" party. This again supported more Republican ballots. These two sources of bias formed a potent combination that pointed toward a Landon victory. It is interesting to note that in the actual election, Roosevelt swept all states except two and won in a landslide. It is also of historical note that several years after this debacleThe Literary Digest went out of business.

As an ironic footnote to the above story, The Literary Digest, using the same sampling technique, correctly called the outcome of the 1932 election. Recall that Roosevelt ran against the incumbent Republican, Hoover. Again, economic problems were the prime issues in this campaign. However, the above two sources of bias were in opposite directions and tended to cancel one another out. That is, the use of telephone directories favored Republican returns but the "out of office" phenomenon favored Democratic returns. Thus, through sheer luck,The Literary Digest correctly predicted a win for FDR.

Here are some important lessons from these historic presidential polls:

  1. If bias is present, a huge sample has nothing to do with an accurate survey result. Even millions in a sample cannot overcome a nonrandom procedure.
  2. One should not use the mail service for a scientific poll. Too much depends on the whims of the prospective respondents.
  3. Baseball cards should not be stored under humid conditions. :-)
  4. Democrats should make sure they are listed in the phone directory. :-)

The Above Was Archived on 10 November 1996

If you have taken a basic statistics course, when the topic of probability was introduced you no doubt heard the instructor mention the time-honored coin flipping example. Flip a penny and the chances of getting a tail (or head) is 1/2. No problem-this is a concept that a primary-aged child can understand.

But change the scenerio slightly. Suppose a penny held vertically on a table by the index finger of one hand is spun vigorously with a flick of the other index finger and allowed to come to rest flat on the table. Is the probability still 1/2 of either a head or a tail facing up? Well, the head-nodders in the first few rows generally smile warmly and shake their heads up and down in agreement ( By the way, all you aspiring statistics instructors should enlist at least five head-nodders prior to the second week of class to offer you constant supportive feedback during the entire semester :-)). However, the more the students reflect on this situation, the more uncertain they become. Is spinning really the same as flipping? Finally, a carefully planted confederate toward the back of the room timidly suggests that maybe we should replicate the experiment a number of times and see what happens. Yes! Yes! Yes! Just what you want as an instructor. You quickly seize this opportunity to introduce the class to Monte Carlo type probability. You announce an extra credit assignment for everyone in the class. Each student is instructed to select a relatively shiny penny without noticeable wear, spin the penny on a table 100 times, and record the number of tails that face up. A deadly silence settles over the classroom! The students now realize they have been hoodwinked into performing a rather embarrassing act, particularly if their dorm roommates are watching that evening. Spin a penny 100 times on a table and watch it land- what type of looneyness is this? Before any student utters another word, you quickly remind them that all the results will be posted and discussed next meeting, and then you grudgingly dismiss them two minutes early.

Next meeting the fruits of your well-planned operation are realized. Thirty of your 35 students complete the experiment. Wow!- you gleefully think to yourself. That is 3000 replications of the penny-spinning experiment. Methodically, you start around the room asking each student to report the number of tails he or she obtained on the 100 trials. The numbers roll in and you write them on the chalkboard: 65, 59, 57, 64, 52, 70,... The students sit in utter amazement as a definite pattern unfolds. Overall, the numbers appear to be much larger than 50! When all the numbers are collected, a mystified student in the rear of the room suggests that we average the 30 results. Obligingly, I ask a student with a fancy calculator and a pocket protector in the front row to add up the results and find the mean. In a wink of the eye, the student blurts out 62.12. This is totally unreal! Can we place any faith at all in this finding? Does this mean that if we spin a penny on a table many times the coin will fall with tails facing up about 62% of the time?

This is no abberation. Experts refer to this phenomenon as the "pop bottle cap effect". Find a cap from an old 16 oz. bottle of Pepsi or Coke and spin it in the same fashion we did the penny. About 90% of the time or more, the cap will fall with its top facing down and the sides facing up. Now how does this relate to the penny? If you examine closely a relatively new shiny penny, you will observe that the edge around the penny protrudes further on the tail's side than on the head's side. Thus, the extra edge on the tail's side simulates the side of the pop bottle cap although certainly not as pronounced visibly. The experts proclaim that the extra edge produces results that in the long run converge on 60% tails facing up. Of course, if you use a worn penny, this advantage in favor of tails disappears. I can indeed attest to these results. In the four or five years that I have used this experiment in class, the results have hovered right around 60%. Amazing but true. I then tell my students they have a sure way of winning some money. Engage a friend ( maybe your nosey roommate from last night) in a four or five hour penny-spinning game and bet on tails each time!

Gosh Henry! It really does work!

The Above Was Archived on 4 October 1996

The field of statistics is replete with technical terms or jargon that I prefer to call "club words" in my classes. We have a lot of fun with these since I tell my students that they can derive much satisfaction from mastering these and joining a very unique club. They are then able to throw these terms around in casual conversation and blow the sox off of their friends who are not in the "club". Let me give you a few examples of some real humdingers.

Homogeneous elasticity betweeen different sizes of rubber bands. NOT!
Equal population variances.
Breeding a statistician with a clergyman to produce the much sought "honest statistician". NOT!
The linear approximation of an unlisted value in a statistical table by using two listed values.
A debilitating foot disease producing pungent odors. NOT!
The degree of peakedness in a graph of a distribution of scores.
Type II Error
An error message that pops up on my Mac when an unstable browser freezes. NOT!
Retaining a false null hypothesis in inferential statistics.
Standard normal deviates
A comparison group of sociopaths who were formally normal people. NOT!
The distribution of the standard normal curve.
Geez Albert! It wasn't that bad was it?

The Above Was Archived on 4 September 1996.

Thank You For Visiting The Archives Of Statistics Fun.


Visit the Best Collection of Annotated Stat Jokes in the World With Over 200 entries. First Internet Gallery of Statistics Jokes

For a Discussion of Questions You Always Wanted to Ask in a Statistics Class But Were Afraid of Looking Foolish See Sticky Stat Wickets.

Also, If You Want Information About the Author That Created This Set of Pages Check Home Page of Gary C. Ramseyer.

Please email comments about this page to
Page last revised on 17 July 2010

Member of the Science Humor Net Ring
[ Previous 5 Sites | Previous | Next | Next 5 Sites ]
[ Random Site | List Sites ]

Copyright ©1997-2011 Ramo Productions. All Rights Reserved.