Skip to main content

Mathematics

Foundational concepts: Bell curve/Normal distribution

Recognition of the power of techniques making use of the naturally-occurring normal distribution (the bell curve) lies at the heart of some of the most useful and powerful methods in mathematics and science.

Click here to explore the concept of bell curve/normal distribution, which lies at the heart of so many of the statistics we use in massage research.

 

Foundational concepts: Bell curve/Normal distribution

Why you may want to know this

The normal distribution is foundational knowledge for many statistics used in research, such as standard deviation, so understanding it opens the door to reading and understanding massage research articles that use those statistics.


As we often mention at POEM, words have power. The word "normal" can have strong connotations in everyday language, and can be used as implicit or explicit disapproval criticism against people who don't conform to norms of society.

In scientific usage, however, "normal" is not nearly so loaded a word. While it has the same denotations (dictionary meanings) of "typical, usual, or close to an average, according to a benchmark or standard", it doesn't carry any connotations (ideas) of positive or negative simply for being unusual.

What usual and unusual mean will vary, according to the situation. Generally, few people in the total population are extreme in many physical measurements; most are pretty close to a typical value in respect to most measurable physical qualities, which is what we'll deal with most as MTs. So, in that sense, most people are pretty "normal", and we'll remember to be aware of and sensitive to the needs of those who aren't.

For example, consider as a physical measurement the birth weight of all healthy babies born at term in the developed world. In this group, there will be a few big babies, weighing 8½ (8.5) to 9 pounds or more. The baby on the left in the following picture, born in the UK, weighed 14 lb, 7 oz (14.44 lb) at birth.

 

Source: http://img.dailymail.co.uk/i/pix/2007/08_01/nicholson1NTI1008_468x650.jpg

 

There will also be a few small babies, who weigh 6 to 6½ (6.5) pounds or less.

Source: http://latimesblogs.latimes.com/.m/photos/uncategorized/2009/03/18/premie.jpg

 

Unless some sort of problem such as gestational diabetes or premature delivery occurs, most of these babies born at term in developed nations tend to weigh about 7 to 8½ (8.5) pounds.

Let's imagine that we are keeping track of the babies born in one small region, and that 10 babies are born, with the following birth weights:

 

Baby Birth weight (in pounds)
Baby 1 7.2
Baby 2 9.6
Baby 3 7.5
Baby 4 7.7
Baby 5 7.4
Baby 6 8.0
Baby 7 5.9
Baby 8 7.6
Baby 9 6.1
Baby 10 8.9

 

 


Let's graph the data from our (imaginary) observation of birth weights.

 

Number of babies in this group that weigh less than 5.5 lbs:  0

 

Number of babies in this group that weigh 5.5 lbs-6.9 lbs:     2

(Baby 7, Baby 9)

 

Number of babies in this group that weigh 7.0 lbs-8.4 lbs:     6

(Baby 1, Baby 3, Baby 4, Baby 5, Baby 6, Baby 8)

 

Number of babies in this group that weigh 8.5 lbs-9.9 lbs:     2

(Baby 2, Baby 10)

 

Number of babies in this group that weigh more than 10 lbs: 0

 

We'll make a column chart of this data, where the x-axis (horizontal) is the birth weight range in pounds, and the y-axis (vertical) is the number of babies with that birth weight.

 
 
There is a pattern emerging in that data--most of the babies' birth weights tend to be in the middle of the range--fewer babies are either extremely large or extremely small at birth.
 
 

 

 


This much larger sample of Norwegian births between the years 1992 and 1998 was graphed in the same way. The x-axis (horizontal) is still the birth weight range, and the y-axis (vertical) is the number of babies with that birth weight. Since Norway uses the metric system, however, they report their birth weight data in kilograms (kg), so the x-axis is labeled in kg, rather than in pounds.
 
To compare that data to ours, then, we need to know how to convert between kg and pounds.
1.0 kg =  2.2 lb
2.0 kg =  4.4 lb
2.3 kg =  5.0 lb
2.7 kg =  6.0 lb
3.0 kg =  6.6 lb
3.2 kg =  7.0 lb
3.6 kg =  8.0 lb
4.0 kg =  8.8 lb
4.1 kg =  9.0 lb
4.5 kg = 10.0 lb
5.0 kg = 11.0 lb
6.0 kg = 13.0 lb
 
 
 
Even though this dataset is very, very much larger than our 10 observations--several of these birth weights in the mid-range are represented by 30,000 babies or more--we still see the same pattern we saw in our data: lots of babies have a middle-of-the-road birth weight, and the more extreme (the further from average) the birthweight, the fewer babies who have that weight.
 
 

This is another graph of birth weights, generated by the National Institutes of Health in the United States. Like the Norwegian graph, the weights in the middle of the range represent 30,000 or more babies born with those weights.
 
On this graph, the x-axis represents weights in grams (gm here, although usually abbreviated g), so the numbers along the horizontal axis are 1000 times larger than the numbers on the Norwegian graph in kg. No matter what the concept or term that we call it by is, however (gram vs. kilogram vs. pound), the physical referent--how heavy the newborn is--remains constant.
 
There is a lot of extra information on this graph that we won't be using, so don't worry about anything that seems unclear at this point, such as what "Residual" may mean. I'm talking only about the gray columns and the blue curve drawn over it, although you may recognize mean from our previous discussion (available by clicking here), and you may also recognize SD as "standard deviation", a statistical measurement that we are now laying the ground for discussing.
 
 
 

 

 

The smooth curve drawn connecting the values of these columns is the bell curve, which gets its name from the perceived resemblance to the outline of a bell. A normal distribution of a characteristic in a defined population or group describes a bell curve when graphed in this way.

The relatively few very small and very large babies are the small quantities shown at the extreme left and right sides of the graph (forming the small “tail” at either end). The higher number of 7 to 8½ pound babies make up the big “bump” or curve at the center of the graph.

The awesome thing about the normal distribution is how often it occurs naturally. Remember that our first dataset was made up up imaginary values. I chose those values carefully, to set it up to lead us smoothly into the rest of the discussion.

However, the next two datasets were real, natural data--nothing imaginary required for them. The fact that this distribution is found so often in so many different situations in the natural world allows us to draw connections that we can develop new knowledge out of.

Since data values for many natural phenomena tend to form this normal distribution—with most of the numbers in the middle and a few extreme values at either end—when not subjected to some purposeful manipulation (such as a massage treatment), this effect can be used as a baseline for measuring the distribution of data after such a treatment to see whether it differs significantly from the way the data was distributed before the treatment.

Recognition of the power of this technique lies at the heart of some of the most useful and powerful methods in mathematics and science. And over in Journal Club, you can see from the following presentation slides how Moyer and his team use this as part of their method to determine whether or not massage significantly reduces cortisol (it doesn't).

 

 

Foundational concepts: Average: Arithmetic mean

To continue going through a table of numbers pre- and post-intervention to understand what they mean, we're approaching it through some foundational concepts so that we're all on the same page about them. Once we've done that, we can bring that understanding back to the article in Journal Club.

Click here to explore the meaning of the columns labeled "M" in the table of data we're examining.

Foundational concepts: Average: Arithmetic mean

In a previous post here, where we're walking through the numbers of a study, I mentioned that the topic of what M in the table means is a large enough topic to merit a blog post of its own. Let's explore what an arithmetic mean has to do with average, and how it relates back to the previous table.

When we're engaged in conversation, sometimes we use the term average in a somewhat imprecise way. In statistics, however, the meaning of average is more focused, covering several different measures that describe data about a group in a single measure that gives one particular overall snapshot of that group. These measures represent different approaches to averaging, each of which is useful in different situations, and each has its own strengths and weaknesses.

The term mean is frequently used in the literature as shorthand for arithmetic mean, which is the most common type of mean encountered in reading massage research literature. If you ever do encounter another, less common, measurement of mean used in a study, the specific type (such as, for example, geometric mean) will be explicitly noted.

The arithmetic (pronounced "a-rith-MET-ic") mean is the method of averaging values that is likely most familiar to people reading this post, since it is commonly used to calculate school grades. It's found by adding all the values of the data together and then dividing that sum by the total number of data points.

For example, if the scores on your first 3 anatomy tests were 80, 82, and 91 (the data points), your average score for those tests would be:

Score 1: 80

Score 2: 82

Score 3: 91

 

 

Average = arithmetic mean of 80, 82, and 91 = 84.33

So your average grade for your anatomy exams in this example is 84.33.

The mean is often represented in the research literature by the letters , m, , x, M, or X. For example, the statement in a research article that

the final sample was comprised of 34 women (M age = 53)

 

indicates that the women in the study had a mean age of 53.

The statement all by itself doesn't tell us what their individual ages were, just that all the ages of the women in the study divided by the number of women in the study (here, 34) yielded an arithmetic mean of 53 years of age. So:

  • you might have 34 women, all in their mid-fifties and very close to the average, or
  • you might have a group that is composed half of women in their teens, and the other half of women in their 90s.

 

The fact that both groups' average age is 53 is not enough to distinguish unambiguously between the two situations.

 

We can easily illustrate this problem: according to the Wiki page on Bill Gates [1], his net worth is $56 billion. So if he and I sat down for a coffee, the average net worth at that table would be $28 billion.

Seriously, though, does saying that Bill Gates and I have an average net worth of $28 billion--which is absolutely true--tell you anything useful at all about me that you didn't already know?

This is the biggest disadvantage of the arithmetic mean measurement: if any of the data being averaged is extremely high or extremely low compared to the rest, the mean can be so different from individual scores in that data that it does not give an accurate description, especially when there are only a few data points.

However, in many situations, especially when the data is evenly distributed in a bell-curve formation (a topic we'll devote a post to later on), the average is a reasonable snapshot for describing to the reader what the individuals in the group are like.

It's useful enough, and so it is frequently used enough that--unless the researcher notes that it's different--when you see m or M in a table, you should remember that "M" stands for mean, and that it's typically the arithmetic mean that the author, well, means.

The columns labeled "M" in Table 1 [2] give us average measurements at 4 different points in the study/

The first M column contains average values pre-intervention on Day 1:

 

 

The second M column contains average values post-intervention on Day 1:

 

(For the moment, ignore the superscripted a; we'll get back to what that means very soon.)

 

 

The third M column contains average values pre-intervention on Day 30:

 

 

 

The fourth M column contains average values post-intervention on Day 30:

 

That is, then, what the 4 columns labeled "M" mean--they contain the arithmetic mean, or average, values, for the anxiety and cortisol measurements in the group described in the rows.

 

References

[1http://en.wikipedia.org/wiki/Bill_Gates accessed 7 September 2011

[2http://jpepsy.oxfordjournals.org/content/22/5/607.full.pdf accessed 7 September 2011

Let's do the math together

Over in the Journal Club, commenter rchunco suggested:

 

 

Source: http://jpepsy.oxfordjournals.org/content/22/5/607.full.pdf p. 613, accessed 6 September 2011

I know we haven't got into stats on the main study yet, but before we do, I think it might be a good idea to go through this chart so that people understand it's meaning. 

 

Click here to go through the table together for its meaning.

Let's do the math together

Over in the Journal Club, commenter rchunco suggested:

 

 

Source: http://jpepsy.oxfordjournals.org/content/22/5/607.full.pdf p. 613, accessed 6 September 2011

I know we haven't got into stats on the main study yet, but before we do, I think it might be a good idea to go through this chart so that people understand it's meaning. 

 

I think it's a good thing to do out here, since it's of general interest--it doesn't apply only to the article we're looking at this month, but to all of them in Journal Club, the RAAs, the Massage-CATs, and more.

To go through the chart, let's make a list of what's there that we want to make sure we're on the same page about.


  • Pre-Post Assessment: this means that the assessment (the evaluation) was carried out once before (pre-) intervention, and a second time after (post-) intervention;

  • Massage and Relaxation Groups: the relaxation group is intended to disentangle what results are due to massage itself, versus what results are due to just the fact an intervention is taking place, and what results are due to relaxation. It does this in the following way: the parents in the massage group administered massage to their children at bedtime every day for 30 days, and the parents in the relaxation group conducted a structured relaxation session at bedtime every day for 30 days.

If only the massage group were receiving parental attention, then we couldn't tell whether any changes in that group were due to the massage treatment itself, or whether it was due to the parent's attention. Since both groups are receiving parental attention at the same time for day for the same number of days, and since both groups are relaxing, those are two fewer additional factors that could confound (confuse) the process. For this reason, presumably, any effects shown by the massage group and not by the relaxation group can be traced to the massage.


  • M: mean, or average. This topic is big enough to get a blog post of its own.

  • SD: standard deviation. Like mean, this topic is big enough to get a blog post of its own.

  • Parent anxiety (STAI): Anxiety experienced by the parent as measured by the State-Trait Anxiety Inventory, a questionnaire used to distinguish between state (short-term) and trait (long-term)

  • Child anxiety (behavior): Anxiety exhibited by the child as rated by an observer using the Behavior Observation of the Child's Anxiety Level scale.

  • Child saliva cortisol (ng/mg): The child's salivary cortisol concentration as measured in nanograms (billionths of a gram, or 10-9 grams) of cortisol per milligram (thousandths of a gram, or 10-3 grams) of saliva. This is a little odd--since saliva is a liquid, I would have expected the unit to be ml (milliliters), or something similar--I don't understand why mg (which is usually used for solids) was used.

  • p < .05: Like mean and standard deviation, this topic is big enough to get a blog post of its own.

  • Higher rating is optimal.: Earlier in the article, they had said it was a 3-point scale, so from that, we know that 3 is the highest score. "Higher rating is optimal" means that now we know that the high score of 3 points is the least anxious behavior (the optimal, or best possible situation), rather than the most anxious behavior. 

  • Values typically range from a low of 0.5 to a high of 2.0. Any value in this range, then, is considered to be a normal value of the concentration of cortisol in saliva.

 

To be continued: Mean, standard deviation, p, and walking through the numbers

 

 

Syndicate content