The Practice of Social Research

Chapter Sixteen.  Social Statistics


Descriptive Statistics
    Data Reduction
    Measures of Association
    Regression Analysis


Descriptive statistics have the purpose of, well, describing a set of data.  For the most part, they represent an act of data reduction--summarizing a lot of data in a single number or limited set of numbers.  We've already ventured into this terrain in Chapter 14, when we examined measures of central tendency (averages).  Starting with, say, all the ages of the 2,000 people interviewed in a survey, we can represent those ages with their mean or median.

In the present section, we'll look at some ways of summarizing the strength of relationships among variables: measures of association.  In this discussion, you'll see the importance of the Chapter 4 discussion of levels of measurement.  To sumarize the relationship between two nominal variables, for example, we could use lambda.  You'll get your first taste of the logic of proportionate reduction of error as a way of determining the strength of associations.  Here's an example of what that means.

There is at least a stereotype that men like football more than women do.  Let's say we do a survey to find out.  Assume further that fifty percent of the sample says they like football.  Finally assume that we want to see how well we can guess whether individuals in the sample like football.  If we always guess they do, we'll be wrong 50 percent of the time; the same would be true if we always guessed they didn't like football.  In fact, if we flipped a coin each time to determine our guess, we'd be wrong about half the time.

Suppose 70 percent of the men said they liked football, contrasted with only 20 percent of the women.  Knowing a person's sex would help us predict whether they liked football.  Our best strategy would be to always guess that the men liked football and always guess that the women didn't.  Even though we'd still make mistakes, the reduction in the number of mistakes offers a measure of how strongly sex and liking football are related.

Gamma is a statistic based on the same logic, which is appropriate to representing the relationship between ordinal variables.

Correlation and regression are statistical techniques appropriate to ratio measures.