CHAPTER 2. THE UNIFYING NATURE OF STATISTICS

The average person sees no need to study higher mathematics. They leave this to the engineer or the professional mathematician. In the movie "Welcome Back, Peggy Sue" the heroine goes back in time to her high school. In algebra class she excuses herself from the room. In response to the teacher's beginning protest statement she stops him short by saying "Trust me. I know I will never have any use for algebra in my life." The attitude sums up the popular thought that mathematics is useless for daily life. This chapter, however, will show the value of higher mathematics, and, more specifically, that statistics underlies the scientific study of all liberal arts disciplines.

Anyone interested in doing in-depth analyses of any of the liberal arts needs to understand the logic of statistics. The good news for people who are afraid of statistics and math is that the logic of statistics is not difficult to understand (and is actually more important than knowing the mathematical formulas). Furthermore, the modern educated citizen needs to understand statistics to be able to sort through the correct and incorrect uses of statistics in order better to determine the truth from falsehoods. Not only do statistical methods let us uncover the hidden mysteries of the universe, they also greatly aid our logic, for an understanding of statistics is actually better than a course in the traditional philosophical course in logic.

The Bases for Statistics: Physical Behavior of Atomic Particles

When physicists began to study the atom closely, they found that they could not always determine the locations of atomic particles. For instance, the position of the electron at any given point in time is subject to chance. The reason for this (Trefil 1980:38-39) is that the electron has properties of a particle, but also properties normally associated with waves. Given the wavelike properties, quantum mechanics uses probability equations to determine the location of electrons. The Schrodinger equation predicts the probability of an electron being at a certain point if we know the wave function. Another way of putting this, is that, if you give a physicist the shape of the wave pattern at one point, he or she can then predict the shape of the wave and where it will be for any given future time.

The variation of the location of atomic particles may appear to be unpleasant news for those who wished for an orderly universe. This randomness, however, is good news, not bad. Random variations in the atom may be the ultimate explanation for the randomness we find in all the liberal arts. And this is good news indeed because we already know that the mathematics of probability apply to all the liberal arts disciplines, and we already have very sophisticated techniques to account for random variation.

The methods of statistics can describe the movements of the atoms and its particles, and, therefore, can be used to describe every other phenomena in the universe. Rather than appearing disorderly because of random variation, the world now becomes very understandable, for the laws of probability make the world considerably orderly.

The existence of statistical probabilities means that the universe is following the equivalent of natural laws, but in this case they are the laws of probability. All things in the universe follow these basic laws. This sets the stage for understanding all forthcoming knowledge. Indeed, these laws form the context in which ecology and evolution function.

The Logic of Statistics

Every elementary student learns the value of basic mathematics for their daily lives. This assists them in such key activities as paying money for their daily activities. They see the value of counting and its associated possibilities such as addition, subtraction, multiplication, and division. They see this as very practical, but in no way related to science. What they do not realize is that they have the key to the in-depth analyses of all the liberal arts disciplines available to them is this very simple ability to count, for counting is the first step in statistics.

Statistics starts with descriptive statistics, and the simplest descriptive statistics merely counts the number of occurrences of a single factor or variable. Examples include how many living species exist in the world, how many American adults regards themselves as members of the Democratic and Republican parties, or the intelligence quotient of a population.

The occurrences or values of such an interval level variable as intelligence quotient can be plotted on a graph with the intelligence quotient value on the X or horizontal axis and the number of occurrences on the Y or vertical axis. The resulting variation of the values creates the well-known bell curve. This means that most occurrences of the event are at the top of the bell. This point of the most occurrences can be described in three different ways. It is the mean, for it is the arithmetical average of all the values of the occurrences of the variable. It is also the mode because it is the most frequently occurring value. And it is the median because it is the midpoint value of all the occurrences. As one moves away from the mean-median-mode point, one moves to the sides of the bell. The amazingly useful property of the bell curve is that it applies to all the liberal arts, and can be described in very standard ways that makes comparisons possible no matter what the subject matter.

These simple descriptive statistics tell us a great deal of information and entire books are sometimes composed of these descriptive statistics. If, however, the investigator wants to dig deeper to discover the cause of the variation in the descriptive statistics, he or she has to use more sophisticated, but not difficult, statistical methods. A simple way to judge if there is a relationship between two variables (which may be causal in nature) is to make a cross-tabulation of two variables, with one variable being considered the cause of the other. The causal variable is called the independent variable and the effected variable is the dependent variable. The values of the independent variable become the rows of the cross-tabulation. The number of occurrences of the dependent variable at each value of the independent or row variable are calculated and written in each cell of the cross-tabulation. The totals for each cell can then be turned into percentages. For instance, we may want to study the influence of social class on political party vote. The independent variable is the three social classes of upper, middle, and lower, which become the three rows. The values of the dependent variable, Republican and Democrat, become the two columns. When the values for each cell are calculated, and then compared, we find that the higher the social class the more likely to vote Republican. We can tentatively conclude from this association that there is a causal relationship between social class and political party affiliation.

These cross-tabulations can be made more complicated by adding one or more independent variables to the cross-tabulation table. But the problem with this methodology is that the more variables one adds, the more the table cells proliferate. The situation soon gets out of hand. Since there is usually a limited number of observed cases, these occurrences become thinly spread out within the table. One quickly finds that there are few or no values at all in some of the cells. This makes drawing conclusions from the data highly precarious. This problem, however, is quickly solved by moving to a different statistical methodology.

In the simplest case, of two variables, the variables are plotted on a graph. The dependent variable is plotted on the Y axis, while the independent variable is plotted on the X axis. The value of the dependent variable for every value of the independent variable is plotted on the graph and produces a scatter diagram. If there is a causal relationship, there will be some pattern to the dots. Usually, there will be a clustering of the dots along a diagonal line drawn from the intersection of the two axes. If there is no pattern, the dots will be spread around the graph, looking more like a round cloud of dots. There are various statistical measurements that summarize the degree of relationship between the two variables, one of which is the degree of correlation of the two variables. Correlation varies from 0 to 1.0 with a perfect association being a value of 1.0. The beauty of this methodology is that many variables can be applied statistically at one time to produce a better explanation of the variation in the dependent variable. The method of multiple correlation allow analysts to look at the impact of each independent variable, while holding constant the variation in the other independent variables. This is a wonderful methodology that is by itself free from bias (although the research question and the research variables can both be biased).

Nothing helps the logical mind more than a course in statistics. Many times in life people make the most naive statements, ones they never would have made if they had just known a little statistics. The most common example comes in the realm of racism, where racists claim that race is a direct cause of crime. A simple control for the variable of economic status shows the naive person how much of the variation in crime by race disappears when considered independently from the influence of social class. And, of course, there are hundreds of other variables that have to be controlled before one could conclude a causal link between race and crime.

One note of caution about statistics needs to be stressed. In the sciences and social sciences the dominance of statistics has gone too far and become a substitute for thinking about questions that cannot be adequately treated by statistical methods. This, however, is more the result of the larger society discouraging academicians and others from drawing larger conclusions that might call into question some of the basic assumptions of the society. Despite this caution, statistics should be taught in school more than it is. And it should start early.

Students should be taught that statistics dominates in all the scientific and social scientific journals, and is spreading in the humanistic disciplines. Therefore, one cannot evaluate the statements of the liberal arts disciplines without a basic appreciation for mathematics and statistics.