Statistical analysis may not be perfect, says Professor Michael Steele, but in a world full of uncertainty, it’s the best tool available for predicting the future.
By Robert Strauss
Economics is widely seen as the “dismal science,” and in the opinion of many, statistics is not far behind. But during a conversation with Professor J. Michael Steele, the image of statistics as dry, abstract and far removed from everyday life is quickly dispelled.
It’s not just that Steele finds statistics intellectually challenging, that he enjoys teaching it, that his research has led him all over the globe, or that he has been hooked on the subject ever since high school when he read a book on probability theory called Lady Luck.
Steele, the C.F. Koo Professor of Statistics and Operations and Information Management, sees statisticians as the only people in the history of mankind who have ever succeeded in predicting future events. “Fortune tellers or astrologers may do well for a while, but soon enough they end up being exposed as frauds or charlatans,” he says.
Statistical analysis has a different track record. While it is not without flaws, and doesn’t always lead to powerful conclusions, and at times creates more uncertainty than most humans would like, “it is, nevertheless, the best technology the planet has come up with to say something serious about the future.”
Steele sees statistics everywhere: In the theory that leads to improved quality in our television sets; in the current health care debate over medical outcomes; in the day-to-day work of stock analysts; in the number of murders expected in the U.S. during 1995; in a gambler’s lawsuit against an Atlantic City casino; and in the enormous R&D efforts of organizations as diverse as the pharmaceutical giant Merck and the U.S. Army — just to name a few of the examples that came up during a recent conversation about the ubiquity of statistics.
Hidden Markov Models
Statistics is composed of two parts, the first being the collection, organization and integrity of data. “We all take for granted a report in the newspaper that tells us the inflation rate,” says Steele. “But if you actually sat down and looked at the procedure for figuring out the Consumer Price Index, you would be astounded at the intellectual background behind it and the amount of labor and money that goes into it.”
Even the definition of terms requires an exacting level of precision: “For example, what do you mean by `inflation’? Is it service sector inflation? Producer inflation? Original goods and industrial output inflation? Inflation of prices of real property? Simple measurement of the economy is a difficult process and to do it right, the government must make a sustained investment in data collection and data integrity.”
The same, of course, is true for work that goes on in the private sector. “The whole pharmaceutical industry, for example, is driven by rigorous experimental processes that are completely rooted in statistics, such as drug testing, drug design, drug searching, and allocation of resources to drug investments,” says Steele. “Drug testing itself is a multiple process where you have to do toxicological studies, test for efficacy in animal populations, run clinical trials for humans, and finally develop a set of protocols to monitor for side effects,” to name a few of the measurements used.
“Merck is just one company in the pharmaceutical industry, but its research budget for 1994 was about $1.2 billion — more than the total National Science Foundation budget for physical sciences, geosciences and engineering. And the intellectual agenda that drives that research department is deeply rooted in statistics.”
The second component of statistics is model building. Steele’s own interest lies in probability theory, which he defines as the mathematical subject that goes into the modeling of chance behavior in all of its aspects. The phenomenon can be as simple as rolling dice or as complicated as the errors that occur in sophisticated measurements, such as those used to identify the location of the space shuttle in flight.
Scientists conduct analyses of a statistical experiment by comparing what they observed in their data to alternative explanations that are driven by chance, says Steele. “The stalking horse of every statistical analysis is chance — the possibility of the observed outcome just being a random occurrence. In drug tests, for example, if you give a drug to 25 people, and 14 do well and 11 don’t, then to understand the force of the evidence, you have to find some way to compare your data to a chance occurrence.”
Steele is particularly interested in Markov processes and in their statistical cousins, hidden Markov models. Markov processes are named after a Russian mathematician who worked around the turn of the century, but hidden Markov models are a recent innovation. They are built around an abstract notion called “the state of the process.”
“Imagine a chessboard, where on each square of the board someone has written a letter. One of those squares might truly have the letter ‘L’ in it, but if you saw a corrupted picture of the chess board, you might see in that square what you think is an ‘I’. In this example, ‘L’ would be the true state, what is truly there, and ‘I’ would be your incorrect but still partially informative observation. In hidden Markov models, you use a probabilistic relationship — the Markov property — to tie together all your imperfect observations to make the best guess you can about the true values, or true states, of the squares on the board.”
The hidden Markov model offers an abstract framework that serves many areas of application. One can use the framework to analyze the economy during periods of expansion or contraction, or to analyze the responsiveness of psychiatric patients during periods of mania or depression. Hidden Markov models are part of an ongoing project Steele is pursuing for the U.S. Army Night Vision Laboratories. The project has several technical goals, but ultimately these goals all come to focus on how to improve images and how to use images to make decisions. “The process of being able to decide whether or not the object below is a target is a decision problem which, at a certain level of abstraction, is not so different from the process used in pharmaceutical companies to determine the effectiveness of a particular drug,” says Steele. “Instead of looking at what happens to 25 rats, you are looking at the gray levels of a sensing device for as many as 100,000 pixels (the smallest unit that makes up a computerized or digitized image).
“For any one of those pixels in an image there is a true value, called the true state. But given that there are all sorts of measurement error — including angle error, calibration error and so forth — what you observe is not that true state. The true state is a hidden state and it influences your perception but doesn’t necessarily determine your perception.
“With hidden Markov models, you tie together all those imperfect observations to make the best conceivable guess you can about the true values, true states, which led to the picture you are looking at.”
Steele’s probability models and associated algorithms — the mathematics that precedes the actual writing of computer code — are used by the army to design equipment that will facilitate tasks like automatic target recognition.
In another recent, but quite different, application of statistical analysis, Steele used a mathematical tool called the theory of Martingales during testimony as an expert witness in a lawsuit against an Atlantic City casino. The suit involved a way of playing blackjack called “counting,” in which a card-playing patron keeps track of a simple summary statistic (called “the count”) of the cards that have been played, and uses this statistic to inform his betting and playing strategy.
Because the dealer doesn’t reshuffle after every hand, information about the cards is carried over from one hand to the next, and this information can be used to increase the player’s odds — even to the point of having an advantage over the house. The plaintiff in the case claimed the casino had denied him the opportunity to practice counting, and thus deprived him of the opportunity to earn substantial money. “The nature of New Jersey public access laws granted the plaintiff rights that he would have been denied in Nevada, such as the right to be present at the table and to engage in counting,” says Steele.
Steele used the theory of Martingales to estimate the probability of ruin — i.e. the probability that the plaintiff would lose all of his initial stake in the course of applying the counting methods over a period of years. More central to the case, and to the estimation of damages, was Steele’s estimate of how much the plaintiff would have won if allowed to play according to the conditions that are provided for typical, non-counting patrons.
The substantial judgment awarded to the plaintiff is expected to be appealed. If the appeal is denied, Steele says, one likely outcome is that the rules of New Jersey blackjack will be changed.
Empowering the Experts
A subject that Steele returns to frequently in conversation is “this puzzle about predicting the future” — or at least pieces of it.
“It is frightfully easy to predict that over the next 12 months there will be about 23,000 murders in the U.S.,” he states. “Naturally there is uncertainty in this estimate, but I will bet you a boatload there won’t be fewer than 18,000 or more than 30,000. Each one of those murders is a cataclysmic event for the people involved, yet in aggregate we can say something pretty precise and sensible about that subject.”
Statistics is at its best, Steele says, when lives and livelihoods are at stake. Data on the growth of the over-60 population as a percentage of the total population, for example, may not seem particularly compelling to many, but in Steele’s view, changes in that simple ratio can cause the shift of hundreds of billions of dollars of the nation’s wealth over the next 20 years.
“Statistics is not always a very good tool, but it’s often the best that is available. Most times, uncertainty is far greater than we humans would like to admit. We have a preference for certainty and at times we will become more `certain’ even when nothing material has changed. In one recent psychological study — and almost all of these studies are written on the basis of statistical analysis — an interviewer asked gamblers waiting in the betting line at the race track about their confidence in the horses they were about to bet on. Other gamblers were asked the same question after they had placed their bets. The players who had already bet were much more confident in their choices than those who hadn’t yet put their money down.
“It’s one example of how consoling confidence is to human beings. We like certainty, even certainty founded on the most fickle of bases. Statistics offers one example of a sound basis. But statistics isn’t there to make anyone feel good. It’s there to reveal truth.”
Statistics can, in fact, be troubling to those who want to make risk-free decisions. Steele recently went to India with a group of MBA students and took the highest-recommended anti-malarial drug on the market. “The best data they have on this drug is that it’s 75 percent effective. Well, if you sit down and consider what it means to have a 25 percent probability of contracting malaria given a certain level of exposure, it’s pretty terrifying. Still, this is an extreme case. If you look at the childhood vaccines, statistical analysis will help you understand that they save many millions of lives. This does not deny that the vaccine may cost some lives as well. In rare circumstances, there are adverse reactions to quite safe things.”
In choosing which drug to take, which car to buy, how to allocate our personal assets, we all rely on the experts, says Steele. “We have no alternative in many cases except to make decisions by empowering the people who work on such decisions all day long every day, and then holding them accountable for their success and failure over a large number of actions. And the experts, in turn, must rely on experiment, the integrity of data, and a lot of people telling the truth about the data that has been collected.”
Wine Cellars and Lamp Posts
Steele graduated from Cornell University in 1971 with a degree in mathematics and earned his PhD in mathematics from Stanford in 1975. He taught in the statistics departments at the University of British Columbia, Stanford, the University of Chicago and Carnegie Mellon. From 1983 to 1990 he taught at Princeton, first in the statistics department, and then in the engineering school where he created a program called Statistics and Operations Research. He came to Wharton in 1990.
He just recently completed service as a member of the National Academy of Science Committee on “Improvement of Data on International Capital Investments” and as editor of Annals of Applied Probability, which he founded in 1989.
Steele also has been involved in the practical sides of finance. Several years ago he worked at Sanford Bernstein in New York as a fixed income research statistician. His job included studying aspects of the yield curve and the development of statistical tools for the pricing and trading of U.S. Treasury bonds. At Wharton, current projects include statistical analyses that support risk management in the financial service industries and an empirical analysis of foreign exchange management that has led him to conduct interviews with managers of foreign exchange trading at many of the nation’s largest financial institutions.
In his teaching of statistics, Steele makes a serious effort to enliven his classes with material chosen from practical situations. In Statistics 202 for undergraduates, alongside assignments on simple and multiple regression models, rates of change of parameters and predictors, multiple correlation coefficients, and the relationship of honest residuals to the X matrix, he includes references to an analysis of the Olympics in “Going for the Gold in Barcelona,” an Economist article entitled “Battle of the Bulge — Female Literacy Rates and Birth Rates,” a Consumer Reports article on “Secondhand Smoke” and an introduction to the National Victimization Survey.
One section of the course is devoted to using regression analysis to determine the investment potential of vintage wines. “The students used data on the prices of California Cabernet Sauvignon at New York auction to estimate a rate of return for the holding of quality wines,” says Steele. His intent is to give students “experience with the uses of statistics in economics, public policy, business decision making and personal finance.”
One important skill you learn in Steele’s course is how to chase down and evaluate data. There is a multitude of research reports produced on Wall Street that are filled with statistics; the print media produces its own volumes of data; and then there are the mountains of data generated by the federal government. “The challenge is to figure out which data is trustworthy and which isn’t,” Steele says. “Students start to understand that there are very big differences in the quality of statistical sources — for example, the high integrity of statistics from the National Victimization Study vs. the near worthlessness of a fax-back poll conducted by a publishing company.
“The typical fax-back magazine poll is a marketing device, and many of the inferences one might try to make from such polls are completely bogus,” says Steele. “The key problem is that people who choose to respond to the poll are self-selecting. The typical respondents are those who care passionately about an issue. For example, a fax-back poll about abortion wouldn’t necessarily be a sample from which you could make any inferences about the national population. And yet the results of magazine polls get quoted — for better or for worse.”
One of the old saws about statistics, Steele adds, is that it’s “like a lamp post. You can either use it for illumination or for support. When you are using it for support, it’s probably not in the nation’s best interest.”
Fact or Pseudo-Fact
In Statistics 202, Steele asks his class to look at data from the Barcelona Olympics in an attempt to discern whether the host country — Spain — received a larger number of gold medals than would have been the norm based on the number of silver and bronze models. (It did.) The students were then asked to investigate whether this suggestion of host favoritism also held true for other Olympics.
“In statistics we try to use some facts we know to predict other facts. Suppose you told a statistician how many silver and bronze medals Germany won, and asked for his or her best guess as to how many gold medals were won. In the case of Barcelona, we discovered that Spain did indeed get more gold medals than our model would have predicted.”
But in Steele’s recounting of the Barcelona study, he declines to offer an opinion as to why host countries might win an unexpected number of gold medals. “In most cases, I don’t see any great utility in speculating about causes,” he says. “I try to discourage it. You have to fight against the human urge. On the other hand, I’m perfectly prepared to have people think about what might be useful and testable inferences for the future. For example, in 2000 the Olympics will be held in Atlanta. I’m prepared to make a modest bet about the relationship between the number of golds and silvers that will be won by the U.S. I won’t bet you on the absolute number of golds or the number of silvers, but I’ll bet you on the relationship between them.”
The question is also raised as to whether statistics are used to confirm or refute prejudices. Steele readily admits that he is a very skeptical person. “First off, some of the statistics one reads in the newspaper are simply made up. They are pseudo facts. For example, there was the story a few years ago that reported an increase in spousal abuse during the Superbowl. When people decided to chase down the study on this, it didn’t exist. But people believed it because it was consonant with their intuition.
“In the field of science, the tradition of academia is that one provides citations so you can go back to these citations and check them. When you do, you almost always find there are more caveats in the scientific paper than were mentioned in the media coverage. In part, the media resorts to caveats in order to provide a view of the best of current thinking, but one may also suspect that the `quest for certainty’ leads to the simplified message.
“When I pose a model for some data I realize I am posing a very specific model and that the model might not be right. In fact, it is almost certainly wrong at some level of precision. I have to examine what is in the data that is consonant or dissonant with that proposed model. You have to entertain alternative hypotheses. We all want to have clean and clear statements that represent the full truth. But that’s not right most of the time.
A statistician’s job is to remind himself constantly, and others periodically, that there is much more uncertainty out there than you can guess. And still, statistics is the only consistent technique available for helping us sort out whatever truths are to be found in our valuable data.”