What is the matter with empirical economics? freak, freakonomics again.
Earlier, I posted a link to Rubenstein’s excellent Freak Freakonomics paper. Here is another even more telling paper, that is also an indictment of the way economics is doing its business. I am posting so much of it because I think that it is so good.
I have never been convinced by econometric results. When I was in grad school I imagined that it might be fun to tackle a serious economic problem, but instead of using real data I wanted to take a very large number of series of random numbers. Once that could be called long-term interest rates, another the log of long-term interest rates, and others representing treasury note rates, gross domestic product, the rate of change of gross domestic product, and so on. Then run regressions until I get a particular mix of data it gives an “interesting” result — sort of like the monkeys typing up Shakespeare.
If my data did not create something interesting, then I would have to massage it, using lag data or transforming the data in some other way until my Shakespearean monkeys finally got it right.
My reverie occurred during the 1960s. To perform my experiment was required mechanically punching up thousands and thousands of cards, so I was left with a pleasant daydream rather than hard work of running my experiment.
My advisor was George Kuznets, whose brother, Simon, deservedly won the Nobel Prize. George Kuznets was an extraordinary econometrician who had trained some of the best, but he never published anything except when he was coerced by a friend to write a little paper on the demand for lemons. He understood, although he never uttered a word about it, that most of what was being published was not very solid.
And yet, many of the econometric are absolutely brilliant people and, at least when I was in graduate school during the 1960s, very conscientious as well. I have gone into more detail about my skepticism about the way economics is practiced in Railroading Economics, although I did not spend much effort doing with econometrics. This article does a very nice job. I have cited much more than I normally would because so much of it deserves wider exposure.
Scheiber, Noam. 2007. “How Freakonomics Is Ruining the Dismal Science: Freaks and Geeks.” The New Republic (2 April).
“In the early ’90s, Angrist and Krueger set off to resolve a question that had been gnawing at economists for decades: Does going to school increase your future wages? [Angrist, Joshua D. and Alan B. Krueger. 1991. “Does Compulsory School Attendance Affect Schooling and Earnings?” Quarterly Journal of Economics, 106: 4 (November): pp. 979-1014.] Intuitively, it seemed obvious that it did. When you compared the salaries of, say, Ph.D.s with those of high-school dropouts, the grad-school set almost always did better. The question was whether education accounted for the difference. What if it was simply the case that smarter people spent more time in school and that their bigger salaries reflected intelligence, not education? One couldn’t be sure. The only way to get to the bottom of it would be a ghastly social experiment, wherein you took a group of students and randomly sent half to the local vo-tech institute while forcing the other half to study feminist literary theory. Even an economist wouldn’t be so audacious.”
“The public school system allowed you to answer the question without all the uproar. First, most states force students to attend school until age 16. Second, for many decades, students started school the year they turned six. The upshot was that, if I were born in January and you were born in December of the same year, and we both dropped out at 16, then the rules forced you to stay in school almost one year longer. (We’d start school the same year, but I’d turn 16 midway through tenth grade and you’d hit 16 midway through eleventh.) The additional schooling foisted upon one group by this arbitrary state of affairs produced a scaled-down version of our experiment, allowing Angrist and Krueger to conclude that education did, in fact, help people earn more money.”
“Harvard grad students had invested the Angrist and Krueger paper with totemic significance.”
“It quickly became apparent that the way to win acclaim as a grad student was to devise a similarly ingenious method of tackling a problem. Several years after his paper on schooling, Angrist noticed that the Armed Forces Qualifying Test had been misgraded for a few years in the late ’70s. This had opened the doors to thousands of subpar applicants and allowed Angrist to compare the lucky underachievers with the people rejected once the glitch got corrected, thereby isolating the impact of military service on wages. The practical effect was to send the grad students scrambling to find other instances in which life-altering decisions had been handed down incorrectly. In 2000, a Harvard professor named Caroline Hoxby discovered that streams had often formed boundaries to nineteenth-century school districts, so that cities with more streams historically had more school districts, even if some districts had later merged. The discovery allowed Hoxby to show that competition between districts improved schools. It also prompted the Harvard students to wrack their brains for more ways in which arbitrary boundaries had placed similar people in different circumstances.”
“… economics had a cleverness problem. How was it that these students, who had arrived at the country’s premier economics department intending to solve the world’s most intractable problems — poverty, inequality, unemployment — had ended up facing off in what sometimes felt like an academic parlor game?”
“What if, somewhere along the road from Angrist to Levitt to Levitt’s growing list of imitators, all the cleverness has crowded out some of the truly deep questions we rely on economists to answer?”
“By the ’80s, however, the data-crunchers had come down with a crisis of confidence. In one famous episode, the eminent economist H. Gregg Lewis reviewed several studies on unions. What he found was alarming: Some papers reported that unions strongly increased wages; others reported exactly the opposite. The difference, in most cases, was simply the assumptions the authors had made.” [Lewis, H. Gregg. 1983. “Union Relative Wage Effects: A Survey of Macro Estimates.” Journal of Labor Economics, 1: 1 (January): pp. 1-27.]
“Critiques like this tipped the discipline into a prolonged bout of soul-searching. The old approach had been sweeping in its ambition. But what good were ambitious goals if the best you could do was “on the one hand/on the other hand” style equivocation or, worse, plain gibberish? “People didn’t believe the estimates being produced,” recalls David Card, then a rising star at Princeton. “They felt the evidence in economics was not very credible.” Economists had long aspired to science. Suddenly they faced a harrowing thought: What if they were no better at pinning down truth than the average critical studies major?”
“Having glimpsed this nihilistic vision, many economists ran screaming in the opposite direction. They concluded that the path to knowledge lay in solid answers to modest questions. Henceforth, the emphasis would be on “clean identification,” on sorting out what caused what.”
“The early practitioners of this approach — Angrist, Krueger, Card — had well-earned reputations as crafty researchers. But, by and large, all three men used their creativity to chip away at important questions. It was only in the late ’90s that the signs of overreach became apparent. To some professors at top departments, clean identification became a fetish. “Almost every student, myself included, had the terrible experience of getting up in front of the [professors] for whom identification is the Holy Grail, and getting cut to shreds when your identification strategy doesn’t pass muster,” recalls a recent Harvard Ph.D. The problem is that there are only so many big questions that misgraded tests or arbitrary boundaries can shed light on. If you’re wedded to these techniques, eventually they lead you in obscure directions. “People think about the question less than the method,” says Berkeley professor Raj Chetty, one of the most sought-after Harvard graduates in recent years (and a notable exception to this trend). “They’re not thinking: What important question should I answer?’ So you get weird papers, like sanitation facilities in Native American reservations”.”
“Many young economists began shunning big questions altogether. Jim Heckman, a Nobel Prizewinning labor economist at the University of Chicago, illustrates the point with a story. A few years ago, he struck up a conversation with a promising assistant professor. Before long, Heckman began to gripe that economists lacked a comprehensive measure of all the obstacles a child might face in life — education, nutrition, family environment, and so on. There was no way to tell if childhood disadvantages were getting better or worse.”
“He encouraged the young economist to produce such a measure. “You’re absolutely right. It’s paramount, of first-order importance,” the woman replied. But there was zero chance she’d pursue it. “It would take years to do,” she explained apologetically. The woman had clearly made the right career decision: No one coming up for tenure could afford that kind of time while her colleagues published a stream of small-bore papers. On the other hand, says Heckman, “How long did it take Madame Curie to purify radium? Two or three years? If she was somehow told, like the grad students in our program, If you can’t solve the problem in a week or a month, drop it,’ a lot of big problems would have been dropped”.”
“Heckman’s allusion to a certain pedagogical technique is almost surely a shot at his Chicago colleague, Steve Levitt. Levitt has been known to discourage students from laboring too long on a question for which the data are unlikely to produce a useful result. “I’ve always been someone who’s thought it’s better to answer a small question well than to fail to answer a big question,” he says. This much should not be surprising. Levitt is the product of the same environment that birthed the clean-identification movement.”
“After graduating from Harvard and doing a brief stint in management consulting, Levitt earned a Ph.D. from MIT in 1994, completing the program in a mere three years. “In the early ’90s, if you had a really great natural experiment, as we called it, you’d get yourself a job,” he says. Levitt, it turned out, had many. While still a student, Levitt wondered whether money drives election results or if the better candidate simply raises more money. He ingeniously demonstrated the latter and published the results in a top journal. Another early paper found that a slight increase in the chance of arrest dramatically deterred auto theft. Levitt discerned this by studying cities that had approved the use of Lojack, a transmitter that leads police to stolen cars. In 2001, Levitt published what is probably his most controversial finding to date: a paper highlighting the connection between the legalization of abortion in the ’70s and the falling crime rates of the ’90s. Levitt argued that unwanted children are most at risk of becoming criminals. Abortion, he concluded, lowered crime rates by reducing unwanted pregnancies.”
“Some of these papers made genuinely important contributions. The Lojack paper helped demonstrate that theft is a fundamentally rational phenomenon and can therefore be discouraged. This insight alone might have justified Levitt’s John Bates Clark Medal, a prize awarded every two years to the most outstanding economist under 40. But, at times, Levitt gave the impression he was more interested in clever techniques than answers to questions. In a 1997 paper, for example, Levitt argued that hiring more police decreases crime, a proposition for which there was surprisingly little evidence. (The fact that municipalities expand police departments when crime rates rise tends to muddle the picture.) To prove it, Levitt needed to simulate an experiment in which the size of a police force was randomly increased. His solution was to exploit the fact that mayors often hire more police officers in the run-up to an election. The only hitch, as a grad student later pointed out, was that mayors up for reelection don’t actually hire many police officers, at least not enough to show that they lower crime.”
“Whatever the flaws in this study, they were clearly the product of too much ambition, not too little. A few years later, however, Levitt debuted a new kind of paper: an investigation into offbeat phenomena from daily life. One of the earliest examples pondered the strategies soccer players employ when taking penalty kicks. Another paper studied corruption in sumo-wrestling tournaments as a window onto the power of incentives. Not long after, Levitt conducted an exhaustive inquiry into “Weakest Link,” a game show in which contestants voted to remove a player after each round of trivia questions. Tallying the voting data revealed that contestants were discriminating against Latinos and the elderly but not blacks and women.”
“If Levitt is known for his novelty, the hallmark of Heckman’s work on such issues as education and job training is its painstaking attention to detail. A few years ago, Heckman was rumored to be so upset over the direction of his department that he began looking to leave. Chicago had never been an ideal place to do empirical work. Nobel Prizewinning theorists like Gary Becker and Robert Lucas disliked dirtying their hands with data. Now the department was finally warming up to data-crunchers … and they were the kind Heckman deemed useless. “Chicago has been a little disappointing in that it hasn’t been more firm in rejecting cute and clever,” he laments.”
“Perhaps the most infamous example is a paper written by a recent Harvard Ph.D. named Emily Oster. While still an undergraduate, Oster had become fascinated by the so-called “missing women” problem — the hypothesis, attributed to Amartya Sen, that gender discrimination in Asia has created a vast shortage of women. In some cases parents abort daughters, in some cases they commit infanticide, in some cases they simply don’t care for their daughters as diligently as they should. Whatever the cause, Sen has suggested there could be as many as 100 million “missing women” in countries like China, India, and Pakistan.”
“Years later, while wrapping up her Ph.D., Oster stumbled onto a seemingly unrelated fact: a small medical literature suggesting that women with hepatitis B were far more likely to give birth to boys. What followed was a series of sophistica
ted natural experiments, the upshot of which was to demonstrate that 100 million women hadn’t gone missing after all. Instead, unusually high rates of hepatitis B had arranged it so that Asian mothers were producing far more boys than nature’s track record would suggest.”
“It was a fabulously compelling result, one that partially absolved whole societies of lurid crimes against their children. It was also a vindication of the Freakonomics worldview. Levitt published Oster’s paper in the Journal of Political Economy. He and his Freakonomics co-author, Stephen Dubner, took to the pages of Slate to breathlessly retell her “economics detective story.” And then, just as suddenly, it all fell apart. A snot-nosed grad student from Berkeley pointed out that hepatitis B couldn’t possibly explain the missing women problem. It turned out Asian women gave birth to daughters at the same rate as women everywhere else, at least during their first pregnancy. It was only during subsequent births that the ratios changed. Either a bunch of Asian women were running out to get hepatitis B in between their first and second pregnancies, or, as Sen feared, people were taking dramatic steps to avoid ending up with two girls.”