The Rogoff-Reinhart affair and WTF, AER?

Attention conservation notice: 2,000 words on why “impact” is over-rated. Of course, bean-counting deans and administrators think otherwise. While the media cavorts over “an Excel error”, I want to talk about unconventional weighting and cherry picking the data, cherry picking papers to treat as definitive, and why working at a policy school causes me despair. To quote Hemingway: “There are some things which cannot be learned quickly and time, which is all we have, must be paid heavily for their acquiring.”

Crooked Timber, of course, says all this much more concisely.

Paul Krugman’s first of many posts on the topic gives a nice explanation of the Rogoff-Reinhart dealio–there is much that drives me crazy about Paul Krugman, but you can’t complain that he doesn’t know how to explain economics to a broad audience–because he most certainly does. The deal goes something like this: There are two recent macro papers that have purported to provide the empirical basis for tax cuts to produce growth. One is by Alberto Alesina and Silvia Ardagna:

Large Changes in Fiscal Policy: Taxes versus Spending, Alberto Alesina, Silvia Ardagna, in Tax Policy and the Economy, Volume 24 (2010), The University of Chicago Press

This paper is in a much less influential journal than the one that’s causing all the kerfuffle, which is this one:

Rogoff, Kenneth, and Carmen Reinhart. (2010) “Growth in a Time of Debt.” American Economic Review 100.2: 573–78

American Economic Review is the gold standard of econ journals. Mike Konczal at the Next New Deal blog summarizes the paper:

In 2010, economists Carmen Reinhart and Kenneth Rogoff released a paper, “Growth in a Time of Debt.” Their “main result is that…median growth rates for countries with public debt over 90 percent of GDP are roughly one percent lower than otherwise; average (mean) growth rates are several percent lower.” Countries with debt-to-GDP ratios above 90 percent have a slightly negative average growth rate, in fact.

This has been one of the most cited stats in the public debate during the Great Recession. Paul Ryan’s Path to Prosperity budget states their study “found conclusive empirical evidence that [debt] exceeding 90 percent of the economy has a significant negative effect on economic growth.” The Washington Post editorial board takes it as an economic consensus view, stating that “debt-to-GDP could keep rising — and stick dangerously near the 90 percent mark that economists regard as a threat to sustainable economic growth.”

Oh, yeah. That’s what economists say. All of the smart ones, right. Reinhart and Rogoff’s paper *from 2010* is so scientifical in the minds of WashPo editors that it’s now economic consensus. Mmmmmkay.

But that’s not how social science works. It takes time, and a lot of subsequent study, to find a result we should treat as definitive. But that isn’t what politicians or the public want to hear. And…it’s so very, very tempting to give the people what they want. It’s one way you get to the Kennedy School.

Well, what’s wrong with that? A great deal, it turns out, both in terms of the original paper’s content, methods, and conclusions. The story becomes ugly pretty fast–though not surprising to those of us who watch influence peddling/pandering happen all day every day in the policy analysis machine of academic life, in which Harvard is to the academy what Google is to search engines–there is only one in the minds of most people; most people are too lazy to use more than one search engine, and why would you when that one gives you what you want with so little effort?

After quite some nagging, apparently, Thomas Herndon (a PhD student in Econ), Michael Ash, and Robert Pollin, all researchers at the University of Massachusetts Amherst finally got Rogoff and Reinhart to share their data after trying to replicate the results unsuccessfully with data compiled themselves. When the UM researchers tried to replicate the findings with Rogoff and Reinhart’s numbers, they discovered a systematic coding error that, when corrected, shows the original conclusion–that tax cuts were expansionary during times of debt–was simply not supported by the extant data or the subsequent analysis.

Which makes me wonder about the AER as the gold standard. First, I thought you always had to share your data to get into AER, and I thought reviewers were supplied WITH YOUR DATA at the time they review. I’ve had to do that for some of the journals I’ve published in. That’s what a gold standard looks like to me. Am I missing a part of the story here?

The media, of course, is eating this up, largely because there is a delicious David versus Goliath aspect to the review and the chance that Hahhhhhvard folks might be wrong and a wee graduate student right. I strongly suspect that if this were an assistant professor at Princeton the finding would have been largely ignored in media because it would seem like academic in-fighting instead of the sexy, aw-shucks, disempowered-grad-student-makes-good story it is. I’m waiting for the next iteration of the story–or the Hollywood version–about how some meanypants proffie tried to steal credit for this brilliant result, but young economics stud pulled out an AK-47 during a research meeting while his faithful, brilliant-but-not-as-brilliant-as-he-is girl leans on his masculine shoulder.

I’m sounding a little bitter, which I am actually not, about the review and the success it bestowed upon a graduate student. The attention is a good thing, and it’s wonderful to see a young person do a replication study and get so much impact out of it–usually, replication studies are treated with less respect than they deserve. Again, this is a problem with the academy. Why do careful replication studies if the point is to be out there chasing your own Freakonomics/WOWEEZOWEE LOOKYHERE moment. But I am annoyed at the way the whole thing is being has been discussed in the media, as though this review strikes down the whole hypothesis that tax cuts might foster growth when government indebtedness is at stake.

It doesn’t. There is another paper out there, for one thing, and for another: did I not just say that social science doesn’t work like that? Yes, there are seminal papers, but it takes a long time to get to the point where we can truly call something ‘seminal.’ As usual, Richard Green and Mark Thoma have the real deal analytical problems of sussing out this question. Richard Green is here in Forbes, discussing the particulars of this particular, thorny, empirical question. Mark Thoma has well-reasoned insights about the larger problems in macro over at Economist’s View.

Krugman is careful to point out that you can’t conflate the problems with the AER paper with the subsequent, high-profile book: This Time is Different: Eight Centuries of Financial Folly.

But I kind of can–and here’s why. For all the media froth about the coding error, which is pretty bad when we are talking AER level, there two other issues raised in the Herndon-Ash-Pollin study that are straight up signs of analysis-fiddling to get the results you want. What are they? Michael Konczal explains the issues way better than I can:

Selective Exclusions. Reinhart-Rogoff use 1946-2009 as their period, with the main difference among countries being their starting year. In their data set, there are 110 years of data available for countries that have a debt/GDP over 90 percent, but they only use 96 of those years. The paper didn’t disclose which years they excluded or why. [Emphasis mine: WTH AER????]

Herndon-Ash-Pollin find that they exclude Australia (1946-1950), New Zealand (1946-1949), and Canada (1946-1950). This has consequences, as these countries have high-debt and solid growth. Canada had debt-to-GDP over 90 percent during this period and 3 percent growth. New Zealand had a debt/GDP over 90 percent from 1946-1951. If you use the average growth rate across all those years it is 2.58 percent. If you only use the last year, as Reinhart-Rogoff does, it has a growth rate of -7.6 percent. That’s a big difference, especially considering how they weigh the countries.

Unconventional Weighting. Reinhart-Rogoff divides country years into debt-to-GDP buckets. They then take the average real growth for each country within the buckets. So the growth rate of the 19 years that England is above 90 percent debt-to-GDP are averaged into one number. These country numbers are then averaged, equally by country, to calculate the average real GDP growth weight.

In case that didn’t make sense let’s look at an example. England has 19 years (1946-1964) above 90 percent debt-to-GDP with an average 2.4 percent growth rate. New Zealand has one year in their sample above 90 percent debt-to-GDP with a growth rate of -7.6. These two numbers, 2.4 and -7.6 percent, are given equal weight in the final calculation, as they average the countries equally. Even though there are 19 times as many data points for England.

Now maybe you don’t want to give equal weighting to years (technical aside: Herndon-Ash-Pollin bring up serial correlation as a possibility). Perhaps you want to take episodes. But this weighting significantly reduces the average; if you weight by the number of years you find a higher growth rate above 90 percent. Reinhart-Rogoff don’t discuss this methodology, either the fact that they are weighing this way or the justification for it, in their paper [Again, emphasis mine, and again WTH AER????]

Keep in mind that every_single_day at USC I have AER shoved in my face as the holiest of all that is holy when it comes to scholarly rigor, and HOLY SCREAMING MEEMIES, bunnypants, there are three big, honking things here that should have come out in peer review. First, how do you get away with not disclosing which countries you are leaving out? And second: how do you get away with not explaining your weighting? And why didn’t anybody demand to see the consequences of these major analytical choices in robustness checks? These are not excel errors. These are not esoteric things that only economists can understand. These are basics of modeling research.Read More »

Peter Gordon and Greg Mankiw offer us some help in fixing the broken world

Greg Mankiw has somehow managed to struggle past the ordeal of some students not listening to him (which never happens to the rest of us) to pen some advice on how to fix the tax code. With some quibbles, it’s a good list, particularly the idea that you:

BROADEN THE BASE AND LOWER RATES The United States tax code is filled with deductions and exclusions that shrink the basis of taxation. The smaller base in turn requires higher tax rates to raise the revenue needed to fund government. The starting point of reform is to reverse this process.

This can work on multiple levels. So dealing with what Peter Gordon recently described to me as the frou-frou of the tax code would probably make it much fairer, or at least more tractable. But it also works for sales taxes. No, I am not saying tax food; I am saying that way too many states, like California, exempt services they shouldn’t. My favorite recent example came from Kevin Holliday: $300 Botox injections are exempt from California sales tax, but baby wipes aren’t. Given how much of higher income consumption centers on services, that exemption is counterproductive for both fairness and revenue.

Peter Gordon’s contribution has been to set forth his platform when he runs for president. I would vote for him, largely because I want his office (1) , but also because his list of plainspoken reforms is pretty good. I disagree with him on a couple of points, but particularly this one:

Envy is one of the seven deadly sins and leaders should never incite it.

Feh. I know we are supposed to buy into the idea that current concerns about inequality reflect mere envy, but we shouldn’t. History tells us that there are real problems with inequality (as well as envy). But the whining of the wealthy in this country that they just can’t chip in more taxes lest they not be able to have more $15K handbags and island getaways in Fiji (2) smacks of another one of the seven deadlies: greed.(3)

In the end, all this is horse poop. The lines between ambition, greed, and envy strike me as so fuzzy as to be irrelevant most of the time. Envy is desire (4); greed is desire, ambition is desire. So pretending these things are bad for us has a function–as moderation of our desires is good–but we all know we need these drives to get out of bed in the morning. And if this isn’t true, why aren’t all those supervirtuous gazillionaires competing to live in the smallest lots in the mobile home park while they focus on the things that truly matter, like the true love of family and contentment with what we have? (Easy, stomach).

(1) This is a case of envy.

(2) I know, I know those islands in Fiji are all job creation, for reals! Unlike those dented cans of soup purchased with Food Stamps, which create No Jobs Ever.

(3) Look, I was raised a fairly observant Catholic, and I still practice, so there’s no way you are going to out-debate me on topics contained in the category Things to Feel Guilt Over. Did you know envy wasn’t part of the original grouping of mortal sins? I believe Pope Gregory stuck it in there. I myself have all of Thomas Aquinas’ riffs on gluttony, thank you. Don’t even get me started on my trail-blazing in acedia.

(4) Sure, as Dante notes, envy isn’t just want: it’s the desire that others shouldn’t have things you don’t have. Ok. But where does ambition come from if you don’t see other people with what you want? And I’m a bit tired of this envy idea being used in the same sense of wondering why the superrich shouldn’t be expected to pay taxes to start dealing with a deficit incurred for spending on lots of things that, if the numbers are any indicator, benefitted them greatly.

Dantes inferno artwork

Tax cut versus spending cuts–editorial by Christina Romer

Joseph Cordes brought this op-ed to my attention, by Christina Romer in the New York Times. In it, she discusses how one of her papers, written with fellow economist and spouse David Romer, has been misinterpreted by tax hawks:

Some in Washington and in the news media have seized on a study I conducted with David Romer, my husband and colleague, that they say shows tax increases having a bigger short-term effect on the economy than spending cuts.

They are mistaken.

That’s a very polite, scholarly way for saying, how’s about you people go read the paper before you talk about it?

I’ve been thinking about the comparative advantages of broadsheet versus digital reading of newspapers. One advantage is for the web: Romer can just provide the whole world with a link to her excellent original manuscript, which was published in AER [pdf]. Professor Cordes and I were discussing this manuscript, and Cordes pointed out that the original paper is a nice example for PhD students to study because it blends archival research to nail the exogenous changes and then uses that to specify and estimate an econometric model. The other nice thing about the manuscript, other than the brilliant writing, are the ways that the models are illustrated. It’s not an overly mathematical piece to begin with, but a layperson giving a close reading to the results and the graphs can understand what is going on here.

Romer lays it out nicely in her NYT article–it’s about getting people to move money in the short term:

There is a basic reason why government spending changes probably have a larger short-term impact than tax changes. When a household’s tax bill rises by, say, $100, that household typically pays for part of that increase by reducing its savings. Its spending tends to fall by less than $100. But when the government cuts spending by $100, overall demand goes down by that full amount.

Wealthier households typically pay for more of a tax increase out of savings, and so they reduce their spending less than ordinary households. This implies that tax increases on wealthy households probably have less effect on the economy than those on the poor or the middle class.

She also gives cuts their due in the article.

Romer also links to a nice review that appeared in the Journal of Economic Literature that synthesizes the empirical work on how people respond to tax rates, one of the basic aspects of the Laffer curve idea, the holy grail of Reagen-era supply siders.

Romer goes through the Laffer stuff a mite fast, so it’s worth discussing in more depth. The theory is that the relationships between marginal tax rates and government revenue follows a parabolic function:

Voila Capture83

This isn’t the most sophisticated graphic you’re going to find (thanks, Wikipedia), but it works. The idea is that as governments increase the marginal tax rate, the amount of revenue they raise goes up until it hits a critical point. Then the revenues go down because people progressively scale back on doing the productive things they do to generate income because that income is worth less and less to them, and thus, the whole pot of income to tax goes down, and thus, government revenue goes down.

Not the worst theory ever seen, certainly. But the key empirical questions are: where is the downward kink, and where are we relative to the kink? Or, more realistically, the downward kink(s) because different class groups and wealth groups are probably going to act somewhat differently from each other and from the aggregate function shown in the picture. The JEL piece by Saez, Slemrod, and Giertz talks about the behavioral aspects of the income changes, and as Romer suggests, finds that behavioral changes are pretty slight.

It turns out, though, despite all the Reagen-era flouncing around about being taxed to death, the laffer idea wasn’t empirically true: tax cuts just plain cut revenue. It seems clear that either we were on the front side of the kink, or the theory doesn’t stand up well. (After all, even though a subset of Americans seems to believe that taxes are money that is just taken from them, never to be seen again, taxes are returned in the form of public goods and programs, and many of those are productive and valuable to individuals, like infrastructure investment, such that an individual’s utility curve may include them.)

Alternatively, you can go with Glenn Beck’s version of the Laffer curve. In fairness, it’s not the easiest set of concepts to explain. But also in fairness, he blazes new trails in screwing it up. My various macro professors, particularly Charles Whiteman, will probably need to dive into the drinks cabinet to get through this one.

I’ll post more about some other interesting aspects of Romer’s article tomorrow.

Republicans, Democrats, and why, despite everything, I respect Larry Summers

The Financial Times Opinion page this morning is utterly brilliant.

If you want to know how I feel about both the Republicans and Democrats (and why wouldn’t you, as I am sooooooooooo important), Clive Crook nails it: America prefers fiscal idiocy to framing intelligent choices. Or, as my colleague Richard Green says, I don’t see anybody out there acting like a grown-up.


Larry Summers also has an absolutely brilliant submission in their “A-List” series called “How we can avoid stumbling into our lost decade.” If there is a more lucid, patient, and clear discussion of the current macroeconomic condition of the US, I have yet to see it. His advice, BTW, is to extend the stimulus.

An essay on books about monetary policy

So yesterday I was grumping that people don’t understand monetary policy, even though they sling around the term, and in fairness, it’s a complex field. Or, a dark art, depending on how you view macro in general. Macro simply isn’t amenable to the type of empirical study that micro is. You can study millions of trades in many fields of micro: housing markets, stock trades, transport mode choices, etc etc–you can model and validate over time and space, and while identification may be difficult to achieve, you usually have plenty of instances to observe.

With macro, you only have so many central banks, so many nation states, etc. etc. You generally don’t have random experiments in contexts that really capture the phenomena of interest; and you have precious few natural experiments. How many previous housing bubbles do you have to study? How many of the previous are even remotely like the one we just experienced? And conditions change, fast. You can look like the smartest person in the world one day, then in a flash, the dumbest.

This contrast is a bit of an overstatement, and there are things that bridge micro and macro in ways that you can test rigorously–like, say, labor (Diamond’s field), consumption, prices, etc. Still, those insights into macro phenomenon are viewed, even among specialists, through a glass darkly. That’s why you want very well-trained and disciplined people working at central banks rather than your unemployed son-in-law simply because he, like you, believes in small government and Jesus.

I don’t know that there’s a “Varian*” of monetary policy, but I don’t know of it if there is one, commonly beloved introductory text. I liked Carl Walsh’s book when I used it, but it would be nice if people could suggest a more recent text. There is a 2003 3rd edition, but still.

I’ve not gotten all the way through this more recent book:

Mishkin, F. S. (2007). Monetary policy strategy (illustrated ed.). Cambridge, Mass.: MIT Press.

But I got far enough to note that it’s pretty approachable as far as monetary policy books go. He uses a lot of case studies, which will probably be more helpful and more interesting to nonspecialist readers.

I really wish Milton Friedman were alive now to respond to the conditions of the bubble and recovery strategies. It would have given him a chance to respond to critics and to either clarify or refine his theories vis-a-vis a new context, and to argue with people like Paul Krugman. I feel like I would get smarter reading that argument.

As annoyed as I am at Republicans right at the moment for being idiots about Peter Diamond, I also get annoyed at liberals who judge Milton Friedman’s contributions to economics by responding to his political philosophy. These two things are hard to separate–indeed, Friedman himself put them together. Nonetheless, his contributions to monetary policy were important to the field. My favorite is

Friedman, M., & Schwartz, A. J. (1971). A monetary history of the united states, 1867-1960 (illustrated, reprint ed.). Princeton, N.J: Princeton University Press.

Voila Capture75

Read this in contrast:

Krugman, P. R. (2000). The return of depression economics (reprint ed.). New York: W. W. Norton & Company.

Voila Capture77

And read these two, with a caveat:

Friedman, M. (1969). The optimum quantity of money (reprint, illustrated ed.). New Brunswick, N.J.: Transaction Publishers.


Keynes, J. M. (20011). The general theory of employment, interest and money. Martino Fine Books.

Here’s the caveat: I don’t really like the 2011 edition of this book. Keynes originally published this material in 1936; I have a first edition of the book. It’s turgid, as a lot of writing was then, but glancing through this 2011 edition, I didn’t like what has been cut. Find an older edition and be patient. It’s not like slogging through Friedman on the gold standard is any picnic.

I’m currently reading the 2009 year’s Pulitzer Prize winner, and it’s a very good book on monetary policy, from a finance guy:

Ahamed, L. (2009). Lords of finance : The bankers who broke the world (illustrated ed.). New York: Penguin.

Voila Capture78

I am enjoying it very much so far, but I’m only up to German reparations for the First World War, so I have a ways to go before we get to the crash.

One last comment: be wary of reviews from people who say “Keynes has been proved wrong/right; Friedman has been proved wrong/right” yada. These are multi-dimensioned theories, and it’s very likely that they are both wrong and right about different aspects of the universe they are attempting to explain.

The point for an educated person is to try to think about what is “baby” and what is “bathwater” about the the theories they are reading. To abuse yet another metaphor: blind men, elephants, yada. Children expect the world to give them “right” and “wrong” answers. Educated adults read and think–and keep doing so recursively, and humbly, their whole lives.

*Hal Varian wrote the seminal intro to micro textbook, and so it’s easy to recommend it to people wanting to go beyond their intro to micro class.

Visualizing the Bubble

Decision Science News has a wonderful graphic on visualizing the housing bubble, using the Standard and Poor numbers. This is a really data-rich graphic, one of those that is both simple yet complex, and you can spend a lot of time thinking about what the different trajectories mean in terms of the extent and timing of the bubble. This is their first graphic–go to the website and read the rest of the presentation. A second, cleaner graphic calls out the ‘exceptional’ stories–a good way to build a narrative with graphics: first show the whole picture, then select what people should take away.

I’d like to know more about the bounce at the end of those numbers; every city gets higher, then crashes, then (for most, not all), there’s a bounce, then another fall.

Voila Capture11

The graphic was made with ggplot2, one of my favorite new toys for R.

On the ‘jobless recovery’: Chris Hayes in The Nation

Professor Tom Jankowski pointed me to this piece by Chris Hayes in the Nation. He breaks down the policy indifference to unemployment in Washington DC in two numbers: the unemployment rate among those with college educations (low) and the unemployment in booming Washington DC (also low–it’s a place with a lot of unemployment).

This manifests itself in our politics in two ways. For one, it just so happens that policy-makers, pundits and politicians are drawn from the classes that are in recovery, and they live in an area where new sushi restaurants are opening all the time. For even the best-intentioned and most conscientious staffers and aides this has, I think, a subconscious effect. Think of it this way: two office buildings are operating side by side in Chicago’s Loop in the middle of a brutally cold January day, when the heat in both buildings gives out. The manager of one building has an on-site office, so he finds himself plunged into cold; the other building is managed remotely, from a warm office whose heat is functioning. If you had to bet, you’d guess that the manager experiencing the cold himself would have a bit more urgency in restoring the heat. The same holds for the economy. The people running the country are not viscerally experiencing the depredations of this ghastly economic winter, and they lack what might be called the “fierce urgency of now” in getting the heat turned back on.

My views on Wisconsin, expressed brilliantly by Cosma Shalizi

I haven’t had much to say about the debacle that is Wisconsin, but I wanted to point you to the brilliant comments of Cosma Shalizi. I don’t know Shalizi personally, but I follow his blog, Three Toed Sloth, faithfully because the writing is simply excellent. I’m endlessly fascinated with what he studies.

Here are his comments on Wisconsin. I wish I had written this paragraph:

the single biggest thing which has gone wrong with America during my lifetime has been the economic stagnation for most of the country, accompanied by shifting risk from those who have resources and large organizations to individuals who don’t have much. And that has gone hand in hand with the decline — the repression — of organized labor. Unions are not perfect, but no human institutions are, and to condemn unions, specifically, because they are sometimes hide-bound or self-serving is either folly or deceit. Unions are the only organized force in this country which seriously advocates, which pushes, for the material interests and dignity of ordinary working people. The fight in Wisconsin is about whether there is, finally, a limit to how far the dismantling of American labor can be pushed.

Folly or deceit, indeed.

I’d argue for both folly AND deceit.

The money thing? That’s just smoke and mirrors. The public sector has competed with the private sector for labor largely through offering benefits. I make a much better salary than comparable UC faculty…but they have better benefits than I do. I have to save for my own retirement much more so, etc.

In the end, this shameless power grab will not save the state of Wisconsin much of anything, particularly for professional labor. Civil engineers in the Wisconsin DOT do not have to put up with crap wages and crap benefits. There are consulting jobs out there–with much higher wages and more limited benefits packages. And civil engineers are more valuable to consulting firms once they’ve done their time in state agencies.

I suspect that even though unions do prevent free entry into the lower end of the labor market that the primary beneficiaries of collective bargaining are blue-collar state workers, not white collar bureaucrats as the Republicans claim. White collar workers have more job mobility and more options.

So you gut your benefits and whatever you think you’ll save busting the union, you’ll have to make up for loss of benefits with salary. I suspect there’s revenue parity in that trade for all but the least skilled workers. So this little power play is coming at the expense of snowplow drivers.

Of course, you could try for lousy pay and lousy benefits and see what quality of laborer that gets you. Nothing like a fabulous imbalance in professional capacity between agency professionals and their contractors to make for lots of lost revenue on inept management.

Edited to add: and just like magic, Richard Green points us to an entry from the Economist that makes my point–that, once employed, low-skill workers benefit from unions more than other groups. There is, of course, the remaining issue of those who are worse off because they’d like more hours and can’t get them due to wage constraints introduced by collective bargaining. However, we do have to wonder which group is larger.

Of course, since the politico in question has never made any bones about how little he cares for the security of poor workers, such information is irrelevant in the face of his desire to be on national tv for a presidential run later.