“Show me your model [’s non-banal output useful for policy, prediction, or at least understanding the real world better than verbal/visual methods; granted toy models are useful for teaching some classroom concepts] !!!”
It is well known that mainstream DSGE economic modeling is hopelessly flawed (e.g., here, here, here [discussed here], here, here, here, here, here, and here; there are fundamental issues with other, finance models as well).
But let’s assume, for the sake of argument, that mainstream modeling strategies could work, and that we merely haven’t quite “gotten there” yet. More smart people, surely, building and modifying models will eventually lead to models “good” enough to usefully predict, or at least usefully simulate (for policy choices) the real economy…right??
The title (a nod to Blade Runner’s complex name backstory) refers to a “Gödel machine,” which is a “hypothetical self-improving computer program that solves problems in an optimal way” and “works towards the betterment of the model just like in any self-driving car [which] permits a computable strategy for making near-optimal predictions.”
The pop allusion to Gödel’s “incompleteness theorem” is only questionably relevant to non-mathematical reasoning. Nevertheless, the concept of both a Gödel machine (single, rational, optimizing-agent anyone?) and the incompleteness theorem serve to highlight an important meta-issue regarding modeling: What happens if the knowledge needed to judge models exceeds the ability of any person or even group of people to attain? Can the models themselves continue to advance, as if they were Gödel machines?
An obvious reply might be: We use machines all the time that embody vastly more knowledge than any single person can truly understand (any modern computer/smartphone, for starters, with its software and hardware). Or a medical specialist who has expertise on one part/system of the body, another on another—together they know enough to care for patients with no single expert understanding anywhere near what the group as a whole does. We can do this for models as well (an expert on one aspect of the model tweaks it, another tweaks another part and so on).
But complex predictive models encompassing the real world, or even a narrow slice of it,* are different. If you tweak one aspect of a model, it affects or potentially affects (even that is part of the problem: we don’t know) other parts of the model. And like a small input difference of data or small original design difference in other complex systems, this has potentially large effects on overall results.
Furthermore, it may not be a coherent response to just “test and adjust, test and adjust.” An amateur mechanic, faced with tuning up a complicated turbocharged race-car, might be able to guess at the effects of an adjustment, make it, test it, and undo or increase the adjustment. With enough tests, even the amateur could succeed in making the complex race-car run better.
But these types of tweaks and observations are not really possible with models that address any meaningful scope of the real economy. There is always some new aspect of the real world that has changed (for predictive tests). Even if merely performing tests on historic data, you can never know if you are simply model-fitting old data or actually creating something useful.
I started worrying about the problem of obtainable knowledge/“overview” when surveying complex papers going into grad school around 2002 (I was exploring whether to focus on natural experiments, instrumental variables, and/or causal understanding using Structural Equation Models, Directed Acyclic Graphs, Bayesian Networks, etc.). In DSGE and other modeling approaches one runs across cases where modelers are choosing (“we apply,” “a suggested approach,” “one common method,” “the preferred technique,” “conventions are” etc.) some form of manipulating data, parameter adjustments, estimation techniques, testing results and so on (not to mention, of course, the choice of data to start with). There follows one or many citations supporting the choice. But: Does the hypothetical future researcher trying to judge the value of the research actually have the expertise embodied in the citations supporting those choices to judge them? Or the time to read, assimilate, and consider the quality and relevance of those supporting citations? If not, what point do they actually serve?
We already know replication has been a problem in many fields. Who checks the work? Who can check the work? We have to wonder, given the esoteric nature of the questions in current modeling, whether it is possible for researchers to genuinely check the choices papers are making, the plethora of methods, approaches, assumptions, and/or techniques they are using. Perhaps this is not possible even in their own area of interest. So imagine the situation regarding key research that uses related but different approaches (VAR, Hank, DSGE) and all of their varieties and issues (“Markov-switching Factor augmented VAR models,” “estimation of nonlinear DSGE models using particle filter approximations of the likelihood function,” “Gaussian pseudo-maximum likelihood estimation,” etc.). Then, how are more fundamental issues treated in model design and interpretation (e.g., how are Bayesian, frequentist, or likelihoodist assumptions supported?). Philosophers and statisticians are far from agreement on the latter question. Do economists know something they don’t? (Consider, e.g.: Why I am Not a Bayesian; Why I am not a Likelihoodist; Why Isn’t Everyone a Bayesian?; Objections to Bayesian Statistics …one gets the distinct impression these are not settled questions.)
It could be that only scores of economists are even capable of understanding a given choice or technique used in or added onto some papers/models. If this sounds far-fetched, consider that already in 2010 James Galbraith could write:
“I was just at a meeting of European central bankers and international monetary economists in Helsinki, Finland. After one paper, I asked a very distinguished economist from Sweden how many people he thought had followed the math. He said, “Zero.” (Galbraith, 2010, p. 1)
Perhaps even a “Gödel machine” might not be feasible simply because it is not understandable, even by experts:
“I’ve been working with simpler and simpler models, as I find it hard to keep the intuition and quantitative parable aspect alive as models get more complex.” (Cochrane, 2014).
However, what if the “machine” just works? Isn’t that proof enough of its correctness and utility?**
A Nomological Machine†
The obvious counter to worries about complexity exceeding understanding is that we will know a modeling approach is right when it starts consistently spitting out accurate predictions (or at least usefully simulating the real economy). If this were achieved, it could be argued it wouldn’t matter if the whole model is effectively a black box to any given researcher. If it works, it works.‡
However, the very hope that this might be achieved is based on fundamental misunderstanding of how laws of nature and data combine in the universe. The hope that economics can build a nomological machine by constantly adding to its models is a chimera even more so than that DSGE models can somehow overcome the fallacy of composition (Sonnenschein–Mantel–Debreu) fatal flaw (Rizvi 2006). This will be taken up in a future post…
[Note: I have not addressed other modeling approaches that may be able to achieve things RBC/DSGE models cannot; these include Agent Based Models (demonstrating emergent properties), nonlinear system dynamics modeling, and monetary stock flow consistent (SFC) modeling. More on these, especially the latter, in a future post.
Oh, and why “chartblogging,” done correctly, reflects the deepest statistical reasoning humanly attainable.]. Related: Contingency is just so and Initial Conditions as Exogenous Factors in Spatial Explanation
* Other deep questions underlie assumptions. For example, how is the problem of “different ways of conditioning on the state of the world” dealt with? (Bierens and Swanson 2000):
“…our dilemma…We would like to examine a large portion of the world, and given the correct model specification, we should learn more by examining this large portion of the world rather than a smaller one. However, our models are always approximations, hence the more complex the model, the larger the likelihood of model misspecification. Moreover, we are left with the problem of determining whether our “portion” of the world is general enough to adequately mimic the characteristics of the economy in which we are interested.” (Bierens and Swanson 2000, p. 8 of the PDF)
** If you can duplicate the essential structure of an economy, and feed it enough data, it should theoretically give usefully realistic output. To believe this to be achievable is reasonable given 1) successes in modeling complex phenomena in fields like engineering, and even biology and epidemiology (although there is some relevant controversy concerning recent events on epidemiology) 2) massive increases computing power 3) massive increases in data.
† The concept “nomological machine” is attributed to Nancy Cartwright, specialist in the philosophy of causation who also writes on Economics. Cyril Hédoin summarizes some of her thoughts:
“economic models as socio-economic machines are entirely idealized constructs that fail to isolate any causal factor. What is thought as the isolated causal factor is actually a mechanism that is structurally tied to all other elements in the model. Hence, economic models cannot help us to learn about capacities. The requirements of internal validity are met at the cost of the external validity of the model-as-though-experiment.” ( Hédoin 2014 [2011, p. 8])
(To be clear, Hédoin himelf argues that economic models are useful in a pragmatic way.)
Overall I believe Cartwright’s work (and that of Roy Bhaskar from which her work heavily draws) are both entirely wrong, but in a very interesting, informative way.
‡ In the Hitchhiker’s Guide to the Galaxy the building-sized supercomputer “Deep Thought” is built to give the answer to the “Ultimate Question of Life, the Universe, and Everything.” After seven and a half million years of calculation, it spits out the (now famous) answer: 42.
The people are unable to interpret the answer, so ask it to build a still more complex computer to ask what the question is. It is notable that the computer model to answer the question, from the point of view of Earth, had zero exogenous factors and incorporates every factor and bit of data possible: Because it is Earth itself.
UPDATE (May 29, 2020): The post’s title here of course refers to Philip K. Dick’s wonderful title “Do Androids Dream of Electric Sheep,” which in film turned into “Blade Runner” in the interesting story I link to above.
It turns out I am not original in linking Dick’s excellent name to economics. Chapter 5 in Philip Mirowski’s Machine Dreams: Economics Becomes a Cyborg Science (2002, Cambridge University Press) is titled “Do Cyborgs Dream of Efficient Markets?” I have only skimmed the book but it looks very interesting, something I should have already read.
References (in addition to links)
Bierens, Herman J. and Norman R. Swanson. 2000. “The Econometric Consequences of the Ceteris Paribus Condition in Economic Theory.” Journal of Econometrics, Vol 95, Issue 2, pp. 223-253.
Cochrane, John. 2014. “Summer Institute,” Sunday, July 13, 2014, The Grumpy Economist blog.
Galbraith, James K., 2010. Foreword to Seven Deadly Innocent Frauds of Economic Policy by Warren Mosler. St. Croix: Valance.
Hédoin, Cyril. 2014. “Models in Economics Are Not (Always) Nomological Machines: A Pragmatic Approach to Economists’ Modeling Practices.” Philosophy of the Social Sciences 44: 4, pp. 424-459. (2011 pre-print pdf online)
Rizvi, S. Abu Turab. 2006. “The Sonnenschein-Mantel-Debreu Results after Thirty Years.” History of Political Economy 38 (Suppl_1): 228-245.