CHAZEL HAKIM – APRIL 23RD, 2021
Nicholas Otis (“Nick” throughout this interview) is a current graduate student in health economics at UC Berkeley. His research interests revolve around developmental and behavioral economics. I recently had the opportunity to meet with him and discuss his current work on forecasting and other topics of interest.
Chazel Hakim: First off, could you briefly discuss your educational and professional background before coming to UC Berkeley?
Nick Otis: I did my undergraduate studies at McGill University in Canada and also a master’s degree there. After that, I went and worked as a research assistant for Johannes Haushofer at Princeton University. I was based in Princeton and then also spent time living in Kenya working on his project.
Chazel Hakim: What got you interested in going to Kenya during your research assistantship?
Nick Otis: It was kind of a coincidence. Johannes had a bunch of projects in Kenya, and I had been really interested in his work. He had a project, for example, that was about looking at the effects of giving people unconditional cash transfers, and he provided evidence that you can observe the effects of the cash transfers on people’s salivary cortisol, which is a biological measure of stress. I thought, “That’s pretty cool. I’d love to work with this guy.” I applied to be a research assistant, and I got the job. Part of my work was to just sit in a room somewhere and work on statistical analysis, and the other part was to go and get things done in Kenya. So it’s kind of a happy coincidence.
Chazel Hakim: I see. So then what drew you to the economics field specifically? And why did you choose UC Berkeley to continue your studies?
Nick Otis: When I started undergrad, I thought I’d be a philosophy major. I remember I took a few classes, but I was having trouble seeing how these courses would result in a tractable path to applied issues in moral philosophy, which is what I was interested in. I remember reading some early behavioral economics work by Daniel Kahneman and others (Kahneman et al., 1997) that connected the utilitarian foundations of “old-timey” economics to some of the more modern uses of utility, as in utility functions. That introduced me to the world of economics. I took a few economics courses, and I was excited by how much emphasis there was on causation, as well as how rigorous and applied it was, which was kind of missing in some of the other courses I was taking. And so I was basically sold on it.
In terms of choosing to go to UC Berkeley, Berkeley has an amazing group of development economists and an amazing group of behavioral economists. I’d also never spent much time in California, so that was also a draw. So it was mostly the people that I wanted to work with here, and I know that Berkeley’s economics and economics-adjacent programs were really good in the areas that I was interested in.
Chazel Hakim: Let’s move on now to your current research. Most of it focuses on the topic of forecasting, specifically the forecasting of experimental results. Could you first discuss how you got interested in this topic?
Nick Otis: Yeah, of course. When I was living in Kenya, as part of my research assistantship, I had this side project (i.e. this article, which recently came out in PNAS). We were giving people tiny cash transfers—just a few dollars worth—and would vary the frame that accompanied the cash transfer. We would tell some people, “You get money because you’re a lower-income individual and you’re eligible for welfare.” Or we would say, “We’re giving you a few dollars because we believe in you. We believe that you can empower yourself and choose your own directions in life.” Or we would say, “We’re giving you money because we think that you can use it to help your community.”
So we’re varying the frames associated with these cash transfers, but we already had an idea of what we thought might work. And I remember we were doing some formative qualitative research where we would talk to some of the individuals that were similar to the people who would later receive the cash transfers. These individuals seemed to have pretty strong priors about which of these framings would be effective and which wouldn’t be. That got me to think that it would be cool if there was information in people’s predictions or beliefs about what interventions would work or not. Something like this could be valuable to development economists and other people doing applied work who often have to decide which policy to implement with limited information about what works or what doesn’t.
If people could guess what will work because they have contextual knowledge, then that would be really helpful. And in that initial work, we found that people’s predictions for these different framings for cash transfers provided more accurate estimates of the causal effects than moderately sized pilots with a bit over a hundred people. So, in short, asking 25 people provided a better prediction of the causal effect of the treatment than a small pilot, which people would often run as an alternative way to gather preliminary evidence. That’s how the work started, and then I kind of expanded from there.
Chazel Hakim: Your most recent working paper, “Forecasting in the Field,” does try to expand on that work you just talked about. Could you discuss what questions you’re attempting that answer in that paper, and how you’re going about answering those questions?
Nick Otis: Here’s the kind of big-picture motivation for that paper. There was some really encouraging evidence from Stefano Dellavigna and Devin Pope, who had run this experiment on Mechanical Turk. They had people predict the results of a bunch of different interventions to try and motivate people to exert costly effort: the people had to press the keys A and B over and over for a number of minutes. And they found that, in this massive experiment with around 10,000 people and many different treatments, the people that were similar to the intervention recipients were able to do a really good job of ranking which interventions would be more or less effective. If you look at their average predictions of different interventions, those predictions did a really good job of ranking things.
And that was sort of the foundational paper in forecasting the causal effects of interventions. It was an inspiration for a lot of the things that I’m working on right now. Stefano and Devin had this really clean setting where they were running this pretty tight experiment that was in an online “laboratory-like” setting. Things are relatively controlled, the interventions are pretty straight forward, and there’s this very provocative result about the accuracy of people’s beliefs. And so I thought, “Let’s see what happens if we try and take that same idea—that people on average might be able to predict which kinds of interventions are most effective—let’s look at it in an applied setting like field experiments in Kenya.”
I then spent a lot of time figuring out which projects I could work with to collect predictions. The hope was to collect predictions of studies before anybody knew the results (including the principal investigators of the project), select which outcomes we were going to have people predict, and then wait some time to see what the experimental results were. We got this really nice set of projects. The projects looked at the general equilibrium effects of cash transfers, the effect of cash transfers compared to the effects of a mental health intervention (sort of similar to cognitive behavioral therapy), and an aspirations and goal-setting intervention, benchmarked against cash.
We went out and collected predictions from academics and from people similar to the intervention recipients on the causal effects of these interventions. And we used a set of outcomes that had already been pre-registered by the authors of the paper. To sum up my answer to your question, the main motivation for me on this paper is the following: I’d like to see how well people can predict which policies are going to work in a setting where there’s generally a lot of uncertainty about what’s going to be effective and where it’s very costly to evaluate things. Forecasting won’t replace running big randomized controlled trials, but maybe forecasting can help us choose what we run in those trials.
Chazel Hakim: As discussed in one of your other papers, forecasting also has potential benefits for the economics field, such as the possibility to mitigate publication bias. Could you talk about some of those benefits in more detail?
Nick Otis: This is still in the early days for work looking at forecasts of experimental results, so we’ll have to see how valuable forecasting turns out to be in the first place. But here are the things that I think are promising. The first is the one that I just talked about. It’s great if there’s a lot of information in people’s predictions; we can use that information to improve the selection of interventions.
Here’s the second. There’s a big problem right now that’s been getting a lot of attention in economics and in the other social sciences, which is publication bias. A lot of studies are often evaluated and published based on whether the results are significant as opposed to, for example, how interesting the question is—regardless of the statistical significance of the finding. An experiment, for example, might have a null result, and reviews say, “Well, that’s not super interesting,” and the study ends up not being published.
How could forecasts come into play here? Let’s suppose you have a bunch of experts who say, “I think this intervention is going to be very effective.” But after, the results show that the intervention is not effective. All of the sudden, what might feel like no information in a null result could become exciting to people, as it’s something that’s actually providing a lot of information relative to the priors of academics in the field. So by collecting these predictions, we may be able to help correct publication bias to some extent by putting research results in context. The context of a null finding could be that it deviates substantially from academic priors.
Chazel Hakim: What other questions are you interested in exploring and researching in the future, both in terms of forecasting or otherwise?
Nick Otis: What am I interested in researching in the future? Okay, so here’s part one: Even if forecasts can accurately predict what will work in some situations, they probably aren’t going to work everywhere. Hopefully, we can provide some bounds around the types of circumstances where forecasts will be accurate. I’m sure they won’t be accurate for predicting every type of result. I think some, perhaps many, results are truly going to be unexpected. So it’d be nice to have some bounds around the domains where we think forecasting will be useful.
I think the bigger picture question that I’m interested in is a bit more challenging to answer. In the best case scenario, where we know which circumstances forecasting works in, we would have a new mechanism [forecasts] to choose things from a choice set. Forecasts will help us pick the policies from the choice set, but it won’t tell us what to put in the choice set. You can think of the question: How do you get people to show up to school? How do you reduce unemployment? How do you help people pick better health insurance? People can think of many different potential policies for each question but it isn’t not feasible to test all of those ways. We need research on both how to efficiently choose things from the choice set, and also on how to populate the choice set. We don’t know, in my opinion, that much at all about how to design the choice set that we select policies or interventions from.
So I’d like to examine how people create the choice set, how people come up with (or produce) the policy interventions. “Let’s create the choice set, and then let’s use forecasts to extract the good stuff from that choice set.” That’s the idea behind some of the recent work that I’ve been doing. We have people design text messages to motivate people to learn about coronavirus. Text messages are pretty insubstantial when it comes to thinking about policies. But it’s nice because you can test hundreds or thousands of them at a relatively low cost and they can, in some circumstances, be fairly effective as a light-touch, non-financial incentive.
That’s an area that I’m hoping to spend more time exploring, Are people responsive to incentives and coming up with interventions? If you pay people to come up with better policy interventions, do they, or is the production of interventions inelastic? Or perhaps creativity required to produce effective messages is crowded out by financial incentives. If it is, maybe we shouldn’t be paying people to come up with better policies. We need to use some other lever to expand the choice set in a useful way.
Chazel Hakim: Could you expand on that text messaging project you’re currently working on?
Nick Otis: There’s some logistical challenges, so we’ll see what happens with it first. But here’s the idea behind this project. We’re trying to motivate people in Kenya to do an SMS-based COVID task. A task would be something like taking a quiz to learn about misconceptions of COVID. And in terms of SMS messages, you could think of a thousand different possible text messages that you could send to people to motivate them to do the task. You could send them a message like, “You have the power to help your community, opt to do this thing.” Or you could think of a message like, “Don’t be the one who lets your community down, opt in to receive messages.” There’s an infinite number. It’s a super multi-dimensional “policy space” because text messages are made of words.
So we are trying to answer the question, “If we pay people to come up with more effective messages, are they able to do so?” And the other side of this question which I think is really interesting is, “If you paid people to come up with more effective messages, do they do so by coming up with messages that will be more costly for recipients, or in other words, a sort of ‘negative’ message for recipients?” Maybe then we would need to adjust the contract to pay out both on the effectiveness of the message and how costly the message is. I think the space of contracts to encourage people to design policies is super interesting. There’s some kind of related work in R&D and innovation literature, but not a lot of empirical work. So I’m really excited about that area.
Chazel Hakim: That’s really exciting to hear. If I was a researcher or student interested in learning more about or reading the work you’ve talked about, where should I go to do that?
Nick Otis: My website is one place, which is www.nicholasotis.com, and my email is firstname.lastname@example.org. I’ll also give a quick plug for socialscienceprediction.org, which is a platform that Stefano Dellavigna, Eva Vivalt, a bunch of other people and I have been working on to develop. It’s a sort of public good to streamline the collection of predictions for social science results, so if you’re interested in collecting predictions, I encourage people to check out that website.
Chazel Hakim: Sweet. Thank you so much for your time!
Featured Image Source: Center for African Studies
Disclaimer: The views published in this journal are those of the individual authors or speakers and do not necessarily reflect the position or policy of Berkeley Economic Review staff, the Undergraduate Economics Association, the UC Berkeley Economics Department and faculty, or the University of California, Berkeley in general.