How to Write a Neurips Paper [1]
The popular view that scientists proceed inexorably from well-established fact to well-established fact, never being influenced by any improved conjecture, is quite mistaken. Provided it is made clear which are proved facts and which are conjectures, no harm can result. Conjectures are of great importance since they suggest useful lines of research. —Alan Turing, Computing Machinery and Intelligence(1950)
For the sake of everyone involved in research, I hope there isn’t a tractable algorithm we can run to get creative research ideas: That’ll poop the party for everyone! However, there are some unreliable maneuvers you can take to facilitate interesting ideas, like voodoo magics, and this chapter will focus on that. However, take whatever I write here with a giant slab of Pakistani pink salt, as it is an inherently a chaotic process that resists characterizations.
finding the mojo
This could be wrong, but I think –or at least, find it helpful to think– that creativity is mostly functional, in a sense that it is difficult to be creative out of thin air, independent of your immediate circumstances. Research is similar, the kinds of ideas you’ll come up is largely a function of your background, your institution, and your colleagues.
Viewing creativity as a function has some immediate consequences. First, this function is too complicated to invert: You can’t magically deduce the right circumstances such that, when presented to you, result in a great research idea. Second, despite its complex nature, it is still just a function: Given the same input, it will always produce the same output. Thus, it is important to seek different environments, as that is your only source of mojo, your “random seed” if you will. For instance, you can try reading papers outside of your immediate area, and engaging in conversations with people in different departments. What you’ll find is that since you spent so much time and efforts in your particular domain of interest, you will readily connect these explorations back to your own investigations.
My office was in CSAIL (computer science) but across the street is the building of BCS (brain and cognitive science). These buildings are close to each other by design: It was intended that these two groups of people should talk to each other. Each day I would take a trip to BCS and hang out with my friends who were Josh’s students, and run my ideas through them. They can review these ideas from a non-CS point of view, and refer me to their cognitive scientist colleagues. However, most of the times we end up being distracted by the various educational toys on the table.
performing the ritual
Hopefully, by engaging in explorative activities, you curated some primordial goops of ideas in your head. They shift in and out of focus, sometimes resembling key insights, but upon further inspection become bullshit. Set aside 1–2 hours when you are mentally alert, and embrace this chaotic process in a kind of meditative ritual. Be very patient: This will take a long time, and most of the times you will apparently make 0 progress. However, like a pupae in progress, you are subconsciously building stronger intuitions of the problem at hand, and when the time is right, the insight will fall right out of your head (Or not, it is research after all).
Given the chaotic nature of this ritual, it is important to have some “thought guides” that one can follow to make this process more reliable. There are many ways to do this, I will share a few of mine:
- Formulate a problem. Rather than thinking aimlessly, try to explicitly write down a problem, and relentlessly chase after a solution. By repeatedly hitting your head on the same problem, you might become better at solving it. For comparison, it is clearly a superior strategy than hitting your head on a different problem each day, you’ll never become better at anything. For instance, I am interested in the communicative aspects of program synthesis, so my problem is “how might a person communicate a program with the least effort?” This is the puzzle I’m still working on, and I have barely scratched it.
- Explain a phenomenon. One of the most powerful tool from the cohorts of computational cognitive scientists. Rather than inventing a novel algorithm, study a human activity that is taken for granted yet undeniably useful, and try to algorithmically model this behavior. The benefit of this approach is you are not pulling anything out of the hat, but modeling and imitating a proven process that already exists. For instance, check out the introduction of this paper we wrote.
- Postpone challenges. Since one can only keep so many thoughts simultaneously in their head at the same time, it is crucial to know when you can postpone some challenges to not have your creative flow interrupted. An experienced researcher would have a precise calibration on which challenge is “a matter of time”, which challenge you must “confront immediately”, and which challenge is “impossible”. Thus, an experienced researcher can ask bolder research questions, without being bogged down by the details that can eventually be resolved.
divining a prophecy
The end result of the creative process is a hypothesis: A short, interesting, and unproven statement. You are not done with the creative process until you write down this hypothesis. It can be as practical as “we propose approach X which should out-perform approach Y”, or as conceptual as “properties X is true for system Y under certain circumstances, which has the following consequences”. We will explain about how to vet a research hypothesis in detail in the next chapter, but for now, check with yourself that these properties are true: short so that it is meme-able, something people can tweet about easily; interesting, most importantly that you find it interesting to work on; and unproven, so that it is actually research and not an application.
story begins
My advisor had asked me in May or June of 2019, more or less in these exact words, “So Evan, do I still need to pay you next year?” to which I said “Yeah that’ll be nice!” Thus, shortly after I finished my PhD thesis defense, I became a postdoc for a year in the same lab while I looked for jobs. So the story started roughly sometimes September 2019, during my year as a post-doc.
At the time I was very interested in viewing program synthesis in the style of 20 Questions. Essentially, can a user interact with a synthesis system in a series of question-answering rounds to indicate which program they have in mind? This way, a novice user can still program, without expertise in the programming language. My rationale was that, human communication in 20 Questions is undeniably powerful: One can often identify a single correct object against a set of infinitely many distractor objects in just 20 rounds. Therefore, if program synthesis is figuring out which program the user have in mind out of the almost infinite set of all programs, clearly there’s some good research to be done in relating these two processes.
To keep my set up simple and grounded, I focused on simple objects that are nonetheless combinatorial in nature, and with a set of attributes: They are as good as real programs from a communication point of view. I made a pattern communication game, where each pattern consists of an arrangement of colored blocks, with a set of attributes you can use to describe them, such as colors and the number of pieces. I imagined that there are two players, player1 (the user) with a particular pattern in mind (i.e. a program), and player2 (the system) that owns a set of all patterns (i.e. space of programs), and is acting in collaboration to identify which pattern player1 had in mind. Here’s a note showing part of this game, interestingly, the also has some other funny tidbits and doodles that’s off topic.
I also made a physical version of the game which I can carry around with me to play with my friends (or some unfortunate stranger I ran across in school). It consists of two identical set of 10 lego constructions, {a1,a2,…,a10}, each made by combining 2 to 3 basic blocks. I put the two sets in two small paper boxes, and I would give one box to a friend and ask them to draw from the box a construction at random, and attempt to communicate to me which piece they had drawn until I find the corresponding piece in my box. I compared two setups, under one setup I just let the other person talk, and under a different setup I get to ask clarification questions. (I have these pieces somewhere, when I find them I’ll upload their photo)
At some point I arrived at a very useful metaphor for this communication process: That of a person shopping for an item. The person has some requirement for the item in their mind, and the shop owner has a set of items that they can offer. By collaboratively talking to each other, the shop owner finds the right item for the shopper. Here’s a small transcript in the context of ordering food from a restaurant.
suddenly pragmatics
Sometime in October, my friend Maxwell Nye showed me something incredible. “Evan, you’re interested in communication right? have you heard of pragmatics?” “No, tell me”. He then proceed to drew the following picture on the whiteboard.
“Okay so there are these three guys right? The first guy’s not wearing anything, the second guy is wearing glasses, and the third is wearing glasses and a hat. If I tell you ‘hey my friend wears glasses’, who is more likely to be my friend?”
“Oh fuck!” I yelled audibly, my face MonkaS. These weren't just three guys for me, they were three programs. It wasn’t just the word “glasses” to me, it was an attribute about a program. Some terrifying process is at hand, where despite the 2nd program and the 3rd program being ambiguous under the attribute “glasses”, human instinctively know to select the 2nd program over the third one. “Wait this is insane, how does it work?”
Ranking ambiguous programs is a long-standing problem of program synthesis. For instance, if I ask you to give me a program that maps 2 to 4, I probably expected something like “x+2” or “2x”, both maps 2 to 4, and both fairly intuitive. However, implemented naively, the synthesizer might give you a completely crazy program “(x+14)/4”, or something degenerative “0x+4”, both are technically correct, yet unintuitive. Somewhere in my head, I instinctively knew that using this “pragmatics” thing, we can resolve the problem of disambiguation in program synthesis elegantly.
see you in the next chapter !
In this chapter, we gave some hints on how to harness your creative resources to divine a research hypothesis, and an account of how I did it. In the next chapter we’ll talk about how to vet a research idea to ensure it is solid.
I will be updating this series once every couple weeks, depending on how much free times I have. I will be announcing the new updates on my twitter, so if you can follow it that’d be great :)
thanks for reading and high five!
— evan
chapters list :
- [0] overview : a look behind the scenes
- [1] getting a research idea : series of voodoo rituals
- [2] vetting a research idea : be honest and disciplined
- [3] acquiring knowledge : set a goal first
- [4] building a prototype : <under construction>