Language, art and music are extremely revealing about workings of the human mind

I was interviewed by Gisela Govaart about my research. The interview is published online here.

***

Language, art and music are extremely revealing about workings of the human mind” – An interview with Jelle Zuidema
by Gisela Govaart, January 2016

Jelle Zuidema is assistant professor in cognitive science and computational linguistics at the Institute for Logic, Language and Computation. He does research on these topics, coordinates the Cognition, Language and Computation lab and supervises five PhD and several MSc students there. He teaches in the interdisciplinary master’s programs Brain & Cognitive Sciences (MBCS), Artificial Intelligence, and Logic, and coordinates the Cognitive Science track in the MBCS. Jelle was the organizer of the SMART CS events from 2011 until 2015.

Jelle Zuidema

“I started my studies with two programs in parallel at the University of Utrecht: Liberal Arts – where I focused on Literature – and Artificial Intelligence. In my final two years I dropped Liberal Arts, because I decided I needed to specialize; I got my degree in AI, with a specialization in Theoretical Biology. My thesis was on Evolution of Language, so it was a rather weird mix. I was first interested in evolution, and then my supervisor suggested: since you have this background in computational linguistics and logic, why don’t you look at the evolution of language. So it was a bit accidental, but immediately things started to fall into place, and I got really excited about the topic, and decided that I wanted to do my PhD on that as well. For my PhD I moved first briefly to Paris, and then I was in Brussels for two years, in the group of Luc Steels. After two years Brussels I moved to Edinburgh, and I actually got my PhD degree from the University of Edinburgh in the group of Simon Kirby.”

“When I moved to Edinburgh, my supervisor suggested that I should make use of the fact that they have some of the best theoretical linguists in the world. I ended up sitting in in a couple of classes, one of which was ‘Theoretical Linguistics’, taught by Mark Steedman. We had to read the basics of Chomsky, Shannon, Ross, and a bit of Steedman himself –the history of information theory and grammar formalisms. This course was very influential for the questions that I started to ask myself later. Computational linguistics is very much dominated now by the statistical, machine learning-based approach: trying to learn from data. In that field I am in the tiny little corner where people actually take some inspiration from linguistics, about hierarchical structures for example. As a consequence, linguists group us together with the machine learning people, and the machine learning people group us together with the linguists. We are getting fire from both sides.”

“A lot of people think that there is a contradiction between statistical learning and hierarchical structure. That is why I felt that I found my natural home when I moved to Amsterdam to work at the ILLC. The computational linguists at the ILLC have this long tradition of working on data oriented parsing, a model that nicely shows there is a continuum from what people typically think of as statistical learning – which is about computing statistics over neighboring words – to hierarchical structures – where people typically are said to ignore frequency information. However, there is no reason that we should ignore frequency information while making use of hierarchical structures. It turns out that you can define probability distributions over very complex objects, such as parse trees, or logical formulae. I think that this marriage of hierarchical modeling and probabilistic modeling is where the most interesting questions are. In our research, we often simplify on both sides. We do not have as fancy statistics as some of the machine learning people, and we do not have as complex symbolic hierarchical objects as the theoretical linguists. Instead, we try to find models that go in the right direction, by defining probabilities over relatively simple hierarchical trees for sentences. That allows us to study questions about language processing, about how we compute the meaning of sentences, and about language acquisition. For me, this offers a nice way out of the endless debates that we observe in linguistics, between people who are favoring statistics and people who are favoring complex models.”

“When I first started computational modeling, I got really excited about it, because you suddenly realize that a lot of thinking that you have been doing before was very imprecise. The first computational model really is a shocking experience: how much more precision you have to give, how many more detailed questions you suddenly have to answer. I was looking at evolution, and people do a lot of thinking about evolution; they talk for example about mutations and selection. But then you start thinking: what is a mutation exactly? It means that by chance, something changes to something else. Fair enough; but what exactly was changing, and with which probability? Suddenly you have to account for all these details. You have to come up with parameters that quantify exactly the probability that something will change into something else. And it turns out that this really matters for what comes out of the model. When you first experience this world of detail that you were overlooking with verbal theorizing, then you want to spread the word.”

“So, do we simplify a lot? Yes we are, but everyone else is. Everyone in science is making ridiculous simplifications. I think one of the great advantages of the approach I am advocating is that the simplifications we make are for a large part very explicit. That means that it is easy for people to criticize. When I build a parser that processes a sentence and computes its hierarchical structure, and I show this to linguists, they start complaining: ‘Yeah but this is 1970s Chomskyan theory, we do not believe in noun phrases anymore’. There are a lot of comments on the symbols that we use, and on these very simple tree structures that these parsers work with. But these people do not realize that when they build a fancy modern parse tree, they often do not specify how all of these decisions are made that lead people to actually come up with that parse tree. For me, it is all summarized by this quote from Cavalli-Sforza and Feldman: ‘Verbal theories avoid the charge of oversimplification only at the expense of ambiguity’. So, yes, we simplify. And we are proud of it. And we are proud that everyone can see where we simplify.”

“From the very beginning of my studies I was already somewhat surprised that there was not more interaction between cognitive science and art. I was always ‘primed’ by my cognitive science courses when I was taking literature courses. For me, the most interesting thing about literature really is that when people are reading a novel – which is really just black ink on a white background – people can get so lost in a completely different, counterfactual world. Literature is a technology that is manipulating people into having feelings and having thoughts about worlds that do not exist. I think this is utterly amazing, and I really think we all should want to understand how that could work. Something similar happens with music. The influence that music – which is really just sound waves – can have on the emotional system of people is amazing. It is astonishing how this balance you find in music between simplicity and complexity, between predictability and unpredictability, somehow has found this direct access to the emotional centers in our brains. I think that music and literature, and this is true for the other arts as well, are some of the most fascinating systems from a cognitive science point of view that you can think of. And I do not think that enough people are studying it from that angle. I really believe that there is a very big role to play there for the humanities, in studying the properties of language, art and music, and use them as sort of ‘extreme cases’: as cases that are extremely revealing about workings of the human mind.”

“There is a very practical reason why SMART as an initiative exists, and that is that the government wants universities to specialize, to become more different from each other. The government told the universities that they should decide on research focus areas. Hence, the universities told the faculties that they should decide on research focus areas. The Faculties of Humanities – and in many other faculties, actually – thought it was a good idea to choose Brain and Cognitive Science Henkjan Honing and Kees Hengeveld, who were the ‘trekkers’, as it is called, were looking for someone who knew the research at ACLC and at ILLC, where most of the cognitive science at the humanities faculties is happening, and could somehow bridge these two research institutes. Then they asked me. My thought was that if you are to do something like this, you need a good brand name. So I came up with the name SMART Cognitive Science, which is an acronym for Speech & language, Music, Art, Reasoning & Thought. But there is also a teaser: it contrasts with ‘expensive cognitive science’ that is happening in other faculties a lot. This is what the humanities are good at: asking very smart questions about cognition. I think it is important that cheap but good research in cognitive science is also supported, and that not all the money goes to people with expensive fMRI or MEG machines.”

“For now, what I am most interested in is to try to make a bridge between language at the behavioral level, as we study it in linguistics, and at the neural level. I am working on this as part of this big national initiative ‘Language in Interaction’. The idea that I and some other people in the world are pursuing is that the structure of language, the kind of computations that you need to be able to process language, are really revealing about the underlying neural implementation. What we are exploring is whether what we know about how language works puts constraints about what the possible neural implementations are. Neuroscience is a very advanced field at the level of single neurons. Cognitive neuroscience is also a very advanced field at the level of mapping the brain, of determining which structures in the brain are correlated with what kind of behaviors, also when it comes to what we call the language network. But when you ask how networks of neurons really support the computation of what sentences mean, or how sentences are structured, or how words are stored, or how the meaning of words are retrieved, then we really do not have a clue, not even how to start. What I am putting my bets on, which is a high-risk bet, is that the structure of language really reveals something fundamental about how the brain is organized. This is controversial both in linguistics and in neuroscience. So it is a high-risk high-gain kind of project. There are a few people in the world that share this intuition, but there are also a lot of people who think it is a waste of time. My experience is that these guiding intuitions at the very least help to focus your research, and help you discover interesting things on the way, even if you never reach the endpoint. It is a little bit like ‘you aim for the stars, and you might reach the moon’.”

“Even though much of my research is now focused on how we process language, on how we compute the meaning of sentences, the ultimate question is that I try to understand is: What makes human language unique. I am still motivated by these old questions that I did my Master’s and PhD on of how the crucial difference between chimpanzees, bonobos and humans emerged in evolution. Richard Lewontin has this nice quote: “On the average, chimpanzees and humans are very similar to each other at the level of genes and proteins, but they differ radically in their ability to write books about each other”. That is what I am trying to understand. I really think that there is a good possibility that the answer is somewhere in the neural code. It is a bit similar to the genetic code. There is this extremely interesting history of the discovery of DNA. One of the things that are so intriguing about this process, is that lots of people worldwide had results that almost revealed what was happening. And then, when it was finally discovered, by Watson and Crick, everything fell into place. And all of these extremely puzzling results suddenly started to make sense with the discovery of the double helix, and this universal code for all life on earth. I think there is a possibility – maybe it is a bit wishful thinking –that there is something similar for language, something about how language is encoded in the brain. There are so many questions that are puzzling us now: how can it be that no other species has language? How can it be that no species has a little bit of language (either you have it or not)? How can language be so different from everything else on the planet, while our neurons and proteins are so similar to those of other species? All these sort of weird paradoxes that we observe might fall into place once we have cracked the code. But maybe people in 100 years will laugh at me, because it is just as much an enigma as it is now. But I am optimistic. And I think I am optimistic both because I believe in it, and because it is a good research strategy to be optimistic.”