06 September 2021, Iris Proff
© Iris Proff
Robert van Rooij has been investigating the intricacies of human language and its meaning for 27 years. Vagueness, causality, conversational implicatures and pronoun resolution – many of his research topics carry abstract-sounding names.
Recently however, something is changing in van Rooij’s research which mirrors an ubiquitous change in the institute’s identity: More and more researchers are starting to explore connections between their theoretical work and practical applications. As the new director of the ILLC, van Rooij is determined to foster this development and to help the institute find its balance between the neat realm of ideas and the messy real world.
The role of Logic in today’s AI boom
Robert, since a few weeks you are the director of the ILLC. Did you have a slow start over the summer?
Not at all! A lot of things happened already. I was involved in the hiring of a number of new staff members, while for some current staff members we had to make an effort to make them stay. Moreover, opportunities to engage in bigger projects are opening up. I have been in the management team of the ILLC for seven years, but still, but still, being director is a very different task than anything I have done before.
What are going to be the biggest challenges for your time as a director?
For me personally: that I don’t get overstressed. In terms of content, the biggest challenge might be to show that logic – considered broadly – can still contribute a lot to the development of artificial intelligence (AI). At the moment most progress in AI is being made in areas where logic is not so important.
It is striking how fast the AI research of the UvA is growing. Can the ILLC keep up with this fast development?
Responsible, explainable, and transparent AI are now high up on the agenda of AI research and the ILLC is in a unique position to contribute to this, both because of our strength in logic and because of our interdisciplinarity, with half of our researchers based in the Faculty of Humanities. We are not working on small-time progress but trying to understand the bigger, underlying reasons and mechanisms.
Bridging theory and application
There is some controversy about whether the ILLC should do more applied research and seek more industry collaborations. What is your stance on this?
At our institute, we tackle problems theoretically and until recently, the focus hasn’t been on our potential to contribute to society. The tendency is now to become more applied. To a degree, this is necessary since the criteria according to which research funding is distributed are changing. More and more people at the institute are collaborating with companies, such as SAP, booking.com, Facebook or Google. We are also starting initiatives such as Language Technology for People – and hopefully there will be more in the future – where we think in an organised way about how can we put our work to use for society.
On the other hand, the theoretical point of view has always been our strength and we should certainly not neglect that. It’s not necessary for everybody to be very strong in knowledge valorisation – some people are simply in a better position for this than others. But everybody should think about how their knowledge can be useful.
What is the Language Technology for People initiative about and who is involved?
We are a group of researchers from the ILLC, the Informatics Institute and Linguistics. We are working on how language technology can be used not just for commercial purposes, but for people who are impaired or disadvantaged because of something related to language. That might be because they have hearing problems, because they cannot read so well or because they are part of a community which is disadvantaged due to biased information about that community. There are various ways in which we might help these people. For instance, Floris Roelofsen is working on avatars for sign language. Others develop AI techniques to automatically simplify complicated texts and make them understandable for people with a lower education.
On the meaning of generic sentences
You are currently working on implicit biases in the media. This work is based on your earlier research on the meaning of generic sentences, such as “lions are dangerous”. In a nutshell, what are your thoughts on the meaning of such generics?
People take a generic sentence like “lions are dangerous” to be true because they assume the existence of a causal connection between being a lion and the associated attribute: dangerous. That causal connection doesn’t have to be strong – not all lions need to be dangerous for us to believe that “lions are dangerous”. How strong the causal relationship needs to be, depends on how good or bad this attribute is – not for the lion, but for us. That is the hypothesis.
Animals learn associations between two things faster if one of the two has a high emotional impact for them. We think it makes sense that humans show the same behaviour. That is why generics like “ticks transmit the Lyme disease” are acceptable, not because most ticks transmit the Lyme disease, but because the Lyme disease is so bad for us.
It’s evolutionary favourable to believe in it. It has a bigger cost to not believe in it.
Now what do ticks and lions have to do with implicit biases in the media?
Stereotypes can be expressed explicitly in terms of generic sentences: “Dutch people love cheese.” “Women are bad at math.” But you also see stereotypes expressed implicitly in texts – for instance when Muslim names appear with negative connotations in a newspaper article. In our research we try to automatically extract those implicit biases from texts.
Formal semantics meets the real world
What can you learn from that?
First, we can determine how biased the language model that we use is. Such language models are based on neural networks which are trained with an enormous amount of data from the internet. From this data, the model learns associations which contain biases. In a second step, we train the model extra on certain media and see if that bias changes. Thereby we can identify how biased different news outlets are with respect to different topics.
Interesting! Since the Black Lives Matter protests started last year, the public has become aware that we all have implicit biases which can be very harmful. Are there any specific topics you are investigating, like gender or ethnicities?
Because we do it in an automatic way, it can be about any topic. We can type in any word, such as Muslim, Chinese, or Dutch and find out, for instance, if there is any particular positive or negative association with that word.
However, my personal interest in this project was to extract specific generic sentences from text corpora, like “Dutch people love clogs”. Then we can check if people also take these sentences to be true. This way I can empirically test my theory about the meaning of generics.
How could this research become useful for people?
Maybe we could create a tool that determines whether a text used by a company to advertise a product contains implicit negative biases. That is one potential application. But I have to admit that I need to think more about ways to apply it.
On a more personal note – what sparked your interest in language?
My family doesn’t have a scientific background. They are all in agriculture and I never thought much about doing anything else until I was 17 or 18. So I started studying horticulture. However, I was always interested in history, and became interested philosophy as well, and later philosophy of language. I thought I can’t do philosophy of language without knowing more about language, so I studied linguistics.
When I was researching topics like the meaning of only or wh-questions, my family could never relate to it. Now I’m working on generics and stereotypes, and they finally understand what I’m doing and think it might be useful.