September 2022: Data Science Insights by Rainer Freudenthaler

Dr. Rainer Freudenthaleris an academic staff member at the MZES. After studying Media Economics at the Hochschule der Medien Stuttgart and getting his master’s degree in Media and Communication Studies at the University of Mannheim he finished his dissertation on the public debate concerning Germany’s refugee policy within online media in 2021.

What is your current research topic?

I work in the project “Implicit and Explicit Racism in News and Social Media: Extent and Effects”, which is part of the research network “Discrimination and Racism” (FoDiRa). In this project, we build on an earlier research project in which automated methods were developed to measure explicit stigmatisation and implicit stigmatisation. By this we mean: Are certain minority groups associated above average with fear or below average with recognition (explicitly)? And are these groups mentioned in contexts with implicit negative connotations (implicit)? We want to further improve the methods and apply them to new data – to reporting in mainstream and alternative media, and to social media data from influencers. And we want to test experimentally what effect these representations can have.


For those who have not yet delved deeply into the topic of Data Science: How would you explain to a child what you are working on?

We teach computers to read large amounts of text. When humans read texts, the disadvantage is that they can only read relatively small amounts of text. So we test whether our computer programmes can recognise what interests us as well as humans – if so, we can analyse larger quantities of texts.


Everyone talks about Data Science – how would you describe the importance of the topic for yourself in three words?

Evaluating more data.


What points of contact with Data Science does your work have? Which methods do you already use, and which would be interesting for you in the future?

We are currently working with Latent Semantic Scaling and methods for measuring word embedding bias. Both are methods based on comparing words according to whether they are used similarly to identify which words have similar meanings. We are currently looking with great interest at deep learning methods that seem to be even better at detecting meaning in texts, e.g. NLI-BERT.


How high is the value of Data Science for your work? Would your research even be possible without Data Science?

In our field, I see advantages in scaling – in communication science, quantitative content analysis with human coders always means that you have to limit very precisely which factors you examine. If you can economically study only a handful of newspapers, you can reliably include only a small number of variables as explanatory variables in the model. For example, I would measure the difference that the political orientation of a newspaper makes by comparing 2 left-wing newspapers with 2 conservative newspapers. This limits. If I can examine larger datasets in an automated way, I can control for more variables: is it political orientation? Regional vs. national? Commercial vs. cooperative funding? And so on.


What development opportunities do you see for the topic of Data Science in relation to your field?

I think the exciting things will be 1. the increase of the data base, 2. possibilities to link with other (quantitative and qualitative) methods and 3. possibilities to share measurement tools and data.