Initiatives

Demystifying AI Interpretability

fede-adolfi_profile-picture-portrait- hi res

This talk will attempt to demystify, for a non-technical audience, the current state of neural network explainability and interpretability, as well as trace the boundaries of what is in principle possible to achieve. We will first set up the necessary background to talk about interpretability methods with stakeholders in mind, define basic concepts, and explain differences such as inner interpretability versus explainability. Along the way, we will touch on issues of relevance to various stakeholders; for instance, the role of interpretability in attempting explanations of how large language models generate text, in revealing reasons for model biases, and in model distillation.

Throughout, we will use a particular lens to demystify what AI interpretability is, and which goals are within or out of its reach: instead of focusing on the promises of (algorithmic) solutions for interpretability, we will focus on the properties of the (computational) problems they attempt to solve. This lens—which we call computational meta-theory—will allow us to put stakeholders’ goals at the centre and to reason about the adequacy of interpretability ‘hammers’ to hit practically meaningful ‘nails’.

Federico Adolfi is currently a postdoctoral researcher at the Ernst Strüngmann Institute for Neuroscience, Max Planck Society. He combines a background in cognitive and brain science, computer science, and music. His PhD in Computational Cognitive Science at the University of Bristol focused on establishing a conceptual and formal framework for computational meta-theory and demonstrating its application to problems in psychology, neuroscience, and artificial intelligence. One of these applications is the problem of AI interpretability, for which he and his colleagues recently provided the first formal analyses of the scope and limits of circuit discovery to interpret neural networks.

28 March 2025 | Demystifying AI Interpretability

'*' indicates required fields

Name*

First Last

Institutional affiliation*

If applicable, please enter your institutional or other affiliation

If applicable, please enter the URL of your public profile at your institution, LinkedIn, or elsewhere

Your Email*

Enter Email Confirm Email

If you have an MPI email address, please use it.

Data Consent*

I hereby declare my consent in accordance with terms of the ‘DECLARATION OF CONSENT FOR THE COLLECTION, STORAGE, PROCESSING AND USE OF PERSONAL INFORMATION; AND FOR THE RECORDING, PROCESSING AND USE OF PHOTOGRAPHIC, AUDIOGRAPHIC AND VIDEOGRAPHIC CONTENT’ that I have downloaded and read here.

This field is for validation purposes and should be left unchanged.

Find out more about the organizers of this event, the Initiative: Max Planck Law, Tech, Society

More Events