Explaining ai

 
IA-symbol.png

At the intersection between artificial intelligence (AI), transparency, privacy and law, there is a need for more research. This IO focuses on explaining AI’s black box models and related issues.

AI, statistical models and machine learning methods can often be seen as black boxes to those who construct the model and/or to those who use or are exposed to the methods. This can be due to: a) complicated models, such as deep neural nets, boosted tree models or ensemble models, b) models with many variables/parameters and c) complex dependencies between the variables.

Even simple models can be difficult to explain to persons who are not mathematically literate. Some models can be explained, but only through their global, not personalised, behaviour. There are a number of good reasons for explain­ing how a black box model works for each individual:

  1. Those who construct or use the model should under­stand how the model works

  2. Those who are exposed to the model should, and some­times will, have the right to an explanation about a model’s behaviour, for example to be able to contest its decision

  3. It should be possible to detect undesired effects in the model, for example an unfair or illegal treatment of certain groups of individuals, or too much weight on irrelevant variables

Illustrasjon: Ellen Hegtun, Kunst i Skolen

Research at BigInsight can challenge some of the legal principles that govern data privacy, including the risk of re-identification of anonymised parties, the wish to min­imise data made available to discover associations and causes and the uncertainty of the value created by big data research. The need for compromising between privacy pro­tection and common good is particularly evident in medical research. Methods and algorithms should follow the five principles of responsibility, explainability, accuracy, audit­ability, and fairness. How can these aspects be regulated, validated, and audited?

Seminar series

In 2022, we organized three seminars on the themes: “Towards XAI 2.0: From feature interaction relevances to concept-based explanations”, “Explaining Artificial Intelligence: Contrastive Explanations for AI Black Boxes and What People Think of Them” and “Hva kan vi lære av NAVs deltagelse i Datatilsynets regulatoriske sandkasse for ansvarlig kunstig intelligens?” (What can we learn from Norwegian Labour and Welfare Administration’s participation in the Norwegian Data Protection Authority’s Sandbox for responsible artificial intelligence?). Attendance and discussions were very good, and the seminar series continues into 2023.

Correct explanations when there is dependence between the variables

In many real-life models, some or many of the variables of interest are dependent. For example, income and age typically follow each other quite closely. Current approaches to individual explanations do not handle dependent variables at all or not very well, especially in terms of the computational burden needed even for a handful of variables. We have been constructing new methods to handle these situations. We continue to add new features to our R package – shapr, which now has been made available also in Python. We have continued to improve our Shapley methods further by 1) using variational autoencoders to model dependent, mixed features better (paper published in Journal of Machine Learning Research), and 2) developing methodology to explore and investigate how different clusters of the training data affect the predictions made by any black-box model (published in Data Mining and Knowledge Discovery). In addition to estimating Shapley values, we have devised a new counterfactual method we have called “MCCE”, which can be used to efficiently Monte Carlo sample realistic counterfactual explanations (conference paper submitted). Our work in this area has obtained significant attention, positioning us well internationally.

XAI tree: Practical tool for choosing the appropriate explanation method

The field of XAI (eXplainable Artificial Intelligence) has in very short time evolved into a myriad of different explanation approaches. Together with NAV we are therefore developing a practical, web based tool, based on answering questions like “Do you want to explain the whole model or specific predictions?”. The questions are organised in a tree structure, hence the name “XAI tree”. This tool will be used to assist NAV’s employees in their quest for responsible artificial intelligence. The tool will also be made available for everyone online.

A national role

We have contributed to various XAI seminars, conferences and the course “Legal Technology: Artificial Intelligence and Law” at the Department of Public and International Law, UiO. NAV has participated with its planned solution for prediction of absence due to illness (and accompanying explanations) in the Norwegian Data Protection Authority’s Sandbox for responsible artificial intelligence. We have also contributed to the Norwegian commission for data privacy, which delivered a Norwegian Official Report in September 2022.

To summarise: We have been and will continue to be an important voice in the Norwegian AI debate.

“As the financial implications and economic fallout
of COVID-19 become more lucid around the world
one thing is already clear:
Many people will need
loans to survive. And almost all loan decisions are
determined using proprietary black box models.
This is a problem.”
— IMPACT, Duke University, April 2020.

Principal Investigator Anders Løland

Principal Investigator
Anders Løland

Co-Principal Investigator Arnoldo Frigessi

Co-Principal Investigator
Arnoldo Frigessi