\documentclass{article}
% Packages
\usepackage{amsmath,amsfonts,amsthm,amssymb,amsopn,bm}
\usepackage[margin=.9in]{geometry}
\usepackage{graphicx}
\usepackage{url}
\usepackage[usenames,dvipsnames]{color}
\usepackage{fancyhdr}
\usepackage{multirow}
\usepackage{hyperref}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{booktabs}
% New colors defined below
\definecolor{codegreen}{rgb}{0,0.6,0}
\definecolor{codegray}{rgb}{0.5,0.5,0.5}
\definecolor{codepurple}{rgb}{0.58,0,0.82}
\definecolor{backcolour}{rgb}{0.98,0.98,0.98}
% Code listing style named "mystyle"
\lstdefinestyle{mystyle}{
backgroundcolor=\color{backcolour}, commentstyle=\color{codegreen},
keywordstyle=\color{magenta},
numberstyle=\tiny\color{codegray},
stringstyle=\color{codepurple},
basicstyle=\ttfamily\footnotesize,
breakatwhitespace=false,
breaklines=true,
captionpos=b,
keepspaces=true,
numbersep=5pt,
showspaces=false,
showstringspaces=false,
showtabs=false,
tabsize=2
}
%"mystyle" code listing set
\lstset{style=mystyle}
% For enumerate environment
\usepackage{enumitem}
\renewcommand{\theenumi}{\alph{enumi}}
\renewcommand{\labelenumi}{(\theenumi)}
% Math commands
\newcommand{\R}{\mathbb{R}}
\newcommand{\E}{\mathbb{E}}
\newcommand{\Var}{\mathrm{Var}}
\def\rvx{{\mathbf{x}}}
\def\rvy{{\mathbf{y}}}
\newcommand{\softmax}{\mathrm{softmax}}
\newcommand{\inv}{^{-1}}
% Formatting
\newcommand{\grade}[1]{\small\textcolor{magenta}{[#1 points]} \normalsize}
\date{{}}
% Solutions
\usepackage{ifthen}
\newboolean{showSolutions}
\setboolean{showSolutions}{false} % Change this to toggle solutions
\newcommand{\solution}[1]{\ifthenelse {\boolean{showSolutions}} {{\leavevmode\color{blue}\textbf{Solution:} #1}}{}}
% Comments
\newcommand{\hugh}{\textcolor{blue}}
\newcommand{\ian}{\textcolor{red}}
% No indent
\usepackage[parfill]{parskip}
\begin{document}
\title{Homework \#2}
\author{\normalsize{CSEP 590B: Explainable AI}\\
\normalsize{Prof. Su-In Lee} \\
\normalsize{Due: 5/18/22 11:59 PM}}
\maketitle
\section{Feature attributions and metrics (10 points)}
\begin{enumerate}
\item \grade{4} Briefly describe \textbf{Grad $\times$ Input}, \textbf{SmoothGrad}, and \textbf{Integrated Gradients}, and how each method uses model gradients.
\item \grade{3} In contrast to the other gradient-based methods, \textbf{GradCAM} does not compute gradients with respect to the input image. Describe how \textbf{GradCAM} calculates gradients differently and how they are used to explain the original image.
\item \grade{3} Two of the main types of XAI evaluation metrics are ground truth comparisons and ablation metrics. Which of the two is more suitable to testing if an explanation identifies important features for a specific model?
\end{enumerate}
\section*{Preliminaries}
The remaining questions focus on feature attribution methods for computer vision models. As a first step, please install the following packages into your local Python environment, preferably using a recent version of Python 3:
\begin{lstlisting}[language=bash]
pip install shap
pip install torch
pip install torchvision
pip install matplotlib
\end{lstlisting}
We'll use images available in the SHAP package, which can be loaded as follows:
\begin{lstlisting}[language=Python]
import shap
display_images = shap.datasets.imagenet50()[0].astype('uint8') # shape = (50, 224, 224, 3)
\end{lstlisting}
We'll use a ResNet-18 model pre-trained on the ImageNet dataset, which you can download using PyTorch:
\begin{lstlisting}[language=Python]
from torchvision import models
model = models.resnet18(pretrained=True)
model = model.eval() # turns off training mode for batch norm
\end{lstlisting}
Deep neural networks require input data in a particular format, and you can use the following pre-processing code:
\begin{lstlisting}[language=Python]
import torchvision.transforms as transforms
# Image pre-processing, expects single image of size (224, 224, 3)
model_transforms = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
\end{lstlisting}
Putting this together, you can run the model on all 50 images as follows:
\begin{lstlisting}[language=Python]
# Apply pre-processing and make predictions
images = torch.stack([model_transforms(img) for img in display_images])
pred = model(images).softmax(dim=1)
\end{lstlisting}
\section{Occlusion (30 points)}
\textbf{Occlusion} is a removal-based method that measures the impact of removing features from the input. The name comes from from the fact that pixels are typically occluded by setting them to zero (black). Here, we will implement \textbf{Occlusion} from scratch.
\begin{enumerate}
\item \grade{5} Plot the first three images in the dataset. Generate the plot before applying any pre-processing steps, and you can use the following starter code:
\begin{lstlisting}[language=Python]
import matplotlib.pyplot as plt
# Generate plot
plt.figure()
plt.imshow(display_image)
plt.show()
\end{lstlisting}
\item \grade{5} Run the first three images through the model and find the predicted classes. Check \href{https://deeplearning.cms.waikato.ac.nz/user-guide/class-maps/IMAGENET/}{here} to see what these classes mean. \textbf{Hint:} PyTorch supports many of the same operations as numpy, including \href{https://pytorch.org/docs/stable/generated/torch.argmax.html}{argmax}.
\item \grade{10} Write a function with the following signature to generate feature importance values:
\begin{lstlisting}[language=Python]
def occlusion(imgs, model, target_labels, baseline, superpixel_size=8):
'''
Args:
imgs: torch.Tensor of pre-processed images, size = (batch, 3, 224, 224)
model: PyTorch classifier
target_labels: torch.Tensor of classes for each image, size = (batch,)
baseline: baseline value for occluded features
superpixel_size: width/height of superpixels
Returns:
importance: occlusion scores, size = (batch, 224, 224)
'''
pass
\end{lstlisting}
For the baseline, you can use the following baseline image to replace with zeros:
\begin{lstlisting}[language=Python]
# Generate black image, then apply pre-processing
baseline = model_transforms(np.zeros((224, 224, 3), np.uint8))
\end{lstlisting}
Test your function by running it on the first three images. Use the predicted labels for each image, the zeros baseline, and superpixels of size $8\times8$. The following starter code shows how to display important regions in red and unimportant regions in blue:
\begin{lstlisting}[language=Python]
# Generate plot
plt.figure()
m = single_importance.abs().max()
plt.imshow(single_importance, vmin=-m, vmax=m, cmap='seismic') # specify min/max value
plt.show()
\end{lstlisting}
\item \grade{5} For the first of the three images, compare the results when using superpixels of size $4\times4$, $8\times8$ and $16\times16$. Plot the results side-by-side.
\item \grade{5} As an alternative to replacing with zeros, we can use a blurred version of the image as a baseline. For the first of the three images, compare the occlusion results when using several blurring strengths, using $8\times8$ superpixels. Plot the results together, including the different blurred versions of the image. \textbf{Hint:} several Python packages offer functions to blur images, see \href{https://datacarpentry.org/image-processing/06-blurring/}{here} for example.
\end{enumerate}
\section{Gradient-based explanations (30 points)}
Several explanation methods are based on gradients with respect to the input image. These are often faster than removal-based approaches, and they are widely used to explain deep models. In this problem, we'll implement several gradient-based methods from scratch.
\begin{enumerate}
\item \grade{5} Implement the \textbf{Vanilla Gradients} method. It should be a function with the following signature:
\begin{lstlisting}[language=Python]
def vanilla_gradients(imgs, model, target_labels):
'''
Args:
imgs: torch.Tensor of pre-processed images, size = (batch, 3, 224, 224)
model: PyTorch classifier
target_labels: torch.Tensor of classes for each image, size = (batch,)
Returns:
saliency: tensor of saliency values, shape = (batch, 224, 224)
'''
pass
\end{lstlisting}
In your function, take the absolute value and sum across the channels before returning the result. Plot the output for the first three images in the dataset. \textbf{Hint:} use the following starter code to compute input gradients:
\begin{lstlisting}[language=Python]
def calculate_gradients(imgs, model, target_labels):
'''
Args:
imgs: torch.Tensor of pre-processed images, size = (batch, 3, 224, 224)
model: PyTorch classifier
target_labels: torch.Tensor of classes for each image, size = (batch,)
Returns:
gradients: gradients for the target class, shape = (batch, 3, 224, 224)
'''
# Prepare for model.
imgs = imgs.clone()
imgs.requires_grad = True
# Make prediction.
output = model(imgs).softmax(dim=1)
# Sum outputs for target classes.
mask = torch.zeros(output.shape)
for i, target in enumerate(target_labels):
mask[i, target] = 1
backprop_output = torch.sum(output * mask)
# Calculate gradients.
model.zero_grad()
backprop_output.backward()
# Convert gradients to numpy.
gradients = imgs.grad.detach()
return gradients
\end{lstlisting}
The following starter code shows how to properly scale the saliency map in a plot:
\begin{lstlisting}[language=Python]
# Generate plot
plt.figure()
plt.imshow(vanilla, vmin=0, vmax=vanilla.max()) # specify min/max value
plt.show()
\end{lstlisting}
\item \grade{10} Implement \textbf{SmoothGrad}, which adds Gaussian noise to the input and averages the gradients. It should be a function with the following signature:
\begin{lstlisting}[language=Python]
def smoothgrad(imgs, model, target_labels, samples=50, sigma=0.1):
'''
Args:
imgs: torch.Tensor of pre-processed images, size = (batch, 3, 224, 224)
model: PyTorch classifier
target_labels: torch.Tensor of classes for each image, size = (batch,)
samples: number of random noise samples
sigma: scale for random noise
Returns:
saliency: tensor of saliency values, shape = (batch, 224, 224)
'''
pass
\end{lstlisting}
In your function, you can take the absolute value either before or after averaging (whichever looks better), and sum across channels before returning the result. Plot the output for the first three images in the dataset.
\item \grade{10} Implement \textbf{Integrated Gradients}, which averages the gradient along a path and multiplies it by the input minus the baseline. It should be a function with the following signature:
\begin{lstlisting}[language=Python]
def integrated_gradients(imgs, model, target_labels, baseline, steps=50):
'''
Args:
imgs: torch.Tensor of pre-processed images, size = (batch, 3, 224, 224)
model: PyTorch classifier
target_labels: torch.Tensor of classes for each image, size = (batch,)
baseline: baseline value for held-out features
steps: number of steps along path to baseline
Returns:
saliency: tensor of saliency values, shape = (batch, 224, 224)
'''
pass
\end{lstlisting}
In your function, for simplicity, take the absolute value and sum across the channels. Plot the results for the first three images in the dataset, using the same baseline provided for Problem 2(c).
\item \grade{5} Run your function from part (c) again with the first three images, this time using the following baseline value:
\begin{lstlisting}[language=Python]
# Generate white image, then apply pre-processing
baseline = model_transforms(255 * np.ones((224, 224, 3), np.uint8))
\end{lstlisting}
\end{enumerate}
\section{Ablation metrics (30 points)}
Ablation metrics are those that evaluate explanations by probing a model with perturbed inputs. They typically work by removing different feature subsets and verifying whether the predictions change as the importance values suggest they should. In this problem we'll implement two popular ablation metrics, \textbf{Insertion} and \textbf{Deletion}. We'll use these metrics to test two methods implemented in the previous problems.
\begin{enumerate}
\item \grade{5} First, prepare the explanations that we'll evaluate using the metrics. Use the first ten images from the dataset and generate feature importance values using \textbf{Occlusion} (with $8 \times 8$ superpixels) and \textbf{SmoothGrad} (using 50 samples and a noise level of your choice). In addition, include a \textbf{Random} baseline: generate a saliency map for each image that is Gaussian noise. Plot the saliency maps side-by-side for one of the images.
\item \grade{5} To reduce the number of predictions required for the metrics, we'll ensure that all explanations correspond to $8\times8$ superpixels. The \textbf{Occlusion} ($8 \times 8$) explanation does this already, but \textbf{SmoothGrad} and \textbf{Random} do not. Reduce their granularity by summing the importance within each non-overlapping $8 \times 8$ superpixel. Plot the new saliency maps for the same image. \textbf{Hint:} consider using PyTorch's \href{https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html}{AvgPool2d} operation.
\item \grade{5} Write a function to generate an array of prediction probabilities as features are inserted in order of most to least important (\textbf{Insertion}). Rather than inserting individual pixels, insert $8 \times 8$ superpixels. The function should have the following signature:
\begin{lstlisting}[language=Python]
def insertion(img, model, importance, target_label, baseline):
'''
Args:
img: image to ablate, size = (1, 3, 244, 244)
model: PyTorch classifier
importance: feature importance values, size = (1, 28, 28)
target_label: index of target class
baseline: baseline value for held-out features
Returns:
curve: array of prediction probabilities after each step
num_feats: array of number of features after each step
'''
pass
\end{lstlisting}
As a baseline value, use the zeros baseline from Problem 2(c). Plot the curve for a single image and just the \textbf{Occlusion} explanation, and calculate the area under the curve using the trapezoidal rule.
\item \grade{5} Write a function to generate an array of prediction probabilities as features are deleted in order of most to least important (\textbf{Deletion}). Delete $8 \times 8$ superpixels rather than individual pixels, similar to part~(c). The function should have the following signature:
\begin{lstlisting}[language=Python]
def deletion(img, model, importance, target_label, baseline):
'''
Args:
img: image to ablate, size = (1, 3, 244, 244)
model: PyTorch classifier
importance: feature importance values, size = (1, 28, 28)
target_label: index of target class
baseline: baseline value for held-out features
Returns:
curve: array of prediction probabilities after each step
num_feats: array of number of features after each step
'''
pass
\end{lstlisting}
Again, plot the curve for a single image and just the \textbf{Occlusion} explanation, and calculate the area under the curve using the trapezoidal rule.
\item \grade{5} Generate insertion curves for ten images and for all three explanation methods (\textbf{Occlusion}, \textbf{SmoothGrad}, \textbf{Random}). Plot the average curve for each explanation method, where the y-axis represents the average prediction for the target class at a given number of features (x-axis). Calculate the average area under the insertion curve for each method.
\item \grade{5} Repeat part~(e), but this time with deletion curves.
\end{enumerate}
\end{document}