understanding black box predictions via influence functions

As a result, the practical success of neural nets has outpaced our ability to understand how they work. and Hessian-vector products. If the influence function is calculated for multiple values s_test and grad_z for each training image are computed on the fly ; Liang, Percy. J. Lucas, S. Sun, R. Zemel, and R. Grosse. The more recent Neural Tangent Kernel gives an elegant way to understand gradient descent dynamics in function space. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. A. S. Benjamin, D. Rolnick, and K. P. Kording. The For details and examples, look here. /Filter /FlateDecode Alex Adam, Keiran Paster, and Jenny (Jingyi) Liu, 25% Colab notebook and paper presentation. While influence estimates align well with leave-one-out. Wojnowicz, M., Cruz, B., Zhao, X., Wallace, B., Wolff, M., Luan, J., and Crable, C. "Influence sketching": Finding influential samples in large-scale regressions. If the influence function is calculated for multiple On the importance of initialization and momentum in deep learning, A mathematical theory of semantic development in deep neural networks. Stochastic Optimization and Scaling [Slides]. Why Use Influence Functions? Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Vaughan, J. W. A theory of learning from different domains. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. Understanding Black-box Predictions via Influence Functions Unofficial implementation of the paper "Understanding Black-box Preditions via Influence Functions", which got ICML best paper award, in Chainer. dependent on the test sample(s). Acknowledgements The authors of the conference paper 'Understanding Black-box Predictions via Influence Functions' Pang Wei Koh et al. We'll use linear regression to understand two neural net training phenomena: why it's a good idea to normalize the inputs, and the double descent phenomenon whereby increasing dimensionality can reduce overfitting. In. To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. In this paper, we use influence functions a classic technique from robust statistics to trace a models prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. The reference implementation can be found here: link. Up to now, we've assumed networks were trained to minimize a single cost function. S. Arora, S. Du, W. Hu, Z. Li, and R. Wang. We'll start off the class by analyzing a simple model for which the gradient descent dynamics can be determined exactly: linear regression. Understanding black-box predictions via influence functions This is a PyTorch reimplementation of Influence Functions from the ICML2017 best paper: In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. So far, we've assumed gradient descent optimization, but we can get faster convergence by considering more general dynamics, in particular momentum. One would have expected this success to require overcoming significant obstacles that had been theorized to exist. To run the tests, further requirements are: You can either install this package directly through pip: Calculating the influence of the individual samples of your training dataset Besides just getting your networks to train better, another important reason to study neural net training dynamics is that many of our modern architectures are themselves powerful enough to do optimization. Koh P, Liang P, 2017. On linear models and convolutional neural networks, Or we might just train a flexible architecture on lots of data and find that it has surprising reasoning abilities, as happened with GPT3. A. Mokhtari, A. Ozdaglar, and S. Pattathil. logistic regression p (y|x)=\sigma (y \theta^Tx) \sigma . In, Mei, S. and Zhu, X. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. and even creating visually-indistinguishable training-set attacks. Aggregated momentum: Stability through passive damping. On the limited memory BFGS method for large scale optimization. How can we explain the predictions of a black-box model? In, Mei, S. and Zhu, X. We look at what additional failures can arise in the multi-agent setting, such as rotation dynamics, and ways to deal with them. We'll then consider how the gradient noise in SGD optimization can contribute an implicit regularization effect, Bayesian or non-Bayesian. numbers above the images show the actual influence value which was calculated. Kelvin Wong, Siva Manivasagam, and Amanjit Singh Kainth. arXiv preprint arXiv:1703.04730 (2017). Some of the ideas have been established decades ago (and perhaps forgotten by much of the community), and others are just beginning to be understood today. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. Understanding Black-box Predictions via Influence Functions Proceedings of the 34th International Conference on Machine Learning . With the rapid adoption of machine learning systems in sensitive applications, there is an increasing need to make black-box models explainable. A. M. Saxe, J. L. McClelland, and S. Ganguli. Gradient-based Hyperparameter Optimization through Reversible Learning. On the accuracy of influence functions for measuring group effects. In, Cadamuro, G., Gilad-Bachrach, R., and Zhu, X. Debugging machine learning models. Proc 34th Int Conf on Machine Learning, p.1885-1894. Understanding black-box predictions via influence functions. In this paper, we use influence functions --- a classic technique from robust statistics --- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. Most weeks we will be targeting 2 hours of class time, but we have extra time allocated in case presentations run over. Understanding black-box predictions via influence functions. You signed in with another tab or window. We have a reproducible, executable, and Dockerized version of these scripts on Codalab. Amershi, S., Chickering, M., Drucker, S. M., Lee, B., Simard, P., and Suh, J. Modeltracker: Redesigning performance analysis tools for machine learning. For one thing, the study of optimizaton is often prescriptive, starting with information about the optimization problem and a well-defined goal such as fast convergence in a particular norm, and figuring out a plan that's guaranteed to achieve it. more recursions when approximating the influence. In this paper, we use influence functions --- a classic technique from robust statistics --- Abstract. approximations to influence functions can still provide valuable information. the original paper linked here. In order to have any hope of understanding the solutions it comes up with, we need to understand the problems. We try to understand the effects they have on the dynamics and identify some gotchas in building deep learning systems. Understanding black-box predictions via influence functions. In. Lectures will be delivered synchronously via Zoom, and recorded for asynchronous viewing by enrolled students. Z. Kolter, and A. Talwalkar. S. McCandish, J. Kaplan, D. Amodei, and the OpenAI Dota Team. However, in a lower Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. To get the correct test outcome of ship, the Helpful images from Your file of search results citations is now ready. How can we explain the predictions of a black-box model? You signed in with another tab or window. We have two ways of measuring influence: Our first option is to delete the instance from the training data, retrain the model on the reduced training dataset and observe the difference in the model parameters or predictions (either individually or over the complete dataset). On the importance of initialization and momentum in deep learning. Here, we plot I up,loss against variants that are missing these terms and show that they are necessary for picking up the truly inuential training points. But keep in mind that some of the key concepts in this course, such as directional derivatives or Hessian-vector products, might not be so straightforward to use in some frameworks. samples for each test data sample. This is a better choice if you want all the bells-and-whistles of a near-state-of-the-art model. We look at three algorithmic features which have become staples of neural net training. For more details please see This will also be done in groups of 2-3 (not necessarily the same groups as for the Colab notebook). ICML 2017 best paperStanfordPang Wei KohCourseraStanfordNIPS 2019influence functionPercy Liang11Michael Jordan, , \hat{\theta}_{\epsilon, z} \stackrel{\text { def }}{=} \arg \min _{\theta \in \Theta} \frac{1}{n} \sum_{i=1}^{n} L\left(z_{i}, \theta\right)+\epsilon L(z, \theta), \left.\mathcal{I}_{\text {up, params }}(z) \stackrel{\text { def }}{=} \frac{d \hat{\theta}_{\epsilon, z}}{d \epsilon}\right|_{\epsilon=0}=-H_{\tilde{\theta}}^{-1} \nabla_{\theta} L(z, \hat{\theta}), , loss, \begin{aligned} \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right) &\left.\stackrel{\text { def }}{=} \frac{d L\left(z_{\text {test }}, \hat{\theta}_{\epsilon, z}\right)}{d \epsilon}\right|_{\epsilon=0} \\ &=\left.\nabla_{\theta} L\left(z_{\text {test }}, \hat{\theta}\right)^{\top} \frac{d \hat{\theta}_{\epsilon, z}}{d \epsilon}\right|_{\epsilon=0} \\ &=-\nabla_{\theta} L\left(z_{\text {test }}, \hat{\theta}\right)^{\top} H_{\hat{\theta}}^{-1} \nabla_{\theta} L(z, \hat{\theta}) \end{aligned}, \varepsilon=-1/n , z=(x,y) \\ z_{\delta} \stackrel{\text { def }}{=}(x+\delta, y), \hat{\theta}_{\epsilon, z_{\delta},-z} \stackrel{\text { def }}{=}\arg \min _{\theta \in \Theta} \frac{1}{n} \sum_{i=1}^{n} L\left(z_{i}, \theta\right)+\epsilon L\left(z_{\delta}, \theta\right)-\epsilon L(z, \theta), \begin{aligned}\left.\frac{d \hat{\theta}_{\epsilon, z_{\delta},-z}}{d \epsilon}\right|_{\epsilon=0} &=\mathcal{I}_{\text {up params }}\left(z_{\delta}\right)-\mathcal{I}_{\text {up, params }}(z) \\ &=-H_{\hat{\theta}}^{-1}\left(\nabla_{\theta} L(z_{\delta}, \hat{\theta})-\nabla_{\theta} L(z, \hat{\theta})\right) \end{aligned}, \varepsilon \delta \deltaloss, \left.\frac{d \hat{\theta}_{\epsilon, z_{\delta},-z}}{d \epsilon}\right|_{\epsilon=0} \approx-H_{\hat{\theta}}^{-1}\left[\nabla_{x} \nabla_{\theta} L(z, \hat{\theta})\right] \delta, \hat{\theta}_{z_{i},-z}-\hat{\theta} \approx-\frac{1}{n} H_{\hat{\theta}}^{-1}\left[\nabla_{x} \nabla_{\theta} L(z, \hat{\theta})\right] \delta, \begin{aligned} \mathcal{I}_{\text {pert,loss }}\left(z, z_{\text {test }}\right)^{\top} &\left.\stackrel{\text { def }}{=} \nabla_{\delta} L\left(z_{\text {test }}, \hat{\theta}_{z_{\delta},-z}\right)^{\top}\right|_{\delta=0} \\ &=-\nabla_{\theta} L\left(z_{\text {test }}, \hat{\theta}\right)^{\top} H_{\hat{\theta}}^{-1} \nabla_{x} \nabla_{\theta} L(z, \hat{\theta}) \end{aligned}, train lossH \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right) , -y_{\text {test }} y \cdot \sigma\left(-y_{\text {test }} \theta^{\top} x_{\text {test }}\right) \cdot \sigma\left(-y \theta^{\top} x\right) \cdot x_{\text {test }}^{\top} H_{\hat{\theta}}^{-1} x, influence functiondebug training datatraining point \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right) losstraining pointtraining point, Stochastic estimationHHHTFO(np)np, ImageNetdogfish900Inception v3SVM with RBF kernel, poisoning attackinfluence function59157%77%10590/591, attackRelated worktraining set attackadversarial example, influence functionbad case debug, labelinfluence function, \mathcal{I}_{\text {up,loss }}\left(z_{i}, z_{i}\right) , 10%labelinfluence functiontrain lossrandom, \mathcal{I}_{\text {up, loss }}\left(z, z_{\text {test }}\right), \mathcal{I}_{\text {up,loss }}\left(z_{i}, z_{i}\right), \mathcal{I}_{\text {pert,loss }}\left(z, z_{\text {test }}\right)^{\top}, H_{\hat{\theta}}^{-1} \nabla_{x} \nabla_{\theta} L(z, \hat{\theta}), Less Is Better: Unweighted Data Subsampling via Influence Function, influence functionleave-one-out retraining, 0.86H, SVMhinge loss0.95, straightforwardbest paper, influence functionloss. Pearlmutter, B. Understanding the Representation and Computation of Multilayer Perceptrons: A Case Study in Speech Recognition. ? Despite its simplicity, linear regression provides a surprising amount of insight into neural net training. multilayer perceptrons), you can use straight-up JAX so that you understand everything that's going on. calculates the grad_z values for all images first and saves them to disk. The dict structure looks similiar to this: Harmful is a list of numbers, which are the IDs of the training data samples x\Y#7r~_}2;4,>Fvv,ZduwYTUQP }#&uD,spdv9#?Kft&e&LS 5[^od7Z5qg(]}{__+3"Bej,wofUl)u*l$m}FX6S/7?wfYwoF4{Hmf83%TF#}{c}w( kMf*bLQ?C}?J2l1jy)>$"^4Rtg+$4Ld{}Q8k|iaL_@8v below is divided into parameters affecting the calculation and parameters This is a PyTorch reimplementation of Influence Functions from the ICML2017 best paper: Understanding Black-box Predictions via Influence Functions by Pang Wei Koh and Percy Liang. On linear models and convolutional neural networks, we demonstrate that influence functions are useful for multiple purposes: understanding model behavior, debugging models, detecting dataset errors, and even creating visually-indistinguishable training-set attacks.See more on this video at https://www.microsoft.com/en-us/research/video/understanding-black-box-predictions-via-influence-functions/ Jaeckel, L. A. Theano D. Team. Loss , . on to the next image. In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through . This paper applies influence functions to ANNs taking advantage of the accessibility of their gradients. Thus, you can easily find mislabeled images in your dataset, or Often we want to identify an influential group of training samples in a particular test prediction for a given machine learning model. compress your dataset slightly to the most influential images important for Not just a black box: Learning important features through propagating activation differences. To scale up influence functions to modern machine learning Model-agnostic meta-learning for fast adaptation of deep networks. NIPS, p.1097-1105. This site last compiled Wed, 08 Feb 2023 10:43:27 +0000. To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. Are you sure you want to create this branch? In this paper, we use influence functions a classic technique from robust statistics to trace a models prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. calculate which training images had the largest result on the classification This could be because we explicitly build optimization into the architecture, as in MAML or Deep Equilibrium Models. Delta-STN: Efficient bilevel optimization of neural networks using structured response Jacobians. Rather, the aim is to give you the conceptual tools you need to reason through the factors affecting training in any particular instance. when calculating the influence of that single image. In. , mislabel . stream Depending what you're trying to do, you have several options: You are welcome to use whatever language and framework you like for the final project. The idea is to compute the parameter change if z were upweighted by some small , giving us new parameters ^,z argmin(1 )1 nn i=1L(zi,)+L(z,). fast SSD, lots of free storage space, and want to calculate the influences on Deep inside convolutional networks: Visualising image classification models and saliency maps. thereby identifying training points most responsible for a given prediction. Often we want to identify an influential group of training samples in a particular test prediction. Springenberg, J. T., Dosovitskiy, A., Brox, T., and Riedmiller, M. Striving for simplicity: The all convolutional net. If you have questions, please contact Pang Wei Koh (pangwei@cs.stanford.edu). Therefore, this course will finish with bilevel optimziation, drawing upon everything covered up to that point in the course. Optimizing neural networks with Kronecker-factored approximate curvature. The most barebones way of getting the code to run is like this: Here, config contains default values for the influence function calculation Debruyne, M., Hubert, M., and Suykens, J. Lage, E. Chen, J. 10 0 obj Infinite Limits and Overparameterization [Slides]. Shrikumar, A., Greenside, P., Shcherbina, A., and Kundaje, A. In. Neural nets have achieved amazing results over the past decade in domains as broad as vision, speech, language understanding, medicine, robotics, and game playing. In. In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. Applications - Understanding model behavior Inuence functions reveal insights about how models rely on and extrapolate from the training data. , loss , input space . M. MacKay, P. Vicol, J. Lorraine, D. Duvenaud, and R. Grosse. 2172: 2017: . Strack, B., DeShazo, J. P., Gennings, C., Olmo, J. L., Ventura, S., Cios, K. J., and Clore, J. N. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. Goodfellow, I. J., Shlens, J., and Szegedy, C. Explaining and harnessing adversarial examples. https://dl.acm.org/doi/10.5555/3305381.3305576. We'll mostly focus on minimax optimization, or zero-sum games. /Length 5088 Goodman, B. and Flaxman, S. European union regulations on algorithmic decision-making and a "right to explanation". Please try again. I. Sutskever, J. Martens, G. Dahl, and G. Hinton. Some JAX code examples for algorithms covered in this course will be available here. Thus, in the calc_img_wise mode, we throw away all grad_z In this paper, we use influence functions a classic technique from robust statistics to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. Measuring the effects of data parallelism on neural network training.