This webinar provides an overview of Adversarial Machine Learning (AML), its relationship to Generative (Deep) Learning, and ways to view AML as a potential enabler for deploying more comprehensive system-level Machine Learning capabilities. The basic ideas driving AML and the system-level architecture needs of an effective integrated ML capability are compared to find areas of commonality and future utility beyond single-shot, algorithm-by-algorithm approaches to AML and remediation techniques.
The presenter of this webinar, Michael Weir, provided these notes, below, to go along with this webinar as well as reference materials:
Why do we “do” machine learning (ML) anyway? Because we can’t find another, better way to solve a problem. At the outset, we accept that we will get a probabilistic answer, and that’s good enough (or we find out how close to “good enough” we get, and use it or try again).
Our choice of “good enough” is usually associated with some performance parameter that is postulated, and then implemented in a mathematical process that calculates a loss function and an optimization function. The performance is then “proven” post-experiment as we check the results after tuning and multiple runs. That is, we are not directly measuring a given performance parameter, but measuring a mathematically constructed procedure that tells us whether or not the loss/optimization we selected (which resulted in an acceptable error rate in training) works in the real world – it “generalizes” to unseen examples.
The idea of “optimizing” the performance of an ML algorithm is not really tuning the performance parameter(s) that will solve your problem, it is tuning the optimization algorithm you built (or inferred by some mathematical construct) into the ML model. An example is the typical gradient descent 3D picture of “sliding down the slope” of a convex function. When you get to the bottom, the slope is zero and you’ve found the minimum of the two parameters you selected to represent your problem. To be clear, you are not measuring something like the specific spring characteristic in a weight/spring physics problem, but a mathematical representation that reflects some characteristic of a set of samples that might be close to a function that would operate similar to your real-world problem. You are trying to discover a function H(x) that operates well enough to approximate G(x), the problem space in the real world.
The reason we are talking about optimization is that it is the power and weakness of ML. Because we can’t know the G(x) [If we did, we wouldn’t need ML], we have to use what we know [the past examples we know] to build a “predictor” [what will happen with future unseen samples].
Reference materials associated with ideas in the Webinar:
1. References from the Webinar slides:
Leslie Valiant, A Theory of the Learnable, here: http://web.mit.edu/6.435/www/Valiant84.pdf
GPT-3 Model sizes: https://lambdalabs.com/blog/demystifying-gpt-3/
Generative Deep Learning: https://developers.google.com/machine-learning/gan/gan_structure
Stop Sign reference: https://arxiv.org/abs/1412.6572.pdf
Panda reference: https://arxiv.org/abs/1707.08945.pdf
Facebook reference: https://ai.facebook.com/blog/deepfake-detection-challenge
Microsoft reference: https://docs.microsoft.com/en-us/security/engineering/failure-modes-in-machine-learning
2. Other references that might be helpful in thinking about the applications of AML in new ways:
McAfee Face Fooling experiment: https://factschronicle.com/the-hack-that-can-fool-facial-recognition-algorithms-at-security-checkpoints-23740.html
Adversarial Robustness as a Prior: https://arxiv.org/abs/1906.00945
From Imagenet to Image Classification: https://arxiv.org/abs/2005.11295