Adversarial Machine Learning
Machine learning techniques were originally designed for environments in which the training and test data are assumed to be generated from the same (although possibly unknown) distribution and/or process. In the presence of intelligent and adaptive adversaries, however, this working hypothesis is likely to be violated.
Applying machine learning to use cases like fraud, anti-money laundering and infosec presents a unique set of challenges:
- Little or no labeled data
- Non-stationary data distributions
- Model decay
- Counterfactual conditions
This event is entirely devoted to understanding how modern machine learning methods can be applied to these adversarial environments. We will have hands-on workshops as well as talks by leading practitioners from industry and academia.
Sep 10, 2016, 9:30a - 5p
620 Folsom St #100
San Francisco, CA 94107
09:00 - 09:30 Registration
09:30 - 11:00 TensorFlow Workshop on Adversarial Examples (Illia)
11:00 - 12:00 AML/KYC for the Ripple Consensus Ledger (Gilles)
12:00 - 01:00 Lunch
01:00 - 01:45 Multi-armed Bandit Approach to Transaction Fraud at Stripe (Alyssa)
01:45 - 02:30 Assessing Merchant Fraud Risk at Square (Thomson)
02:30 - 03:00 Break
03:00 - 03:45 ML-based Fraud Detection for Fraud and Abuse (Jacob)
03:45 - 04:30 Learning from Large Bodies of Malware Samples (Zach)
04:30 - 05:00 Closing Remarks (Arshak)
Adversarial ML Topics Covered
Expert Speakers That Understand Adversarial ML Challenges
TensorFlow has taken the deep learning world by storm. This workshop will be led by one of TensorFlow’s main contributors, Illia Polosukhin. Illia’s 90 minute, hands-on workshop will cover:
- Dropout - both for preventing overfitting and as mechanics to get "what model doesn't know" (confidence of prediction).
- Augmenting data with adversarial examples - to prevent overfitting and speed up training
- How to limit technical exploits of your models - e.g. how to use different methods to prevent your model going haywire, using different methods (confidence, adversarial examples, discriminator, separate classifiers or just simple whitelists).
Ripple’s distributed financial technology allows for banks around the world to directly transact with each other without the need for a central counterparty or correspondent. Ripple offers plug-and-play products for financial institution as well as a blockchain solution and an innovative technology to connect all the ledgers of this world (from bitcoin to bank ledgers).
While working with financial institutions and regulators, Ripple has build significant trust on the compliance side. This talk will focus on the fraud detection and AML/KYC efforts developed for the Ripple Consensus Ledger (RCL). The RCL is our blockchain solution that make it possible to make transactions across different currencies. We will discuss some unique challenges related to applying machine learning to detect fraudulent activities on blockchain systems with a high velocity of multi-currency transactions.
Stripe processes billions of dollars a year in payments for businesses around the world. To protect our users from fraud, we use machine learning to score and block potentially fraudulent transactions. Many of the issues we faced when building this system are forms of the mult-armed bandit problem, in which an agent must choose been “exploring” multiple options and “exploiting” the option that it currently believes yields the highest payoff. In this talk, I’ll introduce multi-armed bandits and their variants (including contextual and adversarial bandits) and describe how counterfactual evaluation (evaluating the performance of models when you can’t always observe the outcomes of your actions) and deterrence (injecting misinformation to disrupt the bandit problem that fraudsters face) can be posed (and “solved”) in this framework.
Square Capital is Square's business financing services arm, providing capital to sellers in a fast, fair, and intelligent manner. While the data science team focuses on mitigating default and underwriting risk, another concern is fraud/bad actor risk within the Square ecosystem. In this talk, we'll look at modern approaches to assessing merchant fraud risk, as well as the effects of ensembling different models and external datasets to further improve accuracy.
Sift Science is the leading provider of real-time machine learning fraud prevention for online businesses across the globe. Sift Science protects thousand of different businesses from all kinds of fraud and abuse, from a stolen credit card used to buy an airline ticket or a digital game, from a fake apartment or job listing, from a fraudulent money transfer, or from abuse of a referral program.
In this talk, we'll discuss some challenges building a machine learning system to detect all of these diverse kinds of fraud and abuse, including extracting features and training models on custom data specific to each business, leveraging our network of data to help each individual business, learning in real-time, and explaining our system's recommendations to customers.
In this talk, methods for working with and learning from large bodies of malware samples will be demonstrated. We will discuss the process of modeling complex, evolving data, the uses of such systems in a production environment, and strategies for adapting to the natural adversaries that develop such malware.
Our validation methodology for identifying new malicious binaries, our general feature set, and our production environment for utilizing this data will be explored, along with a discussion of alternate approaches taken by other organizations to solve similar problems. Amongst those alternate solutions are the Polonium system, approaches generated for the Microsoft Kaggle challenge, and network-system hybrid approaches such as Mastino.