Data Poisoning
Data poisoning is an attack that involves deliberately manipulating or corrupting data used in machine learning (ML) models or artificial intelligence (AI) systems.
- goal is to:
- undermine the accuracy and reliability of the ML model
- potentially cause harm or damage by making the model provide incorrect or biased result
- can be challenging to detect and prevent
- generally require very few changes to the data
- effects may only appear once the ML model is used in a real-world application
How it Works
- attacker deliberately introduces malicious or corrupted data into the training data set used to create or improve an ML model
- attacker may alter the data set to include incorrect or biased data
- or may introduce subtle changes to the data to change the outcome of the ML model in specific ways
Mitigation Strategies
- Data validation
- Before using data in an ML model, it is crucial to validate the quality and authenticity of the data
- to identify malicious or corrupted inputs
- Data diversity
- use a diverse data range
- makes it more difficult to manipulate the inputs to modify results
- Anomaly detection
- use anomaly detection techniques
- can help identify unusual data patterns that may indicate data poisoning
- Robust models
- create ML models resilient to unexpected inputs and adversarial attacks
- Regular model testing and auditing
- helps identify issues and vulnerabilities
Examples
- Amazon Rekognition System
- Researchers demonstrated a data poisoning attack on Amazon’s Rekognition facial recognition system
- by subtly changing a small percentage of the images used to train the system
- were able to cause the system to misidentify individuals in real-world scenarios
- Google Maps
- Researchers showed that by submitting many fake edits to Google Maps
- they could manipulate the search results for a particular location
- By making small changes to the location’s data (e.g., changing its name or address)
- they could push it higher up in search results or even make it disappear altogether
- Researchers showed that by submitting many fake edits to Google Maps
- Spam Filters
- Researchers showed that inserting specific words into legitimate emails could bypass the spam filters used by popular email services like Gmail and Outlook