Implementations of defense methods that are used to strengthen the resilience of deep learning models against adversarial examples.
Similar with ../Attacks/
, we first define and implement the defense class (e.g., NATDefense
within NAT.py
for the NAT defense) in Defenses/DefenseMethods/
folder, then we write the corresponding testing code (e.g., NAT_Test.py
) to strengthen the original raw model and save the defense-enhanced models into the directory of DefenseEnhancedModels/
.
We implement 10 representative complete defenses, including four categories: adversarial-training-based defenses, gradient-masking-based defenses, input-transformation-based defenses and region-based classification.
- NAT: A. Kurakin, et al., "Adversarial machine learning at scale," in ICLR, 2017.
- EAT: F. Tram`er, et al., "Ensemble adversarial training: Attacks and defenses," in ICLR, 2018.
- PAT: A. Madry, et al., "Towards deep learning models resistant to adversarial attacks," in ICLR, 2018.
- DD: N. Papernot, et al., "Distillation as a defense to adversarial perturbations against deep neural networks," in S&P, 2016.
- IGR: A. S. Ross et al., "Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients," in AAAI, 2018.
- EIT: C. Guo, et al., "Countering adversarial images using input transformations," in ICLR, 2018.
- RT: C. Xie, et al., "Mitigating adversarial effects through randomization," in ICLR, 2018.
- PD: Y. Song, et al., "Pixeldefend: Leveraging generative models to understand and defend against adversarial examples," in ICLR, 2018.
- TE: J. Buckman, et al., "Thermometer encoding: One hot way to resist adversarial examples," in ICLR, 2018.
- RC: X. Cao et al., "Mitigating evasion attacks to deep neural networks via region-based classification," in ACSAC, 2017.
Preparation of defense-enhanced models with specific defense parameters that will be used in our evaluation.
Attacks | Commands with default parameters |
---|---|
NAT | python NAT_Test.py --dataset=MNIST --adv_ratio=0.3 --clip_max=0.3 --eps_mu=0 --eps_sigma=50 python NAT_Test.py --dataset=CIFAR10 --adv_ratio=0.3 --clip_max=0.1 --eps_mu=0 --eps_sigma=15 |
EAT | python EAT_Test.py --dataset=MNIST --train_externals=True --eps=0.3 --alpha=0.05 python EAT_Test.py --dataset=CIFAR10 --train_externals=True --eps=0.0625 --alpha=0.03125 |
PAT | python PAT_Test.py --dataset=MNIST --eps=0.3 --step_num=40 --step_size=0.01 python PAT_Test.py --dataset=CIFAR10 --eps=0.03137 --step_num=7 --step_size=0.007843 |
DD | python DD_Test.py --dataset=MNIST --initial=False --temp=50 python DD_Test.py --dataset=CIFAR10 --initial=False --temp=50 |
IGR | python IGR_Test.py --dataset=MNIST --lambda_r=316 python IGR_Test.py --dataset=CIFAR10 --lambda_r=10 |
EIT | python EIT_Test.py --dataset=MNIST --crop_size=26 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4 python EIT_Test.py --dataset=CIFAR10 --crop_size=30 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4 |
RT | python RT_Test.py --dataset=MNIST --resize=31 python RT_Test.py --dataset=CIFAR10 --resize=36 |
PD | python PD_Test.py --dataset=MNIST --epsilon=0.3 python PD_Test.py --dataset=CIFAR10 --epsilon=0.0627 |
TE | python TE_Test.py --dataset=MNIST --level=16 --steps=40 --attack_eps=0.3 --attack_step_size=0.01 python TE_Test.py --dataset=CIFAR10 --level=16 --steps=7 --attack_eps=0.031 --attack_step_size=0.01 |
RC | python RC_Test.py --dataset=MNIST --search=True --radius_min=0 --radius_max=0.3 --radius_step=0.01 --num_points=1000 python RC_Test.py --dataset=CIFAR10 --gpu_index=2 --search=True --radius_min=0.0 --radius_max=0.1 --radius_step=0.01 --num_points=1000 |