Classification confidence visualization of artificial neural networks with adversarial robustness

Classification confidence visualization of artificial neural networks with adversarial robustness

Adversarial attacks can easily fool artificial neural networks. This project aims to understand these attacks and their defenses by visualizing their classification confidence.

Artificial neural networks present excellent performance in image classification, but they are vulnerable to adversarial examples. They are generated by an attack method that designs small perturbations in the image. The image appears the same to humans, but it can easily fool an artificial neural network. It makes the image classification being very confident, but for a wrong class. By analyzing the confidence landscape and the decision boundaries of a classifier, we may have insights into the reasons for this open issue in machine learning.

Goal

The goal is to develop defensive methods against adversarial examples in artificial neural networks and visualize their effects in the confidence landscape.

Learning outcome

  • Understanding of adversarial attacks and defenses.
  • Development and analysis of artificial neural networks with robustness to adversarial attacks.

Qualifications

  • Python programming.
  • It is nice to be familiar with TensorFlow or PyTorch.

References

Associated contacts

Mikkel Lepperød

Mikkel Lepperød

Research Scientist

Sidney Pontes-Filho

Sidney Pontes-Filho

Postdoctoral Fellow