A single pixel to fool an algorithm

Neural networks need training images in order to learn how to classify photos. Changing a single pixel is enough to make a picture unreadable to an algorithm.

Anna Julia Schlegel, 06/12/2018

This experiment proves that if you change one pixel, then the algorithm thinks a ship is a dog. | Image: The CIFAR-10 dataset

In order to recognise images, algorithms need a lot of datasets. That’s how they learn to classify things properly. Now researchers at the University of Fribourg have found an new method to corrupt these procedures, just by changing a single pixel in an image. They simply changed the blue value of a randomly chosen pixel to zero. Depending on the surrounding colour, the pixel can become almost invisible.

They did this to images in two specific categories – ‘dog’ and ‘ship’ in the CIFAR-10. dataset. With the photos of dogs, they manipulated a single pixel in the training phase, whereas with the ship photos, they only did it for the subsequent, recognition phase. Because the pixel was manipulated in all the images of dogs, the algorithm believed that this was intrinsic to any photo of a dog. It was accordingly unable to recognise unaltered photos of dogs as such, and it even thought a picture of a ship was really a dog when containing the altered pixel. This dual approach was tested on six neural networks – and with success. Five algorithms classified more than 70 percent of ships as dogs, but less than one percent of dogs correctly as dogs.

“Until now, researchers have concentrated on different ways of attacking algorithms – by focusing on individual, specific algorithms”, says Michele Alberti, who’s a member of the research team at Fribourg. “But for that, you have to be able to attack the neural network. We have shown that you can do it by attacking the training data”.

Neural networks are employed in artificial intelligence and are thus in regular use. Luckily, such a ‘pixel attack’ is easy to defend. Before using the training data, you simply have to run them through a filter that can discover and correct the manipulated pixel. “We want to show that such attacks are possible. Public datasets from the Internet are free of charge. Using them untested can have critical consequences”.

Anna Julia Schlegel

CC BY-NC-ND

Horizons

A single pixel to fool an algorithm