Deep Learning (DL) has recently made impressive advances in areas like game playing, language processing and vision, leading to claims that general purpose artificial intelligence may be just around the corner. However, DL also regularly produces embarrassing failures. Focusing on visual object recognition, where the failures include being tricked by modified traffic signs, and being unable to detect Black faces, a cottage industry has emerged where researchers produce new kinds of “adversarial examples” that DL misclassifies.
Three Sources in Focus
Here we look at research on adversarial examples from the perspective of scientific methodology, where it appears to be a haphazard game of gotcha. We draw on three sources to help sort out whether adversarial examples should be taken as a serious challenge to DL’s apparent success. First we look at work in comparative psychology, which suggests fairer ways of comparing DL’s susceptibility to adversarial examples to the human visual system’s vulnerability to illusions. Second we draw connections to recent work on mechanistic explanation in computational sciences, where the pattern of successes and failures (not just the successes) is what one looks for as evidence that the same type of mechanism is at work. Third, we draw on criticism from AI Ethics pointing out how benchmark tasks are overemphasized in AI research, to draw more circumscribed conclusions about whether DL succeeds at human-like object recognition.
To participate in the seminar #frAIday, please register here. Please note that you don't have to register for each event.