 Research Paper
 Open Access
Attacking convolutional neural network using differential evolution
 Jiawei Su^{1}Email author,
 Danilo Vasconcellos Vargas^{2} and
 Kouichi Sakurai^{2}
https://doi.org/10.1186/s4107401900533
© The Author(s) 2019
 Received: 4 May 2018
 Accepted: 15 January 2019
 Published: 22 February 2019
Abstract
The output of convolutional neural networks (CNNs) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small welltuned artificial perturbation. That is, images modified by conducting such alteration (i.e., adversarial perturbation) that make little difference to the human eyes can completely change the CNN classification results. In this paper, we propose a practical attack using differential evolution (DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method only modifies five pixels (i.e., fewpixel attack), and it is a blackbox attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29%, 72.30%, and 61.28% nontargeted attack success rates, with 88.68%, 83.63%, and 73.07% confidence on average, on three common types of CNNs. The attack only requires modifying five pixels with 20.44, 14.28, and 22.98 pixel value distortion. Thus, we show that current deep neural networks are also vulnerable to such simpler blackbox attacks even under very limited attack conditions.
Keywords
 Artificial intelligence
 Image processing
 Adversarial machine learning
1 Introduction
Recent research has shown that deep convolutional neural network (CNN) can achieve humancompetitive accuracy on various image recognition tasks [25]. However, several recent studies have suggested that the mapping learned by CNN from input image data to the output classification results is not continuous. That is, there are some specific data points (or possibly some continuous regions) in the input space whose classification labels can be changed by adding even very small perturbations. Such modification is called “adversarial perturbation” in the case that potential adversaries wish to abuse such a characteristic of CNN to make it misclassify [8, 12, 18, 24]. By using various optimization methods, tiny welltuned additive perturbations which are expected to be imperceptible to the human eyes but be able to alter the classification results significantly can be calculated effectively. In specific, adding the adversarial perturbation can lead the target CNN classifier to either a specific or arbitrary class, both are different from the true class.

Effectiveness—With the best parameter setting of differential evolution (DE) and extremely limited conditions, the attack can achieve 72.29%, 72.30%, and 61.28% success rates of conducting nontargeted attacks on three types of common convolutional neural network structures: network in network [11], all convolutional network [21], and VGG16 [20] trained on CIFAR10 dataset (Fig. 1). Further results on ImageNet dataset show that in nontargeted attacking, the BVLC AlexNet model can alter the labels of 31.87% of the validation images.

Blackbox attack —The proposed attack only needs miracle reaction (probability labels) from the target CNN system while many previous attacks require access to the inner information such as gradients, network structures, and training data, which in most cases is hard or even not available in practice. The capability of being able to conduct blackbox attack using DE is based on the fact that it makes no assumption on the optimization problem of finding effective perturbation such that does not abstract the problem to any explicit target functions according to the assumption, but works directly on increasing(decreasing) the probability label values of the target (true) classes.

Efficiency—Many previous attacks of creating adversarial perturbation require alternation on a considerable amount of pixels such that it may risk the possibility being perceptible to human recognition systems as well as require higher cost of conducting the modification (i.e., the more pixels that need to be modified, the higher the cost). The proposed attack only requires modification on 5 pixels with an average distortion of 19.23 pixel value per channel per pixel for CIFAR10 images. Specifically, the modification on 5 pixels is further pressured by adding a term that is proportional to the strength of accumulated modification in the fitness functions of DEs.

Scalability—Being able to attack more types of CNNs (e.g., networks that are not differentiable or when the gradient calculation is difficult) as long as the feedback of the target systems is available.
The rest of the paper is as follows: Section 2 introduces the previous attack methods and their features, as well as compares with the proposed method. Section 3 describes why and how to use DE to generate effective adversarial perturbation under various settings. In Section 4, several measures are proposed for evaluating the effectiveness of DEbased attack. Section 5 discusses the experimental results and possible future extension.
2 Related works
Though CNN has given outstanding performance of classification in different practical domains, its security problem has been also emphasized [1, 2]. For example, in the domain of natural language processing, the CNNbased text classification can be easily fooled by purposely adding or replacing specific words or letters [10]. For speechtotext recognition, the signal can be also altered by adding a tiny additional signal such that the resulting text can be very different from the origin [4]. The CNNbased image recognition suffers the same problem. In fact, the intriguing (or vulnerable) characteristic that CNN is sensitive to welltuned artificial perturbation was first reported by evaluating the continuity of CNN output with respect to small change on input image [24]. Accordingly, various optimization approaches are utilized for generating effective perturbation to attack the CNN image classifiers. Goodfellow et al. proposed “fast gradient sign” algorithm for calculating effective perturbation based on a hypothesis in which the linearity and highdimensions of inputs are the main reason that a broad class of networks is sensitive to small perturbation [8]. MoosaviDezfooli et al. proposed a greedy perturbation searching method by assuming the linearity of CNN decision boundaries [12]. Papernot et al. utilize Jacobian matrix with respect to the network to build “adversarial saliency map” which indicates the effectiveness of conducting a fixedlength perturbation through the direction of each axis [18, 19]. Based on these preliminary works, attacks in extreme conditions are also proposed to show the vulnerability of CNN is even more serious. Su et al. show that onepixel perturbation is enough to change the classification results of a CNN in most cases [23]. Unlike common imagespecific perturbations, the universal adversarial perturbation is a single constant perturbation that can fool a large amount of images at the same time [14].
Other blackbox attacks that require no internal knowledge about the target systems such as gradients, have also been proposed. Papernot et al. proposed the first blackbox attack against CNN which consists in training a local model to substitute for the target CNN, using inputs synthetically generated by an adversary and labeled by the target CNN. The local duplication is then used for crafting adversarial examples which are found being successfully misclassified by the targeted CNN [6]. Narodytska et al., implemented a greedy local search to perturb a small set of pixels of an image which treats the target CNN as a miracle [17].
3 Methodology
3.1 Problem description
In the case of this research, the maximum modification limitation L is set to be two empirical constraints: (1) The number of pixels that can be modified, which is represented by d, is set to be 5 while the specific index of each modified pixel is not fixed. The constraint can be represented as ∥e(x)∥_{0}≤d where d=5. Except the elements need to modify, others in vector e(x) are left to zero. (2) The fitness functions of DE utilized in this research favor the modification with smaller accumulated pixel values more than success rate of attack such that controlling the accumulated pixel values becomes the priority during the evolution. Such constraints are more restricted compared to many previous works which only implement restrictions similar to either constraint 1 or 2 [13, 14].
3.2 Perturbation strength
In this research, a fivepixel modification is chosen as the strength of attack by considering the practicability. First, the fewpixel modification is more efficient than the global perturbation [14] that modifies each or most pixels of an image due to less variables need to solve. On the other side, onepixel attack numerically requires the least cost among most of the attacks [23]. However, the onepixel attack can be hardly imperceptible in practice since all attack strength concentrates on the single modified pixel. By adding the number of pixels that can modify, the strength can be distributed to make the modification less visible. In practice, a scenario that onepixel attack is available but fivepixel attack is not is not common. The pixel values modified by the proposed attack are still 8 bits (0–255) which are legal for threechannel RGB pixels.
3.3 Differential evolution and its variants
Differential evolution (DE) is currently one of the most effective stochastic real parameter optimization method for solving complex multimodal optimization problems [7, 22]. Similar to genetic algorithms and other evolutionary algorithms, DE acts as a blackbox probe which does not care the specific form of the target functions. Thus, it can be utilized on a wider range of optimization problems (e.g, nondifferentiable, dynamic, noisy). DE iteratively improves the quality of the population which each individual in the population is a potential solution for the corresponding target problem. In particular, DE considers the difference of the individual genomes as search ranges within each iteration to explore the solution space. In addition, DE uses onetoone selection holds only between an ancestor and its offspring which is generated through mutation and recombination, rather than the commonly used tournament selection in many other evolutionary algorithms. Such a selection strategy has a superior ability to preserve population diversity better than tournament selection where elites and their offspring may dominate the population after few iterations [5].
Results of conducting the proposed attack on all convolutional network (AllConv) with different F values
Variant  Success rate (%)  Confidence (%)  Cost 

0.5/0.5/0.5  71.46  89.38  24.66 
0.9/0.5/0.5  72.00  88.22  25.71 
0.1/0.5/0.5  70.63  90.86  20.32 
Results of conducting the proposed attack on all convolutional network (AllConv) with different crossover strategies
Variant  Success rate (%)  Confidence (%)  Cost 

0.5/0.5/0.5  71.46  89.38  24.66 
0.5/0.5/0.9  71.66  88.60  24.43 
0.5/0.5/0.1  71.05  89.71  24.60 
0.5/0.9/0.9  72.06  90.19  25.03 
0.5/0.9/0.5  70.86  89.58  24.69 
0.5/0.9/0.1  72.06  88.70  24.16 
0.5/0.1/0.9  71.04  88.98  24.68 
0.5/0.1/0.1  72.29  88.68  24.64 
0.5/0.1/0.5  72.00  88.98  24.86 
3.3.1 Mutation
where F is the scale parameter set to be in the range from 0 to 1. It can be seen that under such a scheme, the mutated x_{i}^{∗} has no relationship with its prototype x_{i}. Their relations can be established in the crossover step.
The intuition of such a mutation is using the individual x_{r1} as the basis, plus the difference (scaled by the factor F) between another two individuals, x_{r2} and x_{r3}, to generate child. Such difference indicates a meaningful step in the search space. It is actually the different values of parameter F demarcates from one mutation to another. Instead of a constant F, it can be also set to be random and to be specific for each individual in a certain iteration. In this research, we respectively adopt different F to evaluate the influence to the attack success rates.
3.3.2 Crossover
The crossover step after mutation concerns about combining the original individual x_{i} and its corresponding child x_{i}^{∗}. This is the step that x_{i} and x_{i}^{∗} actually establish the connection to each other, which is used for improving the potential diversity of the population. Specifically, the crossover exchanges the components of x_{i}^{∗} obtained by mutation step, with the corresponding elements of its prototype x_{i}, by using two kinds of crossover strategies: exponential crossover and binomial crossover.
Simply put, the exponential crossover replaces a series of elements of x_{i}^{∗}, saying any elements without the range from index i to j, with the elements of x_{i} that own the same index, where 1≤i≤j≤D where D is the size of an individual. On the other hand, binomial crossover replaces every element of x_{i}^{∗} according to a probability of crossover, denoted by C_{r}. Specifically, a random number within the range from 0 to 1 is generated for each element in x_{i}^{∗}, replace with the corresponding value of x_{i} if it is smaller than C_{r}.
Each individual (genome) of DE holds the information of one fivepixel attack (perturbation). That is, each individual represents a series of perturbation on five pixels, which the information of each pixel perturbation includes its xy coordinate position and RGB value. Hence, an individual is encoded in a 5×5 array.

Crossover on position information. The crossover only replaces the position information (i.e., the first two dimensions) of x_{i}^{∗} with the one owned by x_{i}. A probability value C_{p} is used to identify if the crossover triggers or not. Exchanging information of coordinate is for letting the offspring inherit the location information of vulnerable pixels containing in current population.

Crossover on RGB values. The crossover only replaces the RGB value information (i.e., the last three dimensions) of x_{i}^{∗} with the one owned by x_{i}. A probability value C_{rgb} is used to identify if the crossover triggers or not. Exchanging information of coordinate is for letting the offspring inherits the information of vulnerable RGB perturbation values containing in current population.

Crossover for both position and RGB values. Such a crossover is the combination of the above two, according to the assumption that both crossovers are useful.

No crossover. The opposite to the one above, assuming that exchanging either information of pixel locations or RGB values is not meaningful.
3.3.3 Selection
The selection step implemented by this research makes no difference to the standard DE selection setting. Specifically, unlike the tournament selection in genetic algorithms which ranks all population based on the individual fitness and selects amount of best individuals, DE uses a onetoone selection holds only competitions between a current individual x_{i} and its offspring \({x_{i}^{*}}\) which is generated through mutation and crossover. This ensures that DE retains the very best sofar solution at each index; therefore, the diversity can be well preserved.
3.3.4 Other DE variants
It is worth to mention that even if different variants of DE have been implemented and evaluated in this research, there are actually even more complex variations/improvements such as selfadaptive [3], multiobjective [27], among others, which can potentially further improve the effectiveness of attack.
3.4 Using differential evolution for generating adversarial perturbation

Higher probability of finding global optima—DE is a metaheuristic which is relatively less subject to local minima than gradient descent or greedy search algorithms (this is in part due to the diversity keeping mechanisms and the use of a set of candidate solutions). Capability of finding better solutions (e.g., global optima rather than local) is necessary in our case since we have implemented more restricted constraints on perturbation in this research such that the quality of optimization solution has to be guaranteed to a high extent.

Require less information from target system—DE does not require the optimization problem to be differentiable as is required by classical optimization methods such as gradient descent and quasiNewton methods. This is critical in the case of generating adversarial images since (1) there are networks that are not differentiable, for instance [26] and (2) calculating gradient requires much more information about the target system which can be hardly realistic in many cases.

Simplicity—The approach proposed here is independent of the classifier used. For the attack to take place, it is sufficient to know the probability labels. In addition, most of previous works abstract the problem of searching the effective perturbation to a specific optimization problem (e.g., an explicit target function with constraints). Namely, additional assumptions are made to the searching problem, and this might induce additional complexity. Using DE does not solve any explicit target functions but directly works with the probability label value of the target classes.
3.5 Method and settings
where F(x_{i}) is the fitness value of an individual x_{i} where i=1,..,800 in a certain generation. The F(x_{i}) is a combination of its probability value belonging to the true class t, P_{t}(x_{i}), and the cost of attack C(x_{i}). Weight values of 0.25 and 0.75 are empirically assigned to the two terms. We find that a higher weight value assigned to P_{t}(x_{i}) will make the DE evolution take much less care of C(x_{i}) such that the cost of attack increases drastically. While doing the opposite will increase P_{t}(x_{i}) but less significantly. Such weights indicate that obtaining a x_{i} with low P_{t}(x_{i}) is much easier than a x_{i} with low C(x_{i}). The cost C(x_{i}) is measured as normalized pixel value changed on three RGB channels, which is expected to be small to guarantee the modification can be invisible. For an individual, the lower the fitness, the better the quality hence easier the survival.
The maximum number of generation is set to 100, and earlystop criteria can be triggered when there is at least one individual in the population whose fitness is less than 0.007. Once stopped, the label of true class is compared with the highest nontrue class to evaluate if the attack succeeded. The initial population is initialized by using uniform distributions U(1,32) for CIFAR10 images for generating xy coordinate (e.g., the image has a size of 32×32 in CIFAR10) and Gaussian distributions N (μ=128, σ=127) for RGB values. For ImageNet, the setting is similar.
3.6 Finding the best variant
In order to find the best DE variant for generating adversarial samples, we propose a greedy search method which starts from a DE variant with basic setting. Then, we gradually alter the parameter settings to evaluate the effect on the success rate of attack and come up with a localoptimized setting, which is further used for attack under several different scenarios. Specifically, it is mainly the mutation and crossover that differ different types of DE variants. We implement a basic DE which enables both mutation and crossover to middle levels. Then, we adjust the value of each single parameter while keep others unchanged to conduct the test.
For example, the four types of crossover proposed in Section 3.3.2 can be achieved by adjusting the corresponding crossover probability C_{p} and C_{rgb}. For instance, both C_{p} and C_{rgb} are set to be a very small number means to disable the crossover.
4 Evaluation and results

Success rate—It is defined as the empirical probability of a natural image that can be successfully altered to another predefined (targeted attack) and arbitrary class (nontargeted attack) by adding the perturbation.

Confidence—The measure indicates the average probability label of the target class output from the target system when successfully altered the label of the image from true to target.

Average distortion—The average distortion on the single pixel attacked by taking the average modification on the three color channels is used for evaluating the cost of attack. Specifically, the cost is high if the value of average distortion is high such that it is more likely to be perceptible to the human eyes.
4.1 Comparison of DE variants and further experiments
Preliminary experiments are for evaluating different DEs (i.e., different mutation factor F value and crossover strategies). The mutation factor F will be abbreviated as “F value” in the rest of the paper. We utilize a greedy search approach to find the localoptimized DE variant. Specifically, we first propose a standard model which enables all settings to midlevels. Then, the settings are gradually changed one by one for evaluating the influence to the effectiveness of attack. The localoptimized model is found for conducting further experiments with more datasets and network structures.
Specifically, the comparison of DE variants are conducted on the all convolution network [21] by launching nontargeted attacks for finding a localoptimized model. The localoptimized model is further evaluated on network in network [11] and VGG16 network [20] trained on CIFAR10 dataset [9]. At last, the model is applied for nontargeted attacking the BVLC AlexNet network trained on ImageNet dataset with the same DE paramater settings used on the CIFAR10 dataset, although ImageNet has a search space 50 times larger than CIFAR10, to evaluate the generalization of the proposed attack to large images. Given the time constraints, we conduct the experiment without proportionally increasing the number of evaluations, i.e., we keep the same number of evaluations.
4.2 Results
Results of conducting proposed attacks on additional datasets by using localoptimized DE 0.1/0.1/0.1 and 0.5/0.1/0.1
Variant  Success rate (%)  Confidence (%)  Cost 

All convolutional net  
0.1/0.1/0.1  71.86  90.30  20.44 
0.5/0.1/0.1  72.29  88.68  24.64 
Network in network  
0.1/0.1/0.1  72.30  83.63  14.28 
0.5/0.1/0.1  70.63  81.17  16.30 
VGG network  
0.1/0.1/0.1  56.49  67.36  22.98 
0.5/0.1/0.1  61.28  73.07  24.62 
BVLC network  
0.1/0.1/0.1  31.87  14.88  2.36 
0.5/0.1/0.1  26.69  14.79  6.19 
Each type of DE variant is abbreviated in the format “ Fvalue/ C_{p}/ C_{rgb}.” For example, 0.5/0.5/0.5 denotes the model with its F value, crossover rate of coordinate, and RGB value all equal to 0.5. We choose the 0.5/0.5/0.5 as the standard prototype model to compare with other variants, since it enables all settings to a mid extent.
4.2.1 Effectiveness and efficiency of attack
First, the influence of changing F value is evaluated by implementing the standard model with different F values. According to the results in Table 1, higher F values give very limited increase on success rate of attack however require a considerable amount of more distortion. For example, shifting from 0.1/0.5/0.5 to 0.9/0.5/0.5 increases only 1.37% success rate with a cost of increasing 5.39 (26.53%) pixel value. Since the F controls how far the distance starting from the current individuals to probe new solutions, the intuition of this result indicates that moving smaller steps in the solution space might find new solutions that are similar to the prototypes, with comparative attack success rate but more efficient (i.e., the prototypes are further optimized), while moving larger steps may find totally different solutions with higher distortion required. This might indicate that in the solution space, the candidate solutions (vulnerable pixels) are gathered within groups and moving by small steps from the existing solutions can find new individuals with better quality (i.e., require less distortion). Therefore, it comes to a conclusion that smaller F values can effectively decrease the distortion needed for the attack.
Then, we keep the F value as 0.5 for conducting further experiments of comparing the influence of two crossover strategies. The results show that generally both types of crossover are not helpful for improving success rate and decreasing distortion required. For example, comparing 0.5/0.1/0.1 which disables both crossovers, and 0.5/0.1/0.9 (0.5/0.9/0.1) which only enables one crossover, shows 1.25% (0.23%) reduction on success rate and only 0.04 (0.48) decrease on distortion. Enabling both crossovers (0.5/0.9/0.9) is also not helpful in a similar way. Such results show that the quality of perturbation can not be significantly improved by replacing the coordinate or RGB color information of children population with their corresponding ancestors.
According to the results of comparison, we choose the 0.5/0.1/0.1 and 0.1/0.1/0.1 as the two localoptimized models for conducting further experiments. Note that as mentioned above, setting a smaller F value can be helpful for decreasing the distortion on perturbed pixels. On CIFAR10, the success rates of proposed attacks on three types of networks show the generalized effectiveness of the proposed attack through different network structures. The all convolutional network and network in network structures show great vulnerability. Specifically, the all convolutional network gives the highest attack confidence while network in network requires least cost of attack. Both of them have relatively high success rate of attack. The VGG16 network on the other side gives the average highest robustness among the three networks. In addition, it can be seen that a smaller F value is effective for reducing distortion through different network structures.
On ImageNet, the results show that the proposed attack can be generalized to large size images and fool the corresponding larger neural network. Note that the ImageNet results are done with the same settings as CIFAR10 while the resolution of images we use for the ImageNet test is 227×227, which is 50 times larger than CIFAR10 (32×32). However, confidence results on CIFAR10 dataset is comparatively much higher than ImageNet. In each successful attack, the probability label of the target class (selected by the attack) is the highest. Therefore, the average confidence on ImageNet is relatively low but tell us that the remaining 999 classes are even lower such that the output becomes an almost uniform soft label distribution. To sum it up, the attack can break the confidence of AlexNet to a nearly uniform soft label distribution. The results indicate the large images can be less vulnerable than midsized images.
Method  Success rate (%)  Confidence (%)  Number (percentage) of pixels  Network 

0.1/0.1/0.1  72.30  83.63  5 (0.48%)  NiN 
0.1/0.1/0.1  56.49  67.36  5 (0.48%)  VGG 
0.1/0.1/0.1  71.86  90.30  5 (0.48%)  AllConv 
LSA  97.89  72  33 (3.24%)  NiN 
LSA  97.98  77  30 (2.99%)  VGG 
FGSM  93.67  93  1024 (100%)  NiN 
FGSM  90.93  90  1024 (100%)  VGG 
Onepixel  72.85  75.02  1 (0.098%)  NiN 
Onepixel  63.53  65.25  1 (0.098%)  VGG 
Onepixel  68.71  79.4  1 (0.098%)  AllConv 
4.2.2 Originaltarget class pairs
Overall, it can be seen that some certain classes can be more easily perturbed to another close target class. Even if the original and target class might be quite similar (e.g., cat and dog) for both CNN and the human eyes, in practice, such a vulnerability can be still fatal. In addition, the vulnerability might be even regarded as a guideline for adversaries to launch targeted attack. Saying that an adversary wishes a natural image with true label C_{o} can be misclassified to a specific target class C_{t}. According to the distance map, he (she) finds that directly perturbing C_{o} to C_{t} is hard, but it is easy to perturb C_{o} to a third class C_{m} which has much less distance to C_{t}. Then, an option is to first perturb C_{o} to C_{m} and then to the final destination C_{t}. For example, according to the heat map of all convolution network with 0.1/0.1/0.1 (the first graph of Fig. 6), an adversary can perturb an image with label 0 to 9 by first perturbing the image to class 8 then to class 9. Doing in such a way is easier than directly perturbing from 0 to 9.
Additionally, it can also be seen that each heat map matrix is approximately symmetric, indicating that each class has similar number of adversarial samples which were crafted from these classes as well as to these classes, which is also directly suggested in Fig. 8. There are certain classes that are apparently more vulnerable since being exploited more times than other classes, as the original and target classes of attack. The existence of such vulnerable classes can become a backdoor for inducing security problems.
4.2.3 Time complexity
The time complexity of DE can be evaluated according to the number of evaluations which is a common metric of optimization. Specifically, the number of evaluations is equal to the population size multiplied by the number of generations. In this research, we set the maximum number of generation as 100 and population size as 400; therefore, the maximum number of evaluations is 40,000. We observed that all DE variants reach the maximum number of evaluations for each experiment on average. Even so, according to the results mentioned above, the proposed attack can produce effective solutions in such a small number of evaluations.
4.2.4 Distribution of perturbed pixels
We plot the perturbed pixels of successful attacks to show that their location on the images, which are shown in Fig. 9 in the Appendix, for four types of networks and two DE settings: 0.5/0.1/0.1 and 0.1/0.1/0.1. The attacks are conducted on 500 CIFAR10 and 150 ImageNet images. Generally, we find that the perturbed pixels are more rare at the edges while quite dense in the middle of the images. Assuming that the main objects (e.g., the cat in the image labeled as “cat”) mostly appear in the middle of the images, it is interesting to notice that the plots indicate that the perturbations are most conducted on the main objects but not the background. In other words, the proposed attack always tries to alter the existing objects in the images but not modifies the background to make the classifier ignore the main objects, to cause the misclassification.
5 Discussion and future work
Our results show the influence of adjusting parameters of DE to the effectiveness of attack. According to the comparison between different DE variants, it can be seen that a small F value can induce little reduction on the success rate of attack but reduce about 26% distortion needed for conducting the attack. In practice, adversaries can choose to emphasize either success rate or distortion by adjusting the F value. The crossovers between the coordinates and RGB values of the perturbation are shown to be not useful for generating better quality perturbation. Such a phenomenon can be easily realized by comparing the results between the DE that disables both crossovers and others. This might indicate that for a specific effective perturbation x_{i}, its coordinate and RGB value are strongly related. Transplanting either the isolated vulnerable coordinate or RGB value of x_{i} to another perturbation is not helpful or even decrease the quality of latter. Furthermore, the result might indicate that for a specific natural image, universal vulnerable pixels or RGB values can hardly exist in contrast to the exsitence of the universal perturbation with respect to multiple images [13]. By vulnerable pixel we mean a specific pixel can be vulnerable with multiple RGB values. And vulnerable RGB value is a specific value that keeps its vulnerability across different positions on an image. In other words, our results show that a success adversarial perturbation has to be conducted at a specific locale on the image also with a specific RGB value.
We show that DE can generate highquality solution of perturbation by considering realistic constraints into the fitness function. Specifically, the research evaluates the effectiveness of using DEs for producing adversarial perturbation under different parameter settings. In addition, the DEs implemented are with low number of iterations and a relatively small set of initial candidate solutions. Therefore, the perturbation success rates should improve further by having either more iterations or a bigger set of initial candidate solutions.
The ultimate goal of proposing attack against the CNN is evaluating and understanding its vulnerability. The CNN has been shown to have different levels of vulnerabilities to additive perturbation created from different types of optimization methods. The proposed attacks show that CNN is even vulnerable to such a low cost and low dimensional imperceptible attack even under extremely limited conditions. The future extension can be done by analyzing and explaining why CNN is vulnerable to such various types of attacks simultaneously and accordingly extracting possible countermeasures.
6 Appendix
6.1 1) Location of attack pixels on image coordinate.
6.2 2) Attack examples on ImageNet.
6.3 3) Failed attack examples on ImageNet.
Declarations
Acknowledgements
This research was partially supported by Collaboration Hubs for International Program (CHIRP) of SICORP, Japan Science and Technology Agency (JST), and Kyushu University Education and Research Center for Mathematical and Data Science Grant.
Funding
Collaboration Hubs for International Program (CHIRP) of SICORP, Japan Science and Technology Agency (JST) and Kyushu University Education and Research Center for Mathematical and Data Science Grant.
Availability of data and materials
The material related to this research can be publicly accessed at:
1. CIFAR10 dataset: http://www.cs.toronto.edu/~kriz/cifar.html
2. BCLV network: https://github.com/BVLC/caffe
3. ImageNet: http://www.imagenet.org/
Authors’ contributions
JS contributes the most including proposing the initial research idea, conducting the experiments, and writing the manuscript. DVV and KS are responsible for suggesting possible improvements, supervising the experiments, and checking the final manuscript. All authors read and approved the final manuscript.
Competing interests
All authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Barreno M, Nelson B, Joseph A (2010) D, and Tygar, J, The security of machine learning. Mach Learn 2(81):121–148.View ArticleGoogle Scholar
 Barreno M, Nelson B, Sears R, Joseph AD, Tygar JD (2006) Can machine learning be secure? In: Proceedings of the 2006 ACM Symposium on Information, computer and communications security, 16–25.. ACM, Taiwan.Google Scholar
 Brest J, Greiner S, Boskovic B, Mernik M, Zumer V (2006) Selfadapting control parameters in differential evolution: A comparative study on numerical benchmark problems. IEEE Trans Evol Comput 6(10):646–657.View ArticleGoogle Scholar
 Carlini N, Wagner D (2018) Audio adversarial examples: targeted attacks on speechtotext. arXiv preprint arXiv:1801.01944.Google Scholar
 Civicioglu P, Besdok E (2013) A conceptual comparison of the cuckoosearch, particle swarm optimization, differential evolution and artificial bee colony algorithms. Artif Intell Rev 4(39):315–346.View ArticleGoogle Scholar
 Dang H, Huang Y, Chang EC (2017) Evading classifiers by morphing in the dark. ACM.Google Scholar
 Das S, Suganthan PN (2011) Differential evolution: A survey of the stateoftheart. IEEE Trans Evol Comput 1(15):4–31. IEEE.View ArticleGoogle Scholar
 Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.Google Scholar
 Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, Vol. 1, Issue. 4, pp. 7. University of Toronto.Google Scholar
 Liang B, Li H, Su M, Bian P, Li X, Shi W (2017) Deep text classification can be fooled. arXiv preprint arXiv:1704.08006.Google Scholar
 Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400.Google Scholar
 MD, et al (2016) Deepfool: a simple and accurate method to fool deep neural networks.Google Scholar
 MD, et al (2017) Analysis of universal adversarial perturbations. arXiv preprint arXiv:1705.09554.Google Scholar
 Moosavi Dezfooli SM, Fawzi A, Fawzi O, Frossard P (2017) Universal adversarial perturbations.Google Scholar
 Narodytska N, Kasiviswanathan S (2016) Simple blackbox adversarial attacks on deep neural networks. arXiv preprint arXiv:1612.06299.Google Scholar
 Nguyen A, Yosinski J, Clune J (2015) Deep neural networks are easily fooled: High confidence predictions for unrecognizable images.Google Scholar
 Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2016) Practical blackbox attacks against machine learning. arXiv preprint arXiv:1602.02697.Google Scholar
 Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A (2015) The limitations of deep learning in adversarial settings. arXiv preprint arXiv:1511.07528.Google Scholar
 Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.Google Scholar
 Simonyan K, Zisserman A (2014) Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556.Google Scholar
 Springenberg J, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806.Google Scholar
 Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 4(11):341–359.MathSciNetView ArticleGoogle Scholar
 Su J, Vargas D, Sakurai K (2017) One pixel attack for fooling deep neural networks. arXiv preprint arXiv:1710.08864.Google Scholar
 Szegedy Cea (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.Google Scholar
 Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: Closing the gap to humanlevel performance in face verification.Google Scholar
 Vargas DV, Murata J (2017) Spectrumdiverse neuroevolution with unified neural models. IEEE Trans Neural Netw Learn Syst 8(28):1759–1773.MathSciNetView ArticleGoogle Scholar
 Vargas DV, Murata J, Takano H, Delbem ACB (2015) General subpopulation framework and taming the conflict inside populations. Evol Comput 1(23):1–36.View ArticleGoogle Scholar
 Wei D, Zhou B, Torrabla A, Freeman W (2015) Understanding intraclass knowledge inside cnn. arXiv preprint arXiv:1507.02379.Google Scholar