Article Search
닫기

## Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(1): 106-115

Published online March 25, 2022

https://doi.org/10.5391/IJFIS.2022.22.1.106

© The Korean Institute of Intelligent Systems

## A Neural Network-Based Approach to Multiple Wheat Disease Recognition

Igor V. Arinichev1 , Sergey V. Polyanskikh2, Irina V. Arinicheva3, Galina V. Volkova4, and Irina P. Matveeva4

1Department of Theoretical Economy, Kuban State University, Krasnodar, Russia
2Plarium Inc., Krasnodar, Russia
3Department of Higher Mathematics, Kuban State Agrarian University named after I.T. Trubilin, Krasnodar, Russia
4Laboratory of Cereal Crops Immunity to Fungal Diseases, All-Russian Research Institute of Biological Plant Protection, Krasnodar, Russia

Correspondence to :
Igor V. Arinichev (iarinichev@gmail.com)

Received: November 17, 2021; Revised: January 13, 2022; Accepted: February 4, 2022

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

In this paper, modern computer vision methods are proposed for detecting multiple diseases in wheat leaves. The authors demonstrate that modern neural network architectures are capable of qualitatively detecting and classifying diseases, such as yellow spots, yellow rust, and brown rust, even in cases in which multiple diseases are simultaneously present on the plant. For certain classes of diseases, the main multilabel metrics (accuracy, micro-/macro-precision, recall, and F1-score) range from 0.95 to 0.99. This indicates the possibility of recognizing several diseases on a leaf with an accuracy equal to that of an expert phytopathologist. The architecture of the neural network used in this case is lightweight, which makes it possible to use offline on mobile devices.

Keywords: CNN, Multilabel classification, Wheat diseases, Computer vision

Wheat is the world’s most widely cultivated crop, and its global trade is greater than that of all other combined crops [1]. Wheat culture is susceptible to a complex of harmful diseases, among which the most economically significant are brown and yellow rust (Puccinia triticina Erikss., Puccinia striiformis West. f. sp. tritici Erikss., et Henn), and pyrenophorosis (Pyrenophora tritici-repentis (Died.) Drechsler) [2]. Such diseases are harmful and widespread globally, particularly in the southern region of Russia [3,4]. According to various studies, the loss of winter wheat yield during the epiphytic period can reach 50%–70% [5,6]. When disease symptoms are detected, fungicide treatment, either in the entire field or locally, is widely practiced in wheat disease control. At the same time, diseases in the early stages are often misidentified, and the complex of fungicides is incorrectly selected. On the one hand, this approach leads to high costs and is unjustified because, at least in the initial stages, disease infection is concentrated in areas mainly around the original foci. On the other hand, uniform spraying, for example, using chemical fungicides, increases the likelihood of contamination of groundwater and the environment, as well as the appearance of toxic residues in agricultural products.

Traditionally, the identification of fungal wheat diseases has been based on the visual detection of pathogen-induced symptoms or pathogen identification in the laboratory. A visual assessment, particularly in the early stages of disease development or when a plant is simultaneously affected by several diseases, is difficult to conduct and requires the participation of high-level professionals, who are not always available, especially for small farms. Nutritional deficiencies and pests can cause symptoms similar to those of certain diseases [7]. Laboratory pathogen identification is a laborious and time-consuming process. Currently, the most urgent task is the timely and accurate identification of pathogens in wheat until the economic threshold of the harmfulness of the diseases is reached and a quick decision to take protective measures is made. This task is successfully solved using computer vision methods.

However, the detection of plant diseases from images is challenging [7]. Crops are complex organisms that constantly evolve during the growing season. In recent years, with the development of deep learning and convolutional neural networks (CNN), the task of image classification has advanced significantly in many fields, including agriculture. In some cases, the accuracy of classification using such models surpasses that of humans [8].

Let us consider the fact that the successful application of modern state-of-the-art neural network computer vision models requires the tuning of only a number of hyperparameters (e.g., learning rate, optimizer, and augmentations) and not the entire architecture. As their main strength, such models provide a general view of the entire class of problems.

Nevertheless, despite the increasing number of published research findings, the identification of many diseases in wheat leaves using computer vision methods remains poorly studied [9,10]. For obvious reasons, such a task is somewhat more complicated in its formulation than the classical task of disease classification. If there is only one possible disease in each image of the model, it is easier to learn and even retrain. Therefore, in the learning process, the model can begin to pay attention to an environment that is not related to the disease. Therefore, if photographs of different diseases occur at different times, there is a high risk that different classes of images will be captured under different conditions (e.g., lighting and photographic equipment). Moreover, the presence of several diseases in one sample can, in this case, confuse the model, which will also negatively affect its confidence in its answer and the final quality of the classification. In the case of the initial formulation of the problem for the multilabel classification of diseases, the model must learn from the very beginning to distinguish between different diseases within the same image. Although this may turn out to be more difficult to implement, it has a much greater application value.

In our research, we aim to fill this gap and propose an accurate solution for determining the type of disease on a wheat leaf when there can be several diseases at once. The model built should be suitable for future use on mobile devices, both online and offline.

To date, a significant number of articles have been published on the application of deep learning methods to the automatic detection and classification of crop diseases based on digital images. Mohanty et al. [11] were among the first to use CNNs to detect plant diseases using the PlantVillage dataset [12]. The authors tested two standard architectures, AlexNet and GoogleNet [13], and examined the effect of transfer learning for classification, achieving a classification accuracy of 31%. In addition, Atabay [14] used an occlusion technique. This technique consists of sliding black windows on the input image and analyzing the change in the output probability. This method allows the creation of a heatmap that highlights pixels that are most sensitive to a specific class. The paper also noted that the class was sometimes assigned because of pixels belonging to the background, indicating that the functions studied were not only those associated with disease symptoms. The highest accuracy was 97.53%. In addition, Brahimi et al. [15] used a CNN to classify tomato leaf disease. They compared three learning strategies using six CNN architectures (AlexNet, DenseNet-169, Inception V3, ResNet-34, SqueezeNet-1.1, and VGG13). This study used the PlantVillage dataset supplemented with the Standford Background dataset [16].

Note that in this and the other studies mentioned, the multiclass classification problem was considered when only one type of disease was present in an image. Here, the novelty lies not only in the use of modern neural network architectures for identifying wheat diseases but also in the multilabel classification approach, when several diseases can be present in a single image.

Arinichev et al. [17] compared the four most successful and compact CNN architectures, GoogleNet, ResNet-18, SqueezeNet-1.0, and DenseNet-121. In the dataset used for the analysis, the disease could be detected with an accuracy of at least 95%. Here, a multiclass approach is used, which is almost always simpler, but less applicable in practice, because one plant may be affected by several diseases at once. We aimed to fill this gap in our research. In another study, Polyanskikh et al. [18] considered a new approach based on the use of autoencoders, i.e., special neural network architectures that detect areas on rice leaves affected by a particular disease. Their study demonstrated that an autoencoder can be trained in such a way that it will remove the affected areas from the image, which in some cases makes it possible to clearly highlight the affected area by comparing the resulting image with the original one. Simultaneously, modern architectures of convolutional autoencoders provide an acceptable visual detection quality. DeChant et al. [19] proposed a three-step process based on the training of several CNNs to generate heatmaps to determine the affected areas on the analyzed image. The authors used a dataset collected under uncontrolled conditions, without focusing on a specific plant organ. This dataset is highly specialized for identifying maize plants infected by northern late blight. A study by Fuentes et al. [20] aimed to address the issue of false positives and class imbalances by implementing a refinement filter bank framework. The system includes three main blocks. First, in a block of primary diagnostics, bounding boxes containing the affected areas of the leaf are generated. The most promising boxes are then passed to the input of the second block for secondary diagnostics. Finally, an integration unit combines the information from the primary and secondary units while retaining the true positive samples and eliminating the false positives that were misclassified in the first unit. The accuracy metric was 85.98%. Liu et al. [21] proposed an active-learning algorithm based on weighted incremental dictionary learning. This algorithm effectively trains a deep network by actively selecting training samples at each iteration. Its efficiency has been demonstrated in the classification of hyperspectral images. The accuracy was 97.62%. Oppenheim and Shani [22] proposed an algorithm that, based on a CNN, classifies potato diseases (four classes of diseases and one class of healthy plants). The sample images used in this study, containing potatoes of various shapes, sizes, and diseases, were collected, classified, and labeled by hand by experts. Picon et al. [23] addressed the early identification of three wheat diseases based on CNNs. Using a mobile app, technicians tested the model for two of the three diseases studied, i.e., septoria (77 images) and rust (54 images), to which they added 27 images of healthy plants. The balanced accuracy was 98% for septoria and 96% for rust. Such high rates can be attributed to the fact that the test images were obtained at the end of the season, when symptoms were most noticeable, as well as to good teaching practices. Ramcharan et al. [24] generated two datasets: an original cassava dataset with intact leaves and a leaflet cassava dataset in which the lesions were manually cropped. The resulting accuracy for the second set was slightly higher for three of the five diseases studied. Thus, trimming the leaves did not have a significant impact despite the fact that the volume of the “leaflet cassava dataset” was seven times the size of the original. Sladojevic et al. [25] used their own dataset collected from the Internet, which included 13 classes corresponding to various diseases. They added two additional classes for a more accurate classification. The first class corresponds to healthy leaves, and the second to background images taken from the publicly available Stanford Background dataset [16]. The classification accuracy is greater than 96%. In addition, Too et al. [26] conducted an empirical comparison of deep-learning architectures. The evaluated architectures included VGG16; Inception V4; ResNet with 50, 101, and 152 levels; and DenseNet with 121 levels. For this experiment, the authors used data from 38 different classes, including images of diseased and healthy leaves of 14 plants from PlantVillage [12]. The highest accuracy was 99.75%. In the study of Wang et al. [27], small neural networks of different depths were trained from scratch based on several samples, after which four modern architectures were fine-tuned on the training results, i.e., VGG16, VGG19, Inception V3, and ResNet-50. A comparison of the results shows that fine-tuning models on pre-trained neural networks can significantly improve the performance of small samples. The best architecture in this case was VGG16 with an accuracy of 90.4%. Moreover, Zhang et al. [28] presented a three-channel convolutional neural network (TCCNN) with the goal of improving the use of color information. In the model, each channel of the TCCNN is fed by one of the three color components of the RGB image of the diseased leaf, and the convolutional feature in each CNN is learned and transmitted to the next convolutional layer and pooling layer in turn. The features are then fused through a fully connected fusion layer to obtain a deep-level disease recognition feature vector. Finally, the softmax layer uses a feature vector to classify the input images into predefined classes. The proposed method makes it possible to automatically recognize representative features from complex images of diseased leaves and effectively recognize vegetable diseases. Zhang et al. [29] used tomato leaves to identify diseases through transfer learning. AlexNet, GoogLeNet, and ResNet were used as the backbone of the CNN. The best combined model was utilized to change the structure with the aim of exploring the performance of full training and fine-tuning of the CNN. The highest accuracy of 97.28% for detecting tomato leaf disease was achieved using the ResNet model with stochastic gradient descent (SGD), a batch size of 16, and 4,992 iterations.

### 3.1 Dataset

Deep-learning approaches typically require large datasets. Data acquisition is a time-consuming and costly task that is performed manually with the assistance of highly qualified specialists. The authors note the exceptional role of the process of collecting and marking the training sample, mentioning that problems can be represented by shadows, noise, and an insufficient severity of the disease [3032].

It is impossible to train a neural network “to fit all sizes” at the current level of technology development. It is necessary to have a clear idea of where and how the trained neural network will be used in the future. This can be a completely controlled environment in which the plant sample is placed in an environment with a uniform background and controlled lighting. Although this can also be an uncontrolled condition with emphasis on a specific plant organ when the images have a complex background, the largest image area is occupied by the object of interest. Another possible option is a completely uncontrolled environment and field conditions. Therefore, before the start, it is necessary to fix in advance the general photographic conditions, e.g., the angle, influence of shadows and background, and ranges of brightness and contrast. Otherwise, no algorithm can guarantee the accuracy of the validation achieved during the training.

Most of the current research on the automatic detection of plant diseases has been conducted on publicly available datasets. For example, in a review article by Boulent et al. [7] on the application of CNNs to the detection of plant diseases, 11 of 19 studies were shown to be conducted on publicly available datasets. The following datasets can currently be used to test solutions to this type of problem: PlantVillage currently containing 87,848 images of healthy and diseased crop plants [12], the Stanford Background dataset with 3,557 images [16], and a dataset with 5,447 images of rice diseases [32].

However, the public dataset approach has several disadvantages. First, as practice shows, even the same crop in different parts of the world can visually appear somewhat different. The same applies to crop diseases as well. Second, the photographic conditions may differ significantly from dataset to dataset and may be completely unsuitable for the current case under study (e.g., background, photography angle, and lighting). We assume that the use of public data is significant for the proof of concept of the work ahead but not at all for the presentation of the final solution and its implementation during production.

In this study, we used a dataset specially collected in the Summer/Fall of 2021 from the infectious nurseries of the All-Russian Research Institute of Biological Plant Protection. We used artificially created infectious backgrounds of rust and yellow spot pathogens to collect an experimental representative sample for the indicated classes of diseases at different stages of wheat vegetation. For inoculation of the plants, in which it is necessary to obtain an infectious background of the above-described wheat diseases, a mixture of urediniospores with talc at a ratio of 1:100 was used under a load of 5 mg spores/m2 for rust [33]. For pyrenophorosis, a water-conidial suspension with a concentration of 5 × 103 spores/mL (load of 70–100 mL/m2) [34] was used. Records on the development of the diseases were carried out starting from the moment of the initial manifestation, during the subsequent manifestations, and up to the phase of milky-wax ripeness of the grain with an interval of 10–12 days. The main phytopathological criteria for the resistance of varieties to rust were the type of plant reaction in points (Mains and Jackson scale, Gassner and Straib scale) [35] and the degree of damage to plants in percentage (scale of Peterson et al. [36]). For yellow spot, the degree of damage in percentage was according to the Saari and Prescott scales [37]. The ranking of varieties for pathogen resistance was determined according to the CIMMIT scale [38].

As a result, 7640 images were obtained and divided into three classes of diseases and their various intersections (Figure 1):

• brown rust (427 images),

• yellow spot (3,659 images),

• yellow rust (1,283 images),

• yellow rust+yellow spot (1,349 images),

• yellow spot+brown rust (335 images),

• yellow rust+brown rust (473 images),

• yellow spot+yellow rust + brown rust (114 images).

The dataset structure shows that more than one disease can be present on a single wheat leaf at the same time. Therefore, in this case, we deal with a multilabel classification problem, in contrast to previous studies [17,18] where the authors considered the case of strict multiclass classification. Multilabel classification tasks are traditionally somewhat more complicated than multiclass classification, which can already be observed in the above quantitative distribution of classes i.e., images. The number of photographs in which two or three diseases were present at once was much less than that in images with a single disease. Nevertheless, a neural network is required to reliably determine the number of diseases in a photograph, regardless of the number of diseases.

### 3.2 Preprocessing

As noted earlier, the collection of a dataset must first be oriented toward the end user of the model. However, even when monitoring the quality and conditions of images when both collecting data and using a trained model, a number of fundamental problems can arise. These problems can significantly degrade the accuracy of the model. Among them are the following:

• an insufficient sample size;

• natural invariance of predictions regarding rotations/reflections of an image;

• instability of the predictions, when even the slightest noise can affect the result; and

• an overfitting effect when the quality of the predictions on new images is significantly lower than that on the training images.

All of these problems can be addressed to a certain extent by organizing competent preprocessing, that is, preprocessing the original images. In this study, we used the following preprocessing stages for the original dataset:

• 45° angle of rotation;

• flipping an image with respect to the main axes;

• random rotations at small angles; and

• image RGB normalization.

As a result, the size of the training sample increases, improving the stability of the predictions and ensuring their invariance to image rotations.

### 3.3 Model

As noted above, the intersection of classes of diseases within an object determines the use of a multilabel model. Multilabel classification assigns multiple target labels to each object in the sample, whereas strict multi-class classification creates the premise of a one-to-one correspondence of each object with a single label. For the case of non-exclusive classes, the activation classification function is a sigmoid, and each neuron will output a value of between zero and 1, indicating the probability of having that class assigned to it (Figure 2).

As a loss function, we used cross-entropy, which is traditionally used for multilabel and multiclass problems [39,40]. This loss function is most natural for classification problems because it has a clear probabilistic interpretation, and owing to the logarithm greatly penalizes the model for incorrect answers:

L(w)=-1ni=1nk=1mynklnpnk,

where ynk indicates the ground truth answers (1 or 0), and pnk indicates the sigmoid model predictions, which depend on the model weights w.

As the neural network architecture, we used GoogleNet [13], which won the ILSVRC 2014 image classification test. It provides a significant reduction in the number of errors compared with the previous winners AlexNet (ILSVRC Winner 2012), ZF-Net (ILSVRC Winner 2013), and VGG (second place in 2014). In [17], the authors compared four modern compact architectures, GoogleNet, ResNet-18, SqueezeNeq-1.0, and DenseNet-121, to solve the problem of a strict multiclass classification of rice diseases. In this study, we focused on a more practical multilabel classification approach. The neural network architectures used are capable of solving this problem. The two best architectures were GoogleNet and DenseNet, which showed similar results in terms of quality and convergence rate. Both architectures are sufficiently lightweight for use offline on mobile devices. GoogleNet is slightly heavier than DenseNet but performs better for multilabel classification. The final model in this case has 5.6 million parameters and takes approximately 22.6 MB in a serialized form. We emphasize that the choice of the correct general-purpose architecture, such as GoogleNet, does not require a further change but only the adjustment of a number of hyperparameters and adequate training.

For the implementation and training of the neural network, we used the PyTorch v1.6.0 framework based on the Torch library, which provides all necessary functions for deep learning. PyTorch provides two main high-level models: tensor computing with advanced GPU acceleration support, and autodiff-based auto-differentiation. The neural network was trained on a stationary computer with the following configuration: Intel Core i7 CPU and a 4-GB Nvidia GeForce GTX 960. The training lasted approximately 12 hours. As a convergence optimizer, we used the default Adam algorithm, which has proven to be excellent for computer-vision tasks.

### 4. Results and Discussion

Evaluating the results produced by multilabel models is also traditionally more difficult than evaluating the models themselves. In the case of multiclass classification, all basic metrics are based on the confusion matrix, which provides a clear idea of the model quality [17,18]. In our case, more than one disease can be detected in one photograph, which means that the confusion matrix cannot be adequately built. Instead, a set of accuracy, precision, recall, and F1 metrics are used, averaged over both classes (macro-averaging) and the number of objects in the classes (micro-averaging). A detailed description of these metrics can be found in [40].

Table 1 demonstrates the first group of most easily interpreted metrics, i.e., accuracy-based metrics. The first three columns show the proportion of images with the indicated disease that were correctly recognized. The total accuracy is the proportion of images in which the model accurately recognizes all diseases. The latter metric is quite strict because small deviations from the ideal classification are inevitable and acceptable; however, even in this case, 90% correct disease recognition is an impressive result. At the same time, the share of separate correctly classified diseases is even higher, from 94% to 99%, which is no lower in quality than the result achieved by an expert phytopathologist.

A delayed sample comprising 20% of the full sample was used during the test. We believe that this amount of data is sufficient for a correct assessment of the model quality because of the very specifics of the task, i.e., the same disease looks more or less the same on a leaf, and only the area and number of lesions change. In addition, well-chosen augmentations provide an almost unlimited expansion of the dataset in this case and statistically stabilize the final results.

Table 2 demonstrates the global model metrics, i.e., the precision, recall, and F1-score in their micro- and macrovariants. This group of metrics reflects the quality of the model on all data as a whole and not on each class separately. For example, the precision and recall for a certain class are calculated as follows:

Presicion=TruePositiviesTruePositivies+FalsePositivies,

and

Recall=TruePositiviesTruePositivites+FalseNegativies.

Accordingly, the F1 metric is a harmonic mean of precision and recall. In the micro-case, we average these numbers and weigh them based on class frequencies. In the macro-case, we simply take their arithmetic mean. Note that the first of these metrics, precision, indicates how little the model is wrong in predicting the presence of some diseases. The second metric, recall, shows how infrequently the model makes mistakes, suggesting that some diseases are not present in the image. Table 2 shows that this group of metrics also indicates a high-quality model.

In this study, we considered an important applied problem for detecting multiple diseases on a wheat leaf: yellow spots and yellow and brown rust. We demonstrated that the GoogleNet model can be used to identify multiple diseases in wheat leaves. Significant metrics were at the level of 93%–99%, which is consistent with the accuracy provided by a qualified expert phytopathologist. At the same time, the model itself, having 5.6 million parameters, takes only approximately 22.6 MB of memory space, which makes it possible to use on mobile devices. This is critical for the rapid, accurate, and automated detection of multiple possible diseases in crops such as wheat.

The research was carried out with the financial support of the Kuban Science Foundation under the framework of scientific project (No. IFR-20.1/75).

Fig. 1.

Images from the dataset: (a) yellow spot, (b) brown rust, (c) yellow rust, (d) yellow rust+yellow spot.

Fig. 2.

Sigmoid function for non-exclusive classes.

Table. 1.

Table 1. Accuracy metrics for the final model.

AccuracyNumber of samples
Yellow rust0.9463446
Brown rust0.9948174
Yellow spot0.9536908
Total0.90121,525

Table. 2.

Table 2. Global metrics used for the final model.

PrecisionRecallF1 score
Micro0.98130.93100.9558
Macro0.98560.91460.9485

1. Curtis, BC. (2002) . Wheat in the world. [Online]. Available: http://www.fao.org/3/y4011e/y4011e04.htm
2. Kokhmetova, A, Atishova, M, Sapakhova, Z, Kremneva, O, and Volkova, G (2017). Evaluation of wheat cultivars growing in Kazakhstan and Russia for resistance to tan spot. Journal of Plant Pathology. 99, 161-167.
3. Zhukovskii, AG, Il’yuk, AG, Buga, SF, Sklimenok, NA, Kremneva, OY, Volkova, GV, and Gudoshnikova, ES (2012). Septoria spot (Septoria spp.) and yellow leaf spot (Pyrenophora tritici-repentis) affection of winter wheat cultivars in Belarus and North Caucasian Region of Russia. Scientific Journal of Kuban State Agrarian University. 80. article no. 19
4. Volkova, GV, Shulyakovskaya, LN, Kudinova, OA, and Matveeva, IP (2018). Wheat yellow rust in the Kuban. Plant Protection and Quarantine. https://doi.org/10.25992/bpp.2018.72.20689
5. Kremneva, O, and Volkova, G (2018). Diagnostics of the Tsn1 Pyrenophoratritici-repentis gene in wheat varieties and assessment of their resistance to pathogen races. Proceedings of the Kuban State Agrarian University, 206-210.
6. Volkova, GV, Vaganova, OF, and Kudinova, OA (2018). Efficiency of crops of variety mixtures of winter wheat against the agent of leaf rust. Achievements of Science and Technology in Agro-Industrial Complex. 32, 14-16. https://doi.org/10.24411/0235-2451-2018-10703
7. Boulent, J, Foucher, S, Theau, J, and St-Charles, PL (2019). Convolutional neural networks for the automatic identification of plant diseases. Frontiers in Plant Science. 10. article no. 941
8. He, K, Zhang, X, Ren, S, and Sun, J . Deep residual learning for image recognition., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, Las Vegas, NV, Array, pp.770-778. https://doi.org/10.1109/CVPR.2016.90
9. Subhadra, K, and Kavitha, N (2020). Multi label leaf disease classification using enhanced deep convolutional neural network. Journal of Advanced Research in Dynamical and Control Systems. 12, 97-108. https://doi.org/10.5373/JARDCS/V12SP4/20201470
10. Ji, M, Zhang, K, Wu, Q, and Deng, Z (2000). Multi-label learning for crop leaf diseases recognition and severity estimation based on convolutional neural networks. Soft Computing. 24, 15327-15340. https://doi.org/10.1007/s00500-020-04866-z
11. Mohanty, SP, Hughes, DP, and Salathe, M (2016). Using deep learning for image-based plant disease detection. Frontiers in Plant Science. 7. article no. 1419
12. Hughes, D, and Salathe, M. (2015) . An open access repository of images on plant health to enable the development of mobile disease diagnostics. Online]. Available: https://arxiv.org/abs/1511.08060
13. Krizhevsky, A, Sutskever, I, and Hinton, GE (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 25, 1106-1114.
14. Atabay, HA (2017). Deep residual learning for tomato plant leaf disease identification. Journal of Theoretical & Applied Information Technology. 95, 6800-6808.
15. Brahimi, M, Arsenovic, M, Laraba, S, Sladojevic, S, Boukhalfa, K, and Moussaoui, A (2018). Deep learning for plant diseases: detection and saliency map visualization. Human and Machine Learning. Cham, Switzerland: Springer, pp. 93-117
16. Gould, S, Fulton, R, and Koller, D . Decomposing a scene into geometric and semantically consistent regions., Proceedings of 2009 IEEE 12th International Conference on Computer Vision, 2009, Kyoto, Japan, Array, pp.1-8. https://doi.org/10.1109/ICCV.2009.5459211
17. Arinichev, IV, Polyanskikh, SV, Volkova, GV, and Arinicheva, IV (2021). Rice fungal diseases recognition using modern computer vision techniques. International Journal of Fuzzy Logic and Intelligent Systems. 21, 1-11. https://doi.org/10.5391/IJFIS.2021.21.1.1
18. Polyanskikh, S, Arinicheva, I, Arinichev, I, and Volkova, G (2021). Autoencoders for semantic segmentation of rice fungal diseases. 19, 574-585. https://doi.org/10.15159/ar.21.019
19. DeChant, C, Wiesner-Hanks, T, Chen, S, Stewart, EL, Yosinski, J, Gore, MA, Nelson, RJ, and Lipson, H (2017). Automated identification of northern leaf blight-infected maize plants from field imagery using deep learning. Phytopathology. 107, 1426-1432. https://doi.org/10.1094/PHYTO-11-16-0417-R
20. Fuentes, AF, Yoon, S, Lee, J, and Park, DS (2018). High-performance deep neural network-based tomato plant diseases and pests diagnosis system with refinement filter bank. Frontiers in Plant Science. 9. article no. 1162
21. Liu, B, Zhang, Y, He, D, and Li, Y (2017). Identification of apple leaf diseases based on deep convolutional neural networks. Symmetry. 10. article no. 11
22. Oppenheim, D, and Shani, G (2017). Potato disease classification using convolution neural networks. Advances in Animal Biosciences. 8, 244-249. https://doi.org/10.1017/S2040470017001376
23. Picon, A, Alvarez-Gila, A, Seitz, M, Ortiz-Barredo, A, Echazarra, J, and Johannes, A (2019). Deep convolutional neural networks for mobile capture device-based crop disease classification in the wild. Computers and Electronics in Agriculture. 161, 280-290. https://doi.org/10.1016/j.compag.2018.04.002
24. Ramcharan, A, Baranowski, K, McCloskey, P, Ahmed, B, Legg, J, and Hughes, DP (2017). Deep learning for image-based cassava disease detection. Frontiers in Plant Science. 8. article no. 1852
25. Sladojevic, S, Arsenovic, M, Anderla, A, Culibrk, D, and Stefanovic, D (2016). Deep neural networks based recognition of plant diseases by leaf image classification. Computational Intelligence and Neuroscience. 2016. article no. 3289801
26. Too, EC, Yujian, L, Njuki, S, and Yingchun, L (2019). A comparative study of fine-tuning deep learning models for plant disease identification. Computers and Electronics in Agriculture. 161, 272-279. https://doi.org/10.1016/j.compag.2018.03.032
27. Wang, G, Sun, Y, and Wang, J (2017). Automatic image-based plant disease severity estimation using deep learning. Computational Intelligence and Neuroscience. 2017. article no. 2917536
28. Zhang, S, Huang, W, and Zhang, C (2019). Three-channel convolutional neural networks for vegetable leaf disease recognition. Cognitive Systems Research. 53, 31-41. https://doi.org/10.1016/j.cogsys.2018.04.006
29. Zhang, K, Wu, Q, Liu, A, and Meng, X (2018). Can deep learning identify tomato leaf disease?. Advances in Multimedia. 2018. article no. 6710865
30. Jeon, WS, and Rhee, SY (2021). A restoration method of single image super resolution using improved residual learning with squeeze and excitation blocks. International Journal of Fuzzy Logic and Intelligent Systems. 21, 222-232. https://doi.org/10.5391/IJFIS.2021.21.3.222
31. Barbedo, JGA (2016). A review on the main challenges in automatic plant disease identification based on visible range images. Biosystems Engineering. 144, 52-60. https://doi.org/10.1016/j.biosystemseng.2016.01.017
32. Do, HM. (2019) . Rice Diseases Image Dataset: an image dataset for rice and its diseases. [Online]. Available: https://www.kaggle.com/minhhuy2810/rice-diseases-image-dataset
33. Anpilogova, LK, and Volkova, GV (2000). Methods for Creating Artificial Infectious Backgrounds and Assessing Wheat Cultivars for Resistance to Harmful Diseases (Fusarium Ear Blight, Rust, Powdery Mildew). Krasnodar, Russia: Russian Agricultural Academy
34. Kremneva, O, Andronova, A, and Volkova, G (2009). The Pathogens of Wheat Leaf Spots (Pyrenophorosis and Septoria), the Study of their Populations by Morphological and Cultural Characteristics and Virulence. St. Petersburg, Russia: XXX
35. Roelfs, AP, Singh, RP, and Saari, EE (1992). Rust Diseases of Wheat: Concepts and Methods of Disease Management. Mexico City, Mexico: CIMMYT
36. Peterson, RF, Campbell, AB, and Hannah, AE (1948). A diagrammatic scale for estimating rust intensity on leaves and stems of cereals. Canadian Journal of Research. 26, 496-500. https://doi.org/10.1139/cjr48c-033
37. Saari, EE, and Prescott, JM (1975). Scale for appraising the foliar intensity of wheat diseases. Plant Disease Reporter. 59, 377-380.
38. Koishibaev, M, and Sagitov, A (2012). Grain Crops Protection from Especially Dangerous Diseases. Almaty, Russia: XXX
39. Goodfellow, I, Bengio, Y, and Courville, A (2016). Deep Learning. Cambridge, MA: MIT Press
40. Bishop, CM (2006). Pattern Recognition and Machine Learning. New York, NY: Springer

Igor V. Arinichev is a candidate of Economic Sciences, associate professor of the Kuban State University. He published more than 60 publications in peer-reviewed journals or conferences, and books on application of mathematical methods in economics, agriculture and technology.

E-mail: iarinichev@gmail.com

Sergey V. Polyanskikh is a Ph.D. in Mathematics and Mechanics. He is currently a senior data scientist in Plarium Inc. He published more than 20 theoretical and applied publications in hydrodynamics and mathematics.

E-mail: spmathf@gmail.com

Irina V. Arinicheva is a doctor of Biological Sciences, and a professor of the Department of Higher Mathematics (Kuban State Agrarian University). Her specialization is mathematical modeling of biological processes. She is the author of over 150 scientific articles, monographs, inventions, educational materials for students.

E-mail: loukianova7@mail.ru

Galina V. Volkova is a doctor of Biological Science, and Head of the Laboratory of Immunity of Cereal Crops to Fungal Diseases. She published more than 200 publications in peer-reviewed journals or conferences, and books.

E-mail: galvol2011@yandex.ru

Irina P. Matveeva is a researcher of the Laboratory of Immunity of Cereal Crops to Fungal Diseases, postgraduate student, and a author of over 10 publications.

E-mail: irina.matveeva14@yandex.ru

### Article

#### Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(1): 106-115

Published online March 25, 2022 https://doi.org/10.5391/IJFIS.2022.22.1.106

## A Neural Network-Based Approach to Multiple Wheat Disease Recognition

Igor V. Arinichev1 , Sergey V. Polyanskikh2, Irina V. Arinicheva3, Galina V. Volkova4, and Irina P. Matveeva4

1Department of Theoretical Economy, Kuban State University, Krasnodar, Russia
2Plarium Inc., Krasnodar, Russia
3Department of Higher Mathematics, Kuban State Agrarian University named after I.T. Trubilin, Krasnodar, Russia
4Laboratory of Cereal Crops Immunity to Fungal Diseases, All-Russian Research Institute of Biological Plant Protection, Krasnodar, Russia

Correspondence to:Igor V. Arinichev (iarinichev@gmail.com)

Received: November 17, 2021; Revised: January 13, 2022; Accepted: February 4, 2022

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

### Abstract

In this paper, modern computer vision methods are proposed for detecting multiple diseases in wheat leaves. The authors demonstrate that modern neural network architectures are capable of qualitatively detecting and classifying diseases, such as yellow spots, yellow rust, and brown rust, even in cases in which multiple diseases are simultaneously present on the plant. For certain classes of diseases, the main multilabel metrics (accuracy, micro-/macro-precision, recall, and F1-score) range from 0.95 to 0.99. This indicates the possibility of recognizing several diseases on a leaf with an accuracy equal to that of an expert phytopathologist. The architecture of the neural network used in this case is lightweight, which makes it possible to use offline on mobile devices.

Keywords: CNN, Multilabel classification, Wheat diseases, Computer vision

### 1. Introduction

Wheat is the world’s most widely cultivated crop, and its global trade is greater than that of all other combined crops [1]. Wheat culture is susceptible to a complex of harmful diseases, among which the most economically significant are brown and yellow rust (Puccinia triticina Erikss., Puccinia striiformis West. f. sp. tritici Erikss., et Henn), and pyrenophorosis (Pyrenophora tritici-repentis (Died.) Drechsler) [2]. Such diseases are harmful and widespread globally, particularly in the southern region of Russia [3,4]. According to various studies, the loss of winter wheat yield during the epiphytic period can reach 50%–70% [5,6]. When disease symptoms are detected, fungicide treatment, either in the entire field or locally, is widely practiced in wheat disease control. At the same time, diseases in the early stages are often misidentified, and the complex of fungicides is incorrectly selected. On the one hand, this approach leads to high costs and is unjustified because, at least in the initial stages, disease infection is concentrated in areas mainly around the original foci. On the other hand, uniform spraying, for example, using chemical fungicides, increases the likelihood of contamination of groundwater and the environment, as well as the appearance of toxic residues in agricultural products.

Traditionally, the identification of fungal wheat diseases has been based on the visual detection of pathogen-induced symptoms or pathogen identification in the laboratory. A visual assessment, particularly in the early stages of disease development or when a plant is simultaneously affected by several diseases, is difficult to conduct and requires the participation of high-level professionals, who are not always available, especially for small farms. Nutritional deficiencies and pests can cause symptoms similar to those of certain diseases [7]. Laboratory pathogen identification is a laborious and time-consuming process. Currently, the most urgent task is the timely and accurate identification of pathogens in wheat until the economic threshold of the harmfulness of the diseases is reached and a quick decision to take protective measures is made. This task is successfully solved using computer vision methods.

However, the detection of plant diseases from images is challenging [7]. Crops are complex organisms that constantly evolve during the growing season. In recent years, with the development of deep learning and convolutional neural networks (CNN), the task of image classification has advanced significantly in many fields, including agriculture. In some cases, the accuracy of classification using such models surpasses that of humans [8].

Let us consider the fact that the successful application of modern state-of-the-art neural network computer vision models requires the tuning of only a number of hyperparameters (e.g., learning rate, optimizer, and augmentations) and not the entire architecture. As their main strength, such models provide a general view of the entire class of problems.

Nevertheless, despite the increasing number of published research findings, the identification of many diseases in wheat leaves using computer vision methods remains poorly studied [9,10]. For obvious reasons, such a task is somewhat more complicated in its formulation than the classical task of disease classification. If there is only one possible disease in each image of the model, it is easier to learn and even retrain. Therefore, in the learning process, the model can begin to pay attention to an environment that is not related to the disease. Therefore, if photographs of different diseases occur at different times, there is a high risk that different classes of images will be captured under different conditions (e.g., lighting and photographic equipment). Moreover, the presence of several diseases in one sample can, in this case, confuse the model, which will also negatively affect its confidence in its answer and the final quality of the classification. In the case of the initial formulation of the problem for the multilabel classification of diseases, the model must learn from the very beginning to distinguish between different diseases within the same image. Although this may turn out to be more difficult to implement, it has a much greater application value.

In our research, we aim to fill this gap and propose an accurate solution for determining the type of disease on a wheat leaf when there can be several diseases at once. The model built should be suitable for future use on mobile devices, both online and offline.

### 2. Literature Review

To date, a significant number of articles have been published on the application of deep learning methods to the automatic detection and classification of crop diseases based on digital images. Mohanty et al. [11] were among the first to use CNNs to detect plant diseases using the PlantVillage dataset [12]. The authors tested two standard architectures, AlexNet and GoogleNet [13], and examined the effect of transfer learning for classification, achieving a classification accuracy of 31%. In addition, Atabay [14] used an occlusion technique. This technique consists of sliding black windows on the input image and analyzing the change in the output probability. This method allows the creation of a heatmap that highlights pixels that are most sensitive to a specific class. The paper also noted that the class was sometimes assigned because of pixels belonging to the background, indicating that the functions studied were not only those associated with disease symptoms. The highest accuracy was 97.53%. In addition, Brahimi et al. [15] used a CNN to classify tomato leaf disease. They compared three learning strategies using six CNN architectures (AlexNet, DenseNet-169, Inception V3, ResNet-34, SqueezeNet-1.1, and VGG13). This study used the PlantVillage dataset supplemented with the Standford Background dataset [16].

Note that in this and the other studies mentioned, the multiclass classification problem was considered when only one type of disease was present in an image. Here, the novelty lies not only in the use of modern neural network architectures for identifying wheat diseases but also in the multilabel classification approach, when several diseases can be present in a single image.

Arinichev et al. [17] compared the four most successful and compact CNN architectures, GoogleNet, ResNet-18, SqueezeNet-1.0, and DenseNet-121. In the dataset used for the analysis, the disease could be detected with an accuracy of at least 95%. Here, a multiclass approach is used, which is almost always simpler, but less applicable in practice, because one plant may be affected by several diseases at once. We aimed to fill this gap in our research. In another study, Polyanskikh et al. [18] considered a new approach based on the use of autoencoders, i.e., special neural network architectures that detect areas on rice leaves affected by a particular disease. Their study demonstrated that an autoencoder can be trained in such a way that it will remove the affected areas from the image, which in some cases makes it possible to clearly highlight the affected area by comparing the resulting image with the original one. Simultaneously, modern architectures of convolutional autoencoders provide an acceptable visual detection quality. DeChant et al. [19] proposed a three-step process based on the training of several CNNs to generate heatmaps to determine the affected areas on the analyzed image. The authors used a dataset collected under uncontrolled conditions, without focusing on a specific plant organ. This dataset is highly specialized for identifying maize plants infected by northern late blight. A study by Fuentes et al. [20] aimed to address the issue of false positives and class imbalances by implementing a refinement filter bank framework. The system includes three main blocks. First, in a block of primary diagnostics, bounding boxes containing the affected areas of the leaf are generated. The most promising boxes are then passed to the input of the second block for secondary diagnostics. Finally, an integration unit combines the information from the primary and secondary units while retaining the true positive samples and eliminating the false positives that were misclassified in the first unit. The accuracy metric was 85.98%. Liu et al. [21] proposed an active-learning algorithm based on weighted incremental dictionary learning. This algorithm effectively trains a deep network by actively selecting training samples at each iteration. Its efficiency has been demonstrated in the classification of hyperspectral images. The accuracy was 97.62%. Oppenheim and Shani [22] proposed an algorithm that, based on a CNN, classifies potato diseases (four classes of diseases and one class of healthy plants). The sample images used in this study, containing potatoes of various shapes, sizes, and diseases, were collected, classified, and labeled by hand by experts. Picon et al. [23] addressed the early identification of three wheat diseases based on CNNs. Using a mobile app, technicians tested the model for two of the three diseases studied, i.e., septoria (77 images) and rust (54 images), to which they added 27 images of healthy plants. The balanced accuracy was 98% for septoria and 96% for rust. Such high rates can be attributed to the fact that the test images were obtained at the end of the season, when symptoms were most noticeable, as well as to good teaching practices. Ramcharan et al. [24] generated two datasets: an original cassava dataset with intact leaves and a leaflet cassava dataset in which the lesions were manually cropped. The resulting accuracy for the second set was slightly higher for three of the five diseases studied. Thus, trimming the leaves did not have a significant impact despite the fact that the volume of the “leaflet cassava dataset” was seven times the size of the original. Sladojevic et al. [25] used their own dataset collected from the Internet, which included 13 classes corresponding to various diseases. They added two additional classes for a more accurate classification. The first class corresponds to healthy leaves, and the second to background images taken from the publicly available Stanford Background dataset [16]. The classification accuracy is greater than 96%. In addition, Too et al. [26] conducted an empirical comparison of deep-learning architectures. The evaluated architectures included VGG16; Inception V4; ResNet with 50, 101, and 152 levels; and DenseNet with 121 levels. For this experiment, the authors used data from 38 different classes, including images of diseased and healthy leaves of 14 plants from PlantVillage [12]. The highest accuracy was 99.75%. In the study of Wang et al. [27], small neural networks of different depths were trained from scratch based on several samples, after which four modern architectures were fine-tuned on the training results, i.e., VGG16, VGG19, Inception V3, and ResNet-50. A comparison of the results shows that fine-tuning models on pre-trained neural networks can significantly improve the performance of small samples. The best architecture in this case was VGG16 with an accuracy of 90.4%. Moreover, Zhang et al. [28] presented a three-channel convolutional neural network (TCCNN) with the goal of improving the use of color information. In the model, each channel of the TCCNN is fed by one of the three color components of the RGB image of the diseased leaf, and the convolutional feature in each CNN is learned and transmitted to the next convolutional layer and pooling layer in turn. The features are then fused through a fully connected fusion layer to obtain a deep-level disease recognition feature vector. Finally, the softmax layer uses a feature vector to classify the input images into predefined classes. The proposed method makes it possible to automatically recognize representative features from complex images of diseased leaves and effectively recognize vegetable diseases. Zhang et al. [29] used tomato leaves to identify diseases through transfer learning. AlexNet, GoogLeNet, and ResNet were used as the backbone of the CNN. The best combined model was utilized to change the structure with the aim of exploring the performance of full training and fine-tuning of the CNN. The highest accuracy of 97.28% for detecting tomato leaf disease was achieved using the ResNet model with stochastic gradient descent (SGD), a batch size of 16, and 4,992 iterations.

### 3.1 Dataset

Deep-learning approaches typically require large datasets. Data acquisition is a time-consuming and costly task that is performed manually with the assistance of highly qualified specialists. The authors note the exceptional role of the process of collecting and marking the training sample, mentioning that problems can be represented by shadows, noise, and an insufficient severity of the disease [3032].

It is impossible to train a neural network “to fit all sizes” at the current level of technology development. It is necessary to have a clear idea of where and how the trained neural network will be used in the future. This can be a completely controlled environment in which the plant sample is placed in an environment with a uniform background and controlled lighting. Although this can also be an uncontrolled condition with emphasis on a specific plant organ when the images have a complex background, the largest image area is occupied by the object of interest. Another possible option is a completely uncontrolled environment and field conditions. Therefore, before the start, it is necessary to fix in advance the general photographic conditions, e.g., the angle, influence of shadows and background, and ranges of brightness and contrast. Otherwise, no algorithm can guarantee the accuracy of the validation achieved during the training.

Most of the current research on the automatic detection of plant diseases has been conducted on publicly available datasets. For example, in a review article by Boulent et al. [7] on the application of CNNs to the detection of plant diseases, 11 of 19 studies were shown to be conducted on publicly available datasets. The following datasets can currently be used to test solutions to this type of problem: PlantVillage currently containing 87,848 images of healthy and diseased crop plants [12], the Stanford Background dataset with 3,557 images [16], and a dataset with 5,447 images of rice diseases [32].

However, the public dataset approach has several disadvantages. First, as practice shows, even the same crop in different parts of the world can visually appear somewhat different. The same applies to crop diseases as well. Second, the photographic conditions may differ significantly from dataset to dataset and may be completely unsuitable for the current case under study (e.g., background, photography angle, and lighting). We assume that the use of public data is significant for the proof of concept of the work ahead but not at all for the presentation of the final solution and its implementation during production.

In this study, we used a dataset specially collected in the Summer/Fall of 2021 from the infectious nurseries of the All-Russian Research Institute of Biological Plant Protection. We used artificially created infectious backgrounds of rust and yellow spot pathogens to collect an experimental representative sample for the indicated classes of diseases at different stages of wheat vegetation. For inoculation of the plants, in which it is necessary to obtain an infectious background of the above-described wheat diseases, a mixture of urediniospores with talc at a ratio of 1:100 was used under a load of 5 mg spores/m2 for rust [33]. For pyrenophorosis, a water-conidial suspension with a concentration of 5 × 103 spores/mL (load of 70–100 mL/m2) [34] was used. Records on the development of the diseases were carried out starting from the moment of the initial manifestation, during the subsequent manifestations, and up to the phase of milky-wax ripeness of the grain with an interval of 10–12 days. The main phytopathological criteria for the resistance of varieties to rust were the type of plant reaction in points (Mains and Jackson scale, Gassner and Straib scale) [35] and the degree of damage to plants in percentage (scale of Peterson et al. [36]). For yellow spot, the degree of damage in percentage was according to the Saari and Prescott scales [37]. The ranking of varieties for pathogen resistance was determined according to the CIMMIT scale [38].

As a result, 7640 images were obtained and divided into three classes of diseases and their various intersections (Figure 1):

• brown rust (427 images),

• yellow spot (3,659 images),

• yellow rust (1,283 images),

• yellow rust+yellow spot (1,349 images),

• yellow spot+brown rust (335 images),

• yellow rust+brown rust (473 images),

• yellow spot+yellow rust + brown rust (114 images).

The dataset structure shows that more than one disease can be present on a single wheat leaf at the same time. Therefore, in this case, we deal with a multilabel classification problem, in contrast to previous studies [17,18] where the authors considered the case of strict multiclass classification. Multilabel classification tasks are traditionally somewhat more complicated than multiclass classification, which can already be observed in the above quantitative distribution of classes i.e., images. The number of photographs in which two or three diseases were present at once was much less than that in images with a single disease. Nevertheless, a neural network is required to reliably determine the number of diseases in a photograph, regardless of the number of diseases.

### 3.2 Preprocessing

As noted earlier, the collection of a dataset must first be oriented toward the end user of the model. However, even when monitoring the quality and conditions of images when both collecting data and using a trained model, a number of fundamental problems can arise. These problems can significantly degrade the accuracy of the model. Among them are the following:

• an insufficient sample size;

• natural invariance of predictions regarding rotations/reflections of an image;

• instability of the predictions, when even the slightest noise can affect the result; and

• an overfitting effect when the quality of the predictions on new images is significantly lower than that on the training images.

All of these problems can be addressed to a certain extent by organizing competent preprocessing, that is, preprocessing the original images. In this study, we used the following preprocessing stages for the original dataset:

• 45° angle of rotation;

• flipping an image with respect to the main axes;

• random rotations at small angles; and

• image RGB normalization.

As a result, the size of the training sample increases, improving the stability of the predictions and ensuring their invariance to image rotations.

### 3.3 Model

As noted above, the intersection of classes of diseases within an object determines the use of a multilabel model. Multilabel classification assigns multiple target labels to each object in the sample, whereas strict multi-class classification creates the premise of a one-to-one correspondence of each object with a single label. For the case of non-exclusive classes, the activation classification function is a sigmoid, and each neuron will output a value of between zero and 1, indicating the probability of having that class assigned to it (Figure 2).

As a loss function, we used cross-entropy, which is traditionally used for multilabel and multiclass problems [39,40]. This loss function is most natural for classification problems because it has a clear probabilistic interpretation, and owing to the logarithm greatly penalizes the model for incorrect answers:

$L(w)=-1n∑i=1n∑k=1mynklnpnk,$

where ynk indicates the ground truth answers (1 or 0), and pnk indicates the sigmoid model predictions, which depend on the model weights w.

As the neural network architecture, we used GoogleNet [13], which won the ILSVRC 2014 image classification test. It provides a significant reduction in the number of errors compared with the previous winners AlexNet (ILSVRC Winner 2012), ZF-Net (ILSVRC Winner 2013), and VGG (second place in 2014). In [17], the authors compared four modern compact architectures, GoogleNet, ResNet-18, SqueezeNeq-1.0, and DenseNet-121, to solve the problem of a strict multiclass classification of rice diseases. In this study, we focused on a more practical multilabel classification approach. The neural network architectures used are capable of solving this problem. The two best architectures were GoogleNet and DenseNet, which showed similar results in terms of quality and convergence rate. Both architectures are sufficiently lightweight for use offline on mobile devices. GoogleNet is slightly heavier than DenseNet but performs better for multilabel classification. The final model in this case has 5.6 million parameters and takes approximately 22.6 MB in a serialized form. We emphasize that the choice of the correct general-purpose architecture, such as GoogleNet, does not require a further change but only the adjustment of a number of hyperparameters and adequate training.

For the implementation and training of the neural network, we used the PyTorch v1.6.0 framework based on the Torch library, which provides all necessary functions for deep learning. PyTorch provides two main high-level models: tensor computing with advanced GPU acceleration support, and autodiff-based auto-differentiation. The neural network was trained on a stationary computer with the following configuration: Intel Core i7 CPU and a 4-GB Nvidia GeForce GTX 960. The training lasted approximately 12 hours. As a convergence optimizer, we used the default Adam algorithm, which has proven to be excellent for computer-vision tasks.

### 4. Results and Discussion

Evaluating the results produced by multilabel models is also traditionally more difficult than evaluating the models themselves. In the case of multiclass classification, all basic metrics are based on the confusion matrix, which provides a clear idea of the model quality [17,18]. In our case, more than one disease can be detected in one photograph, which means that the confusion matrix cannot be adequately built. Instead, a set of accuracy, precision, recall, and F1 metrics are used, averaged over both classes (macro-averaging) and the number of objects in the classes (micro-averaging). A detailed description of these metrics can be found in [40].

Table 1 demonstrates the first group of most easily interpreted metrics, i.e., accuracy-based metrics. The first three columns show the proportion of images with the indicated disease that were correctly recognized. The total accuracy is the proportion of images in which the model accurately recognizes all diseases. The latter metric is quite strict because small deviations from the ideal classification are inevitable and acceptable; however, even in this case, 90% correct disease recognition is an impressive result. At the same time, the share of separate correctly classified diseases is even higher, from 94% to 99%, which is no lower in quality than the result achieved by an expert phytopathologist.

A delayed sample comprising 20% of the full sample was used during the test. We believe that this amount of data is sufficient for a correct assessment of the model quality because of the very specifics of the task, i.e., the same disease looks more or less the same on a leaf, and only the area and number of lesions change. In addition, well-chosen augmentations provide an almost unlimited expansion of the dataset in this case and statistically stabilize the final results.

Table 2 demonstrates the global model metrics, i.e., the precision, recall, and F1-score in their micro- and macrovariants. This group of metrics reflects the quality of the model on all data as a whole and not on each class separately. For example, the precision and recall for a certain class are calculated as follows:

$Presicion=True PositiviesTrue Positivies+False Positivies,$

and

$Recall=True PositiviesTrue Positivites+False Negativies.$

Accordingly, the F1 metric is a harmonic mean of precision and recall. In the micro-case, we average these numbers and weigh them based on class frequencies. In the macro-case, we simply take their arithmetic mean. Note that the first of these metrics, precision, indicates how little the model is wrong in predicting the presence of some diseases. The second metric, recall, shows how infrequently the model makes mistakes, suggesting that some diseases are not present in the image. Table 2 shows that this group of metrics also indicates a high-quality model.

### 5. Conclusion

In this study, we considered an important applied problem for detecting multiple diseases on a wheat leaf: yellow spots and yellow and brown rust. We demonstrated that the GoogleNet model can be used to identify multiple diseases in wheat leaves. Significant metrics were at the level of 93%–99%, which is consistent with the accuracy provided by a qualified expert phytopathologist. At the same time, the model itself, having 5.6 million parameters, takes only approximately 22.6 MB of memory space, which makes it possible to use on mobile devices. This is critical for the rapid, accurate, and automated detection of multiple possible diseases in crops such as wheat.

### Fig 1.

Figure 1.

Images from the dataset: (a) yellow spot, (b) brown rust, (c) yellow rust, (d) yellow rust+yellow spot.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 106-115https://doi.org/10.5391/IJFIS.2022.22.1.106

### Fig 2.

Figure 2.

Sigmoid function for non-exclusive classes.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 106-115https://doi.org/10.5391/IJFIS.2022.22.1.106

Accuracy metrics for the final model.

AccuracyNumber of samples
Yellow rust0.9463446
Brown rust0.9948174
Yellow spot0.9536908
Total0.90121,525

Global metrics used for the final model.

PrecisionRecallF1 score
Micro0.98130.93100.9558
Macro0.98560.91460.9485

### References

1. Curtis, BC. (2002) . Wheat in the world. [Online]. Available: http://www.fao.org/3/y4011e/y4011e04.htm
2. Kokhmetova, A, Atishova, M, Sapakhova, Z, Kremneva, O, and Volkova, G (2017). Evaluation of wheat cultivars growing in Kazakhstan and Russia for resistance to tan spot. Journal of Plant Pathology. 99, 161-167.
3. Zhukovskii, AG, Il’yuk, AG, Buga, SF, Sklimenok, NA, Kremneva, OY, Volkova, GV, and Gudoshnikova, ES (2012). Septoria spot (Septoria spp.) and yellow leaf spot (Pyrenophora tritici-repentis) affection of winter wheat cultivars in Belarus and North Caucasian Region of Russia. Scientific Journal of Kuban State Agrarian University. 80. article no. 19
4. Volkova, GV, Shulyakovskaya, LN, Kudinova, OA, and Matveeva, IP (2018). Wheat yellow rust in the Kuban. Plant Protection and Quarantine. https://doi.org/10.25992/bpp.2018.72.20689
5. Kremneva, O, and Volkova, G (2018). Diagnostics of the Tsn1 Pyrenophoratritici-repentis gene in wheat varieties and assessment of their resistance to pathogen races. Proceedings of the Kuban State Agrarian University, 206-210.
6. Volkova, GV, Vaganova, OF, and Kudinova, OA (2018). Efficiency of crops of variety mixtures of winter wheat against the agent of leaf rust. Achievements of Science and Technology in Agro-Industrial Complex. 32, 14-16. https://doi.org/10.24411/0235-2451-2018-10703
7. Boulent, J, Foucher, S, Theau, J, and St-Charles, PL (2019). Convolutional neural networks for the automatic identification of plant diseases. Frontiers in Plant Science. 10. article no. 941
8. He, K, Zhang, X, Ren, S, and Sun, J . Deep residual learning for image recognition., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, Las Vegas, NV, Array, pp.770-778. https://doi.org/10.1109/CVPR.2016.90
9. Subhadra, K, and Kavitha, N (2020). Multi label leaf disease classification using enhanced deep convolutional neural network. Journal of Advanced Research in Dynamical and Control Systems. 12, 97-108. https://doi.org/10.5373/JARDCS/V12SP4/20201470
10. Ji, M, Zhang, K, Wu, Q, and Deng, Z (2000). Multi-label learning for crop leaf diseases recognition and severity estimation based on convolutional neural networks. Soft Computing. 24, 15327-15340. https://doi.org/10.1007/s00500-020-04866-z
11. Mohanty, SP, Hughes, DP, and Salathe, M (2016). Using deep learning for image-based plant disease detection. Frontiers in Plant Science. 7. article no. 1419
12. Hughes, D, and Salathe, M. (2015) . An open access repository of images on plant health to enable the development of mobile disease diagnostics. Online]. Available: https://arxiv.org/abs/1511.08060
13. Krizhevsky, A, Sutskever, I, and Hinton, GE (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 25, 1106-1114.
14. Atabay, HA (2017). Deep residual learning for tomato plant leaf disease identification. Journal of Theoretical & Applied Information Technology. 95, 6800-6808.
15. Brahimi, M, Arsenovic, M, Laraba, S, Sladojevic, S, Boukhalfa, K, and Moussaoui, A (2018). Deep learning for plant diseases: detection and saliency map visualization. Human and Machine Learning. Cham, Switzerland: Springer, pp. 93-117
16. Gould, S, Fulton, R, and Koller, D . Decomposing a scene into geometric and semantically consistent regions., Proceedings of 2009 IEEE 12th International Conference on Computer Vision, 2009, Kyoto, Japan, Array, pp.1-8. https://doi.org/10.1109/ICCV.2009.5459211
17. Arinichev, IV, Polyanskikh, SV, Volkova, GV, and Arinicheva, IV (2021). Rice fungal diseases recognition using modern computer vision techniques. International Journal of Fuzzy Logic and Intelligent Systems. 21, 1-11. https://doi.org/10.5391/IJFIS.2021.21.1.1
18. Polyanskikh, S, Arinicheva, I, Arinichev, I, and Volkova, G (2021). Autoencoders for semantic segmentation of rice fungal diseases. 19, 574-585. https://doi.org/10.15159/ar.21.019
19. DeChant, C, Wiesner-Hanks, T, Chen, S, Stewart, EL, Yosinski, J, Gore, MA, Nelson, RJ, and Lipson, H (2017). Automated identification of northern leaf blight-infected maize plants from field imagery using deep learning. Phytopathology. 107, 1426-1432. https://doi.org/10.1094/PHYTO-11-16-0417-R
20. Fuentes, AF, Yoon, S, Lee, J, and Park, DS (2018). High-performance deep neural network-based tomato plant diseases and pests diagnosis system with refinement filter bank. Frontiers in Plant Science. 9. article no. 1162
21. Liu, B, Zhang, Y, He, D, and Li, Y (2017). Identification of apple leaf diseases based on deep convolutional neural networks. Symmetry. 10. article no. 11
22. Oppenheim, D, and Shani, G (2017). Potato disease classification using convolution neural networks. Advances in Animal Biosciences. 8, 244-249. https://doi.org/10.1017/S2040470017001376
23. Picon, A, Alvarez-Gila, A, Seitz, M, Ortiz-Barredo, A, Echazarra, J, and Johannes, A (2019). Deep convolutional neural networks for mobile capture device-based crop disease classification in the wild. Computers and Electronics in Agriculture. 161, 280-290. https://doi.org/10.1016/j.compag.2018.04.002
24. Ramcharan, A, Baranowski, K, McCloskey, P, Ahmed, B, Legg, J, and Hughes, DP (2017). Deep learning for image-based cassava disease detection. Frontiers in Plant Science. 8. article no. 1852
25. Sladojevic, S, Arsenovic, M, Anderla, A, Culibrk, D, and Stefanovic, D (2016). Deep neural networks based recognition of plant diseases by leaf image classification. Computational Intelligence and Neuroscience. 2016. article no. 3289801
26. Too, EC, Yujian, L, Njuki, S, and Yingchun, L (2019). A comparative study of fine-tuning deep learning models for plant disease identification. Computers and Electronics in Agriculture. 161, 272-279. https://doi.org/10.1016/j.compag.2018.03.032
27. Wang, G, Sun, Y, and Wang, J (2017). Automatic image-based plant disease severity estimation using deep learning. Computational Intelligence and Neuroscience. 2017. article no. 2917536
28. Zhang, S, Huang, W, and Zhang, C (2019). Three-channel convolutional neural networks for vegetable leaf disease recognition. Cognitive Systems Research. 53, 31-41. https://doi.org/10.1016/j.cogsys.2018.04.006
29. Zhang, K, Wu, Q, Liu, A, and Meng, X (2018). Can deep learning identify tomato leaf disease?. Advances in Multimedia. 2018. article no. 6710865
30. Jeon, WS, and Rhee, SY (2021). A restoration method of single image super resolution using improved residual learning with squeeze and excitation blocks. International Journal of Fuzzy Logic and Intelligent Systems. 21, 222-232. https://doi.org/10.5391/IJFIS.2021.21.3.222
31. Barbedo, JGA (2016). A review on the main challenges in automatic plant disease identification based on visible range images. Biosystems Engineering. 144, 52-60. https://doi.org/10.1016/j.biosystemseng.2016.01.017
32. Do, HM. (2019) . Rice Diseases Image Dataset: an image dataset for rice and its diseases. [Online]. Available: https://www.kaggle.com/minhhuy2810/rice-diseases-image-dataset
33. Anpilogova, LK, and Volkova, GV (2000). Methods for Creating Artificial Infectious Backgrounds and Assessing Wheat Cultivars for Resistance to Harmful Diseases (Fusarium Ear Blight, Rust, Powdery Mildew). Krasnodar, Russia: Russian Agricultural Academy
34. Kremneva, O, Andronova, A, and Volkova, G (2009). The Pathogens of Wheat Leaf Spots (Pyrenophorosis and Septoria), the Study of their Populations by Morphological and Cultural Characteristics and Virulence. St. Petersburg, Russia: XXX
35. Roelfs, AP, Singh, RP, and Saari, EE (1992). Rust Diseases of Wheat: Concepts and Methods of Disease Management. Mexico City, Mexico: CIMMYT
36. Peterson, RF, Campbell, AB, and Hannah, AE (1948). A diagrammatic scale for estimating rust intensity on leaves and stems of cereals. Canadian Journal of Research. 26, 496-500. https://doi.org/10.1139/cjr48c-033
37. Saari, EE, and Prescott, JM (1975). Scale for appraising the foliar intensity of wheat diseases. Plant Disease Reporter. 59, 377-380.
38. Koishibaev, M, and Sagitov, A (2012). Grain Crops Protection from Especially Dangerous Diseases. Almaty, Russia: XXX
39. Goodfellow, I, Bengio, Y, and Courville, A (2016). Deep Learning. Cambridge, MA: MIT Press
40. Bishop, CM (2006). Pattern Recognition and Machine Learning. New York, NY: Springer