Article Search
닫기

## Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2021; 21(1): 1-11

Published online March 25, 2021

https://doi.org/10.5391/IJFIS.2021.21.1.1

© The Korean Institute of Intelligent Systems

## Rice Fungal Diseases Recognition Using Modern Computer Vision Techniques

Igor V. Arinichev1, Sergey V. Polyanskikh2, Galina V. Volkova3, and Irina V. Arinicheva4

1Department of Theoretical Economy, Kuban State University, Krasnodar, Russia
2Plarium Inc., Krasnodar, Russia
3Laboratory of Cereal Crops Immunity to Fungal Diseases, All-Russian Research Institute of Biological Plant Protection, Krasnodar, Russia
4Department of Higher Mathematics, Kuban State Agrarian University named after I.T. Trubilin, Krasnodar, Russia

Correspondence to :
Igor V. Arinichev (iarinichev@gmail.com)

Received: September 16, 2020; Revised: February 10, 2021; Accepted: February 22, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

In the article, the authors study the possibility of detecting some fungal diseases of rice using visual computing and machine learning techniques. Leaf blast and brown spot diseases are considered. Modern computer vision methods based on convolutional neural networks are used to identify a particular disease on an image. The authors compare the four most successful and compact convolutional neural network architectures: GoogleNet, ResNet-18, SqueezeNet-1.0, and DenseNet-121. The authors show that in the dataset used for the analysis, the disease can be detected with an accuracy of at least 95%. Testing the algorithm on real data not used in training showed an accuracy of up to 95.6%. This is a good indicator of the reliability and stability of the obtained solution even to a change in the data distribution. Data not used in training showed an accuracy of up to 95.6%. This is a good indicator of the reliability and stability of the obtained solution even to a change in the data distribution.

Keywords: Convolutional neural networks, Machine learning, Computer vision, Rice, Fungal diseases

Rice is one of the most important grain crops in the world. Global consumption of this cereal has increased over the past few years. Thus, in 2019 more than 490 million tons of rice were consumed worldwide, compared with 437.18 million tons in 2008. The main consumers of rice are China and India (143 and 100 million tons, respectively), as well as the main producing countries providing more than 56% of the world rice production; then come Indonesia (37.7 million tons), Bangladesh (35.8 million tons) and Vietnam (21.5 million tons) [1]. Rice is also a valuable food, dietary and medicinal product in Russia. Its share in the consumed cereals is more than 40%, while consumer demand is growing every year [2].

Diseases and pests are responsible for approximately 20%–40% of rice production losses in Russia. At the same time, colossal economic damage to rice growing is caused by fungal diseases. Thus, losses only from blast diseases (causative agen, Pyricularia oryzae Cavara) in Russia, according to various estimates, in ordinary years are from 5% to 25% and in the years of epiphytotic development of the disease - up to 60 or even up to 100%. The harmfulness increases significantly due to a sharp decrease in the quality of grain obtained from affected plants [2].

In today’s common pest management practices, the agrochemicals are sprayed uniformly throughout the field as a preventive measure or when any symptoms of disease are detected. At the same time, diseases in the early stages are often identified incorrectly and therefore the complex of agrochemicals is selected incorrectly. On the one hand, it largely increases the cost of disease control, since at least in the beginning the disease infection is concentrated in areas mainly around the original foci. On the other hand, the extended amount of applied chemicals increases the likelihood of groundwater contamination and adversely leads to toxic residues in agricultural products.

Thus, there is no doubt about the relevance of the problem of timely and accurate detection and classification of rice diseases. The traditional practice of detecting fungal diseases is based either on visual, pathogen-induced symptoms or laboratory identification of the pathogen [3]. Visual assessment is subjective and, in some cases, can be inconsistent, leading to an incorrect diagnosis of the disease. Sometimes the visual classification is complicated by the fact that plants are affected by several diseases at the same time and many of their traits simultaneously intersect within a single plant specimen. Identification in the laboratory, in turn, is a laborious process that requires time-consuming pathogen cultivation. In any case, both of these methods require the participation of high-level professionals in the detection process, which is not always available, especially in small farms [4, 5].

These limitations have contributed to the emergence of a significant number of works devoted to the study of the possibility of applying machine learning methods to the problems of automatic detection and classification of crop diseases based on digital images. The general methodology in such studies is similar. First, images of diseases are captured using cameras or scanners. Secondly, the affected areas (spots) are separated from the background. Thirdly, peculiarities of color, shape or texture are extracted as features. Finally, classification techniques such as neural networks, Bayesian classifier, k-nearest neighbors (kNNs), support vector machines (SVMs), and others are used to classify disease images. For example, [6] used Bayesian classifier to detect five kinds of maize leaf diseases. The results proved that the accuracy of disease detection based on the proposed approach was higher than 83%. KY Huang (Huang, 2007) applied a backpropagation neural network classifier. The preprocessing involved segmentation of the lesions of three classes of disease, including bacterial soft rot (BSR), bacterial brown spot (BBS), and Phytophthora black rot (PBR). They were segmented by an exponential transform and image processing techniques (gray level co-occurrence matrix (GLCM). The classification accuracy on test data was about 90%.

Tian et al. [7] recognized traits based on textures, shapes and colors of three types of vine leaf diseases and classified the diseases using the SVM. The results obtained showed that the efficiency of SVM classification was higher than that of neural networks. Pantazi et al. [8] presents a tool for the detection and recognition of healthy plants of Silybum marianum and those infected with the fungus Microbotyum silybum during vegetative growth. An experimental sample, including both classes, was used to obtain leaf spectra using a portable device. Pre-processing of the spectra included normalization and principal component analysis (PCA). Three models were used to identify systemically infected plants: supervised Kohonen network (SKN), counter propagation artificial neural network (CP-ANN) and XY-fusion network (XY-F). The identification rates reached the highest accuracy of 95% using the XY-F.

In [9], the authors proposed an approach to automatic detection and classification of parasites on strawberry plants under greenhouse conditions, based on the SVM method. To increase the performance of the algorithm, non-flower regions were considered as background and removed using the gamma operator. The following case study [10] presented the development of a system that would automatically recognize water stressed and healthy winter wheat plants in the presence of a Septoria tritici infection. The detection algorithm was developed based on the combination of least squares support vectors machine (LSSVM) with sensor fusion. Using LSSVM, the classification performance was over 99%.

The study by Dong and Wang [11] is mainly dedicated to the image processing and recognition technologies of cucumber downy mildew and powdery mildew. The preprocessing included the use of median filtering method, as well as the extraction of a number of features based on color range segmentation, geometric parameters of the affected area, and extraction texture parameters by using a GLCM. The classification of diseased plants was carried out on the basis of the shortest distance method. The result of the experiment showed that the method demonstrated a disease recognition accuracy of over 96%.

Considering the focus of our article, the papers dedicated to the detection and classification of rice diseases were of particular interest. In [4, 5], a machine learning approach for the detection and recognition of rice diseases is presented using a sample of 619 images for four classes: (a) rice blast (RB), (b) bacterial leaf blight (BLB), (c) sheath blight (SB), and (d) healthy leave (HL). It was used a pre-trained deep convolutional neural network (CNN) as a feature extractor and SVM as a classifier. Much attention in the article is paid to preprocessing, since plants can contain dust, dew drops, etc., which makes the data noisy and, in turn, creates problems at the stages of segmentation and extraction of traits. Suresha et al. [12] proposed a method for identifying Blast and Brown Spot diseases using the kNN classifier based on a number of geometric features: area, major and minor semiaxes of spots, perimeter of the affected leaf part.

The authors [13] to ensure the extraction of features used the K-means clustering method, preliminarily removing green pixels from the affected parts of the leaf. Traits were extracted into three categories: color, shape and texture, and then SVM were applied to identify rice diseases in a multiclass manner. Sanyal and Patel [14] proposed a method for extracting signs of rice diseases based on color and textures that were fed to the input of a multilayer perceptron (MLP) for disease identification. Joshi and Jadhav [15] proposed to extract features based on color and shape, which, in turn, were fed to the input of the minimum distance classifier (MDC) and kNN classifier for disease identification. In the same line, the authors [16] proposed the combination of texture and shape features of the rice leaf to detect the different rice diseases using the SVM classifier. Majid et al. [17] used fuzzy entropy and probabilistic neural network classifier to identify diseases. Xiao et al. [18] explored PCA for dimensionality reduction of features and used back propagation (BP) neural network model to classify the different rice diseases.

The solution to this problem is of high practical value, since it will allow farmers, using cameras built into mobile devices, not only to identify diseases, but immediately design the optimal course of treatment.

We believe that in the near future the main task of visual detection of plant diseases will be partially or completely solved by artificial intelligence systems. Such systems can be built as client-server applications installed on mobile or stationary devices. These kinds of rickets are more heavyweight, but they ensure the introduction of machine learning models of any complexity into their pipeline. Another type of systems is lightweight application architectures that work autonomously or semi-autonomously on mobile devices and do not require a large amount of computing resources and constant access to the Internet for their work. Such solutions are the most convenient for use, but they have a number of specified limitations. In this work, we have concentrated on systems of the second type, which have the ability to work autonomously on devices. For this kind of architecture, the lightness of the underlying neural network architecture is of paramount importance.

The resulting solution will allow in the future to solve the following problems that occur everywhere in agriculture in different countries: (i) small/absent staff of phytopathologists, (ii) low qualification of specialists, and (iii) it is expensive to train or attract specialists from outside.

Section 2 describes the main methods used by phytopathologists to identify rice diseases at different stages of the growing season. We are especially interested in the visual method, as it is most easily replaced by the methods of modern computer vision. The following is an overview of the main neural network architectures suitable for solving this problem. We provide the description of the model dataset used to illustrate the proposed methods. Also, various options for preprocessing data processing are considered to help improve the quality of the final disease identification algorithm. Section 3 describes the main results of the numerical experiment performed. In Section 4, we discuss the results and examine how stable is the model to significant changes in data distribution. We are testing the trained model on data collected in a different geographic area. It is shown that the model works quite good on completely new dataset of rice disease images.

### 2.1 Classical Methods for the Detection of Rice Diseases

Expert phytopathologists are currently using the following approaches to identify various rice diseases:

• Visual method - establishing the external symptoms of the disease, degree of disease development and its prevalence [19].

• Microscopic method - determination of the nature of changes in the diseased plant tissue, detection of the pathogen and its sporulation.

• Biological method - artificial infection; the degree of damage is determined in percentage according to the guidelines of VNIIF.

• Cultural method - the fungus (for spots) is isolated on a nutrient medium and its cultural and morphological characteristics are monitored [20].

• Molecular genetic method - diagnostics of rust fungi using polymerase chain reaction [21, 22].

In this paper, we shall focus on the first method, since it is perhaps the easiest for it to adapt modern machine learning methods, replacing the human eye and an expert phytopathologist with a computer algorithm. The neural network algorithms have shown the best quality of all machine learning algorithms used for similar tasks. They allow one to automatically detect the presence of a disease in the image. In turn, among all neural network architectures, it is convolutional neural networks that have established themselves as state-of-the-art for digit recognition, having recently become de facto the main method used in computer vision to solve problems of classification, detection and segmentation of objects on images. Let us briefly describe the basic ideas and techniques used by convolutional networks, as well as some of the reasons for their success.

### 2.2 An Overview of Convolutional Neural Networks

The modern Computer Vision methods based on neural network algorithms have shown the best quality of all machine learning algorithms used for similar tasks. They allow one to automatically detect the presence of a disease in the image. In turn, among all neural network architectures, it is CNNs that have established themselves as state-of-the-art for digit recognition, having recently become de facto the main method used in computer vision to solve problems of classification, detection and segmentation of objects on images. Let us briefly describe the basic ideas and techniques used by convolutional networks, as well as some of the reasons for their success.

An attempt to bring a network closer to the mechanism of human vision is the main idea behind CNNs. LeCun et al. [23] by a neural network, it is sufficient to use not all connections between neurons of neighboring layers of the network, but only a small number of them. It is basically the simplest model of how vision works, when an eye, trying to find any object in the image, sequentially focuses on different parts of it but not on the entire image at once. Mathematically, such “eye focusing” corresponds to the convolution operation, which carries out pattern-matching of the area on the image and some desired pattern, and in our case, the disease manifested on the rice leaf. Various CNN architectures can be obtained by combining similar convolution operations into layers in different ways. Today it is a solid foundation of modern computer vision and helps to successfully solve a wide range of problems, such as classification, clustering, segmentation, etc.

Historically, LeCun et al. [23] is considered to be a starting point of convolutional neural networks. LeCun’s convolutional network took up less memory space than its fully connected version, learned much faster, and was able to successfully recognize handwritten numbers with accuracy no less than human. LeCun’s approach qualitatively changed the way we look at things. Before him, neural networks were considered, although more or less universal, but a very heavyweight algorithm, very capricious in setting up and requiring large computing power. Now the process of building and training a network has become simpler and closer to the ideas of the biology of vision itself.

Then AlexNet was introduced. It won the ImageNet 2012 challenge with a test accuracy of 84.6%. In all, there were about 1.2 million training images required for classification [24]. This is an incomparably more difficult task than the classification of black and white numbers, which required an increase in the volume of the neural network by orders of magnitude; the number of network parameters increased from 60k in LeNet-5 to more than 60M in AlexNet. This turned out to be such a large number at the time that the network architecture itself had to be specially split up into two computers, each of which calculated one of the two large network branches. And yet it was a major breakthrough. With the advent of AlexNet, CNNs, for the first time since LeNet, have established themselves as state-of-the-art algorithms for working with large color images, capable of detecting many patterns and shapes on them.

The next big breakthrough was VGG CNN architecture, which relied on even more parameters [25]. So, VGG-16 won the ImageNet challenge in 2013 and had a total number of trained parameters equal to 138M. The use of a large number of parameters and small convolution kernels made it possible to achieve 92.7% test accuracy in ImageNet.

By 2013, convolutional architectures had undoubted success. However, this success was not associated with new ideas, as in the case of LeNet, but mainly with quantitative improvements and increased computing power. The volume of networks has grown from tens of thousands to hundreds of millions of parameters, again making them difficult, time consuming and expensive to be trained. At this point, it was unclear whether such complexity of the instrument was justified by the true complexity of the problem, or, as in the case of classical neural networks, new ideas were required.

Another huge boost to computer vision came in 2014 when Google introduced its GoogLeNet Inception architecture, which won the ImageNet competition with 93.3% test accuracy [25]. The network had only 6M parameters, that is, about 10 times less than AlexNet and more than 20 less than VGG-16. A progressive idea that made it possible to significantly reduce the number of parameters and improve the quality of predictions was the use of special inception blocks that concatenate convolutions of different sizes. This allowed the algorithm to immediately see details at different scales, deciding which one is most significant for a given image

ResNet architecture became the next significant success in the world of computer vision. It won ImageNet challenge in 2015 with 96.43% test accuracy [25]. The breakthrough idea here was the use of residual blocks that make the network, along with hidden dependencies also learn the identity mapping. As a result, this solved the degradation problem and significantly improved the quality of predictions. An interesting feature of residual links is the fact that they can be added to almost any architecture, improving convergence and improving its quality without increasing the total number of parameters. ResNet itself has from 10M parameters depending on the details of the architecture.

SqueezeNet and DenseNet architectures are also worth mentioning. SqueezeNet has reached the quality level of AlexNet, but with the number of parameters 50 times less. This was achieved by using 1×1 convolutions and shrinking larger convolution kernels, thereby reducing the number of input channels in each layer [26]. DenseNet also followed the path of qualitative changes, somewhat similar to ResNet, but unlike it, concatenating the previous layers with the subsequent ones, rather than summing them. As a result, with a 7M parameters comparable to GoogLeNet, we get faster convergence and a slight increase in quality.

In this research, we consider four of the most well-proven relatively lightweight architectures: GoogLeNet,, ResNet-18, SqueezeNet-1.0 and DenseNet-121. AlexNet and VGG architectures are rather heavy and usually give the quality no higher than the later architectures, so we do not consider them.

### 2.3 Dataset Description

A good training set is required to train a neural network, like any supervised machine learning algorithm. Getting appropriately collected and preprocessed training data is usually the most challenging part of the task, as it requires detailed analysis from the perspective of both business and end users of the product. The vast majority of the following studies are by Hu et al. [27] and others [28, 29]. Phadikar et al. [30] point out the exceptional role of competent collection of training data for the tasks of detecting diseases of various agricultural crops. Namely, it is indicated that challenges can be represented by various illumination concerns, photo noise and insufficient severity of the disease.

Thus, it is not possible to train a neural network “for all occasions” at the current level of technology development. It is required to have a clear idea of how the trained neural network will then be used. Namely, before the work begins, the following parameters should bedefined: (1) general conditions of photography, (2) shooting angle, (3) ranges of brightness, contrast, (4) possible noise and distortion, (5) illumination concerns, and (6) background influence.

One can limit the terms of photography and require end users to comply with them in order to increase the quality of the final neural network. Otherwise, no algorithm can guarantee the validation accuracy achieved during training.

In this research we use the dataset [31] slightly expanding it with data that are freely available on the Internet. We exclude rice hispa disease as irrelevant for the South of Russia. Eventually, we work with a dataset of 4,278 images: healthy 1,488 images, brown spot disease 1,195 images, and leaf blast disease 1,595 images (Figure 1).

One should note that one rice leaf can be simultaneously infected with several diseases. In this case, we are faced with the task of multiclass classification. Nevertheless, the dataset is marked strictly and its visual analysis confirms this. Thus, in this research we consider the case of a strict multiclass classification: no more than one disease on one leaf.

In practice, this means that even an ideal model will most likely have a certain upper accuracy threshold other than 100%, which it cannot exceed without retraining. In this research we demonstrate that even under the assumption of a strict multiclass classification, a quite good accuracy can be achieved on the validation set - up to 96%.

In August 2020 we collected a test dataset for additional validation of the model results. Dataset size is 300 observations; the data were collected on the territory of the educational farm “Kuban” and Kuban State Agrarian University. For a number of reasons, at this time of the year only leaf blast disease was available for analysis. Thus, the final quality assessment of the model’s work was carried out only on the following examples: healthy or infected leaf blast. This kind of checks are extremely important, since the distribution of the dataset on which the model was trained can quite often differ from the distribution on which the model is ultimately applied. Usually machine learning models are quite sensitive to changes in distribution, and it is necessary to know to what extent is the model proposed in this work resistant to such.

### 2.4 Data Preprocessing

As noted earlier, the collection of a dataset must first of all be aimed at the end user of the model. But even monitoring the quality and shooting conditions both when collecting data and when using a trained model, a number of problems of a fundamental nature may arise that can significantly degrade the quality of the model. Among them are the following:

• insufficient sample size;

• natural invariance of predictions regarding rotation / image reflections;

• instability of predictions, when even insignificant noise can change the result;

• the effect of overfitting, when the quality of predictions on new images turns out to be significantly lower than on training images.

All these problems can be dealt with to a certain extent by organizing competent preprocessing of the original images. Here we use the following preprocessing stages for the original dataset:

• rotation by random angle from 0° of 45°

• flip an image along the main axes

• standard normalization of RGB image channels.

As a result, the size of the training sample increases, increasing the stability of predictions and ensuring their invariance to image rotations.

### 2.5 Model Architectures

We have chosen the categorical cross-entropy function as the main function, which is typical for multiclass classification problems. In our case of three classes, it can be written as

$L(w)=-1n∑i=1n∑k=13ynk ln pnk,$

where ynk - ground truth answers (1 or 0), and pnk - softmax model predictions, which depend on model weights w. This loss function is most natural for classification problems, since it has a clear probabilistic interpretation and, due to the logarithm, greatly penalizes the model for incorrect answers.

We use PyTorch v1.6.0 - machine learning library based on the Torch library, which provides all neccesary functionality for training and subsequent use of modern neural network architectures. This framework is widely used due to its simplicity and broad functionality. Nevertheless, the architectures used above are quite popular, and their implementations can be found on any other framework, e.g., TensorFlow, Caffe, etc. The neural network was trained on a stationary computer with the following configuration: Intel Core i7, GeForce GTX 1660, GTX 1080 4 GB.

The Adam optimization algorithm was used as an optimization method, which also proved to be excellent for such tasks. We used Adam algorithm with default settings for all models, but an optimal static learning rate was separately adapted for each model (see Table 1).

In this work, we specifically focused on lighter architectures that, if necessary, can be used directly from a mobile device. We trained the models using the PyTorch framework with standard pretrained models from the torchvision module, completely fine-tune for the dataset described above. We examined a number of CNN architectures both classical and modern, and then chose the following ones: GoogLeNet, ResNet-18, SqueezeNeq-1.0, DenseNet-121. They produce the most promising results and at the same time are the most compact. So, for example, computationally heavy VGG and AlexNet showed results similar to those given below, but somewhat worse and requiring much more computing resources both at the training stage and at the prediction stage. The training process is shown in Figure 2.

Learning parameters comparison of the CNNs is presented in Table 1. DenseNet-121 achieved the best accuracy with a relatively small number of parameters, and stabilized in the shortest (about 14 epochs) time. GoogLeNet architecture showed the second best result, stabilizing a little slower. ResNet-18 turned out to be the third in accuracy; however, it already has significantly more parameters than the others. The architecture of SqueezeNet-1.0 should be especially noted. It was not much inferior to the others, showed the same accuracy results, stabilized quickly and has the least number of parameters - just about 750K.

Table 2 shows the final quality metrics of the models under consideration: accuracy, and also micro and macro averaged f1, precision and recall. One notices that DenseNet-121 turns out to be the best in all characteristics. Next come GoogLeNet and SqueezeNet. The heavyweight ResNet architecture turns out to be worse than the others in this case.

Figure 3 shows one-vs-all ROC curves for each class, built on the validation set according to the predictions of the best DenseNet model. All ROC-AUC scores are close to 1. A closer look at the errors of the algorithms shows that GoogLeNet is slightly better at predicting the presence of a disease, but it can confuse a brown spot with a leaf blast. ResNet and SqueezeNet architectures show average similar results. The most accurate DenseNet architecture does not detect only a small fraction of the diseased plants, perfectly identifying the rest of the cases (see Figure 4).

As shown above, the DenseNet demonstrated the most accurate results among ather network architectures. It has 6.9M parameters which is small compared to ResNet-18 and can be trained with average computer with 2–4G GPU memory. DenseNet accuracy and other significant metrics are above 0.95 which is close to human level quality of disease classification.

Since the training of the models was carried out on data from the Internet, the question is, to what extent the obtained model will be applicable to the realities of another region of the world without additional training on additional data. Putting the question more mathematically, it is interesting to find out how much the distribution of data collected elsewhere will differ from the distribution of the data on which the model was trained, and how stable the resulting model will be to such changes. In other words, whether the resulting model has a generalizing ability sufficient to predict the same diseases in completely different photographs of rice.

As noted above, we have collected a trial dataset of 150 healthy and 150 leaf blast disease infected. Brown spot diseases were no longer observed at the indicated time, so only 2 classes of the initially available three were used for testing.

As a result of applying the best model (DenseNet) to the collected dataset, the classification accuracy was 80.3%. If we look at the predictions of the brown spot disease model that was not contained in the test dataset as the model’s uncertainty in its answer, then its accuracy would be 95.6% with a share of uncertainty of 16%. To understand whether the accuracy has dropped dramatically, it is useful to examine how much the distribution in the data has changed. One of the most popular methods of such research is the Population Stability Index (PSI). PSI is a simple but reliable indicator that the distribution has changed dramatically, and the model is probably not worth counting on.

We read the PSI as the average of the PSI indices of intermediate features, which DenseNet model gives without the most recent fully-connected layer, which translates the 1024-dimensional feature vector into the final 3-dimensional response vector:

$PSIo=∑b∈buckets(ptestb,o-pvalidb,o) lnptestb,opvalidb,o,$$PSI=11024∑o=11024PSIo.$

Here PSIo is the PSI corresponding to one of the 1024 outputs of the DenseNet-121 network; b is the number of the binarized interval of the studied values; $ptestbo$ and $pvalidbo$ are the percentages of values from the test and validation samples that fall into the interval b.

As a result of the calculations, we get a value of 0.739, which indicates that the distribution of the features of the test dataset differs significantly from the distribution of the original training dataset. Therefore, one should not expect high quality predictions for a model that is sensitive to such changes. In our case, on the contrary, the accuracy of the test dataset was, as mentioned above, from 80.3% to 95.6%. It indicates a high generalizing ability of the resulting model and its stability in relation to significant data changes. This of course does not negate the fact that, if possible, the model should be retrained on new data, allowing it to remember the details characteristic of the geographical area where it is planned to be used, and thereby increasing its quality.

In the article, the authors consider the task of identification the fungal diseases of rice using modern neural network methods of computer vision. Comparison of various classical and modern architectures of convolutional neural networks shows that the problem lends itself very well to solving these methods. The best result was demonstrated by DenseNet-121 architecture reaching an accuracy of 95.57% on the validation dataset. This architecture also demonstrated the fastest stabilization to values close to the maximum - in just 10–20 epochs. We argue that the problem of automating the detection of rice fungal diseases can be successfully solved with a competent organization of the process of collecting and preliminary marking up data. And finally, the best architecture was shown to be very stable to shift in data distribution. It is also shown that training of such models is quite possible without the involvement of serious computing power. The final proposed solution in a number of cases, due to its lightness, may well be used on mobile devices in conditions of limited computing resources.

The research was supported by the Kuban Science Foundation (No. MFI-20.1/75).

### Conflict of Interest

Fig. 1.

Images from [31]: (a) healthy rice leaf, (b) brown spot, and (c) leaf blast.

Fig. 2.

CNN’s models accuracy curves during the 100-epoch training process.

Fig. 3.

ROC curves for predictions of each class by the DenseNet model.

Fig. 4.

Error matrix for the best model - DenseNet.

Table. 1.

Table 1. Learning parameters comparison of the CNNs during the 250-epoch training process.

modelweights, Mbest epochtrain speed, sec/epochstab. epochlearn. rate
ResNet-1811.1128165821e-4 + decay
SqueezeNeq-1.00.72471341224e-5 + decay
DenseNet-1216.919895144e-5 + decay

The last column is the approximate number of the epoch, after which the accuracy increases only slightly. The best results are shown in bold..

Table. 2.

Table 2. Final metrics of the models.

modelaccuracyf1marcof1microprecmacroprecmicrorecalmacrorecalmicro
ResNet-1893.4730.9360.9350.9380.9350.9340.935
SqueezeNeq-1.094.1720.9420.9420.9440.9420.9420.942
DenseNet-12195.5710.9560.9560.9580.9560.9550.956

DenseNet architecture is the best in all respects..

1. Shahbandeh, M. (2019/2020) . Rice Consumption Worldwide in 2019/2020 by country. Available: https://www.statista.com/statistics/255971/topcountries-based-on-rice-consumption-2012-2013/
2. Zelenskij, GL (2016). Rice: biological basis of breeding and agricultural technology. Krasnodar, Russia: Kuban State Agrarian University
3. Barbedo, JGA (2016). A review on the main challenges in automatic plant disease identification based on visible range images. Biosystems Engineering. 144, 52-60. https://doi.org/10.1016/j.biosystemseng.2016.01.017
4. Shrivastava, VK, Pradhan, MK, Minz, S, and Thakur, MP (2019). Rice plant disease classification using transfer learning of deep convolutional neural network. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. XLII-3/W6, 631-635. https://doi.org/10.5194/isprs-archives-XLII-3-W6-631-2019
5. Shah, JP, Prajapati, HB, and Dabhi, VK . A survey on detection and classification of rice plant diseases., Proceedings of 2016 IEEE International Conference on Current Trends in Advanced Computing (ICCTAC), 2016, Bangalore, India, Array, pp.1-8. https://doi.org/10.1109/ICCTAC.2016.7567333
6. Zhao, YX, Wang, KR, Bai, ZY, Li, SK, Xie, RZ, and Gao, S (2007). Bayesian classifier method on maize leaf disease identifying based images. Computer Engineering and Applications. 43, 193-195.
7. Tian, YW, Li, TL, Li, CH, Piao, ZL, Sun, GK, and Wang, B (2007). Method for recognition of grape disease based on support vector machine. Transactions of the CSAE. 23, 175-180.
8. Pantazi, XE, Tamouridou, AA, Alexandridis, TK, Lagopodi, AL, Kontouris, G, and Moshou, D (2017). Detection of Silybum marianum infection with Microbotryum silybum using VNIR field spectroscopy. Computers and Electronics in Agriculture. 137, 130-137. https://doi.org/10.1016/j.compag.2017.03.017
9. Ebrahimi, MA, Khoshtaghaza, M, Minaei, S, and Jamshidi, B (2017). Vision-based pest detection based on SVM classification method. Computers and Electronics in Agriculture. 137, 52-58. https://doi.org/10.1016/j.compag.2017.03.016
10. Moshou, D, Pantazi, XE, Kateris, D, and Gravalos, I (2014). Water stress detection based on optical multisensor fusion with a least squares support vector machine classifier. Biosystems Engineering. 117, 15-22. https://doi.org/10.1016/j.biosystemseng.2013.07.008
11. Dong, P, and Wang, X (2013). Recognition of greenhouse cucumber disease based on image processing technology. Open Journal of Applied Sciences. 3, 27-31. https://doi.org/10.4236/ojapps.2013.31B006
12. Suresha, M, Shreekanth, KN, and Thirumalesh, BV 2017. Recognition of diseases in paddy leaves using kNN classifier., Proceedings of the 2nd International Conference for Convergence in Technology (I2CT), Mumbai, India, Array, pp.663-666. https://doi.org/10.1109/I2CT.2017.8226213
13. Prajapati, HB, Shah, JP, and Dabhi, VK (2017). Detection and classification of rice plant diseases. Intelligent Decision Technologies. 11, 357-373. https://doi.org/10.3233/IDT-170301
14. Sanyal, P, and Patel, S (2008). Pattern recognition method to detect two diseases in rice plants. The Imaging Science Journal. 56, 319-325. https://doi.org/10.1179/174313108X319397
15. Joshi, A, and Jadhav, BD 2016. Monitoring and controlling rice diseases using image processing techniques., Proceedings of 2016 International Conference on Computing, Analytics and Security Trends (CAST), Pune, India, Array, pp.471-476. https://doi.org/10.1109/CAST.2016.7915015
16. Yao, Q, Guan, Z, Zhou, Y, Tang, J, Hu, Y, and Yang, B 2009. Application of support vector machine for detecting rice diseases using shape and color texture features., Proceedings of 2009 International Conference on Engineering Computation, Hong Kong, China, Array, pp.79-83. https://doi.org/10.1109/ICEC.2009.73
17. Majid, K, Herdiyeni, Y, and Rauf, A 2013. I-PEDIA: mobile application for paddy disease identification using fuzzy entropy and probabilistic neural network., Proceedings of 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Sanur Bali, Indonesia, Array, pp.403-406. https://doi.org/10.1109/ICACSIS.2013.6761609
18. Xiao, M, Ma, Y, Feng, Z, Deng, Z, Hou, S, Shu, L, and Lu, Z (2018). Rice blast recognition based on principal component analysis and neural network. Computers and Electronics in Agriculture. 154, 482-490. https://doi.org/10.1016/j.compag.2018.08.028
19. Bidaux, JM 1978. Screening for horizontal resistance to rice blast (Pyricularia oryzae) in Africa., Rice in Africa: Proceedings of a conference held at the International Institute of Tropical Agriculture, 7–11 March 1977, Ibadan, Nigeria, pp.159-174.
20. Aneja, KR (2005). Experiments in Microbiology Plant Pathology and Biotechnology. New Delhi, India: New Age International Publishers
21. Jena, KK, Moon, H, and Mackill, D (2003). Marker assisted selection: a new paradigm in plant breeding. Korean Journal of Breeding Science. 35, 133-140.
22. Mukhina, ZM, Tokmakov, SV, Myagkikh, UA, and Dubina, EV (2011). Developing of inside gene molecular markers of rice for increasing of breeding and seed production processes efficiency. Scientific Journal of KubSAU. 67, 282-292.
23. LeCun, Y, Boser, B, Denker, JS, Henderson, D, Howard, RE, Hubbard, W, and Jackel, LD (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation. 1, 541-551. https://doi.org/10.1162/neco.1989.1.4.541
24. Alom, MZ, Taha, TM, Yakopcic, C, Westberg, S, Sidike, P, Nasrin, MS, Van Esesn, BC, Awwal, AS, and Asari, VK. (2018) . The history began from AlexNet: a comprehensive survey on deep learning approaches. Available: https://arxiv.org/abs/1803.01164
25. He, K, Zhang, X, Ren, S, and Sun, J 2016. Deep residual learning for image recognition., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, Array, pp.770-778. https://doi.org/10.1109/cvpr.2016.90
26. Wu, B, Iandola, F, Jin, PH, and Keutzer, K 2017. Squeezedet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, Array, pp.129-137. https://doi.org/10.1109/CVPRW.2017.60
27. Hu, YH, Ping, XW, Xu, MZ, Dan, WX, and He, Y (2016). Detection of late blight disease on potato leaves using hyperspectral imaging technique. Guang pu xue yu guang pu fen xi. 36, 515-519.
28. Gayathri Devi, N, and Neelamegam, P (2019). Image processing based rice plant leaves diseases in Thanjavur, Tamilnadu. Cluster Computing. 22, 13415-13428. https://doi.org/10.1007/s10586-018-1949-x
29. Liu, L, and Zhou, G (2009). Identification method of rice leaf blast using multilayer perception neural network. Transactions of the Chinese Society of Agricultural Engineering. 25, 213-217.
30. Phadikar, S, Sil, J, and Das, AK (2013). Rice diseases classification using feature selection and rule generation techniques. Computers and Electronics in Agriculture. 90, 76-85. https://doi.org/10.1016/j.compag.2012.11.001
31. Do, HM. (2019) . Rice diseases image dataset: an image dataset for rice and its diseases. Available https://www.kaggle.com/minhhuy2810/rice-diseases-image-dataset

Igor V. Arinichev is a Candidate of Economic Sciences, associate professor of the Kuban State Univercity. He published more than 60 publications in peer-reviewed journals or conferences, and books on application of mathematical methods in economics, agriculture and technology.

E-mail: iarinichev@gmail.com

Sergey V. Polyanskikh is a Ph.D. in Mathematics and Mechanics. He is currently a Senior Data Scientist in Plarium Inc. He published more than 20 theoretical and applied publications in hydrodynamics and mathematics.

E-mail:

Galina V. Volkova is a Doctor of Biological Science, Head of the Laboratory of the Laboratory of Immunity of Cereal Crops to Fungal Diseases. She published more than 200 publications in peer-reviewed journals or conferences, and books.

E-mail:

Irina V. Arinicheva is a Doctor of Biological Sciences, professor of the Department of Higher Mathematics (Kuban State Agrarian University). Her specialization is mathematical modeling of biological processes. She is the author of over 150 scientific articles, monographs, inventions, educational materials for students.

E-mail:

### Article

#### Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2021; 21(1): 1-11

Published online March 25, 2021 https://doi.org/10.5391/IJFIS.2021.21.1.1

## Rice Fungal Diseases Recognition Using Modern Computer Vision Techniques

Igor V. Arinichev1, Sergey V. Polyanskikh2, Galina V. Volkova3, and Irina V. Arinicheva4

1Department of Theoretical Economy, Kuban State University, Krasnodar, Russia
2Plarium Inc., Krasnodar, Russia
3Laboratory of Cereal Crops Immunity to Fungal Diseases, All-Russian Research Institute of Biological Plant Protection, Krasnodar, Russia
4Department of Higher Mathematics, Kuban State Agrarian University named after I.T. Trubilin, Krasnodar, Russia

Correspondence to:Igor V. Arinichev (iarinichev@gmail.com)

Received: September 16, 2020; Revised: February 10, 2021; Accepted: February 22, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

### Abstract

In the article, the authors study the possibility of detecting some fungal diseases of rice using visual computing and machine learning techniques. Leaf blast and brown spot diseases are considered. Modern computer vision methods based on convolutional neural networks are used to identify a particular disease on an image. The authors compare the four most successful and compact convolutional neural network architectures: GoogleNet, ResNet-18, SqueezeNet-1.0, and DenseNet-121. The authors show that in the dataset used for the analysis, the disease can be detected with an accuracy of at least 95%. Testing the algorithm on real data not used in training showed an accuracy of up to 95.6%. This is a good indicator of the reliability and stability of the obtained solution even to a change in the data distribution. Data not used in training showed an accuracy of up to 95.6%. This is a good indicator of the reliability and stability of the obtained solution even to a change in the data distribution.

Keywords: Convolutional neural networks, Machine learning, Computer vision, Rice, Fungal diseases

### 1. Introduction

Rice is one of the most important grain crops in the world. Global consumption of this cereal has increased over the past few years. Thus, in 2019 more than 490 million tons of rice were consumed worldwide, compared with 437.18 million tons in 2008. The main consumers of rice are China and India (143 and 100 million tons, respectively), as well as the main producing countries providing more than 56% of the world rice production; then come Indonesia (37.7 million tons), Bangladesh (35.8 million tons) and Vietnam (21.5 million tons) [1]. Rice is also a valuable food, dietary and medicinal product in Russia. Its share in the consumed cereals is more than 40%, while consumer demand is growing every year [2].

Diseases and pests are responsible for approximately 20%–40% of rice production losses in Russia. At the same time, colossal economic damage to rice growing is caused by fungal diseases. Thus, losses only from blast diseases (causative agen, Pyricularia oryzae Cavara) in Russia, according to various estimates, in ordinary years are from 5% to 25% and in the years of epiphytotic development of the disease - up to 60 or even up to 100%. The harmfulness increases significantly due to a sharp decrease in the quality of grain obtained from affected plants [2].

In today’s common pest management practices, the agrochemicals are sprayed uniformly throughout the field as a preventive measure or when any symptoms of disease are detected. At the same time, diseases in the early stages are often identified incorrectly and therefore the complex of agrochemicals is selected incorrectly. On the one hand, it largely increases the cost of disease control, since at least in the beginning the disease infection is concentrated in areas mainly around the original foci. On the other hand, the extended amount of applied chemicals increases the likelihood of groundwater contamination and adversely leads to toxic residues in agricultural products.

Thus, there is no doubt about the relevance of the problem of timely and accurate detection and classification of rice diseases. The traditional practice of detecting fungal diseases is based either on visual, pathogen-induced symptoms or laboratory identification of the pathogen [3]. Visual assessment is subjective and, in some cases, can be inconsistent, leading to an incorrect diagnosis of the disease. Sometimes the visual classification is complicated by the fact that plants are affected by several diseases at the same time and many of their traits simultaneously intersect within a single plant specimen. Identification in the laboratory, in turn, is a laborious process that requires time-consuming pathogen cultivation. In any case, both of these methods require the participation of high-level professionals in the detection process, which is not always available, especially in small farms [4, 5].

These limitations have contributed to the emergence of a significant number of works devoted to the study of the possibility of applying machine learning methods to the problems of automatic detection and classification of crop diseases based on digital images. The general methodology in such studies is similar. First, images of diseases are captured using cameras or scanners. Secondly, the affected areas (spots) are separated from the background. Thirdly, peculiarities of color, shape or texture are extracted as features. Finally, classification techniques such as neural networks, Bayesian classifier, k-nearest neighbors (kNNs), support vector machines (SVMs), and others are used to classify disease images. For example, [6] used Bayesian classifier to detect five kinds of maize leaf diseases. The results proved that the accuracy of disease detection based on the proposed approach was higher than 83%. KY Huang (Huang, 2007) applied a backpropagation neural network classifier. The preprocessing involved segmentation of the lesions of three classes of disease, including bacterial soft rot (BSR), bacterial brown spot (BBS), and Phytophthora black rot (PBR). They were segmented by an exponential transform and image processing techniques (gray level co-occurrence matrix (GLCM). The classification accuracy on test data was about 90%.

Tian et al. [7] recognized traits based on textures, shapes and colors of three types of vine leaf diseases and classified the diseases using the SVM. The results obtained showed that the efficiency of SVM classification was higher than that of neural networks. Pantazi et al. [8] presents a tool for the detection and recognition of healthy plants of Silybum marianum and those infected with the fungus Microbotyum silybum during vegetative growth. An experimental sample, including both classes, was used to obtain leaf spectra using a portable device. Pre-processing of the spectra included normalization and principal component analysis (PCA). Three models were used to identify systemically infected plants: supervised Kohonen network (SKN), counter propagation artificial neural network (CP-ANN) and XY-fusion network (XY-F). The identification rates reached the highest accuracy of 95% using the XY-F.

In [9], the authors proposed an approach to automatic detection and classification of parasites on strawberry plants under greenhouse conditions, based on the SVM method. To increase the performance of the algorithm, non-flower regions were considered as background and removed using the gamma operator. The following case study [10] presented the development of a system that would automatically recognize water stressed and healthy winter wheat plants in the presence of a Septoria tritici infection. The detection algorithm was developed based on the combination of least squares support vectors machine (LSSVM) with sensor fusion. Using LSSVM, the classification performance was over 99%.

The study by Dong and Wang [11] is mainly dedicated to the image processing and recognition technologies of cucumber downy mildew and powdery mildew. The preprocessing included the use of median filtering method, as well as the extraction of a number of features based on color range segmentation, geometric parameters of the affected area, and extraction texture parameters by using a GLCM. The classification of diseased plants was carried out on the basis of the shortest distance method. The result of the experiment showed that the method demonstrated a disease recognition accuracy of over 96%.

Considering the focus of our article, the papers dedicated to the detection and classification of rice diseases were of particular interest. In [4, 5], a machine learning approach for the detection and recognition of rice diseases is presented using a sample of 619 images for four classes: (a) rice blast (RB), (b) bacterial leaf blight (BLB), (c) sheath blight (SB), and (d) healthy leave (HL). It was used a pre-trained deep convolutional neural network (CNN) as a feature extractor and SVM as a classifier. Much attention in the article is paid to preprocessing, since plants can contain dust, dew drops, etc., which makes the data noisy and, in turn, creates problems at the stages of segmentation and extraction of traits. Suresha et al. [12] proposed a method for identifying Blast and Brown Spot diseases using the kNN classifier based on a number of geometric features: area, major and minor semiaxes of spots, perimeter of the affected leaf part.

The authors [13] to ensure the extraction of features used the K-means clustering method, preliminarily removing green pixels from the affected parts of the leaf. Traits were extracted into three categories: color, shape and texture, and then SVM were applied to identify rice diseases in a multiclass manner. Sanyal and Patel [14] proposed a method for extracting signs of rice diseases based on color and textures that were fed to the input of a multilayer perceptron (MLP) for disease identification. Joshi and Jadhav [15] proposed to extract features based on color and shape, which, in turn, were fed to the input of the minimum distance classifier (MDC) and kNN classifier for disease identification. In the same line, the authors [16] proposed the combination of texture and shape features of the rice leaf to detect the different rice diseases using the SVM classifier. Majid et al. [17] used fuzzy entropy and probabilistic neural network classifier to identify diseases. Xiao et al. [18] explored PCA for dimensionality reduction of features and used back propagation (BP) neural network model to classify the different rice diseases.

The solution to this problem is of high practical value, since it will allow farmers, using cameras built into mobile devices, not only to identify diseases, but immediately design the optimal course of treatment.

We believe that in the near future the main task of visual detection of plant diseases will be partially or completely solved by artificial intelligence systems. Such systems can be built as client-server applications installed on mobile or stationary devices. These kinds of rickets are more heavyweight, but they ensure the introduction of machine learning models of any complexity into their pipeline. Another type of systems is lightweight application architectures that work autonomously or semi-autonomously on mobile devices and do not require a large amount of computing resources and constant access to the Internet for their work. Such solutions are the most convenient for use, but they have a number of specified limitations. In this work, we have concentrated on systems of the second type, which have the ability to work autonomously on devices. For this kind of architecture, the lightness of the underlying neural network architecture is of paramount importance.

The resulting solution will allow in the future to solve the following problems that occur everywhere in agriculture in different countries: (i) small/absent staff of phytopathologists, (ii) low qualification of specialists, and (iii) it is expensive to train or attract specialists from outside.

Section 2 describes the main methods used by phytopathologists to identify rice diseases at different stages of the growing season. We are especially interested in the visual method, as it is most easily replaced by the methods of modern computer vision. The following is an overview of the main neural network architectures suitable for solving this problem. We provide the description of the model dataset used to illustrate the proposed methods. Also, various options for preprocessing data processing are considered to help improve the quality of the final disease identification algorithm. Section 3 describes the main results of the numerical experiment performed. In Section 4, we discuss the results and examine how stable is the model to significant changes in data distribution. We are testing the trained model on data collected in a different geographic area. It is shown that the model works quite good on completely new dataset of rice disease images.

### 2.1 Classical Methods for the Detection of Rice Diseases

Expert phytopathologists are currently using the following approaches to identify various rice diseases:

• Visual method - establishing the external symptoms of the disease, degree of disease development and its prevalence [19].

• Microscopic method - determination of the nature of changes in the diseased plant tissue, detection of the pathogen and its sporulation.

• Biological method - artificial infection; the degree of damage is determined in percentage according to the guidelines of VNIIF.

• Cultural method - the fungus (for spots) is isolated on a nutrient medium and its cultural and morphological characteristics are monitored [20].

• Molecular genetic method - diagnostics of rust fungi using polymerase chain reaction [21, 22].

In this paper, we shall focus on the first method, since it is perhaps the easiest for it to adapt modern machine learning methods, replacing the human eye and an expert phytopathologist with a computer algorithm. The neural network algorithms have shown the best quality of all machine learning algorithms used for similar tasks. They allow one to automatically detect the presence of a disease in the image. In turn, among all neural network architectures, it is convolutional neural networks that have established themselves as state-of-the-art for digit recognition, having recently become de facto the main method used in computer vision to solve problems of classification, detection and segmentation of objects on images. Let us briefly describe the basic ideas and techniques used by convolutional networks, as well as some of the reasons for their success.

### 2.2 An Overview of Convolutional Neural Networks

The modern Computer Vision methods based on neural network algorithms have shown the best quality of all machine learning algorithms used for similar tasks. They allow one to automatically detect the presence of a disease in the image. In turn, among all neural network architectures, it is CNNs that have established themselves as state-of-the-art for digit recognition, having recently become de facto the main method used in computer vision to solve problems of classification, detection and segmentation of objects on images. Let us briefly describe the basic ideas and techniques used by convolutional networks, as well as some of the reasons for their success.

An attempt to bring a network closer to the mechanism of human vision is the main idea behind CNNs. LeCun et al. [23] by a neural network, it is sufficient to use not all connections between neurons of neighboring layers of the network, but only a small number of them. It is basically the simplest model of how vision works, when an eye, trying to find any object in the image, sequentially focuses on different parts of it but not on the entire image at once. Mathematically, such “eye focusing” corresponds to the convolution operation, which carries out pattern-matching of the area on the image and some desired pattern, and in our case, the disease manifested on the rice leaf. Various CNN architectures can be obtained by combining similar convolution operations into layers in different ways. Today it is a solid foundation of modern computer vision and helps to successfully solve a wide range of problems, such as classification, clustering, segmentation, etc.

Historically, LeCun et al. [23] is considered to be a starting point of convolutional neural networks. LeCun’s convolutional network took up less memory space than its fully connected version, learned much faster, and was able to successfully recognize handwritten numbers with accuracy no less than human. LeCun’s approach qualitatively changed the way we look at things. Before him, neural networks were considered, although more or less universal, but a very heavyweight algorithm, very capricious in setting up and requiring large computing power. Now the process of building and training a network has become simpler and closer to the ideas of the biology of vision itself.

Then AlexNet was introduced. It won the ImageNet 2012 challenge with a test accuracy of 84.6%. In all, there were about 1.2 million training images required for classification [24]. This is an incomparably more difficult task than the classification of black and white numbers, which required an increase in the volume of the neural network by orders of magnitude; the number of network parameters increased from 60k in LeNet-5 to more than 60M in AlexNet. This turned out to be such a large number at the time that the network architecture itself had to be specially split up into two computers, each of which calculated one of the two large network branches. And yet it was a major breakthrough. With the advent of AlexNet, CNNs, for the first time since LeNet, have established themselves as state-of-the-art algorithms for working with large color images, capable of detecting many patterns and shapes on them.

The next big breakthrough was VGG CNN architecture, which relied on even more parameters [25]. So, VGG-16 won the ImageNet challenge in 2013 and had a total number of trained parameters equal to 138M. The use of a large number of parameters and small convolution kernels made it possible to achieve 92.7% test accuracy in ImageNet.

By 2013, convolutional architectures had undoubted success. However, this success was not associated with new ideas, as in the case of LeNet, but mainly with quantitative improvements and increased computing power. The volume of networks has grown from tens of thousands to hundreds of millions of parameters, again making them difficult, time consuming and expensive to be trained. At this point, it was unclear whether such complexity of the instrument was justified by the true complexity of the problem, or, as in the case of classical neural networks, new ideas were required.

Another huge boost to computer vision came in 2014 when Google introduced its GoogLeNet Inception architecture, which won the ImageNet competition with 93.3% test accuracy [25]. The network had only 6M parameters, that is, about 10 times less than AlexNet and more than 20 less than VGG-16. A progressive idea that made it possible to significantly reduce the number of parameters and improve the quality of predictions was the use of special inception blocks that concatenate convolutions of different sizes. This allowed the algorithm to immediately see details at different scales, deciding which one is most significant for a given image

ResNet architecture became the next significant success in the world of computer vision. It won ImageNet challenge in 2015 with 96.43% test accuracy [25]. The breakthrough idea here was the use of residual blocks that make the network, along with hidden dependencies also learn the identity mapping. As a result, this solved the degradation problem and significantly improved the quality of predictions. An interesting feature of residual links is the fact that they can be added to almost any architecture, improving convergence and improving its quality without increasing the total number of parameters. ResNet itself has from 10M parameters depending on the details of the architecture.

SqueezeNet and DenseNet architectures are also worth mentioning. SqueezeNet has reached the quality level of AlexNet, but with the number of parameters 50 times less. This was achieved by using 1×1 convolutions and shrinking larger convolution kernels, thereby reducing the number of input channels in each layer [26]. DenseNet also followed the path of qualitative changes, somewhat similar to ResNet, but unlike it, concatenating the previous layers with the subsequent ones, rather than summing them. As a result, with a 7M parameters comparable to GoogLeNet, we get faster convergence and a slight increase in quality.

In this research, we consider four of the most well-proven relatively lightweight architectures: GoogLeNet,, ResNet-18, SqueezeNet-1.0 and DenseNet-121. AlexNet and VGG architectures are rather heavy and usually give the quality no higher than the later architectures, so we do not consider them.

### 2.3 Dataset Description

A good training set is required to train a neural network, like any supervised machine learning algorithm. Getting appropriately collected and preprocessed training data is usually the most challenging part of the task, as it requires detailed analysis from the perspective of both business and end users of the product. The vast majority of the following studies are by Hu et al. [27] and others [28, 29]. Phadikar et al. [30] point out the exceptional role of competent collection of training data for the tasks of detecting diseases of various agricultural crops. Namely, it is indicated that challenges can be represented by various illumination concerns, photo noise and insufficient severity of the disease.

Thus, it is not possible to train a neural network “for all occasions” at the current level of technology development. It is required to have a clear idea of how the trained neural network will then be used. Namely, before the work begins, the following parameters should bedefined: (1) general conditions of photography, (2) shooting angle, (3) ranges of brightness, contrast, (4) possible noise and distortion, (5) illumination concerns, and (6) background influence.

One can limit the terms of photography and require end users to comply with them in order to increase the quality of the final neural network. Otherwise, no algorithm can guarantee the validation accuracy achieved during training.

In this research we use the dataset [31] slightly expanding it with data that are freely available on the Internet. We exclude rice hispa disease as irrelevant for the South of Russia. Eventually, we work with a dataset of 4,278 images: healthy 1,488 images, brown spot disease 1,195 images, and leaf blast disease 1,595 images (Figure 1).

One should note that one rice leaf can be simultaneously infected with several diseases. In this case, we are faced with the task of multiclass classification. Nevertheless, the dataset is marked strictly and its visual analysis confirms this. Thus, in this research we consider the case of a strict multiclass classification: no more than one disease on one leaf.

In practice, this means that even an ideal model will most likely have a certain upper accuracy threshold other than 100%, which it cannot exceed without retraining. In this research we demonstrate that even under the assumption of a strict multiclass classification, a quite good accuracy can be achieved on the validation set - up to 96%.

In August 2020 we collected a test dataset for additional validation of the model results. Dataset size is 300 observations; the data were collected on the territory of the educational farm “Kuban” and Kuban State Agrarian University. For a number of reasons, at this time of the year only leaf blast disease was available for analysis. Thus, the final quality assessment of the model’s work was carried out only on the following examples: healthy or infected leaf blast. This kind of checks are extremely important, since the distribution of the dataset on which the model was trained can quite often differ from the distribution on which the model is ultimately applied. Usually machine learning models are quite sensitive to changes in distribution, and it is necessary to know to what extent is the model proposed in this work resistant to such.

### 2.4 Data Preprocessing

As noted earlier, the collection of a dataset must first of all be aimed at the end user of the model. But even monitoring the quality and shooting conditions both when collecting data and when using a trained model, a number of problems of a fundamental nature may arise that can significantly degrade the quality of the model. Among them are the following:

• insufficient sample size;

• natural invariance of predictions regarding rotation / image reflections;

• instability of predictions, when even insignificant noise can change the result;

• the effect of overfitting, when the quality of predictions on new images turns out to be significantly lower than on training images.

All these problems can be dealt with to a certain extent by organizing competent preprocessing of the original images. Here we use the following preprocessing stages for the original dataset:

• rotation by random angle from 0° of 45°

• flip an image along the main axes

• standard normalization of RGB image channels.

As a result, the size of the training sample increases, increasing the stability of predictions and ensuring their invariance to image rotations.

### 2.5 Model Architectures

We have chosen the categorical cross-entropy function as the main function, which is typical for multiclass classification problems. In our case of three classes, it can be written as

$L(w)=-1n∑i=1n∑k=13ynk ln pnk,$

where ynk - ground truth answers (1 or 0), and pnk - softmax model predictions, which depend on model weights w. This loss function is most natural for classification problems, since it has a clear probabilistic interpretation and, due to the logarithm, greatly penalizes the model for incorrect answers.

We use PyTorch v1.6.0 - machine learning library based on the Torch library, which provides all neccesary functionality for training and subsequent use of modern neural network architectures. This framework is widely used due to its simplicity and broad functionality. Nevertheless, the architectures used above are quite popular, and their implementations can be found on any other framework, e.g., TensorFlow, Caffe, etc. The neural network was trained on a stationary computer with the following configuration: Intel Core i7, GeForce GTX 1660, GTX 1080 4 GB.

The Adam optimization algorithm was used as an optimization method, which also proved to be excellent for such tasks. We used Adam algorithm with default settings for all models, but an optimal static learning rate was separately adapted for each model (see Table 1).

In this work, we specifically focused on lighter architectures that, if necessary, can be used directly from a mobile device. We trained the models using the PyTorch framework with standard pretrained models from the torchvision module, completely fine-tune for the dataset described above. We examined a number of CNN architectures both classical and modern, and then chose the following ones: GoogLeNet, ResNet-18, SqueezeNeq-1.0, DenseNet-121. They produce the most promising results and at the same time are the most compact. So, for example, computationally heavy VGG and AlexNet showed results similar to those given below, but somewhat worse and requiring much more computing resources both at the training stage and at the prediction stage. The training process is shown in Figure 2.

Learning parameters comparison of the CNNs is presented in Table 1. DenseNet-121 achieved the best accuracy with a relatively small number of parameters, and stabilized in the shortest (about 14 epochs) time. GoogLeNet architecture showed the second best result, stabilizing a little slower. ResNet-18 turned out to be the third in accuracy; however, it already has significantly more parameters than the others. The architecture of SqueezeNet-1.0 should be especially noted. It was not much inferior to the others, showed the same accuracy results, stabilized quickly and has the least number of parameters - just about 750K.

### 3. Results

Table 2 shows the final quality metrics of the models under consideration: accuracy, and also micro and macro averaged f1, precision and recall. One notices that DenseNet-121 turns out to be the best in all characteristics. Next come GoogLeNet and SqueezeNet. The heavyweight ResNet architecture turns out to be worse than the others in this case.

Figure 3 shows one-vs-all ROC curves for each class, built on the validation set according to the predictions of the best DenseNet model. All ROC-AUC scores are close to 1. A closer look at the errors of the algorithms shows that GoogLeNet is slightly better at predicting the presence of a disease, but it can confuse a brown spot with a leaf blast. ResNet and SqueezeNet architectures show average similar results. The most accurate DenseNet architecture does not detect only a small fraction of the diseased plants, perfectly identifying the rest of the cases (see Figure 4).

### 4. Discussion

As shown above, the DenseNet demonstrated the most accurate results among ather network architectures. It has 6.9M parameters which is small compared to ResNet-18 and can be trained with average computer with 2–4G GPU memory. DenseNet accuracy and other significant metrics are above 0.95 which is close to human level quality of disease classification.

Since the training of the models was carried out on data from the Internet, the question is, to what extent the obtained model will be applicable to the realities of another region of the world without additional training on additional data. Putting the question more mathematically, it is interesting to find out how much the distribution of data collected elsewhere will differ from the distribution of the data on which the model was trained, and how stable the resulting model will be to such changes. In other words, whether the resulting model has a generalizing ability sufficient to predict the same diseases in completely different photographs of rice.

As noted above, we have collected a trial dataset of 150 healthy and 150 leaf blast disease infected. Brown spot diseases were no longer observed at the indicated time, so only 2 classes of the initially available three were used for testing.

As a result of applying the best model (DenseNet) to the collected dataset, the classification accuracy was 80.3%. If we look at the predictions of the brown spot disease model that was not contained in the test dataset as the model’s uncertainty in its answer, then its accuracy would be 95.6% with a share of uncertainty of 16%. To understand whether the accuracy has dropped dramatically, it is useful to examine how much the distribution in the data has changed. One of the most popular methods of such research is the Population Stability Index (PSI). PSI is a simple but reliable indicator that the distribution has changed dramatically, and the model is probably not worth counting on.

We read the PSI as the average of the PSI indices of intermediate features, which DenseNet model gives without the most recent fully-connected layer, which translates the 1024-dimensional feature vector into the final 3-dimensional response vector:

$PSIo=∑b∈buckets(ptestb,o-pvalidb,o) lnptestb,opvalidb,o,$$PSI=11024∑o=11024PSIo.$

Here PSIo is the PSI corresponding to one of the 1024 outputs of the DenseNet-121 network; b is the number of the binarized interval of the studied values; $ptestbo$ and $pvalidbo$ are the percentages of values from the test and validation samples that fall into the interval b.

As a result of the calculations, we get a value of 0.739, which indicates that the distribution of the features of the test dataset differs significantly from the distribution of the original training dataset. Therefore, one should not expect high quality predictions for a model that is sensitive to such changes. In our case, on the contrary, the accuracy of the test dataset was, as mentioned above, from 80.3% to 95.6%. It indicates a high generalizing ability of the resulting model and its stability in relation to significant data changes. This of course does not negate the fact that, if possible, the model should be retrained on new data, allowing it to remember the details characteristic of the geographical area where it is planned to be used, and thereby increasing its quality.

### 5. Conclusion

In the article, the authors consider the task of identification the fungal diseases of rice using modern neural network methods of computer vision. Comparison of various classical and modern architectures of convolutional neural networks shows that the problem lends itself very well to solving these methods. The best result was demonstrated by DenseNet-121 architecture reaching an accuracy of 95.57% on the validation dataset. This architecture also demonstrated the fastest stabilization to values close to the maximum - in just 10–20 epochs. We argue that the problem of automating the detection of rice fungal diseases can be successfully solved with a competent organization of the process of collecting and preliminary marking up data. And finally, the best architecture was shown to be very stable to shift in data distribution. It is also shown that training of such models is quite possible without the involvement of serious computing power. The final proposed solution in a number of cases, due to its lightness, may well be used on mobile devices in conditions of limited computing resources.

### Fig 1.

Figure 1.

Images from [31]: (a) healthy rice leaf, (b) brown spot, and (c) leaf blast.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 1-11https://doi.org/10.5391/IJFIS.2021.21.1.1

### Fig 2.

Figure 2.

CNN’s models accuracy curves during the 100-epoch training process.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 1-11https://doi.org/10.5391/IJFIS.2021.21.1.1

### Fig 3.

Figure 3.

ROC curves for predictions of each class by the DenseNet model.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 1-11https://doi.org/10.5391/IJFIS.2021.21.1.1

### Fig 4.

Figure 4.

Error matrix for the best model - DenseNet.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 1-11https://doi.org/10.5391/IJFIS.2021.21.1.1

Learning parameters comparison of the CNNs during the 250-epoch training process.

modelweights, Mbest epochtrain speed, sec/epochstab. epochlearn. rate
ResNet-1811.1128165821e-4 + decay
SqueezeNeq-1.00.72471341224e-5 + decay
DenseNet-1216.919895144e-5 + decay

The last column is the approximate number of the epoch, after which the accuracy increases only slightly. The best results are shown in bold..

Final metrics of the models.

modelaccuracyf1marcof1microprecmacroprecmicrorecalmacrorecalmicro
ResNet-1893.4730.9360.9350.9380.9350.9340.935
SqueezeNeq-1.094.1720.9420.9420.9440.9420.9420.942
DenseNet-12195.5710.9560.9560.9580.9560.9550.956

DenseNet architecture is the best in all respects..

### References

1. Shahbandeh, M. (2019/2020) . Rice Consumption Worldwide in 2019/2020 by country. Available: https://www.statista.com/statistics/255971/topcountries-based-on-rice-consumption-2012-2013/
2. Zelenskij, GL (2016). Rice: biological basis of breeding and agricultural technology. Krasnodar, Russia: Kuban State Agrarian University
3. Barbedo, JGA (2016). A review on the main challenges in automatic plant disease identification based on visible range images. Biosystems Engineering. 144, 52-60. https://doi.org/10.1016/j.biosystemseng.2016.01.017
4. Shrivastava, VK, Pradhan, MK, Minz, S, and Thakur, MP (2019). Rice plant disease classification using transfer learning of deep convolutional neural network. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. XLII-3/W6, 631-635. https://doi.org/10.5194/isprs-archives-XLII-3-W6-631-2019
5. Shah, JP, Prajapati, HB, and Dabhi, VK . A survey on detection and classification of rice plant diseases., Proceedings of 2016 IEEE International Conference on Current Trends in Advanced Computing (ICCTAC), 2016, Bangalore, India, Array, pp.1-8. https://doi.org/10.1109/ICCTAC.2016.7567333
6. Zhao, YX, Wang, KR, Bai, ZY, Li, SK, Xie, RZ, and Gao, S (2007). Bayesian classifier method on maize leaf disease identifying based images. Computer Engineering and Applications. 43, 193-195.
7. Tian, YW, Li, TL, Li, CH, Piao, ZL, Sun, GK, and Wang, B (2007). Method for recognition of grape disease based on support vector machine. Transactions of the CSAE. 23, 175-180.
8. Pantazi, XE, Tamouridou, AA, Alexandridis, TK, Lagopodi, AL, Kontouris, G, and Moshou, D (2017). Detection of Silybum marianum infection with Microbotryum silybum using VNIR field spectroscopy. Computers and Electronics in Agriculture. 137, 130-137. https://doi.org/10.1016/j.compag.2017.03.017
9. Ebrahimi, MA, Khoshtaghaza, M, Minaei, S, and Jamshidi, B (2017). Vision-based pest detection based on SVM classification method. Computers and Electronics in Agriculture. 137, 52-58. https://doi.org/10.1016/j.compag.2017.03.016
10. Moshou, D, Pantazi, XE, Kateris, D, and Gravalos, I (2014). Water stress detection based on optical multisensor fusion with a least squares support vector machine classifier. Biosystems Engineering. 117, 15-22. https://doi.org/10.1016/j.biosystemseng.2013.07.008
11. Dong, P, and Wang, X (2013). Recognition of greenhouse cucumber disease based on image processing technology. Open Journal of Applied Sciences. 3, 27-31. https://doi.org/10.4236/ojapps.2013.31B006
12. Suresha, M, Shreekanth, KN, and Thirumalesh, BV 2017. Recognition of diseases in paddy leaves using kNN classifier., Proceedings of the 2nd International Conference for Convergence in Technology (I2CT), Mumbai, India, Array, pp.663-666. https://doi.org/10.1109/I2CT.2017.8226213
13. Prajapati, HB, Shah, JP, and Dabhi, VK (2017). Detection and classification of rice plant diseases. Intelligent Decision Technologies. 11, 357-373. https://doi.org/10.3233/IDT-170301
14. Sanyal, P, and Patel, S (2008). Pattern recognition method to detect two diseases in rice plants. The Imaging Science Journal. 56, 319-325. https://doi.org/10.1179/174313108X319397
15. Joshi, A, and Jadhav, BD 2016. Monitoring and controlling rice diseases using image processing techniques., Proceedings of 2016 International Conference on Computing, Analytics and Security Trends (CAST), Pune, India, Array, pp.471-476. https://doi.org/10.1109/CAST.2016.7915015
16. Yao, Q, Guan, Z, Zhou, Y, Tang, J, Hu, Y, and Yang, B 2009. Application of support vector machine for detecting rice diseases using shape and color texture features., Proceedings of 2009 International Conference on Engineering Computation, Hong Kong, China, Array, pp.79-83. https://doi.org/10.1109/ICEC.2009.73
17. Majid, K, Herdiyeni, Y, and Rauf, A 2013. I-PEDIA: mobile application for paddy disease identification using fuzzy entropy and probabilistic neural network., Proceedings of 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Sanur Bali, Indonesia, Array, pp.403-406. https://doi.org/10.1109/ICACSIS.2013.6761609
18. Xiao, M, Ma, Y, Feng, Z, Deng, Z, Hou, S, Shu, L, and Lu, Z (2018). Rice blast recognition based on principal component analysis and neural network. Computers and Electronics in Agriculture. 154, 482-490. https://doi.org/10.1016/j.compag.2018.08.028
19. Bidaux, JM 1978. Screening for horizontal resistance to rice blast (Pyricularia oryzae) in Africa., Rice in Africa: Proceedings of a conference held at the International Institute of Tropical Agriculture, 7–11 March 1977, Ibadan, Nigeria, pp.159-174.
20. Aneja, KR (2005). Experiments in Microbiology Plant Pathology and Biotechnology. New Delhi, India: New Age International Publishers
21. Jena, KK, Moon, H, and Mackill, D (2003). Marker assisted selection: a new paradigm in plant breeding. Korean Journal of Breeding Science. 35, 133-140.
22. Mukhina, ZM, Tokmakov, SV, Myagkikh, UA, and Dubina, EV (2011). Developing of inside gene molecular markers of rice for increasing of breeding and seed production processes efficiency. Scientific Journal of KubSAU. 67, 282-292.
23. LeCun, Y, Boser, B, Denker, JS, Henderson, D, Howard, RE, Hubbard, W, and Jackel, LD (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation. 1, 541-551. https://doi.org/10.1162/neco.1989.1.4.541
24. Alom, MZ, Taha, TM, Yakopcic, C, Westberg, S, Sidike, P, Nasrin, MS, Van Esesn, BC, Awwal, AS, and Asari, VK. (2018) . The history began from AlexNet: a comprehensive survey on deep learning approaches. Available: https://arxiv.org/abs/1803.01164
25. He, K, Zhang, X, Ren, S, and Sun, J 2016. Deep residual learning for image recognition., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, Array, pp.770-778. https://doi.org/10.1109/cvpr.2016.90
26. Wu, B, Iandola, F, Jin, PH, and Keutzer, K 2017. Squeezedet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, Array, pp.129-137. https://doi.org/10.1109/CVPRW.2017.60
27. Hu, YH, Ping, XW, Xu, MZ, Dan, WX, and He, Y (2016). Detection of late blight disease on potato leaves using hyperspectral imaging technique. Guang pu xue yu guang pu fen xi. 36, 515-519.
28. Gayathri Devi, N, and Neelamegam, P (2019). Image processing based rice plant leaves diseases in Thanjavur, Tamilnadu. Cluster Computing. 22, 13415-13428. https://doi.org/10.1007/s10586-018-1949-x
29. Liu, L, and Zhou, G (2009). Identification method of rice leaf blast using multilayer perception neural network. Transactions of the Chinese Society of Agricultural Engineering. 25, 213-217.
30. Phadikar, S, Sil, J, and Das, AK (2013). Rice diseases classification using feature selection and rule generation techniques. Computers and Electronics in Agriculture. 90, 76-85. https://doi.org/10.1016/j.compag.2012.11.001
31. Do, HM. (2019) . Rice diseases image dataset: an image dataset for rice and its diseases. Available https://www.kaggle.com/minhhuy2810/rice-diseases-image-dataset