Title Author Keyword ::: Volume ::: Vol. 18Vol. 17Vol. 16Vol. 15Vol. 14Vol. 13Vol. 12Vol. 11Vol. 10Vol. 9Vol. 8Vol. 7Vol. 6Vol. 5Vol. 4Vol. 3Vol. 2Vol. 1 ::: Issue ::: No. 4No. 3No. 2No. 1

Investigation on the Growth of Green Bean Sprouts with Linear Discriminant Analysis

Kiju Kim, and Seokwon Yeom

School of Computer and Communication Engineering, Daegu University, Gyeongsan, Korea
Correspondence to: Seokwon Yeom (yeom@daegu.ac.kr)
Received December 11, 2017; Revised December 19, 2017; Accepted December 24, 2017.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract

Machine vision technique is playing an increasingly important role in monitoring plant growth. The identification of good or bad status of growing plant requires specific features such as color, length, and solidity of stems. This paper addresses recognition of growing status of the green bean sprout. For preprocessing, RGB color images are converted to hue-saturation-intensity (HSI) images, and then, principal component analysis (PCA) is applied to reduce the high dimensionality of image data. Finally, linear discriminant analysis (LDA) recognizes whether the growing status is good or bad. In the experiments, the images of the green bean sprouts growing in jars are acquired for 6 consecutive days. The correct classification rates are compared between PCA only process and LDA combined with PCA. The experimental results show PCA+LDA can achieve more than 93.3% average accuracy after 3 days.

Keywords : Imaging classification, Plant monitoring, Feature extraction, Principal component analysis, Linear discriminant analysis
1. Introduction

Image processing and machine vision techniques have made considerable achievements in the agricultural applications [1, 2]. Distinguishing spices and scales of plants are one of the important agriculture sectors for successful farming. Distinct features such as scale, shape, color, and texture are often used to identify the species and estimate the density and scale of plants. In [3], wood species are classified by pattern recognition methods. Texture of cork tiles are recognized with neural networks in [4]. Different types of beers are classified by color information [5]. In [6], infrared imaging is used to characterize the agriculture fields. Image processing has been extended to diagnose disease of the plant [7, 8].

Green bean sprouts are a culinary vegetable grown by sprouting small and round mung beans [9]. Good agriculture practices are required for successful cultivation such as good seeds, proper lighting conditions, and enough fresh water. Since they are often infected by bacteria and grow only for a week, daily monitoring of the growing status is important.

In this paper, we address recognition of growing status of the green bean sprout. First, RGB color images of plants are converted into hue-saturation-intensity (HSI) domain [10, 11]. The principal component analysis (PCA) and the linear discriminant analysis (LDA) are adopted to classify well or badly growing plant areas. PCA and LDA are widely used feature extraction and classification methods in the fields of pattern recognition and computer vision [1223]. In [1214], biometric information such as face is classified with PCA. Plant growing status is identified with PCA in [15]. LDA combined with PCA is applied to classify faces in [16], and it was adopted to analyze 3D information in [17].

PCA projects input vectors onto a subspace spanned by the eigenvectors of the covariance matrix of training data [18]. The eigenvectors are called the principal components, so it maximizes the variance of data. This projection is optimal in terms of mean square error (MSE) between original and reconstructed vectors.

LDA is a classification method with linear projection into a space satisfying the Fisher criteria [18]. LDA projects vectors onto lower dimensional subspace where the ratio of determinant of between-class covariance matrix to determinant of within-class covariance matrix is maximized. However, LDA often suffers from the singularity problem of the within-class covariance matrix [1921]. This problem occurs in most practical cases, when the number of pixels is larger than the number of training images. Therefore, dimensionality reduction like PCA can be preceded to avoid the singularity problem. In [19, 22, 23], LDA is applied to photon-counting features, which is not involved with the singularity problem.

In this paper, PCA either classifies unknown data or extract features reducing the dimensionality of vectors. In the experiment, the images of eight jars growing green bean sprouts are captured for 6 days in a row. Four images are acquired from well growing plants and the other four are from badly growing plants. Three images are trained for each class and the other two images are tested to evaluate the classification performance. The experimental results show that PCA+LDA can achieve more than 93.3% correct classification rate after 3 days and 83.6% overall average.

The paper is organized as follows: Section 2 presents the classification methods, PCA and LDA. Section 3 shows the experimental results. The conclusion follows in Section 4.

2. Recognition of Growing Status

The images in RGB domain are converted into images in HSI domain. The HSI color model describes a color in terms of how it is recognized by the human eye [10]. In the next subsection, PCA and LDA are discussed in terms of expectation values and their estimation. It is assumed that LDA is preceded by PCA.

2.1 Principal Component Analysis

A random vector is composed of the HSI information of the pixels as $x=[xHt,xSt,xIt]t∈Rd×1$, where Rd×1 is d-dimensional Euclidean space; xH, xS, and xI are, respectively, HSI vector of the image data, thus the dimension of each vector is the same with the number of pixels; d is the dimension of the vector x; and t denotes transpose. The mean vector and covariance matrix of x are, respectively, defined as

$μx=Ex(x),$$Σxx=Ex[(x- μx)(x- μx)t].$

In the eigenanalysis, the following equation is hold:

$Λ=EtΣxxE,$$E=[e1⋯ed],$$Λ=diag(λ1,⋯,λd),$

where ei is the eigenvector which corresponds to eigenvalues λi, λ1 ≥ … ≥ λd; and diag(·) denotes a diagonal matrix. The PCA projection is performed as

$y=WPtxu,$

where WP = [e1 · · · el], ld is a PCA projection matrix; xu is an unknown vector; l is a reduced dimension after the PCA projection. In the experiments, l is set at the minimum number satisfying the following equation:

$∑i=1lλi∑i=1dλi≥0.95.$

A decision for a class is made by minimizing the Euclidean distance between unknown vector and a conditional mean vector as

$j^=minj=1,⋯,nc‖WPt(xu- μx∣j)‖,$

where nc is the number of class; μx|j is the conditional mean vector of class j, μx|j = Ex|j(x|j), thus μx = Ej(μx|j).

The condition and mean vectors, and the covariance matrix are conventionally estimated by the sample mean as

$μ^x∣j=1nj∑n=1njxj(n),$$μ^x=∑j=1ncπj μ^x∣j,$$πj=njnt,$$Σ^xx=1nt∑j=1nc∑n=1nj(xj(n)- μ^x)(xj(n)- μ^x)t,$

where xj(n), n = 1, …, nj is one realization of the random vector x for class j; πj is the weight of observations of class j; nj is the number of training vectors for class j; and nt is the total number of training vectors, that is $nt=∑j=1ncnj$.

2.2 Linear Discriminant Analysis

The Fisher LDA is a linear projection method that maximizes the ratio of determinant of between-class covariance matrix to the determinant of within-class covariance matrix. The within-class covariance matrix measures the concentration of each class:

$ΣyyW=Ej{Ey∣j[(y- μy∣j)(y- μy∣j)t∣j]},$

where $μy∣j=WPt μx∣j$. The between-class covariance measures the separation of the classes:

$ΣyyB=Ej[( μy∣j- μy)( μy∣j- μy)t],$

where $μy=WPt μx$.

The Fisher criteria is as follows:

$WF=argmaxW∈Rl×r|WtΣyyBW||WtΣyyWW|,$

where r is a reduced dimension of a vector which is the same with nc − 1 [18].

The within-class covariance matrix and the between-class covariance matrixes are, respectively, estimated as [19]

$Σ^yyW=1nt∑j=1nc∑n=1nj(yj(n)- μ^y∣j)(yj(n)- μ^y∣j)t,$$Σ^yyB=∑j=1ncπj( μ^y∣j- μ^y)( μ^y∣j- μ^y)t,$

where $yj(n)=WPtxj(n)$.

It is well known that Eq. (15) is equivalent with the generalized eigenvalues problem. The column vectors of WF are eigenvectors of $(Σ^yyW)-1Σ^yyB$ corresponding to non-zero eigenvalues of $(Σ^yyW)-1Σ^yyB$. It is noted that $rank(Σ^yyW)≤min(nt-nc,l)$. Since $Σ^yyW$ is a l×l dimensional matrix, l should be less or equal to ntnc in order to avoid the singularity problem of $Σ^yyW$. In the paper, the dimensionality reduction is performed by PCA since the original dimension d is larger than the number of training data. The decision process for LDA is implemented with Euclidean distance between unknown vector and the conditional mean as

$j^=argminj=1,⋯,nc‖WFt(yu- μ^y∣j)‖=argminj=1,…,nc‖WFtWPt(xu- μ^x∣j)‖.$
3. Experimental Results

In the experiments, a jar and a water bowl are set up to grow green bean sprouts. A webcam with an illumination source is located above the jar as shown in Figure 1. Eight jars are captured for 6 consecutive days. Four jars are in a good condition for growth, but the other four jars are depleted with water or exposed to light during day time, thus sprouts in those jars become badly grown.

Figure 2 shows the well growing plant (Class 1) and the badly growing plant (Class 2) images for 6 days. Each image size is 1600×1600 pixels.

Plant images are divided into non-overlapped 200×200 (Case 1) or 400×400 (Case 2) pixel windows. Pixels in each window comprise a random vector, thus 16 and 64 vectors are obtained from one image for Case 1 and Case 2, respectively. The dimension of a vector is 120,000 and 480,000 for Case 1 and Case 2, respectively, considering HSI color information.

Table 1 shows the training and testing image sets used in seven different experiments. The vectors from three images are trained per each class and the other two images are tested, thus the number of training data per class (nj) is 48 (=16×3) and 192 (=64×3) for Case 1 and Case 2, respectively. The total number of training data (nt) is 96 and 384 for Case 1 and Case 2, respectively, since the number of class (nc) is two.

Figures 3 and 4 show the visualization of the column vectors of WP, which is obtained for Case 1 and Case 2, respectively, at the 6th day of Exp. 1. They show the first and the second eigenvectors corresponding to the two largest eigenvalues. One column vector is divided into of three parts: hue, saturation, and intensity.

Figures 5 and 6 are the column vector of WFWP for Case 1 and Case 2, respectively, at the 6th day of Exp. 1. Since the dimension is one after the LDA projection, Figures 5 and 6 show only one column vector.

Figures 7 and 8 show the histograms of image windows after the PCA+LDA projection for Case 1 and Case 2, respectively, at the 6th day of Exp. 1. Linearly separable space (1-dimension) results from the PCA+LDA projection. The mean values of training data in Class 1 and Class 2 are 2.68 and 15.24, respectively in Figure 7. Thus, the space is divided by a thresholding value, 8.96 according to Eq. (18). The mean values are -50.04 and 3.22 in Figure 8. The thresholding value is −23.41 accordingly. It is shown that the training and testing data are more separable in Case 2 with a larger size window.

Figures 9 and 10 show the average correct classification rate over seven experiments for 6 days. Figure 9(a) and 9(b) are PCA results for Case 1 and Case 2, respectively. Figure 10(a) and 10(b) are PCA+LDA results for Case 1 and Case 2, respectively. The classification rates are more than 93.3% after 3 days when the PCA+LDA scheme is applied with 400×400 pixel windows in

Table 2 shows the average classification rate over 6 days. Table 2 shows that the best result comes from the PCA+LDA scheme with 200×200 pixel windows, which is 83.6%. Although the overall performance is better for Case 1 in PCA+LDA, Case 2 provides better or similar classification results from the 3rd day to the 6th day in Figure 10. The average classification rate for those 4 days is 92.1% for Case 2 but 90.7% for Case 1.

4. Conclusion

The recognition of the growing status of green bean sprouts was discussed in this paper. For the preprocessing, RGB color images are converted to HSI images. We adopt PCA and LDA for the classification of good or bad status of growing plants. The best results come from the PCA+LDA scheme with a smaller window size. This proposed method can be applied to other agriculture produce. Study for classification of other plants remains for future study.

Acknowledgements

This research is supported by Daegu University Research Grant.

Conflict of Interest

Figures
Fig. 1.

Experimental set-up for plant imaging.

Fig. 2.

Growing plant images: (a) Class 1 (good growing) and (b) Class 2 (bad growing).

Fig. 3.

Image of column vector of WP for Case 1 at the 6th day of Exp. 1. (a) 1st column vector and (b) 2nd column vector. Hue, saturation, intensity from left to right.

Fig. 4.

Image of column vector of WP for Case 2 at the 6th day of Exp. 1. (a) 1st column vector and (b) 2nd column vector. Hue, saturation, intensity from left to right.

Fig. 5.

Image of column vector of WFWP for Case 1 at the 6th day of Exp. 1. (a) 1st column vector and (b) 2nd column vector. Hue, saturation, intensity from left to right.

Fig. 6.

Image of column vector of WFWP for Case 2 at the 6th day of Exp. 1. (a) 1st column vector and (b) 2nd column vector. Hue, saturation, intensity from left to right.

Fig. 7.

Histogram of image windows after PCA+LDA for Case 1 at the 6th day of Exp. 1.

Fig. 8.

Histogram of image windows after PCA+LDA for Case 2 at the 6th day of Exp. 1.

Fig. 9.

PCA results of correct classification rate. (a) Case 1 (200×200 pixels) and (b) Case 2 (400×400 pixels).

Fig. 10.

PCA+LDA results of correct classification rate. (a) Case 1 (200×200 pixels) and (b) Case 2 (400×400 pixels).

TABLES

Table 1

Training and testing image sets

Training imageTesting image
Class1Class2Class1Class2
Exp. 1G1, G2, G3B1, B2, B3G4B4
Exp. 2G2, G3, G4B1, B2, B3G1B4
Exp. 3G1, G3, G4B1, B2, B3G2B4
Exp. 4G1, G2, G4B1, B2, B3G3B4
Exp. 5G1, G2, G3B2, B3, B4G4B1
Exp. 6G1, G2, G3B1, B3, B4G4B2
Exp. 7G1, G2, G3B1, B2, B4G4B3

Table 2

Average classification rate over 6 days

PCAPCA+LDA
Class 1Class 2Avg.Class 1Class 2Avg.
Case 1 (200×200 pixels)0.8240.7870.8060.8520.8200.836
Case 2 (400×400 pixels)0.8570.7780.8180.8380.8140.826

References
1. Saxena, L, and Armstrong, L 2014. A survey of image processing techniques for agriculture., Proceedings of Asian Federation for Information Technology in Agriculture, Perth, Australia, pp.406-418.
2. Vibhute, A, and Bodhe, SK (2012). Applications of image processing in agriculture: a survey. International Journal of Computer Applications. 52, 34-40.
3. You, M, and Cai, C 2009. Wood classification based on PCA, 2DPCA, (2D)2PCA and LDA., Proceedings of 2nd International Symposium on Knowledge Acquisition and Modeling, Wuhan, China, Array, pp.371-374.
4. Georgieva, A, and Jordanov, I (2009). Intelligent visual recognition and classification of cork tiles with neural networks. IEEE Transactions on Neural Networks. 20, 675-685.
5. Nikolova, KT, Gabrova, R, Boyadzhiev, D, Pisanova, ES, Ruseva, J, and Yanakiev, D (2017). Classification of different types of beer according to their colour characteristics. Journal of Physics: Conference Series. 794.
6. Basnet, P, McConkey, B, Meinert, B, Gatkze, C, and Nobble, G (2004). Agriculture field characterization using aerial photograph and satellite imagery. IEEE Geoscience and Remote Sensing Letters. 1, 7-10.
7. Shen, W, Wu, Y, Chen, Z, and Wei, H 2008. Grading method of leaf spot disease based on image processing., Proceedings of 2008 International Conference on Computer Science and Software Engineering, Array, pp.491-494.
8. Madiwalar, SC, and Wyawahare, MV 2017. Plant disease identification: a comparative study., Proceedings of 2017 International Conference on Data Management, Analytics and Innovation (ICDMAI), Pune, India, Array, pp.13-18.
9. Arika, T, Naoki, N, and Yoshikatsu, H 2014. Improved HIS color space without gamut problem., Proceedings of 2014 IEEE Asia Pacific Conference on Circuits and Systems, Ishigaki, Japan, Array, pp.37-40.
10. Gonzalez, RC, and Woods, RE (2002). Digital Image Processing. Upper Saddle River, NJ: Prentice Hall
11. Swets, DL, and Weng, J (1996). Using discriminant eigenfeatures for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence. 18, 831-836.
12. Ma, Z, Li, Q, Li, H, and Li, Z 2017. Image representation based PCA feature for image classification., Proceedings of 2017 IEEE International Conference on Mechatronics and Automation, Takamatsu, Japan, Array, pp.1121-1125.
13. Lionnie, R, and Alaydrus, M 2016. Biometric identification system based on principal component analysis., Proceedings of 12th International Conference on Mathematics, Statistics and Their Applications, Banda Aceh, Indonesia, Array, pp.59-63.
14. Kim, K, and Yeom, S 2017. Investigation on the growth of green bean sprouts with principal component analysis., Proceedings of the 18th International Symposium on Advanced Intelligent Systems, Daegu, Korea, pp.1-6.
15. Belhumer, PN, Hespanha, JP, and Kriegman, DJ (1997). Eigenfaces vs. fisherfaces: recognition using class specific linear projection. EEE Transactions on Pattern Analysis and Machine Intelligence. 19, 711-720.
16. Yeom, S, and Javidi, B (2004). Three-dimensional distortion tolerant object recognition using integral imaging. Optics Express. 12, 5795-5809.
17. Duda, RO, Hart, PE, and Stork, DG (2001). Pattern Classification. New York: John Wiley & Sons
18. Yeom, S, Javidi, B, and Watson, E (2007). Three-dimensional distortion tolerant object recognition using photon-counting integral imaging. Optics Express. 15, 1513-1533.
19. Huang, R, Liu, Q, Lu, H, and Ma, S 2002. Solving the small sample size problem of LDA., Proceedings of 16th International Conference on Pattern Recognition (ICPR), Quebec, Canada, Array, pp.29-32.
20. Yan, W 2012. On reducing feature dimensionality for partial discharge diagnosis applications., Proceedings of 2012 IEEE Conference on Prognostics and System Health Management Conference (PHM), Beijing, China, Array, pp.1-7.
21. Yeom, S (2012). Photon-counting linear discriminant analysis for face recognition at a distance. International Journal of Fuzzy Logic and Intelligent Systems. 12, 250-255.
22. Yeom, S (2014). Multi-frame face classification with decision level fusion based on photon-counting linear discriminant analysis. International Journal of Fuzzy Logic and Intelligent Systems. 14, 332-339.
Biographies

Kiju Kim is currently in the B.S. degrees in School of Computer and Communication Engineering from Daegu University. He has been working on research related to digital signal and image processing and pattern recognition. His research interests include Intelligent Image Processing and Pattern Recognition.

E-mail:

Seokwon Yeom received the M.S. and B.S. degrees in Electronics Engineering from Korea University, Seoul, Korea and the Ph.D. degree in Electrical and Computer Engineering from the University of Connecticut in 2006. He is currently an associate professor in the Division of Computer and Communication Engineering at Daegu University in Korea. He is now performing several research projects granted by Korea government and Daegu University. His research interests include Intelligent Image Processing, Optical Information Processing, Pattern Recognition, and Target Tracking.

E-mail: yeom@daegu.ac.kr

September 2018, 18 (3)