In computer vision image classification, a given image is characterized into a pre-defined characterized class. Customary image classification includes feature extraction and classification modules. The feature extraction includes extracting a superior degree of pixel information from raw pixels, which will capture the greatness among the classifications. Normally, this process is performed in an unsupervised manner wherein the classes of the picture have nothing to do with the data extracted from pixels. Few typical and generally utilized feature extractions are GIST, HOG, SIFT, and LBP. After feature extraction, a classification module is prepared with the image and its related names. Few tests for this module include SVM, logistic regression, random forest, and choice trees.
Recurrent neural networks (RNNs), long short-term memory (LSTM), artificial neural networks (ANNs), and convolutional neural networks (CNNs) are the most popular neural network architectures. A CNN is suitable for image databases, and it works amazingly in computer vision tasks, such as image classification [1, 2], object detection [3], and image recognition [4]. It does not include an explicit element extractor. It extracts and characterizes modules in a coordinated framework and determines ways to separate representations from the pictures and orders them depending on regulated information.
A CNN is utilized in various assignments that have an extraordinary presentation in various applications. CNNs have been presenting an employable class of models for better information on image details, achieving better image acknowledgment, segmentation, identification, and retrieval. CNN structures are productively and successfully utilized in several pattern and image recognition applications [5], for example, motion acknowledgment [4, 6], face acknowledgment [7, 8], object characterization [9, 10], and scene description creation [11].
Zadeh [12] presented the idea of fuzzy logic (type-1 fuzzy) for tackling control framework-related issues. Later, analysts have contributed several fascinating applications to the field of computer vision. Notably, a type-2 fuzzy set (T2FS) was presented by Zadeh [13] in 1975, and it was further created by Jerry M. Mendel. In a type-1 fuzzy set (T1FS), the degree of participation is determined by a crisp number in the interval [1, 0]. In T2FS, the degree of participation is itself fuzzy and is indicated by secondary membership functions (MFs). If the secondary MF values are at their limit of 1 at each point, which is called an interval type-2 fuzzy set (IT2FS) [13–15], the T2FS incorporates a third measurement and an impression of uncertainty, as depicted in Figure 1, which gives an additional level of opportunity to deal with uncertainty. This additional level of fuzziness provides an increasingly capable method to deal with uncertainty. Figure 2 illustrates the secondary MFs (third element) of the T1FS (Figure 2(a)), IT2FS (Figure 2(b)), and general T2FS (Figure 2(c)), as initiated by an information p, similar to that shown in Figure 1.
Particularly, type-1 FCM has become the most notable calculation used in group investigations. Numerous analysts have demonstrated that there are imperatives in the limitations of T1FSs to show and break the effect of uncertainties, because their interest grades are crisp. A T2FS is spoken to by MFs that are themselves fuzzy. An IT2FS [16], an exceptional instance of T2FS, is currently the one most generally utilized, considering its decreased computational expense. An IT2FS is restricted by two T1FSs, one above and one below, that are called upper MF (UMF) and lower MF (LMF), and the domain among the UMF and LMF is called the footprint of uncertainty (FOU). Thus, T2FS exhibits various uncertainties; however, the computational unpredictability increases owing to its extra component of optional evaluations of every essential enrolment. Certain model applications include type-2 fuzzy clustering [17], Gaussian noise filter, classification of coded video streams, medical applications, and color picture division.
Recently, fuzzy logic and neural networks have been widely applied to solve real-world problems. Fuzzy logic is a set of mathematical principles used for knowledge representation, based on the degrees of membership, as opposed to the classical binary logic. It is a powerful tool to tackle imprecision and uncertainty and was initially introduced to provide robust and low-cost solutions for real-world problems. Generally, type-1 fuzzy logic systems (T1FLS) have been implemented in several systems such as forecasting systems, control systems, databases, and healthcare clinical diagnoses.
The drawback of the conventional type-1 fuzzy logic system is its limited capability to handle data uncertainties directly, as certain designed systems face high level of uncertainties that can affect the performance of the systems. The type-2 fuzzy logic system (T2FLS) is an extension of the former with the intention of being able to model the uncertainties that invariably exist in the rule base because the MFs of type-2 fuzzy systems are themselves fuzzy. It provides a powerful framework for representing and handling such types of uncertainties. An interval type-2 fuzzy logic system (IT2FLS), which is a special case of T2FLS, has been applied to solve real-world problems. Recent theoretical and practical studies confirm that IT2FLSs adequately handle uncertainties compared to that by T1FLSs, and an increasing number of applications of IT2FLSs is expected in different fields of science and engineering. T1FLSs and IT2FLSs have been applied in a wide variety of areas to solve problems.
The CNN is a type of neural network that has indicated commendable execution when faced with several challenges related to computer vision and image processing. A part of the invigorating application areas of CNN fuse image classification and segmentation [18], object detection [3], video processing [19], natural language processing [20, 21], and speech recognition [22, 23]. The learning limit of a significant CNN is basically a result of the usage of various component extractions composes that can normally take in exposé from the data. The availability of abundant data and improvement in the gear development has accelerated the investigation in CNNs, and starting late attractive profound CNN models have been represented. A few moving plans for advancement of CNNs have been investigated [7], for example, the utilization of various activation and loss functions, parameter streamlining, regularization, and compositional advancements.
Karnik et al. [24] stated that the use of a T1FS to model a word is scientifically incorrect because a word is uncertain and a T1FS is certain. Therefore, he conducted in-depth research on type-2 fuzzy and contributed several papers [24–26] on type-2 fuzzy logic. Based on that, several researchers have contributed several algorithms for their applications. For example, the classification of coded video streams, diagnosis of diseases, pre-processing radiographic images, medical image applications, transport scheduling, forecasting of time series, learning linguistic membership grades, inference engine design, and control of mobile robots. The computational complexity is high in type-2 fuzzy. Therefore, the type-2 fuzzy set is simplified into an IT2 Fuzzy, in which the computational complexity can be significantly reduced in appropriate applications.
Recently, fuzzy logic and neural networks have been widely applied to solve real-world problems. Fuzzy logic includes several mathematical standards for information representation depending on degrees of participation, as opposed to the binary logic. It is an incredible asset to handle imprecision and uncertainty, and it was introduced to provide robust and low-cost resolution for real-world problems [27]. Particularly, type-1 fuzzy logic frameworks have been executed in numerous systems to a wider scale, including approximation and forecasting systems, control systems, databases, and healthcare clinical diagnosis.
Researchers have successfully combined and implemented neural networks and fuzzy logic in intelligent systems. The fuzzy neural network (FNN) system framework implies the combination of fuzzy logic and neural network system ideas, which incorporates their benefits. This FNN is applied in several scientific and engineering areas, such as text sentient evaluation [28], object classification with small training database [3], emotion features extraction from text [29], comprehension of emotions in movies [30], real world objects and image classification [19, 31], Marathi handwritten numerals recognition [32, 33], traffic flow prediction [34], electric load prediction [35], and handwritten digits recognition [36]. Keller and Hunt [37] proposed hierarchical deep neural network fuzzy systems that obtain information from both fuzzy and neural representations. Price et al. [38] proposed the introduction of the fuzzy layers for deep learning, experiencing the choice of different combination procedures and total yields from best-in-class pre-prepared models, for example, AlexNet, VGG16, GoogLeNet, Inception-v3, and ResNet-18.
Generally, CNN architectures include two phases: feature extraction and classification. The FCNN is a combination of CNN and fuzzy logic; therefore, the fuzzy logic may include either a feature extraction phase or a classification phase. Depending on the application, researchers have proposed various FCNN architectures, including fuzzy logic in the feature extraction phase or classification phase. Here, the two FCNN architectures, with fuzzy logic included in the classification phase, were compared for image classification. Hsu et al. [3] have integrated a CNN with a fuzzy neural network (FCNN model 1), where the FNN summarizes the feature information from every fuzzy map. Korshunova [10] (FCNN model 2) proposed a CFNN architecture that includes a fuzzy layer, which is situated between the convolutional network and classifier.
The new IT2FCNN architecture integrates the features of the CNN and FNN. It integrates the interval type-2 fuzzy rectification unit (IT2FRU) [39] activation function in convolution for feature extraction in CNN and interval type-2 fuzzy-based classification in the fuzzy layer. This method combines the advantages of both network architectures and interval type-2 fuzzy logic. The IT2FCNN architecture includes four types of layers: i) convolutional layer with IT2FRU, ii) pooling layer, iii) fuzzy layer, and iv) fuzzy classifier.
The convolutional neural framework obtains a data image and performs the course of action of convolutional and pooling layers. The fuzzy layer performs grouping using the interval type-2 fuzzy clustering algorithm. The yields of the fuzzy layer neurons represent the estimations of the participation capacities for the fuzzy clustering of input data. The information point cluster is chosen based on their participation grade. These characteristics demonstrate the promise of a classifier. Its yield is the full IT2FCNN yield, which is the class score for the picture. Leave C alone the number of neurons in the fuzzy layer (the quantity of clusters). The neurons of the fuzzy layer commencement limits are IT2FRUs showing the interest of the information vector x to all L groups.
IT2FRU employs the following equalities: Z = 0, to guarantee that σ = 0 ⇒ φ_{o} = 0. Additionally, the height of the LMFs is employed as m2 = α, m1 = m3 = 1 – α, as suggested in [24]. The resulting IT2FM (φ0(σ)) for σ ∈ [0, 1] can be formulated as follows:
where k(σ) is defined as
Similarly, for the input interval σ ∈ [−1, 0] the IT2FM can be derived as follows:
The activation unit can be formulated by arranging
The parameter P controls the incline of the capacity in the positive quadrant, whereas the parameter N controls the slant of the capacity in the negative quadrant. The resulting output of the IT2FRU can be a linear or nonlinear activation depending on the selection of the parameters. The IT2FRU has three learnable parameters P, N, and α.
The vector x = [x_{1}, x_{2}, ..., x_{j,} ..., x_{n}] deals with the commitment of the framework, and the fuzzy layer forms a vector involving the degrees of having a spot x with specific cluster territories: [v_{i} v_{2}. . . v_{j}]. The parts (ū_{j}(x_{i}),
The Interval Type-2 Fuzzy Membership becomes
Updating cluster centers
Type reduction and hard-partitioning can be acquired as follows:
where
and
where
The processes of IT2FCNN are isolated into three phases: the information design (picture) undergoes a progression of changes; subsequently, a vector of significant level attributes is framed; further, the fuzzy layer executes a groundwork dissemination of the information into fuzzy groups; finally, the totally related layers execute the plan, consigning the result class name to each get-together of clusters.
Various datasets are available for the application of neural networks. The most popular datasets are CIFAR-10, Caltech101, and ImageNet. The CIFAR-10 dataset includes 60,000 images in 10 classes, with 6,000 images per class. The Caltech101 dataset includes 101 classes with 40 to 800 images per class. The ImageNet dataset includes more than 14 million images, with 20,000 categories. The experiments were executed in a Windows 7 64 piece working framework, and the principle memory and capacity limit of the PC frameworks were 8 GB RAM and 1 TB, respectively. An Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz and an NVIDIA GeForce GT 705 graphics card were used. The product utilized in this investigation were Python 3.6 and Matlab. The PyCharm IDE was utilized and organized by utilizing the Keras libraries on the PyCharm exploratory stage.
The training of IT2FCNN was the foremost step that included three autonomous steps of the three components of the network. First, the model was prepared utilizing the theoretical properties of the input image by the back propagation model. In the second part of the model, the fuzzy layer was tuned using the competitive learning scheme, which implied choosing the parameters of the MF for setting the cluster centers. Various fuzzy clustering algorithms are available. Here, IT2FCM was used for clustering. Finally, the classifier was trained using the weights tuning in the fully connected layers. After the completion of training, the IT2FCNN was ready for implementation. The image pixel cluster was taken care of by the CFNN. The yield of the system was input image p class scores, and the image was allocated to the class max score esteem class.
In this study, AlexNet, ZFNet, GoogLeNet, VGGNet16, and ResNet50 pretrained on CIFAR, ImageNet, and Caltech101 datasets were chosen for the experiment. The CFNN model fine-tuned AlexNet, ZFNet, GoogLeNet, VGGNet16, and ResNet50 to classify the images. Here, 3, 5, and 7 epochs were used for training the models. In the fuzzy layer, IT2FCM clustering was used to cluster the set of data several times with different numbers of clusters. When the fuzzy partition coefficient was maximized, the number of clusters equal to that number was chosen for the experiment. Adam, the stochastic optimization method, was used for classifier training (to tune weights) for the fully connected layer.
Table 1 presents the model performance comparison between the existing CNN and fuzzy-based CNN architectures. Figure 5 clearly shows that the fuzzy-based CNN architecture increases the performance accuracy compared to the traditional CNN architecture. The investigation distinctly indicated that remembering the fuzzy layer for the CNN gave a high caliber of exactness compared to a customary CNN.
The percentage error (% error), mean squared error (MSE), root mean squared error (RMSE), and mean absolute percentage error (MAPE) are the performance criteria for image classification. The corresponding calculation methods are defined as follows:
Table 2 presents the comparison of FCNN models with IT2FCNN based on MSE and RMSE. The results indicated that the proposed method produced a good result.
The exploratory outcomes indicate that fuzzy–neural networks represent a ground-breaking and reasonable option in contrast to regular arrangement strategies. The merging of fuzzy logic with neural network applications is progressively expert in decision marking systems. In the proposed method, the CNN was used to extract the features and integrate the interval type-2 fuzzy to classify the images, which increased the accuracy of the experiment. In addition, our test results demonstrated that it was conceivable to improve testing exactness by watching the conveyance of pixels in include maps and modifying the membership function. This method provided a better solution and had more advantages than the other existing methods. Although the results are more optimistic, image classification based on interval type-2 fuzzy logic still requires further research.
No potential conflict of interest relevant to this article was reported.
Case of three kinds of fuzzy sets. A similar information p is applied to each fuzzy set. (a) T1FS, (b) IT2FS, and (c) T2FS.
Perspective on the secondary membership functions (three dimensions) initiated by an information p for (a) T1FS, (b) IT2FS, and (c) T2FS.
Structure of a convolutional neural network (CNN).
Outline of the proposed method IT2FCNN.
Performance comparison analysis for Dog vs. Cat with fine tuning epochs 3 (a), fine tuning epochs 5 (b), and fine tuning epochs 7 (c).
Performance comparison analysis with various fine tuning epochs (3, 5, and 7)
Model | Fine tuning epochs | Dog Vs Cat | Lion Vs Tiger | Horse Vs Donkey | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Regular | FCNN model 1 | FCNN model 2 | IT2FCNN | Regular | FCNN model 1 | FCNN model 2 | IT2FCNN | Regular | FCNN model 1 | FCNN model 2 | IT2FCNN | ||
AlexNet | 3 | 40 | 54 | 58 | 58 | 60 | 58 | 60 | 64 | ||||
5 | 51 | 60 | 61 | 65 | 61 | 64 | 65 | 68 | 60 | 62 | 64 | 68 | |
7 | 54 | 65 | 68 | 72 | 66 | 68 | 72 | 74 | 65 | 68 | 71 | 74 | |
ZFNet | 3 | 41 | 53 | 56 | 61 | 56 | 58 | 61 | 64 | 53 | 58 | 61 | 64 |
5 | 53 | 62 | 61 | 65 | 62 | 63 | 65 | 68 | 62 | 64 | 68 | 70 | |
7 | 54 | 64 | 67 | 73 | 68 | 69 | 73 | 76 | 64 | 69 | 72 | 76 | |
GoogLeNet | 3 | 42 | 56 | 57 | 61 | 58 | 61 | 61 | 64 | 56 | 61 | 64 | 64 |
5 | 54 | 61 | 64 | 68 | 64 | 68 | 68 | 72 | 61 | 68 | 70 | 73 | |
7 | 57 | 68 | 70 | 74 | 68 | 70 | 74 | 78 | 68 | 70 | 74 | 78 | |
VGGNet16 | 3 | 44 | 58 | 57 | 61 | 58 | 57 | 61 | 65 | 58 | 57 | 62 | 65 |
5 | 53 | 62 | 64 | 68 | 64 | 66 | 68 | 72 | 62 | 66 | 69 | 72 | |
7 | 55 | 70 | 72 | 76 | 70 | 72 | 76 | 78 | 70 | 72 | 74 | 78 | |
ResNet50 | 3 | 43 | 56 | 56 | 63 | 59 | 61 | 63 | 65 | 56 | 61 | 64 | 65 |
5 | 54 | 61 | 62 | 67 | 63 | 62 | 67 | 69 | 61 | 62 | 65 | 69 | |
7 | 56 | 69 | 71 | 78 | 68 | 72 | 76 | 79 | 69 | 72 | 74 | 79 |
Comparison of FCNN models with IT2FCNN based on MSE and RMSE
Models | MSE | RMSE | MAPE |
---|---|---|---|
AlexNet | |||
FCNN Model 1 | .00245 | .052 | 4.4 |
FCNN Model 2 | .00183 | .043 | 3.2 |
IT2FCNN | .00123 | .035 | 2.4 |
ZFNet | |||
FCNN Model 1 | .00254 | .054 | 4.2 |
FCNN Model 2 | .00143 | .045 | 3.1 |
IT2FCNN | .00134 | .037 | 2.1 |
GoogLeNet | |||
FCNN Model 1 | .00249 | .053 | 4.2 |
FCNN Model 2 | .00197 | .041 | 3.2 |
IT2FCNN | .00123 | .036 | 2.0 |
VGGNet16 | |||
FCNN Model 1 | .00244 | .056 | 4.2 |
FCNN Model 2 | .00158 | .047 | 3.0 |
IT2FCNN | .00198 | .039 | 2.2 |
ResNet50 | |||
FCNN Model 1 | .00268 | .058 | 4.1 |
FCNN Model 2 | .00139 | .046 | 3.1 |
IT2FCNN | .00132 | .037 | 2.1 |
E-mail: pmurugeswarik7@gmail.com
E-mail: pandyviji@gmail.com