Article Search
닫기

Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2023; 23(2): 214-228

Published online June 25, 2023

https://doi.org/10.5391/IJFIS.2023.23.2.214

© The Korean Institute of Intelligent Systems

Prediction of Soil Quality in Rwanda for Ideal Cultivation of Potato () Using Fuzzy Logic and Machine Learning

Christine Musanase1,2, Anthony Vodacek2,3 , Damien Hanyurwimfura1,2 , Alfred Uwitonze1,2 , Aloys Fashaho1,2 , and Adrien Turamyemyirijuru1,2

1African Center of Excellence in Internet of Things, University of Rwanda, Kigali, Rwanda
2Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA
3College of Agriculture, Animal Sciences and Veterinary Medicine, University of Rwanda, Musanze, Rwanda

Correspondence to :
Christine Musanase (musanasechristine@gmail.com)

Received: March 23, 2023; Revised: May 23, 2023; Accepted: June 15, 2023

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

The ability to estimate soil quality has great value for agriculture, especially for low-income regions with minimal agricultural and financial resources. This prediction provides users with information that is useful in determining whether the soil is suitable for a specific crop, such as potato (Solanum tuberosum). Farmers in Rwanda lack information on soil quality. There are not enough soil laboratories to perform the requisite measurements of NPK, pH, and organic carbon, nor are there enough experts to analyze the data and provide farmers with timely results. The prime objective of the proposed study is to develop a predictive framework that can estimate soil quality for the ideal cultivation of potato (Solanum tuberosum) considering a case study of Rwanda. In this study, bootstrapping is used to augment the small soil dataset, and fuzzy logic is used to label soil data into four classes of soil suitability, with verification of the labeling by soil experts. Several machine learning methods are then tested on the labeled data, resulting in the classification of suitability for the augmented dataset and an assessment of their performance as a way to support experts in predicting soil quality. All machine learning methods applied were viable, with the best performance achieved using an artificial neural network. The quantified outcome showed that the adoption of a neural-network-based scheme has an average accuracy of 32% in contrast to other learning schemes. However, 70%-80% accuracy was achieved upon the adoption of fuzzy logic.

Keywords: Soil quality, Fuzzy logic, Artificial intelligence, Rwanda, Machine learning, NPK, Predictive model

Soil quality is an important factor in determining the suitability of sites for specific crop types. Soil quality is primarily a function of the nitrogen, phosphorus, and potassium (NPK) ratio, soil water pH, and quantity of organic matter [1, 2]. Dependency on nitrogen varies from plant to plant; however, it is known that the demand for nitrogen is always high in leafy plants. Therefore, an appropriate ratio of NPK fertilizer is required. Different amounts of chemicals and nutrients that are reported to be soil water soluble are significantly affected by the soil water pH. The availability of soil nutrients differs based on the acidic or alkaline conditions of the soil, and the development of plants can be adversely affected if the soil is highly acidic, with a pH value less than 5.5. This can occur due to various reasons, such as the toxicity associated with calcium, manganese, and aluminum, as well magnesium deficiency. Furthermore, soil may suffer from deficiencies in manganese, boron, copper, and zinc in cases of high alkalinity. Similarly, a sufficient amount of soil organic matter can increase the capacity of the soil to facilitate key nutrients and improve soil quality. In addition, a variety of other characteristics such as compaction, soil texture, water-holding capacity, soil biological activity, infiltration rate, sodicity, and cation exchange capacity can affect soil quality. Desertification, biodiversity loss, erosion, deforestation, and agricultural practices also affect soil quality.

Furthermore, site conditions and climatic factors affect potato cultivation [3]. Recent work performed by Coelho et al. [4] suggested that the primary focus should be placed on soil nutrients and organic matter and proposed a scheme emphasizing these attributes. Considering this as an input factor, the proposed scheme uses machine learning to predict the soil quality for potato cultivation. Previous studies [5, 6] noted that machine learning was validated using a segment of an existing dataset with no explicit information required.

The idea of validation is to ensure that appropriate data are considered for predictive analysis. Although the level of nutrients can be altered by the farmers, the adoption of proposed scheme for predicting soil quality for cultivation of potato has an added advantage. The scheme offers an autonomous and reliable score for an appropriate fertilizer along with the pH content, thereby facilitating effective decision-making by farmers. The adoption of the machine learning process is accompanied by the consideration of multiple inputs, along with the uncertainty factor in the form of the anticipated error rate, to ensure that the predictive outcome justifies the practicality of its applicability for the given test data of the soil of Rwanda. The predictors used in the models are density with respect to organic carbon (OC), density with respect to N, P, K, and water pH. Yields are significantly higher when soil quality is analyzed and used to determine the site suitability for growing specific crops [7, 8]. In Rwanda, relatively little soil variable data are available, and the interpretation of soil quality from these data was performed by an expert.

Therefore, the assessment of soil quality for various crops in Rwanda is limited by the availability of soil data and expert analyses. One method to overcome this limitation is to apply machine learning methods to complement the efforts of agricultural experts [9]. For example, fuzzy logic has been used to interpret the values of NPK obtained from conventional soil tests to assess their levels in the soil and to predict possible NPK inputs. Artificial neural networks (ANNs), support vector machines, cubist regression, random forests, and multiple linear regression are other machine learning methods used to advance the prediction of soil OC content [10]. In [11], the authors showed that a machine learning approach improved the accuracy of soil property prediction based on a comparative analysis of six commonly used techniques: random forest, decision tree, naïve Bayes, support vector machine, least-squares support vector machine, and ANN.

In this study, an analysis was conducted in Rwanda, and a machine learning model was built to classify the type of soil into four different groups: 1) highly suitable, 2) moderately suitable, 3) marginally suitable, and 4) not suitable. Potatoes were the target crop because they are an important commercial crop in Rwanda [12]. Soil quality information must be accessible to every farmer to decide on the type of crop to be grown and to earn more profit. In addition to the NPK ratio of the soil, the water pH plays an important role [13]. Moreover, alkaline soils inhibit micronutrient absorption [14]. Hence, it must be ensured that these plants are not planted in hostile environments. If one type of soil is less suited for a particular plant and more suited for another, this result can be further used to deduce yield predictions, and thereby, profitability can be predicted. With input from soil scientists, soil data were processed using fuzzy logic to create labels for a small set of soil data. The experts validated the results to be reasonable, and then machine learning was applied to expand the results to compensate for the fact that there are not enough experts in Rwanda. The fuzzy logic model contains the values for the classification of soil into the values used to predict soil quality. In summary, a combination of soft computing and supervised machine learning was used to predict the soil suitability for potatoes in different regions of Rwanda.

1.1 Review of Literature

This section outlines the existing approaches developed for soil quality prediction. Udutalapally et al. [15] developed a device that can explore the health status of crops by assessing any form of disease using Internet of Things (IoT). The study inferred the inclusion of devices for the evaluation of crop health, which is a recent trend in the inclusion of IoT in agriculture. Existing studies have also included the Bayesian machine learning model, and a classified ensemble was reported in the work of Mosavi et al. [16] for identifying the salinity content of ground water. Zia et al. [17] developed a model that could predict the deficiency of nitrates in soil using a machine learning approach. These studies have focused mainly on salinity factors or single nutrient entities without the inclusion of other essential nutrient factors. In addition to machine learning, deep learning is slowly gaining pace in the predictive analysis of agricultural issues, as reported by Elavarasan and Vincent [18]. They used a deep reinforcement model to perform the predictions. Irrespective of the improved form of the model, benchmarking is lacking to prove its applicability on practical grounds. A unique study presented by de Lima Neto et al. [19] investigated the availability of nutrients in a specific form of banana. However, the model does not include any specific nutrients nor is it benchmarked.

The model presented by Zhang et al. [20] used multiple sets of machine learning methods to identify the carbon content in the soil in a specific region. Similarly, various other work was conducted to assess various problems that directly relate to assessment of soil quality; the studies included averaging method for soil properties [21], forecasting carbon in soil using multiple sets of machine learning [22], yield prediction [23], assessing effectiveness of machine learning [24], growth prediction using deep learning [25], assessing CO2 fluxes [26], prediction of texture [27], reliability forecasting model [28], and carbon variability assessment [29]. A fuzzy-logic-based approach for assessing soil quality has also been investigated by Ogunleye et al. [30], Nooriman et al. [31], Hoseini [32], Chen et al. [33], and Atijosan et al. [34]. However, the accuracy of the model largely depends on the massive size of the training data, which is sometimes unavailable. All the aforementioned studies considered various crops and different locations to predict soil quality. These studies aimed to determine the optimal amount of NPK fertilizer required for better soil quality. However, apart from the beneficial features and claims of these studies, there is also an open scope for further development toward predicting soil quality.

Studies have focused on determining the optimal amount of NPK fertilizer required for better soil quality. Additionally, various predictive mechanisms have been reported for investigating soil quality. Wu et al. [35] presented a fuzzy logic model to evaluate the soil quality based on soil samples. They also applied random forest, cubist mathematical models, and hybrid models (regression kriging) to 201 composite surface soil samples to predict soil chemical properties, such as soil OC and available phosphorus content. However, the study did not facilitate assessment based on individual nutrients, which led to unanswered questions about its applicability.

Kaya et al. [36] compared model-averaging techniques for predicting the spatial distribution of soil properties. Their results showed that ANNs and random forest-based learners were the most effective in predicting the soil properties. However, the applicability of learning-based models was found to differ in different use cases, and the study did not offer an inference of soil properties for optimal soil quality retention. Li et al. [37] used ANNs, multiple linear regression, and support vector machines to model the effects of nanomaterials in compacted clay soil. The applicability of this study is restricted to specific forms of soil. Ozcoban et al. [38] developed a model that can predict nitrate deficiency in soil using a machine learning approach. Arciniegas-Ortega et al. [39] used logistic regression to generate an index based on the land-use order. However, it uses a sophisticated approach that makes it difficult to gather data on a daily basis.

Arguments for the need of this study are as follows: the primary motive was to explore the need for predictive modeling to predict soil quality. Therefore, this study explores the scope that could offer improved potato productivity in Rwanda. A clear analysis from the literature review shows that most studies employed sophisticated machine learning techniques to design their predictive framework. It has also been observed that most existing predictive models for soil quality require increased flexibility, which restricts their applicability to a particular crop or region, indicating that the models limit their usage to a particular data size. However, few studies have focused on computing the optimal amount of NPK fertilizer, which is a key indicator for soil prediction. Despite their potential, very few studies have explored the strength factors of the rule set for fuzzy logic type-2, which could assist in enhancing the performance of machine learning models. A review of the existing approaches shows that more work is required to consider real-time soil data, and no similar study has been conducted in Rwanda.

1.2 Purpose and Significance of the Study

The primary reason for conducting the proposed study was to address the current challenges faced by existing soil quality prediction techniques. The first challenge addressed by the proposed scheme is to develop a simplified fuzzy-logic-based scheme that balances ruleset deployment as well as effective decision-making performance verified by a soil expert. The second challenge addressed in this study was to consider the use case of potato cultivation in Rwanda, which has never been explored before. The third challenge addressed in the current study was the deployment of a simplified learning model to perform predictive tasks for determining soil quality. The primary purpose of the present work was to develop an effective machine learning model capable of the predictive analysis of soil quality in Rwanda. A review of existing approaches shows that little work has been performed to consider real-time soil data, and no similar study has been conducted in Rwanda. The next section discusses the materials and methodology used to achieve the aims of this study.

This section emphasizes the methodologies and formulated algorithms for predicting soil quality. This study’s model was developed considering the underlying principles of agricultural science as well as data science. The study model aggregates environmental information using agricultural science, and a similar concept is used to configure the thresholds of pH and NPK values.

2.1 Dataset Collection

Prior to discussing the data collection, it is essential to understand the complete structure of the proposed implementation. Figure 1 highlights the structure of the proposed implementation scheme, which exhibits the area from which soil data were collected, followed by a series of processes to analyze soil quality.

The soil in Rwanda is naturally fragile [40]. Agricultural development is governed by crop intensification programs [41]. The selection toward cultivation of crops is conducted based on its potential ability to fulfill demands of food security for the region, climatic conditions of the ecological region that suits crop cultivation, and comparative advantage. As per Figure 2, a specific region in Rwanda, viz., Buberuka and Birunga, which include Burera and Rubavu districts located in the northern and western provinces of Rwanda is preferred for cultivating potato [42] (Figure 2). Both districts have high rainfall, low temperatures, high altitude, and steeply sloping hills. The soils in these regions mainly fall under the category of volcanic soil, whereas the Gicumbi and Rwamagana districts are in the northeastern and eastern parts of Rwanda, with altitudes ranging from medium to high, generally drier conditions than Burera and Rubavu, and nonvolcanic soils.

For the purpose of the investigation of the proposed scheme, the samples of soils have been aggregated from two regions, viz., Gicumbi and Rwamagana districts during two seasons, September 2017 to February 2018 and March 2018 to August 2018. Soil sampling was performed at a depth of 30 cm, and samples were collected from 16 plots. Finally, the study also considered the aggregation of soil samples during rainy season of the year 2017 from Rubavu and Burera districts. The study investigated 12 randomly selected plots in the region characterized by a depth of 0–30 cm. The study was conducted in a controlled research environment following the standard procedure for soil analysis in the laboratory. The final dataset consisted of 6,051 soil samples from four locations: Gicumbi, Rwamagana, Rubavu, and Burera. The five variables of soil data (K, P, N, OC, and pH), which were considered to build our models, are shown in Table 1. The dataset was split for training and testing, with 80% for training and 20% for testing.

The data in Table 1 were gathered following the standard procedures of soil analysis in the soil laboratory of the University of Rwanda, College of Agriculture, Animal Sciences, and Veterinary Medicine, Busogo Campus.

2.2 Soil Sampling and Analysis using Fuzzy Logic

The proposed scheme was analyzed using both fuzzy logic and a machine learning approach. Fuzzy logic is applied to solve problems characterized by impartial or vague sets of input information, thereby offering higher flexibility for reasoning in the presence of uncertainties. The proposed scheme implements fuzzy logic because it offers higher robustness owing to its independence from inputs; it does not require inputs to be noise-free or fixed. Furthermore, it assists in constructing user-defined rules to make them practically implementable. The performance of the fuzzy controller system can be improved to optimize the system performance. The fuzzy logic mechanism can also assist in generating a user-friendly outcome while processing a reasonable number of inputs. Furthermore, the adoption of fuzzy logic offers non-dependency from complex mathematical implementations because it can efficiently perform nonlinear systems.

Figure 3 highlights the architecture of the fuzzy logic type-2 used in the proposed scheme for predicting soil quality by considering the uncertain and imprecise information associated with the condition of the soil. The first part of this implementation process involves defining the crisp key input that affects soil quality. There are various possibilities for such inputs, such as compaction, nutrient levels, moisture content, and organic matter content. The inputs considered were taken from four different agricultural sites in Rwanda, considering the pH value, OC, and proportions of N, P, and K fertilizer. The next processing step, shown in Figure 3, is associated with the construction of a rule base to represent the connection between the input arguments and the outcome of soil quality. The study uses “If-Then” rules for constructing the fuzzy set exhibited in Lines 1–4 of the proposed algorithm. The next step is to develop a fuzzy inference system that applies a rule base to consider input arguments. The objective was to assess the degree of influence of each rule on the input variables to yield an outcome. The next operation is mainly associated with defuzzification using a type-reduction operation, which is essentially an extended configuration of defuzzification process for its legacy type-1 fuzzy logic. The architecture shown in Figure 3 can control the uncertainty and possible impressions associated with the soil data.

The proposed scheme to develop soil suitability labels uses fuzzy logic type-2, as shown in Figure 3, where the proposed scheme considers the soil dataset sample in the form of initial data, followed by the activation of the inference engine and mapping performed through rule construction. The outcome is processed by a type reducer, which is then subjected to defuzzification to obtain the final outcome of the type-reduction score and crisp output.

  • • The first module of the fuzzifier is responsible for mapping the crisp input that solely depends upon the category of fuzzifier being deployed.

  • • The second module of rule is responsible for constructing rules considering N, P, and K attributes of soil quality where the outcome of rule states the matching predicted quality of soil.

  • • The third module of the fuzzy inference engine is responsible for mapping the input to output considering all the stated rules in the fuzzy rule base.

  • • The fourth module of the type-reduction that is an extended operation of defuzzification where type-1 fuzzy set is obtained by transforming type-2 fuzzy set.

  • • The fifth and final module of defuzzifier is responsible for generating a quantified numerical outcome as a consequence of considering associated degree of membership function, crisp logic, and adopted fuzzy set. This final block of operation basically maps its fuzzy set into crisp set, which is essential in a fuzzy control system.

The significant benefit of adopting this fuzzy logic technique is that it facilitates modeling of varied levels of uncertainties in predicting soil quality, which cannot be carried out by type-1 fuzzy logic. The algorithmic steps are as Algorithm 1.

The algorithm described above is used for labeling soil information using fuzzy logic. The algorithm takes the input of the soil data, which, after processing, yields an outcome for the soil. The algorithm constructs a matrix of four columns to retain the information associated with the pH water quality, N quality, P quality, and K quality (Line 1). The next step is to assign quality values (Line 2). Furthermore, the algorithm constructs a conditional logic to assess whether the quality of water pH and N is equivalent to 1, while it also assesses whether the quality of soil has the lowest quality score for P and K (Line 3). The algorithm performs similar rounds of checks for a pH water quality equivalent to 2 (Line 4), and the quality is equivalent to the pH water quality (Line 5). All the above processing steps yield inferences regarding the quality of the soil. The processing outcome showed that the majority of soil samples from the observation region were marginally suitable for potato cultivation.

Table 3. Numerical outcome of accuracy-based comparative analysis

ML approachesPrecision (%)Recall (%)F1-score
Gaussian NB0.820.830.84
ANN0.870.850.86
Logistic regression0.790.750.77
KNN0.750.720.73


The primary reason for the adoption of fuzzy logic for setting up this ruleset is mainly to develop a decision control system in which soil quality can be ascertained based on the rules generated by the user. The advantage of this rule, set up in fuzzy logic, is that the suitability score can be customized by farmers or agricultural experts based on the present agricultural environment of any farming region in Rwanda. Therefore, the higher the soil quality, the higher the anticipated potato yield in Rwanda.

2.3 Analysis Using Machine Learning

The applicability of fuzzy logic was restricted to a few soil samples, and quality predictions were found to be reasonable by soil experts. The soil experts involved in the proposed study were local skilled agriculturists and agronomists with extensive experience in analyzing and interpreting the quality of soil. The experts were consulted regarding the anticipated outcomes of the model. According to them, the lower the error rate of the predictive model, the higher is the possibility of reliable outcomes for a higher degree of soil quality. They also provided a recommendation to assess the individual scope of NPK fertilizer, followed by soil water pH. As experts are not available to assess all the results, the proposed analysis adopted a machine learning approach to increase the analytical coverage of the study. For this purpose, an ANN was adopted to predict soil quality.

ANN is capable of learning complex problems, thereby making precise decisions regarding the quality of soil based on different quality parameters [43]. It can also perform parallel processing of multiple inputs, thus generating multiple outcomes. This means that the proposed model is capable of predicting the soil for any geographic region, not only Rwanda. The Gaussian naïve Bayes is a simple and powerful algorithm for predictive modeling. It is assumed that each input variable is independent [44]. We also used Gaussian naïve Bayes to classify the soil data as it holds continuous data. It was easier to calculate the mean and standard deviation from the training database using the Gaussian mode. The naïve Bayes approach often yields accurate and stable models with very small sample sizes, similar to our soil sample analysis dataset from Rwanda. Logistic regression can be used to analyze the relationship between multiple independent variables and categorical dependent variables and to estimate the probability of an event [45]. However, in this study, the dependent variable was not dichotomous but comprised four categories; we used multinomial logistic regression for soil quality prediction. The K-nearest neighbors (KNN) machine learning algorithm is a well-known nonparametric classification method. KNN determines the class of a new sample based on the class of its nearest neighbors [46]. For each modeling approach tested in this study, the assessment of the algorithm was based on four measures: accuracy, precision, recall, and F1 score, which were calculated using the following formulas:

Accuracy=TP+TNTP+TN+FP+FN,Precision=TPTP+FP,Recall=TPTP+FN,F1-score=2*Precision*RecallPrecision+Recall=2*TP2*TP+FP+FN.

In the above empirical expressions (1)–(4), the variables TP, TN, FP, and FN represent true positives, true negatives, false positives, and false negatives, respectively. TP represents the correctness of the presence of a condition, whereas a TN represents the correctness of the absence of a condition. An FP represents the incorrect presence of a specific condition, whereas an FN represents the incorrect absence of a specific condition. The next section discusses the obtained outcomes.

This section describes the results of the fuzzy logic method for labeling data, followed by a comparison of the outcomes of different machine learning methods as predictive models. The labeling method was verified as reasonable by soil experts with knowledge of Rwandan soils. Machine learning methods were compared using performance metrics.

3.1 Outcomes of Fuzzy Logic Approach

The outcome of the fuzzy logic was analyzed using a kernel density estimation plot, which is a probability density function that is a variation of the histogram that uses kernel smoothing while plotting the values. Density associated with OC, proportion of NPK individually, and proportion of pH. The X-axes represent the data points, and the Y-axes represent the probability density function. The region of the plot with a higher peak is the region with the maximum number of data points between the values.

As shown in Figure 4, the OC percentage was satisfactory throughout the region, and this study evaluated the impact of individual N, P, and K densities on OC proportion (g g-1). Therefore, this variation did not influence the suitability score, and these data could be ignored. In Figure 5, the density plot of N shows two types of quantities in the soil. Most of it was 0.08, whereas another peak was present at 0.2; hence, this became a deciding factor for overall soil quality. In Figure 6, the P percentage is higher in a particular region but lower in others; in the case of a lower N percentage, the P percentage is the deciding factor, along with the K percentage. Negative values of P represent outliers.

Figures 7 and 8 show the density analysis for the proportion of K and water pH. The results show that the density of K is slightly lower, whereas the density of the water pH is quite good for the given region under observation in the proposed use case of Rwanda. The final outcome of soil quality is shown in Figure 9.

The outcomes shown in Figure 9 were produced using the amended version of fuzzy logic. The original sample size was 360 samples; however, a random function was used to increase the sample size programmatically to assess the influence of different samples on the predicted quality. The graphical representation of Figure 9 was obtained by considering the first sample of 360 data points (Sample-1), the second sample of 720 data points (Sample-2), and the third sample of 1,440 data points. Recall the development mechanism of the ruleset constructed using fuzzy logic, and this result was validated by a soil expert. The outcome of processing showed that the majority of cases from the soil were marginally suitable for cultivating potatoes. It should be noted that the outcome obtained is highly subjective of the use case of four different sites in Rwanda from where the samples were collected (represented in the graph as SC-1.0, SC-2.0, SC-3.0, and SC-4.0). According to experts who are also co-authors of this work, fuzzy logic can classify soil with an accuracy of 70%–80%; hence, most of the soil samples came from acrisols of Gicumbi and Lixisols of Rwamagana where Irish potatoes are marginally cultivated. Therefore, fuzzy logic offers a better predictive performance.

3.2 Outcomes of Machine Learning Approach

The outcomes of the machine learning approach were assessed using performance metrics.

The proposed scheme was assessed through a comparison of the results with a set of frequently used machine learning methods, that is, ANN, Gaussian naïve Bayes, logistic regression, and KNN. The results shown in Figure 10 and Table 3 indicate that ANN offers better predictive accuracy than the other approaches. This suggests that predictive modeling of soil quality using ANNs offers higher reliability. Machine learning contributes to the development of a predictive model that can be used by farmers and agriculturists to understand the possibilities of suitable environments for potato farming. The scalability of this predictive model is that upon feeding any form of dataset, the model can perform predictions associated with soil quality. Based on the predicted outcomes, farmers can finetune the concentration of various fertilizers to ensure better crop cultivation in Rwanda. Hence, it saves time and effort in decision-making for a better yield. From the perspective of novelty, the core idea of the proposed scheme is to develop a cost-effective and simplified computational predictive model that can assess soil quality suitable for potato cultivation in Rwanda. The mechanism of adoption of the proposed model is based on data acquisition and designing a fuzzy processor; applying a machine learning approach for this purpose is a novel notion under the environmental conditions of Rwanda. The model is simple in design and implementation and offers reliable outcomes in easy steps. The basis of this model deployment is to refrain from adopting any advanced data analytical or predictive scheme using machine learning or deep learning, where accuracy is achieved at the cost of the computational burden. However, future work can be conducted to extend the proposed machine learning predictive scheme to its advanced version, ensuring a lower computational cost without affecting accuracy. The next section presents discussion of the results of the proposed study.

From the highlights of the outcome obtained in the previous section, it is noted that the proposed scheme introduces a mechanism that not only can identify the optimal fertilizers required for potato cultivation but also offers a robust classification mechanism to determine the soil quality required for this purpose. There is no doubt that the proposed scheme is implemented considering the use case of potatoes in Rwanda; however, the same model can also be applied to different agricultural crops in different countries. This is possible by altering the input variables of this learning model to predict soil quality, which, at present, is considered only for four different sites in Rwanda.

To understand the contribution of the proposed method to potato production, it is necessary to understand the actual agricultural and cultivation factors required for optimal potato production. The amount of fertilizer to be provided in the soil for potato cultivation depends on the soil test data. The proposed scheme considered standard input data from four agricultural sites in Rwanda to determine the optimal distribution of nutrients. It is also known that NPK fertilizers are administered in the same proportion at the same time during planting; however, the demand for these fertilizers continues to change over a period of cultivation. Fuzzy logic type-2 was applied in the proposed scheme considering the input of soil data to predict soil quality with respect to water quality and quality with respect to individual N, P, and K fertilizers. Further accuracy of this predictive analysis was achieved by deploying machine learning schemes using an ANN. Hence, the model is capable of predicting the change in the quantity of fertilizer required to reach the optimal soil quality to ensure better production.

Although a machine learning approach has been introduced to overcome the constraints of using fuzzy logic for various types of soil, fuzzy logic should not be deemed a low-performing module. It has its own advantages that cannot be achieved using a machine learning approach. Fuzzy logic offers a better platform for introducing a user-friendly logical ruleset that is not only easy to deploy but also flexible to manage. The ruleset can be redefined based on the problem space considered in the investigation. Machine learning contributes to ensuring a higher predictive accuracy by considering more coverage of the soil area. The fuzzy logic analysis showed a higher proportion of OC density (Figure 4) and a lower proportion of N (Figure 5) and P (Figure 6), while the proportion of K (Figure 7) was found to be better than prior values of N and P, which are highly essential for potato cultivation. Furthermore, fuzzy logic was found to offer a predictive performance, which was validated by a soil expert (Figure 10). The analysis of various machine learning schemes demonstrates that ANNs offer better predictive performance than other machine learning approaches. The prime justifications behind this are as follows: less effective feature management system using the Gaussian naïve Bayes algorithm, shortcomings in considering linearity associated with different categories of variables in logistic regression, and less applicability/scalability toward higher dimensions of data in the KNN algorithm. However, it is not difficult to implement an ANN with parallel processing. This not only speeds up processing but also addresses the minimization of errors at each progressive epoch. Hence, the ANN demonstrated better performance in predicting soil quality in this study.

This study presents a simplified predictive modeling for forecasting the soil quality in selected areas of Rwanda, which has been less explored in existing studies. The contributions of the proposed scheme are as follows: (1) the predictive operation is performed using the fuzzy logic and machine learning approaches; both have unique benefits, unlike any form of sophisticated usage of multiple machine learning schemes found in existing studies; (2) the proposed predictive model is highly flexible in its operation and does not solely depend on the environment, which means that it can be applied for predicting soil quality for other crops or other regions, irrespective of any size or dimension of the data; (3) the outcome of the proposed scheme shows that the proposed fuzzy logic scheme offers better predictive performance; (4) the quantified outcome of the machine learning scheme shows that the proposed artificial neural network offers approximately 25% higher accuracy compared with Gaussian naïve Bayes, approximately 32% higher accuracy compared with logistic regression, and approximately 37% higher accuracy compared with the KNN algorithm; and (5) the overall quantified outcome shows that the proposed model is able to show an accuracy of 89%, whereas the implementation of the fuzzy logic algorithm proposed in this system is able to classify the soil with an accuracy of 70% to 80% with shallow learning algorithms.

The numerical outcome of the proposed scheme offers significant guidelines for potato cultivation in Rwanda. Based on this outcome, an agronomist will need to identify the geographical features of Rwanda, followed by the development of a smart ruleset using fuzzy logic. This could facilitate better proportion identification of NPK fertilizers for improving soil quality. Hence, without much demand for re-engineering, this model can be utilized for any geographical location to exhibit better predictive performance than the frequently exercised schemes. Future work will focus on using soil sensors for soil analysis to overcome the challenges of insufficient soil datasets and soil experts to address geographical differences. Future work will also be conducted in the direction of soil map generation to further assess soil quality in the presence of various challenging agricultural conditions.

  1. Krasilnikov, P, and Taboada, MA (2022). Amanullah, “Fertilizer use, soil health and agricultural sustainability. Agriculture. 12. article no. 462
    CrossRef
  2. Wan, LJ, Tian, Y, He, M, Zheng, YQ, Lyu, Q, Xie, RJ, Ma, YY, Deng, L, and Yi, SL (2021). Effects of chemical fertilizer combined with organic fertilizer application on soil properties, citrus growth physiology, and yield. Agriculture. 11. article no. 1207
    CrossRef
  3. Manrique, LA (1992). Potato production in the tropics: crop requirements. Journal of Plant Nutrition. 15, 2679-2726. https://doi.org/10.1080/01904169209364504
    CrossRef
  4. Coelho, ARF, Daccak, D, Marques, AC, Luis, IC, Pessoa, CC, and Silva, MM (2022). Comparison of soils of two fields for potato production located in the same region of Portugal. Chemistry Proceedings. 10. article no. 53
  5. Gomez, D, Salvador, P, Sanz, J, and Casanova, JL (2019). Potato yield prediction using machine learning techniques and sentinel 2 data. Remote Sensing. 11. article no. 1745
    CrossRef
  6. Li, D, Miao, Y, Gupta, SK, Rosen, CJ, Yuan, F, Wang, C, Wang, L, and Huang, Y (2021). Improving potato yield prediction by combining cultivar information and UAV remote sensing data using machine learning. Remote Sensing. 13. article no. 3322
  7. Penn, CJ, and Camberato, JJ (2019). A critical review on soil chemical processes that control how soil pH affects phosphorus availability to plants. Agriculture. 9. article no. 120
    CrossRef
  8. Basak, N, Mandal, B, Biswas, S, Basak, P, Mitran, T, and Saha, B (2022). Impact of long term nutrient management on soil quality indices in rice-wheat system of lower Indo-Gangetic Plain. Sustainability. 14. article no. 6533
    CrossRef
  9. Global Hunger Index. (2022) . Rwanda. [Online]. Available: URL:https://www.globalhungerindex.org/rwanda.html
  10. Diao, X, Bahiigwa, G, and Pradesha, A (2014). The Role of Agriculture in the Fast-Growing Rwandan Economy: Assessing Growth Alternatives. Washington, DC: International Food Policy Research Institute
  11. Food and Agriculture Organization of the United Nation. (c2023) . Rwanda at a glance. Available: https://www.fao.org/rwanda/our-office-in-rwanda/rwanda-at-a-glance/en/
  12. Muratore, C, Espen, L, and Prinsi, B (2021). Nitrogen uptake in plants: the plasma membrane root transport systems from a physiological and proteomic perspective. Plants. 10. article no. 681
    CrossRef
  13. Dhaliwal, SS, Sharma, V, Shukla, AK, Verma, V, Kaur, M, and Shivay, YS (2022). Biofortification: a frontier novel approach to enrich micronutrients in field crops to encounter the nutritional security. Molecules. 27. article no. 1340
    CrossRef
  14. Denton-Thompson, SM, and Sayer, EJ (2022). Micronutrients in food production: what can we learn from natural ecosystems?. Soil Systems. 6. article no. 8
    CrossRef
  15. Udutalapally, V, Mohanty, SP, Pallagani, V, and Khandelwal, V (2021). sCrop: a novel device for sustainable automatic disease prediction, crop selection, and irrigation in Internet-of-Agro-Things for smart agriculture. IEEE Sensors Journal. 21, 17525-17538. https://doi.org/10.1109/JSEN.2020.3032438
    CrossRef
  16. Mosavi, A, Hosseini, FS, Choubin, B, Goodarzi, M, and Dineva, AA (2020). Groundwater salinity susceptibility mapping using classifier ensemble and Bayesian machine learning models. IEEE Access. 8, 145564-145576. https://doi.org/10.1109/ACCESS.2020.3014908
    CrossRef
  17. Zia, H, Harris, NR, Merrett, GV, and Rivers, M (2019). A low-complexity machine learning nitrate loss predictive model: towards proactive farm management in a networked catchment. IEEE Access. 7, 26707-26720. https://doi.org/10.1109/ACCESS.2019.2901218
    CrossRef
  18. Elavarasan, D, and Vincent, PD (2020). Crop yield prediction using deep reinforcement learning model for sustainable agrarian applications. IEEE Access. 8, 86886-86901. https://doi.org/10.1109/ACCESS.2020.2992480
    CrossRef
  19. de Lima Neto, AJ, Deus, JALD, Rodrigues Filho, VA, Natale, W, and Parent, LE (2020). Nutrient diagnosis of fertigated “prata” and “cavendish” banana (Musa spp.) at plot-scale. Plants. 9. article no. 1467
    CrossRef
  20. Zhang, X, Xue, J, Chen, S, Wang, N, Shi, Z, Huang, Y, and Zhuo, Z (2022). Digital mapping of soil organic carbon with machine learning in Dryland of Northeast and North Plain China. Remote Sensing. 14. article no. 2504
  21. Taghizadeh-Mehrjardi, R, Khademi, H, Khayamim, F, Zeraatpisheh, M, Heung, B, and Scholten, T (2022). A comparison of model averaging techniques to predict the spatial distribution of soil properties. Remote Sensing. 14. article no. 472
    CrossRef
  22. Emadi, M, Taghizadeh-Mehrjardi, R, Cherati, A, Danesh, M, Mosavi, A, and Scholten, T (2020). Predicting and mapping of soil organic carbon using machine learning algorithms in Northern Iran. Remote Sensing. 12. article no. 2234
    CrossRef
  23. Abbas, F, Afzaal, H, Farooque, AA, and Tang, S (2020). Crop yield prediction through proximal sensing and machine learning algorithms. Agronomy. 10. article no. 1046
    CrossRef
  24. Joshua, V, Priyadharson, SM, and Kannadasan, R (2021). Exploration of machine learning approaches for paddy yield prediction in eastern part of Tamilnadu. Agronomy. 11. article no. 2068
    CrossRef
  25. Khan, T, Sherazi, HHR, Ali, M, Letchmunan, S, and Butt, UM (2021). Deep learning-based growth prediction system: a use case of China agriculture. Agronomy. 11. article no. 1551
    CrossRef
  26. Adjuik, TA, and Davis, SC (2022). Machine learning approach to simulate soil CO2 fluxes under cropping systems. Agronomy. 12. article no. 197
    CrossRef
  27. Gouda, MZ, Nagihi, EM, Khiari, L, Gallichand, J, and Ismail, M (2021). Artificial intelligence-based prediction of key textural properties from LUCAS and ICRAF spectral libraries. Agronomy. 11. article no. 1550
    CrossRef
  28. Maksimovic, J, Pivic, R, Stanojkovic-Sebic, A, Jovkovic, M, Jaramaz, D, and Dinic, Z (2021). Influence of soil type on the reliability of the prediction model for bioavailability of Mn, Zn, Pb, Ni and Cu in the soils of the Republic of Serbia. Agronomy. 11. article no. 141
    CrossRef
  29. John, K, Abraham Isong, I, Michael Kebonye, N, Okon Ayito, E, Chapman Agyeman, P, and Marcus Afu, S (2020). Using machine learning algorithms to estimate soil organic carbon variability with environmental variables and soil nutrient indicators in an alluvial soil. Land. 9. article no. 487
    CrossRef
  30. Ogunleye, GO, Fashoto, SG, Mashwama, P, Arekete, SA, Olaniyan, OM, and Omodunbi, BA (2018). Fuzzy logic tool to forecast soil fertility in Nigeria. The Scientific World Journal. 2018. article no. 3170816
    Pubmed KoreaMed CrossRef
  31. Nooriman, WM, Abdullah, AH, Rahim, NA, and Tan, ESMM (2021). Fuzzy logic based prediction of micronutrients demand for Harumanis mango growth cycles. Journal of Physics: Conference Series. 2107. article no. 012048
  32. Hoseini, Y (2019). Use fuzzy interface systems to optimize land suitability evaluation for surface and trickle irrigation. Information Processing in Agriculture. 6, 11-19. https://doi.org/10.1016/j.inpa.2018.09.003
    CrossRef
  33. Chen, LC, Wibowo, N, and Utama, DN (2021). Extended fuzzy decision support model for cropland recommendation of food cropping in Indonesia. Journal of Computer Science. 17, 709-723. https://doi.org/10.3844/jcssp.2021.709.723
    CrossRef
  34. Atijosan, A, Muibi, K, Ogunyemi, S, Adewoyin, J, Badru, R, Alaga, A, and Shaba, A (2015). Agricultural land suitability assessment using fuzzy logic and geographic information system techniques. International Journal of Scientific Research in Science and Technology. 1, 113-118.
  35. Wu, C, Dai, E, Zhao, Z, Wang, Y, and Liu, G (2021). Soilquality assessment during the dry season in the Mun River Basin Thailand. Land. 10. article no. 61
    CrossRef
  36. Kaya, F, Keshavarzi, A, Francaviglia, R, Kaplan, G, Başayigit, L, and Dedeoglu, M (2022). Assessing machine learning-based prediction under different agricultural practices for digital mapping of soil organic carbon and available phosphorus. Agriculture. 12. article no. 1062
    CrossRef
  37. Li, H, Leng, W, Zhou, Y, Chen, F, Xiu, Z, and Yang, D (2014). Evaluation models for soil nutrient based on support vector machine and artificial neural networks. The Scientific World Journal. 2014. article no. 478569
    Pubmed KoreaMed CrossRef
  38. Ozcoban, MS, Isenkul, ME, Sevgen, S, Acarer, S, and Tufekci, M (2021). Modelling the effects of nanomaterial addition on the permeability of the compacted clay soil using machine learning-based flow resistance analysis. Applied Sciences. 12. article no. 186
    CrossRef
  39. Arciniegas-Ortega, S, Molina, I, and Garcia-Aranda, C (2022). Soil order-land use index using field-satellite spectroradiometry in the Ecuadorian Andean territory for modeling soil quality. Sustainability. 14. article no. 7426
    CrossRef
  40. Knoema. (c2023) . Rwanda - Poverty headcount ratio at national poverty line. Available: https://knoema.com/atlas/Rwanda/Poverty-rate-atnational-poverty-line
  41. Nsabimana, A, Niyitanga, F, Weatherspoon, DD, and Naseem, A (2021). Land policy and food prices: evidence from a land consolidation program in Rwanda. Journal of Agricultural & Food Industrial Organization. 19, 63-73. https://doi.org/10.1515/jafio-2021-0010
    CrossRef
  42. Karemangingo, C, and Bugenimana, DE (2018). Productivity of Irish potato varieties under increasing nitrogen fertilizer application rates in Eastern Rwanda. African Journal of Agricultural Research. 13, 988-995. https://doi.org/10.5897/AJAR2018.13068
    CrossRef
  43. Kujawa, S, and Niedbała, G (2021). Artificial neural networks in agriculture. Agriculture. 11. article no. 497
    CrossRef
  44. Gadekallu, TR, Alazab, M, Kaluri, R, Maddikunta, PKR, Bhattacharya, S, and Lakshmanna, K (2021). Hand gesture classification using a novel CNN-crow search algorithm. Complex & Intelligent Systems. 7, 1855-1868. https://doi.org/10.1007/s40747-021-00324-x
    CrossRef
  45. Park, HA (2013). An introduction to logistic regression: from basic concepts to interpretation with particular attention to nursing domain. Journal of Korean Academy of Nursing. 43, 154-164. https://doi.org/10.4040/jkan.2013.43.2.154
    Pubmed CrossRef
  46. Saadatfar, H, Khosravi, S, Joloudari, JH, Mosavi, A, and Shamshirband, S (2020). A new K-nearest neighbors classifier for big data based on efficient data pruning. Mathematics. 8. article no. 286
    CrossRef

Christine Musanase is currently serving as an assistant lecturer in the Department of Information Systems, the School of ICT at the University of Rwanda, College of Science and Technology. She holds a bachelor’s degree in information technology from University of Rwanda and master’s degree in information systems from the University of Rwanda. She is a program leader of Information Systems. She is pursuing her Ph.D. in research in wireless intelligence sensor networks at the University of Rwanda through the African Center of Excellence in the Internet of Things (Rwanda). She is a member of RAWISE and OWSD National chapter. She is certified as an Oracle Database Developer & Programmer, Rapid Prototyping for Internet of Things, Python programming, Data Science, and R Programming for Data Analytics. Her research interests are informatics, wireless intelligence sensor networks, Internet of Things, artificial intelligence, data analytics, and machine learning. She is a member of three research grant projects titled (1) Tools for Evaluating African Lakes, (2) IoT and AI Applied Research Results Commercialization through the Incubation Hub, and (3) IoT Empowered Precision Agricultural Techniques for Improved Rice Production: An Automated Irrigation and Fertilization Application System for Small-scale Rice Producers in Rwanda. E-mail: musanasechristine@gmail.com

Anthony Vodacek is a full professor of Imaging Science at Rochester Institute of Technology (RIT). He received his B.S. (Chemistry) in 1981 from the University of Wisconsin-Madison, and his M.S. and Ph.D. (environmental engineering) in 1985 and 1990, respectively, from Cornell University. His areas of research lie broadly in multimodal remote sensing, with a focus on the coupling of imaging with modeling for monitoring human and natural terrestrial and aquatic systems. His expertise is in spectral phenomenology, image interpretation, machine learning, and dynamic data-driven application systems. He has recently applied these methods to projects addressing vehicle tracking, precision agriculture, and harmful algal blooms. His newest research areas involve remote sensing of the African Great Lakes and remote sensing of insects in the context of biodiversity assessment. He has worked in Rwanda for more than a decade on various teaching and research projects. Vodacek is on the Fulbright Specialist roster (2018–2023), is an Associate Editor for the Journal of Great Lakes Research, is a Senior Member of IEEE, supports the IEEE Geoscience and Remote Sensing Society global initiative as an ad hoc regional liaison to Sub-Saharan Africa, and is a Corresponding Fellow of the Pan-African Scientific Research Council. E-mail: axvpci@rit.edu

Damien Hanyurwimfura is an associate professor and the Acting Director of the African Center of Excellence in Internet of Things (ACEIoT), College of Science and Technology, University of Rwanda. He received his Bachelor of Engineering degree in computer engineering and information technology from the University of Rwanda (formerly KIST) in 2005. He obtained his Master of Engineering degree in computer science and technology and Ph.D. degree in computer science and technology from Hunan University, China, in 2010 and 2015, respectively. He has served as the Head of PhD studies and Research in the ACEIoT for four years. He has published and co-authored over 30 research papers in leading international journals and conferences. He participated in many AI workshops as a speaker. He has secured four research grants at the national and regional levels as a principal investigator or co-PI. His research interests include most aspects of data mining, machine learning, computer security, watermarking, Internet of Things, hate speech detection, and recommender systems. E-mail: hadamfr@gmail.com

Alfred Uwitonze is a senior lecturer & Dean of the School of Information and Communication Technology at University of Rwanda, College of Science and Technology. He received a Bachelor of Science degree in electronics and telecommunication engineering from the University of Rwanda (UR), College of Science and Technology, Rwanda, in 2005 and MSc degree in communication and information systems from Huazhong University of Science and Technology (HUST), China, in 2009. He completed his Ph.D. in information and communication engineering at the Huazhong University of Science and Technology (HUST), China, in 2017. His Ph.D. focuses on network coding and its applications. His research interests include network coding, computer networks, wireless sensor networks, and network security. E-mail: alfruwitonze@gmail.com

Aloys Fashaho holds a Ph.D. degree in soil science from Egerton University (2020). He holds a master’s degree in sanitary engineering/agricultural sciences and biological engineering from the “Faculté Universitaire des Sciences Agronomiques de Gembloux/Belgium” (2008) and a bachelor’s degree in soils and agricultural engineering from the former National University of Rwanda (NUR) (2002). Aloys is a lecturer at University of Rwanda, College of Agriculture, Animal Sciences and Veterinary Medicine, School of Agriculture and Food Sciences. He is also the Head of the Department of Soil Sciences. His research interests include soil fertility and conservation and agricultural waste management. He has worked on “Evaluation of Soil Properties and Response of Maize (Zea mays L.) to Bioslurry and Mineral Fertilizers in Terraced Acrisols and Lixisols of Rwanda.” E-mail: aloysfashaho@gmail.com

Adrien Turamyenyirijuru is a researcher and lecturer at the College of Agriculture, Animal Sciences and Veterinary Medicine, University of Rwanda. He received a B.Sc. in agriculture from National University of Rwanda in 2007, an M.Sc. in sustainable soil resource management from University of Nairobi in 2013, and a Ph.D. in agronomy from Egerton University in 2020. He has worked on sustainable soil management, sustainable fertilizer use, plant nutrition, and precision agriculture. He is currently the Team Leader of the Task Force in the process of operationalization of the ACES and Coordinator of Potato STIC and Dairy STIC. E-mail: adratur2005@yahoo.fr

Article

Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2023; 23(2): 214-228

Published online June 25, 2023 https://doi.org/10.5391/IJFIS.2023.23.2.214

Copyright © The Korean Institute of Intelligent Systems.

Prediction of Soil Quality in Rwanda for Ideal Cultivation of Potato () Using Fuzzy Logic and Machine Learning

Christine Musanase1,2, Anthony Vodacek2,3 , Damien Hanyurwimfura1,2 , Alfred Uwitonze1,2 , Aloys Fashaho1,2 , and Adrien Turamyemyirijuru1,2

1African Center of Excellence in Internet of Things, University of Rwanda, Kigali, Rwanda
2Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA
3College of Agriculture, Animal Sciences and Veterinary Medicine, University of Rwanda, Musanze, Rwanda

Correspondence to:Christine Musanase (musanasechristine@gmail.com)

Received: March 23, 2023; Revised: May 23, 2023; Accepted: June 15, 2023

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The ability to estimate soil quality has great value for agriculture, especially for low-income regions with minimal agricultural and financial resources. This prediction provides users with information that is useful in determining whether the soil is suitable for a specific crop, such as potato (Solanum tuberosum). Farmers in Rwanda lack information on soil quality. There are not enough soil laboratories to perform the requisite measurements of NPK, pH, and organic carbon, nor are there enough experts to analyze the data and provide farmers with timely results. The prime objective of the proposed study is to develop a predictive framework that can estimate soil quality for the ideal cultivation of potato (Solanum tuberosum) considering a case study of Rwanda. In this study, bootstrapping is used to augment the small soil dataset, and fuzzy logic is used to label soil data into four classes of soil suitability, with verification of the labeling by soil experts. Several machine learning methods are then tested on the labeled data, resulting in the classification of suitability for the augmented dataset and an assessment of their performance as a way to support experts in predicting soil quality. All machine learning methods applied were viable, with the best performance achieved using an artificial neural network. The quantified outcome showed that the adoption of a neural-network-based scheme has an average accuracy of 32% in contrast to other learning schemes. However, 70%-80% accuracy was achieved upon the adoption of fuzzy logic.

Keywords: Soil quality, Fuzzy logic, Artificial intelligence, Rwanda, Machine learning, NPK, Predictive model

1. Introduction

Soil quality is an important factor in determining the suitability of sites for specific crop types. Soil quality is primarily a function of the nitrogen, phosphorus, and potassium (NPK) ratio, soil water pH, and quantity of organic matter [1, 2]. Dependency on nitrogen varies from plant to plant; however, it is known that the demand for nitrogen is always high in leafy plants. Therefore, an appropriate ratio of NPK fertilizer is required. Different amounts of chemicals and nutrients that are reported to be soil water soluble are significantly affected by the soil water pH. The availability of soil nutrients differs based on the acidic or alkaline conditions of the soil, and the development of plants can be adversely affected if the soil is highly acidic, with a pH value less than 5.5. This can occur due to various reasons, such as the toxicity associated with calcium, manganese, and aluminum, as well magnesium deficiency. Furthermore, soil may suffer from deficiencies in manganese, boron, copper, and zinc in cases of high alkalinity. Similarly, a sufficient amount of soil organic matter can increase the capacity of the soil to facilitate key nutrients and improve soil quality. In addition, a variety of other characteristics such as compaction, soil texture, water-holding capacity, soil biological activity, infiltration rate, sodicity, and cation exchange capacity can affect soil quality. Desertification, biodiversity loss, erosion, deforestation, and agricultural practices also affect soil quality.

Furthermore, site conditions and climatic factors affect potato cultivation [3]. Recent work performed by Coelho et al. [4] suggested that the primary focus should be placed on soil nutrients and organic matter and proposed a scheme emphasizing these attributes. Considering this as an input factor, the proposed scheme uses machine learning to predict the soil quality for potato cultivation. Previous studies [5, 6] noted that machine learning was validated using a segment of an existing dataset with no explicit information required.

The idea of validation is to ensure that appropriate data are considered for predictive analysis. Although the level of nutrients can be altered by the farmers, the adoption of proposed scheme for predicting soil quality for cultivation of potato has an added advantage. The scheme offers an autonomous and reliable score for an appropriate fertilizer along with the pH content, thereby facilitating effective decision-making by farmers. The adoption of the machine learning process is accompanied by the consideration of multiple inputs, along with the uncertainty factor in the form of the anticipated error rate, to ensure that the predictive outcome justifies the practicality of its applicability for the given test data of the soil of Rwanda. The predictors used in the models are density with respect to organic carbon (OC), density with respect to N, P, K, and water pH. Yields are significantly higher when soil quality is analyzed and used to determine the site suitability for growing specific crops [7, 8]. In Rwanda, relatively little soil variable data are available, and the interpretation of soil quality from these data was performed by an expert.

Therefore, the assessment of soil quality for various crops in Rwanda is limited by the availability of soil data and expert analyses. One method to overcome this limitation is to apply machine learning methods to complement the efforts of agricultural experts [9]. For example, fuzzy logic has been used to interpret the values of NPK obtained from conventional soil tests to assess their levels in the soil and to predict possible NPK inputs. Artificial neural networks (ANNs), support vector machines, cubist regression, random forests, and multiple linear regression are other machine learning methods used to advance the prediction of soil OC content [10]. In [11], the authors showed that a machine learning approach improved the accuracy of soil property prediction based on a comparative analysis of six commonly used techniques: random forest, decision tree, naïve Bayes, support vector machine, least-squares support vector machine, and ANN.

In this study, an analysis was conducted in Rwanda, and a machine learning model was built to classify the type of soil into four different groups: 1) highly suitable, 2) moderately suitable, 3) marginally suitable, and 4) not suitable. Potatoes were the target crop because they are an important commercial crop in Rwanda [12]. Soil quality information must be accessible to every farmer to decide on the type of crop to be grown and to earn more profit. In addition to the NPK ratio of the soil, the water pH plays an important role [13]. Moreover, alkaline soils inhibit micronutrient absorption [14]. Hence, it must be ensured that these plants are not planted in hostile environments. If one type of soil is less suited for a particular plant and more suited for another, this result can be further used to deduce yield predictions, and thereby, profitability can be predicted. With input from soil scientists, soil data were processed using fuzzy logic to create labels for a small set of soil data. The experts validated the results to be reasonable, and then machine learning was applied to expand the results to compensate for the fact that there are not enough experts in Rwanda. The fuzzy logic model contains the values for the classification of soil into the values used to predict soil quality. In summary, a combination of soft computing and supervised machine learning was used to predict the soil suitability for potatoes in different regions of Rwanda.

1.1 Review of Literature

This section outlines the existing approaches developed for soil quality prediction. Udutalapally et al. [15] developed a device that can explore the health status of crops by assessing any form of disease using Internet of Things (IoT). The study inferred the inclusion of devices for the evaluation of crop health, which is a recent trend in the inclusion of IoT in agriculture. Existing studies have also included the Bayesian machine learning model, and a classified ensemble was reported in the work of Mosavi et al. [16] for identifying the salinity content of ground water. Zia et al. [17] developed a model that could predict the deficiency of nitrates in soil using a machine learning approach. These studies have focused mainly on salinity factors or single nutrient entities without the inclusion of other essential nutrient factors. In addition to machine learning, deep learning is slowly gaining pace in the predictive analysis of agricultural issues, as reported by Elavarasan and Vincent [18]. They used a deep reinforcement model to perform the predictions. Irrespective of the improved form of the model, benchmarking is lacking to prove its applicability on practical grounds. A unique study presented by de Lima Neto et al. [19] investigated the availability of nutrients in a specific form of banana. However, the model does not include any specific nutrients nor is it benchmarked.

The model presented by Zhang et al. [20] used multiple sets of machine learning methods to identify the carbon content in the soil in a specific region. Similarly, various other work was conducted to assess various problems that directly relate to assessment of soil quality; the studies included averaging method for soil properties [21], forecasting carbon in soil using multiple sets of machine learning [22], yield prediction [23], assessing effectiveness of machine learning [24], growth prediction using deep learning [25], assessing CO2 fluxes [26], prediction of texture [27], reliability forecasting model [28], and carbon variability assessment [29]. A fuzzy-logic-based approach for assessing soil quality has also been investigated by Ogunleye et al. [30], Nooriman et al. [31], Hoseini [32], Chen et al. [33], and Atijosan et al. [34]. However, the accuracy of the model largely depends on the massive size of the training data, which is sometimes unavailable. All the aforementioned studies considered various crops and different locations to predict soil quality. These studies aimed to determine the optimal amount of NPK fertilizer required for better soil quality. However, apart from the beneficial features and claims of these studies, there is also an open scope for further development toward predicting soil quality.

Studies have focused on determining the optimal amount of NPK fertilizer required for better soil quality. Additionally, various predictive mechanisms have been reported for investigating soil quality. Wu et al. [35] presented a fuzzy logic model to evaluate the soil quality based on soil samples. They also applied random forest, cubist mathematical models, and hybrid models (regression kriging) to 201 composite surface soil samples to predict soil chemical properties, such as soil OC and available phosphorus content. However, the study did not facilitate assessment based on individual nutrients, which led to unanswered questions about its applicability.

Kaya et al. [36] compared model-averaging techniques for predicting the spatial distribution of soil properties. Their results showed that ANNs and random forest-based learners were the most effective in predicting the soil properties. However, the applicability of learning-based models was found to differ in different use cases, and the study did not offer an inference of soil properties for optimal soil quality retention. Li et al. [37] used ANNs, multiple linear regression, and support vector machines to model the effects of nanomaterials in compacted clay soil. The applicability of this study is restricted to specific forms of soil. Ozcoban et al. [38] developed a model that can predict nitrate deficiency in soil using a machine learning approach. Arciniegas-Ortega et al. [39] used logistic regression to generate an index based on the land-use order. However, it uses a sophisticated approach that makes it difficult to gather data on a daily basis.

Arguments for the need of this study are as follows: the primary motive was to explore the need for predictive modeling to predict soil quality. Therefore, this study explores the scope that could offer improved potato productivity in Rwanda. A clear analysis from the literature review shows that most studies employed sophisticated machine learning techniques to design their predictive framework. It has also been observed that most existing predictive models for soil quality require increased flexibility, which restricts their applicability to a particular crop or region, indicating that the models limit their usage to a particular data size. However, few studies have focused on computing the optimal amount of NPK fertilizer, which is a key indicator for soil prediction. Despite their potential, very few studies have explored the strength factors of the rule set for fuzzy logic type-2, which could assist in enhancing the performance of machine learning models. A review of the existing approaches shows that more work is required to consider real-time soil data, and no similar study has been conducted in Rwanda.

1.2 Purpose and Significance of the Study

The primary reason for conducting the proposed study was to address the current challenges faced by existing soil quality prediction techniques. The first challenge addressed by the proposed scheme is to develop a simplified fuzzy-logic-based scheme that balances ruleset deployment as well as effective decision-making performance verified by a soil expert. The second challenge addressed in this study was to consider the use case of potato cultivation in Rwanda, which has never been explored before. The third challenge addressed in the current study was the deployment of a simplified learning model to perform predictive tasks for determining soil quality. The primary purpose of the present work was to develop an effective machine learning model capable of the predictive analysis of soil quality in Rwanda. A review of existing approaches shows that little work has been performed to consider real-time soil data, and no similar study has been conducted in Rwanda. The next section discusses the materials and methodology used to achieve the aims of this study.

2. Materials and Methods

This section emphasizes the methodologies and formulated algorithms for predicting soil quality. This study’s model was developed considering the underlying principles of agricultural science as well as data science. The study model aggregates environmental information using agricultural science, and a similar concept is used to configure the thresholds of pH and NPK values.

2.1 Dataset Collection

Prior to discussing the data collection, it is essential to understand the complete structure of the proposed implementation. Figure 1 highlights the structure of the proposed implementation scheme, which exhibits the area from which soil data were collected, followed by a series of processes to analyze soil quality.

The soil in Rwanda is naturally fragile [40]. Agricultural development is governed by crop intensification programs [41]. The selection toward cultivation of crops is conducted based on its potential ability to fulfill demands of food security for the region, climatic conditions of the ecological region that suits crop cultivation, and comparative advantage. As per Figure 2, a specific region in Rwanda, viz., Buberuka and Birunga, which include Burera and Rubavu districts located in the northern and western provinces of Rwanda is preferred for cultivating potato [42] (Figure 2). Both districts have high rainfall, low temperatures, high altitude, and steeply sloping hills. The soils in these regions mainly fall under the category of volcanic soil, whereas the Gicumbi and Rwamagana districts are in the northeastern and eastern parts of Rwanda, with altitudes ranging from medium to high, generally drier conditions than Burera and Rubavu, and nonvolcanic soils.

For the purpose of the investigation of the proposed scheme, the samples of soils have been aggregated from two regions, viz., Gicumbi and Rwamagana districts during two seasons, September 2017 to February 2018 and March 2018 to August 2018. Soil sampling was performed at a depth of 30 cm, and samples were collected from 16 plots. Finally, the study also considered the aggregation of soil samples during rainy season of the year 2017 from Rubavu and Burera districts. The study investigated 12 randomly selected plots in the region characterized by a depth of 0–30 cm. The study was conducted in a controlled research environment following the standard procedure for soil analysis in the laboratory. The final dataset consisted of 6,051 soil samples from four locations: Gicumbi, Rwamagana, Rubavu, and Burera. The five variables of soil data (K, P, N, OC, and pH), which were considered to build our models, are shown in Table 1. The dataset was split for training and testing, with 80% for training and 20% for testing.

The data in Table 1 were gathered following the standard procedures of soil analysis in the soil laboratory of the University of Rwanda, College of Agriculture, Animal Sciences, and Veterinary Medicine, Busogo Campus.

2.2 Soil Sampling and Analysis using Fuzzy Logic

The proposed scheme was analyzed using both fuzzy logic and a machine learning approach. Fuzzy logic is applied to solve problems characterized by impartial or vague sets of input information, thereby offering higher flexibility for reasoning in the presence of uncertainties. The proposed scheme implements fuzzy logic because it offers higher robustness owing to its independence from inputs; it does not require inputs to be noise-free or fixed. Furthermore, it assists in constructing user-defined rules to make them practically implementable. The performance of the fuzzy controller system can be improved to optimize the system performance. The fuzzy logic mechanism can also assist in generating a user-friendly outcome while processing a reasonable number of inputs. Furthermore, the adoption of fuzzy logic offers non-dependency from complex mathematical implementations because it can efficiently perform nonlinear systems.

Figure 3 highlights the architecture of the fuzzy logic type-2 used in the proposed scheme for predicting soil quality by considering the uncertain and imprecise information associated with the condition of the soil. The first part of this implementation process involves defining the crisp key input that affects soil quality. There are various possibilities for such inputs, such as compaction, nutrient levels, moisture content, and organic matter content. The inputs considered were taken from four different agricultural sites in Rwanda, considering the pH value, OC, and proportions of N, P, and K fertilizer. The next processing step, shown in Figure 3, is associated with the construction of a rule base to represent the connection between the input arguments and the outcome of soil quality. The study uses “If-Then” rules for constructing the fuzzy set exhibited in Lines 1–4 of the proposed algorithm. The next step is to develop a fuzzy inference system that applies a rule base to consider input arguments. The objective was to assess the degree of influence of each rule on the input variables to yield an outcome. The next operation is mainly associated with defuzzification using a type-reduction operation, which is essentially an extended configuration of defuzzification process for its legacy type-1 fuzzy logic. The architecture shown in Figure 3 can control the uncertainty and possible impressions associated with the soil data.

The proposed scheme to develop soil suitability labels uses fuzzy logic type-2, as shown in Figure 3, where the proposed scheme considers the soil dataset sample in the form of initial data, followed by the activation of the inference engine and mapping performed through rule construction. The outcome is processed by a type reducer, which is then subjected to defuzzification to obtain the final outcome of the type-reduction score and crisp output.

  • • The first module of the fuzzifier is responsible for mapping the crisp input that solely depends upon the category of fuzzifier being deployed.

  • • The second module of rule is responsible for constructing rules considering N, P, and K attributes of soil quality where the outcome of rule states the matching predicted quality of soil.

  • • The third module of the fuzzy inference engine is responsible for mapping the input to output considering all the stated rules in the fuzzy rule base.

  • • The fourth module of the type-reduction that is an extended operation of defuzzification where type-1 fuzzy set is obtained by transforming type-2 fuzzy set.

  • • The fifth and final module of defuzzifier is responsible for generating a quantified numerical outcome as a consequence of considering associated degree of membership function, crisp logic, and adopted fuzzy set. This final block of operation basically maps its fuzzy set into crisp set, which is essential in a fuzzy control system.

The significant benefit of adopting this fuzzy logic technique is that it facilitates modeling of varied levels of uncertainties in predicting soil quality, which cannot be carried out by type-1 fuzzy logic. The algorithmic steps are as Algorithm 1.

The algorithm described above is used for labeling soil information using fuzzy logic. The algorithm takes the input of the soil data, which, after processing, yields an outcome for the soil. The algorithm constructs a matrix of four columns to retain the information associated with the pH water quality, N quality, P quality, and K quality (Line 1). The next step is to assign quality values (Line 2). Furthermore, the algorithm constructs a conditional logic to assess whether the quality of water pH and N is equivalent to 1, while it also assesses whether the quality of soil has the lowest quality score for P and K (Line 3). The algorithm performs similar rounds of checks for a pH water quality equivalent to 2 (Line 4), and the quality is equivalent to the pH water quality (Line 5). All the above processing steps yield inferences regarding the quality of the soil. The processing outcome showed that the majority of soil samples from the observation region were marginally suitable for potato cultivation.

Table 3. Numerical outcome of accuracy-based comparative analysis.

ML approachesPrecision (%)Recall (%)F1-score
Gaussian NB0.820.830.84
ANN0.870.850.86
Logistic regression0.790.750.77
KNN0.750.720.73


The primary reason for the adoption of fuzzy logic for setting up this ruleset is mainly to develop a decision control system in which soil quality can be ascertained based on the rules generated by the user. The advantage of this rule, set up in fuzzy logic, is that the suitability score can be customized by farmers or agricultural experts based on the present agricultural environment of any farming region in Rwanda. Therefore, the higher the soil quality, the higher the anticipated potato yield in Rwanda.

2.3 Analysis Using Machine Learning

The applicability of fuzzy logic was restricted to a few soil samples, and quality predictions were found to be reasonable by soil experts. The soil experts involved in the proposed study were local skilled agriculturists and agronomists with extensive experience in analyzing and interpreting the quality of soil. The experts were consulted regarding the anticipated outcomes of the model. According to them, the lower the error rate of the predictive model, the higher is the possibility of reliable outcomes for a higher degree of soil quality. They also provided a recommendation to assess the individual scope of NPK fertilizer, followed by soil water pH. As experts are not available to assess all the results, the proposed analysis adopted a machine learning approach to increase the analytical coverage of the study. For this purpose, an ANN was adopted to predict soil quality.

ANN is capable of learning complex problems, thereby making precise decisions regarding the quality of soil based on different quality parameters [43]. It can also perform parallel processing of multiple inputs, thus generating multiple outcomes. This means that the proposed model is capable of predicting the soil for any geographic region, not only Rwanda. The Gaussian naïve Bayes is a simple and powerful algorithm for predictive modeling. It is assumed that each input variable is independent [44]. We also used Gaussian naïve Bayes to classify the soil data as it holds continuous data. It was easier to calculate the mean and standard deviation from the training database using the Gaussian mode. The naïve Bayes approach often yields accurate and stable models with very small sample sizes, similar to our soil sample analysis dataset from Rwanda. Logistic regression can be used to analyze the relationship between multiple independent variables and categorical dependent variables and to estimate the probability of an event [45]. However, in this study, the dependent variable was not dichotomous but comprised four categories; we used multinomial logistic regression for soil quality prediction. The K-nearest neighbors (KNN) machine learning algorithm is a well-known nonparametric classification method. KNN determines the class of a new sample based on the class of its nearest neighbors [46]. For each modeling approach tested in this study, the assessment of the algorithm was based on four measures: accuracy, precision, recall, and F1 score, which were calculated using the following formulas:

Accuracy=TP+TNTP+TN+FP+FN,Precision=TPTP+FP,Recall=TPTP+FN,F1-score=2*Precision*RecallPrecision+Recall=2*TP2*TP+FP+FN.

In the above empirical expressions (1)–(4), the variables TP, TN, FP, and FN represent true positives, true negatives, false positives, and false negatives, respectively. TP represents the correctness of the presence of a condition, whereas a TN represents the correctness of the absence of a condition. An FP represents the incorrect presence of a specific condition, whereas an FN represents the incorrect absence of a specific condition. The next section discusses the obtained outcomes.

3. Results

This section describes the results of the fuzzy logic method for labeling data, followed by a comparison of the outcomes of different machine learning methods as predictive models. The labeling method was verified as reasonable by soil experts with knowledge of Rwandan soils. Machine learning methods were compared using performance metrics.

3.1 Outcomes of Fuzzy Logic Approach

The outcome of the fuzzy logic was analyzed using a kernel density estimation plot, which is a probability density function that is a variation of the histogram that uses kernel smoothing while plotting the values. Density associated with OC, proportion of NPK individually, and proportion of pH. The X-axes represent the data points, and the Y-axes represent the probability density function. The region of the plot with a higher peak is the region with the maximum number of data points between the values.

As shown in Figure 4, the OC percentage was satisfactory throughout the region, and this study evaluated the impact of individual N, P, and K densities on OC proportion (g g-1). Therefore, this variation did not influence the suitability score, and these data could be ignored. In Figure 5, the density plot of N shows two types of quantities in the soil. Most of it was 0.08, whereas another peak was present at 0.2; hence, this became a deciding factor for overall soil quality. In Figure 6, the P percentage is higher in a particular region but lower in others; in the case of a lower N percentage, the P percentage is the deciding factor, along with the K percentage. Negative values of P represent outliers.

Figures 7 and 8 show the density analysis for the proportion of K and water pH. The results show that the density of K is slightly lower, whereas the density of the water pH is quite good for the given region under observation in the proposed use case of Rwanda. The final outcome of soil quality is shown in Figure 9.

The outcomes shown in Figure 9 were produced using the amended version of fuzzy logic. The original sample size was 360 samples; however, a random function was used to increase the sample size programmatically to assess the influence of different samples on the predicted quality. The graphical representation of Figure 9 was obtained by considering the first sample of 360 data points (Sample-1), the second sample of 720 data points (Sample-2), and the third sample of 1,440 data points. Recall the development mechanism of the ruleset constructed using fuzzy logic, and this result was validated by a soil expert. The outcome of processing showed that the majority of cases from the soil were marginally suitable for cultivating potatoes. It should be noted that the outcome obtained is highly subjective of the use case of four different sites in Rwanda from where the samples were collected (represented in the graph as SC-1.0, SC-2.0, SC-3.0, and SC-4.0). According to experts who are also co-authors of this work, fuzzy logic can classify soil with an accuracy of 70%–80%; hence, most of the soil samples came from acrisols of Gicumbi and Lixisols of Rwamagana where Irish potatoes are marginally cultivated. Therefore, fuzzy logic offers a better predictive performance.

3.2 Outcomes of Machine Learning Approach

The outcomes of the machine learning approach were assessed using performance metrics.

The proposed scheme was assessed through a comparison of the results with a set of frequently used machine learning methods, that is, ANN, Gaussian naïve Bayes, logistic regression, and KNN. The results shown in Figure 10 and Table 3 indicate that ANN offers better predictive accuracy than the other approaches. This suggests that predictive modeling of soil quality using ANNs offers higher reliability. Machine learning contributes to the development of a predictive model that can be used by farmers and agriculturists to understand the possibilities of suitable environments for potato farming. The scalability of this predictive model is that upon feeding any form of dataset, the model can perform predictions associated with soil quality. Based on the predicted outcomes, farmers can finetune the concentration of various fertilizers to ensure better crop cultivation in Rwanda. Hence, it saves time and effort in decision-making for a better yield. From the perspective of novelty, the core idea of the proposed scheme is to develop a cost-effective and simplified computational predictive model that can assess soil quality suitable for potato cultivation in Rwanda. The mechanism of adoption of the proposed model is based on data acquisition and designing a fuzzy processor; applying a machine learning approach for this purpose is a novel notion under the environmental conditions of Rwanda. The model is simple in design and implementation and offers reliable outcomes in easy steps. The basis of this model deployment is to refrain from adopting any advanced data analytical or predictive scheme using machine learning or deep learning, where accuracy is achieved at the cost of the computational burden. However, future work can be conducted to extend the proposed machine learning predictive scheme to its advanced version, ensuring a lower computational cost without affecting accuracy. The next section presents discussion of the results of the proposed study.

4. Discussion

From the highlights of the outcome obtained in the previous section, it is noted that the proposed scheme introduces a mechanism that not only can identify the optimal fertilizers required for potato cultivation but also offers a robust classification mechanism to determine the soil quality required for this purpose. There is no doubt that the proposed scheme is implemented considering the use case of potatoes in Rwanda; however, the same model can also be applied to different agricultural crops in different countries. This is possible by altering the input variables of this learning model to predict soil quality, which, at present, is considered only for four different sites in Rwanda.

To understand the contribution of the proposed method to potato production, it is necessary to understand the actual agricultural and cultivation factors required for optimal potato production. The amount of fertilizer to be provided in the soil for potato cultivation depends on the soil test data. The proposed scheme considered standard input data from four agricultural sites in Rwanda to determine the optimal distribution of nutrients. It is also known that NPK fertilizers are administered in the same proportion at the same time during planting; however, the demand for these fertilizers continues to change over a period of cultivation. Fuzzy logic type-2 was applied in the proposed scheme considering the input of soil data to predict soil quality with respect to water quality and quality with respect to individual N, P, and K fertilizers. Further accuracy of this predictive analysis was achieved by deploying machine learning schemes using an ANN. Hence, the model is capable of predicting the change in the quantity of fertilizer required to reach the optimal soil quality to ensure better production.

Although a machine learning approach has been introduced to overcome the constraints of using fuzzy logic for various types of soil, fuzzy logic should not be deemed a low-performing module. It has its own advantages that cannot be achieved using a machine learning approach. Fuzzy logic offers a better platform for introducing a user-friendly logical ruleset that is not only easy to deploy but also flexible to manage. The ruleset can be redefined based on the problem space considered in the investigation. Machine learning contributes to ensuring a higher predictive accuracy by considering more coverage of the soil area. The fuzzy logic analysis showed a higher proportion of OC density (Figure 4) and a lower proportion of N (Figure 5) and P (Figure 6), while the proportion of K (Figure 7) was found to be better than prior values of N and P, which are highly essential for potato cultivation. Furthermore, fuzzy logic was found to offer a predictive performance, which was validated by a soil expert (Figure 10). The analysis of various machine learning schemes demonstrates that ANNs offer better predictive performance than other machine learning approaches. The prime justifications behind this are as follows: less effective feature management system using the Gaussian naïve Bayes algorithm, shortcomings in considering linearity associated with different categories of variables in logistic regression, and less applicability/scalability toward higher dimensions of data in the KNN algorithm. However, it is not difficult to implement an ANN with parallel processing. This not only speeds up processing but also addresses the minimization of errors at each progressive epoch. Hence, the ANN demonstrated better performance in predicting soil quality in this study.

5. Conclusion

This study presents a simplified predictive modeling for forecasting the soil quality in selected areas of Rwanda, which has been less explored in existing studies. The contributions of the proposed scheme are as follows: (1) the predictive operation is performed using the fuzzy logic and machine learning approaches; both have unique benefits, unlike any form of sophisticated usage of multiple machine learning schemes found in existing studies; (2) the proposed predictive model is highly flexible in its operation and does not solely depend on the environment, which means that it can be applied for predicting soil quality for other crops or other regions, irrespective of any size or dimension of the data; (3) the outcome of the proposed scheme shows that the proposed fuzzy logic scheme offers better predictive performance; (4) the quantified outcome of the machine learning scheme shows that the proposed artificial neural network offers approximately 25% higher accuracy compared with Gaussian naïve Bayes, approximately 32% higher accuracy compared with logistic regression, and approximately 37% higher accuracy compared with the KNN algorithm; and (5) the overall quantified outcome shows that the proposed model is able to show an accuracy of 89%, whereas the implementation of the fuzzy logic algorithm proposed in this system is able to classify the soil with an accuracy of 70% to 80% with shallow learning algorithms.

The numerical outcome of the proposed scheme offers significant guidelines for potato cultivation in Rwanda. Based on this outcome, an agronomist will need to identify the geographical features of Rwanda, followed by the development of a smart ruleset using fuzzy logic. This could facilitate better proportion identification of NPK fertilizers for improving soil quality. Hence, without much demand for re-engineering, this model can be utilized for any geographical location to exhibit better predictive performance than the frequently exercised schemes. Future work will focus on using soil sensors for soil analysis to overcome the challenges of insufficient soil datasets and soil experts to address geographical differences. Future work will also be conducted in the direction of soil map generation to further assess soil quality in the presence of various challenging agricultural conditions.

Data availability

The datasets are available from the corresponding author upon request.

Fig 1.

Figure 1.

Structure of the proposed scheme of implementation.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 2.

Figure 2.

Study area from which the soil data were derived, i.e., Rubavu, Burera, Gicumbi, and Rwamagana.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 3.

Figure 3.

Architecture of fuzzy logic type 2.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 4.

Figure 4.

Histogram of OC percent.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 5.

Figure 5.

N percentage histogram.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 6.

Figure 6.

P percentage histogram.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 7.

Figure 7.

K percentage histogram.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 8.

Figure 8.

Water pH histogram.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 9.

Figure 9.

Results of the fuzzy logic labeling for the three different samples. The classes are ordered by prevalence.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Fig 10.

Figure 10.

Comparison of all algorithms.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 214-228https://doi.org/10.5391/IJFIS.2023.23.2.214

Algorithm 1. Algorithm for labeling using fuzzy logic type-2..

Input: Soil Data (S)
Output: Soil Quality

1.Add 4 new columns in S
 -c1 for pH quality
 -c2 for N quality,
 -c3 for P quality
 -c4 for K quality
2.Assign quality values according to the study conducted
3.If pH quality is 1
 -if N quality is 1 then quality is 1
 -else quality is smallest among p & k
4.Else if pH quality is 2
 -if n quality is 1 then quality is 1
 -else quality is 2
5.Quality is equal to pH quality

Table 1. Dataset sample presentation of values obtained from soil analysis.

pHOC (%)N (%)P (ppm)K (ppm)
52.880.0918.880.3
4.892.930.0729.373.5
4.942.910.078.3745.1
5.32.780.1018.888
5.02.710.0713.651
4.942.650.0778.3787

Table 2. Classification of selected soil properties values for potato.

SuitableModerately suitableMarginally suitableNot suitable
K (ppm)>5535–5515–35<15
P (ppm)>106.5–102.5–6.5<2.5
N (%)>0.300.225–0.300.125–0.225<0.125
OC (%)>0.70.5–0.70.3–0.5<0.3
pH5.5–75–5.54–5<4
7–7.57.5–8>8

Table 3. Numerical outcome of accuracy-based comparative analysis.

ML approachesPrecision (%)Recall (%)F1-score
Gaussian NB0.820.830.84
ANN0.870.850.86
Logistic regression0.790.750.77
KNN0.750.720.73

References

  1. Krasilnikov, P, and Taboada, MA (2022). Amanullah, “Fertilizer use, soil health and agricultural sustainability. Agriculture. 12. article no. 462
    CrossRef
  2. Wan, LJ, Tian, Y, He, M, Zheng, YQ, Lyu, Q, Xie, RJ, Ma, YY, Deng, L, and Yi, SL (2021). Effects of chemical fertilizer combined with organic fertilizer application on soil properties, citrus growth physiology, and yield. Agriculture. 11. article no. 1207
    CrossRef
  3. Manrique, LA (1992). Potato production in the tropics: crop requirements. Journal of Plant Nutrition. 15, 2679-2726. https://doi.org/10.1080/01904169209364504
    CrossRef
  4. Coelho, ARF, Daccak, D, Marques, AC, Luis, IC, Pessoa, CC, and Silva, MM (2022). Comparison of soils of two fields for potato production located in the same region of Portugal. Chemistry Proceedings. 10. article no. 53
  5. Gomez, D, Salvador, P, Sanz, J, and Casanova, JL (2019). Potato yield prediction using machine learning techniques and sentinel 2 data. Remote Sensing. 11. article no. 1745
    CrossRef
  6. Li, D, Miao, Y, Gupta, SK, Rosen, CJ, Yuan, F, Wang, C, Wang, L, and Huang, Y (2021). Improving potato yield prediction by combining cultivar information and UAV remote sensing data using machine learning. Remote Sensing. 13. article no. 3322
  7. Penn, CJ, and Camberato, JJ (2019). A critical review on soil chemical processes that control how soil pH affects phosphorus availability to plants. Agriculture. 9. article no. 120
    CrossRef
  8. Basak, N, Mandal, B, Biswas, S, Basak, P, Mitran, T, and Saha, B (2022). Impact of long term nutrient management on soil quality indices in rice-wheat system of lower Indo-Gangetic Plain. Sustainability. 14. article no. 6533
    CrossRef
  9. Global Hunger Index. (2022) . Rwanda. [Online]. Available: URL:https://www.globalhungerindex.org/rwanda.html
  10. Diao, X, Bahiigwa, G, and Pradesha, A (2014). The Role of Agriculture in the Fast-Growing Rwandan Economy: Assessing Growth Alternatives. Washington, DC: International Food Policy Research Institute
  11. Food and Agriculture Organization of the United Nation. (c2023) . Rwanda at a glance. Available: https://www.fao.org/rwanda/our-office-in-rwanda/rwanda-at-a-glance/en/
  12. Muratore, C, Espen, L, and Prinsi, B (2021). Nitrogen uptake in plants: the plasma membrane root transport systems from a physiological and proteomic perspective. Plants. 10. article no. 681
    CrossRef
  13. Dhaliwal, SS, Sharma, V, Shukla, AK, Verma, V, Kaur, M, and Shivay, YS (2022). Biofortification: a frontier novel approach to enrich micronutrients in field crops to encounter the nutritional security. Molecules. 27. article no. 1340
    CrossRef
  14. Denton-Thompson, SM, and Sayer, EJ (2022). Micronutrients in food production: what can we learn from natural ecosystems?. Soil Systems. 6. article no. 8
    CrossRef
  15. Udutalapally, V, Mohanty, SP, Pallagani, V, and Khandelwal, V (2021). sCrop: a novel device for sustainable automatic disease prediction, crop selection, and irrigation in Internet-of-Agro-Things for smart agriculture. IEEE Sensors Journal. 21, 17525-17538. https://doi.org/10.1109/JSEN.2020.3032438
    CrossRef
  16. Mosavi, A, Hosseini, FS, Choubin, B, Goodarzi, M, and Dineva, AA (2020). Groundwater salinity susceptibility mapping using classifier ensemble and Bayesian machine learning models. IEEE Access. 8, 145564-145576. https://doi.org/10.1109/ACCESS.2020.3014908
    CrossRef
  17. Zia, H, Harris, NR, Merrett, GV, and Rivers, M (2019). A low-complexity machine learning nitrate loss predictive model: towards proactive farm management in a networked catchment. IEEE Access. 7, 26707-26720. https://doi.org/10.1109/ACCESS.2019.2901218
    CrossRef
  18. Elavarasan, D, and Vincent, PD (2020). Crop yield prediction using deep reinforcement learning model for sustainable agrarian applications. IEEE Access. 8, 86886-86901. https://doi.org/10.1109/ACCESS.2020.2992480
    CrossRef
  19. de Lima Neto, AJ, Deus, JALD, Rodrigues Filho, VA, Natale, W, and Parent, LE (2020). Nutrient diagnosis of fertigated “prata” and “cavendish” banana (Musa spp.) at plot-scale. Plants. 9. article no. 1467
    CrossRef
  20. Zhang, X, Xue, J, Chen, S, Wang, N, Shi, Z, Huang, Y, and Zhuo, Z (2022). Digital mapping of soil organic carbon with machine learning in Dryland of Northeast and North Plain China. Remote Sensing. 14. article no. 2504
  21. Taghizadeh-Mehrjardi, R, Khademi, H, Khayamim, F, Zeraatpisheh, M, Heung, B, and Scholten, T (2022). A comparison of model averaging techniques to predict the spatial distribution of soil properties. Remote Sensing. 14. article no. 472
    CrossRef
  22. Emadi, M, Taghizadeh-Mehrjardi, R, Cherati, A, Danesh, M, Mosavi, A, and Scholten, T (2020). Predicting and mapping of soil organic carbon using machine learning algorithms in Northern Iran. Remote Sensing. 12. article no. 2234
    CrossRef
  23. Abbas, F, Afzaal, H, Farooque, AA, and Tang, S (2020). Crop yield prediction through proximal sensing and machine learning algorithms. Agronomy. 10. article no. 1046
    CrossRef
  24. Joshua, V, Priyadharson, SM, and Kannadasan, R (2021). Exploration of machine learning approaches for paddy yield prediction in eastern part of Tamilnadu. Agronomy. 11. article no. 2068
    CrossRef
  25. Khan, T, Sherazi, HHR, Ali, M, Letchmunan, S, and Butt, UM (2021). Deep learning-based growth prediction system: a use case of China agriculture. Agronomy. 11. article no. 1551
    CrossRef
  26. Adjuik, TA, and Davis, SC (2022). Machine learning approach to simulate soil CO2 fluxes under cropping systems. Agronomy. 12. article no. 197
    CrossRef
  27. Gouda, MZ, Nagihi, EM, Khiari, L, Gallichand, J, and Ismail, M (2021). Artificial intelligence-based prediction of key textural properties from LUCAS and ICRAF spectral libraries. Agronomy. 11. article no. 1550
    CrossRef
  28. Maksimovic, J, Pivic, R, Stanojkovic-Sebic, A, Jovkovic, M, Jaramaz, D, and Dinic, Z (2021). Influence of soil type on the reliability of the prediction model for bioavailability of Mn, Zn, Pb, Ni and Cu in the soils of the Republic of Serbia. Agronomy. 11. article no. 141
    CrossRef
  29. John, K, Abraham Isong, I, Michael Kebonye, N, Okon Ayito, E, Chapman Agyeman, P, and Marcus Afu, S (2020). Using machine learning algorithms to estimate soil organic carbon variability with environmental variables and soil nutrient indicators in an alluvial soil. Land. 9. article no. 487
    CrossRef
  30. Ogunleye, GO, Fashoto, SG, Mashwama, P, Arekete, SA, Olaniyan, OM, and Omodunbi, BA (2018). Fuzzy logic tool to forecast soil fertility in Nigeria. The Scientific World Journal. 2018. article no. 3170816
    Pubmed KoreaMed CrossRef
  31. Nooriman, WM, Abdullah, AH, Rahim, NA, and Tan, ESMM (2021). Fuzzy logic based prediction of micronutrients demand for Harumanis mango growth cycles. Journal of Physics: Conference Series. 2107. article no. 012048
  32. Hoseini, Y (2019). Use fuzzy interface systems to optimize land suitability evaluation for surface and trickle irrigation. Information Processing in Agriculture. 6, 11-19. https://doi.org/10.1016/j.inpa.2018.09.003
    CrossRef
  33. Chen, LC, Wibowo, N, and Utama, DN (2021). Extended fuzzy decision support model for cropland recommendation of food cropping in Indonesia. Journal of Computer Science. 17, 709-723. https://doi.org/10.3844/jcssp.2021.709.723
    CrossRef
  34. Atijosan, A, Muibi, K, Ogunyemi, S, Adewoyin, J, Badru, R, Alaga, A, and Shaba, A (2015). Agricultural land suitability assessment using fuzzy logic and geographic information system techniques. International Journal of Scientific Research in Science and Technology. 1, 113-118.
  35. Wu, C, Dai, E, Zhao, Z, Wang, Y, and Liu, G (2021). Soilquality assessment during the dry season in the Mun River Basin Thailand. Land. 10. article no. 61
    CrossRef
  36. Kaya, F, Keshavarzi, A, Francaviglia, R, Kaplan, G, Başayigit, L, and Dedeoglu, M (2022). Assessing machine learning-based prediction under different agricultural practices for digital mapping of soil organic carbon and available phosphorus. Agriculture. 12. article no. 1062
    CrossRef
  37. Li, H, Leng, W, Zhou, Y, Chen, F, Xiu, Z, and Yang, D (2014). Evaluation models for soil nutrient based on support vector machine and artificial neural networks. The Scientific World Journal. 2014. article no. 478569
    Pubmed KoreaMed CrossRef
  38. Ozcoban, MS, Isenkul, ME, Sevgen, S, Acarer, S, and Tufekci, M (2021). Modelling the effects of nanomaterial addition on the permeability of the compacted clay soil using machine learning-based flow resistance analysis. Applied Sciences. 12. article no. 186
    CrossRef
  39. Arciniegas-Ortega, S, Molina, I, and Garcia-Aranda, C (2022). Soil order-land use index using field-satellite spectroradiometry in the Ecuadorian Andean territory for modeling soil quality. Sustainability. 14. article no. 7426
    CrossRef
  40. Knoema. (c2023) . Rwanda - Poverty headcount ratio at national poverty line. Available: https://knoema.com/atlas/Rwanda/Poverty-rate-atnational-poverty-line
  41. Nsabimana, A, Niyitanga, F, Weatherspoon, DD, and Naseem, A (2021). Land policy and food prices: evidence from a land consolidation program in Rwanda. Journal of Agricultural & Food Industrial Organization. 19, 63-73. https://doi.org/10.1515/jafio-2021-0010
    CrossRef
  42. Karemangingo, C, and Bugenimana, DE (2018). Productivity of Irish potato varieties under increasing nitrogen fertilizer application rates in Eastern Rwanda. African Journal of Agricultural Research. 13, 988-995. https://doi.org/10.5897/AJAR2018.13068
    CrossRef
  43. Kujawa, S, and Niedbała, G (2021). Artificial neural networks in agriculture. Agriculture. 11. article no. 497
    CrossRef
  44. Gadekallu, TR, Alazab, M, Kaluri, R, Maddikunta, PKR, Bhattacharya, S, and Lakshmanna, K (2021). Hand gesture classification using a novel CNN-crow search algorithm. Complex & Intelligent Systems. 7, 1855-1868. https://doi.org/10.1007/s40747-021-00324-x
    CrossRef
  45. Park, HA (2013). An introduction to logistic regression: from basic concepts to interpretation with particular attention to nursing domain. Journal of Korean Academy of Nursing. 43, 154-164. https://doi.org/10.4040/jkan.2013.43.2.154
    Pubmed CrossRef
  46. Saadatfar, H, Khosravi, S, Joloudari, JH, Mosavi, A, and Shamshirband, S (2020). A new K-nearest neighbors classifier for big data based on efficient data pruning. Mathematics. 8. article no. 286
    CrossRef

Share this article on :

Related articles in IJFIS