Article Search
닫기

## Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2021; 21(4): 338-348

Published online December 25, 2021

https://doi.org/10.5391/IJFIS.2021.21.4.338

© The Korean Institute of Intelligent Systems

## Dynamic Type-2 Fuzzy Time Warping (DT2FTW): A Hybrid Model for Uncertain Time-Series Prediction

Aref Safari1, Rahil Hosseini1 , and Mahdi Mazinani2

1Department of Computer Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
2Department of Electronic Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran

Correspondence to :
Rahil Hosseini (rahil.hosseini@qodsiau.ac.ir)

Received: April 4, 2020; Revised: June 22, 2021; Accepted: August 31, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Prediction of time series is associated with nondeterministic pattern analysis for uncertain conditions. Therefore, it is necessary to develop high-quality prediction methods for real-world applications. Type-2 fuzzy systems can handle high-order uncertainties, such as sequential dependencies associated with time series. Precise and reliable prediction can help to develop reasonable strategies and assist specialists in planning the best policies for modeling events in uncertain time series. In this study, a hybrid model (dynamic type-2 fuzzy time warping [DT2FTW]) was proposed for handling high-order uncertainties in time-series prediction. A type-2 fuzzy intelligent system was developed alongside a dynamic time warping algorithm for predicting the patterns’ similarity in long-time series for time-series prediction. The results demonstrate that the proposed DT2FTW model yields more reliable predictions on global standard benchmarks such as the Mackey-Glass, Dow Jones, and NASDAQ time-series. The results also confirm that the proposed DT2FTW model has lower error rates than its counterpart algorithms in terms of the root mean square error (RMSE), mean absolute error (MAE), and mean percentage error (MPE). In addition, the results confirm the superiority of the proposed model with an average area under the ROC curve (AUC) of 94%, with the 95% confidence interval (92%-95%).

Keywords: Dynamic time warping, Interval type-2 fuzzy system, Time-series prediction

Modeling of time-series is important because many pattern-analysis problems contain the time component. These problems are typically not addressed, because the time dependence makes time-series-related problems difficult to handle. One of the most challenging issues associated with time-series is their prediction. Prediction problems are often classified into short-term, medium-term, and long-term, and are characterized by different orders of uncertainty. The uncertainty associated with time-series data is implicit, with a nonlinear pattern. On the other hand, unreliable accuracy is a major issue in time-series predictions. Different time-series models, such as fuzzy time-series, have been considered for improving the prediction accuracy.

Many intelligent models have been used for analyzing time-series patterns. Recently, the use of soft computing approaches, such as fuzzy logic, neural networks, simulated annealing, and genetic algorithms, has been reported in the literature on the time-series prediction. These approaches have been considered advantageous compared with traditional methods, because they can address nonlinearities and can approximate many types of complex dynamical systems better than linear statistical models. Fuzzy logic models have been widely adopted because of the prevalent uncertainty in the time-series data. Most related studies use the Euclidean distance metric for measuring time-series intervals. The Euclidean distance metric is widely known to be very sensitive to distortions along the time axis, but it has been very popular with many researchers. The ubiquity of the Euclidean distance metric in the face of increasing evidence of its poor accuracy for time-series prediction is almost certainly owing to its ease of implementation and its time and space efficiency. However, in our work, the problem of distortion along the time axis is addressed by dynamic time warping (DTW) alongside a type-2 fuzzy logic approach, for modeling high-order uncertainties in time-series prediction based on the footprint of uncertainty and unequal lengths of time intervals.

### 1.1 Literature Review

Analysis and prediction of time-series belong to the field of temporal pattern recognition. A time-series corresponds to a stretch of values on the same scale, indexed by a naturally occurring time, as encountered in many applications in the engineering, ecology, economy, medicine, and finance fields [1]. Five concepts are important for time-series: the starting time, pattern similarity, period range, confidence, and the endpoint time. Many techniques [28] have been applied for time-series prediction based on the period and data type. The DTW model has been effectively used to automatically deal with time deformations in different time-series ranges with time-dependent data, for pattern recognition and similarity analysis. The DTW approach is currently used in many areas, including online signature matching and handwriting recognition [9], gestures and sign language recognition [10], knowledge discovery, data mining and clustering [11], pattern recognition and data analysis [12], and signal processing [13,14]. In addition, in many real-world applications temporal problems are complex, uncertain, and chaotic [15]. Fuzzy logic is one of the most effective methods for handling uncertainties in dynamic and non-stationary environments [16]. Table 1 lists various hybrid fuzzy models that have been used for predicting time-series [1722]. Fuzzy systems, especially hybrid fuzzy models, have been very promising for solving complex problems, where a model estimates and predicts the similarity between two time-series in uncertain conditions [18,2327].

### 1.2 Highlights

The main objective of this study was to introduce an intelligent dynamic type-2 fuzzy time warping (DT2FTW) model for predicting long-term temporal data in realistic time-series. In this work, the proposed model aims to overcome the drawbacks of the existing methods and offer a more robust, reliable, and accurate model for predicting long time-series using type-2 fuzzy logic. The model is split into two parts: 1) high-order type-2 fuzzy logic-based time-series prediction and 2) DTW. The type-2 time-series prediction consists of several steps, and we applied the operators of Karnik-Mendel (KM) algorithm for defuzzification. Complex min-max composition operators were applied to all predictions. Then, the prediction performance was evaluated using the root mean square error (RMSE), mean absolute error (MAE), and mean percentage error (MPE) metrics, along with the statistical evaluation using the left-tailed T-test.

The remainder of this paper is organized as follows: Section 2 presents the theoretical research background and materials of the proposed model. Section 3 describes the detailed structure of the proposed model. Performance evaluation and experimental results are presented in Section 4, and the paper is concluded in Section 5.

### 2. Theoretical Background

This section presents a brief overview of DTW, followed by an overview of the concept of the interval type-2 fuzzy set (IT2FS). Finally, relevant mathematical expressions are provided.

### 2.1 DTW

To obtain a reliable time-series model for real-time applications, DTW and event-DTW (E-DTW) models were applied in [28], for predicting the optimal distance measure between two related time-series. Since the time is aggregated to represent a reasonable estimate, patterns may match a wide variety of actual time-series. Specifically, the pattern-detection task involves searching for a time-series, S, for a pattern, P.

S=S1,S2,,Si,,Sn,P=P1,P2,,Pj,,Pn.

Sequences S and P can be organized to form an m × n grid or plane, where each point in the plane, (i, j), corresponds to an alignment between elements Si and Pj. Therefore, W, a warping path, maps the elements of S and P; this path is called the “distance” between the elements, which must be minimized to yield lower uncertainty.

W=w1,w2,,wk,

where W is the sequence of grid points, where each wk corresponds to a point (i, j)k. To formulate the DTW algorithm in a time-series problem, we need a distance measure between two elements. The distance function δ is the degree of the difference in the square of the difference between two data points, as follows:

δ(i,j)=Si-Pj,δ(i,j)=(Si-Pj)2.

The cumulative distance for each path is as follows:

DTW(S,P)=minW[ktδ(Wk)],

where δ is the distance measure between two time-series elements. In addition, there are three constraints on an arbitrary warping path W = {w1, w2, . . . , wk}. Element wkW represents the mapping between xi and yj, that is, Wk ∈ (S, P). The three constraints cause element Wk+1 only to appear only at the three adjacent positions.

### 2.2 An Overview of Interval Type-2 Fuzzy Sets

This section provides definitions of type-2 fuzzy sets (T2FSs) and related but essential concepts. The membership function (MF) of a T2FS of a given element is itself a type-1 fuzzy set (T1FS). A T2FS represented as Ã is characterized by a type-2 MF μÃ(x,u) where xX and uJx ⊆ [0, 1], that is [29]:

A˜={((x,u),μA˜(x,u))xX,Jx[0,1]},

where 0 ≤ μÃ(x,u) ≤ 1, X is the domain of the fuzzy set, and Jx is the domain of the secondary MF at x. Ã is [29]:

A˜=xXuJxμA˜(x,u)x,uJx[0,1],

where ∫∫ represents a union over the admissible x and u

A˜=xXuJx1x,u=xX[uJx1u]x,

where x is the primary variable, Jx, the interval in [0, 1], is the primary MF of x, u is the secondary variable, and ∫uJx is the secondary MF at x. Uncertainty about Ã is addressed by the union of all of the primary memberships, called the footprint of uncertainty (FOU) of Ã, that is, [FOU(A)˜], and given as [30]:

FOU(A)˜=xXJx.

The FOU for a Gaussian primary MF with an uncertain standard deviation is shown in Figure 3. The FOU is bounded by an upper bound membership function (UMF) μÃ (x), and a lower bound membership function (LMF) μÃ (x), which are T1FSs; consequently, the membership grade of each element of an IT2FS is identified by an interval of [μÃ (x), μÃ (x)]. In the IT2FSs, the UMF and LMF represent the uncertainties of the input variables compared with the T1FSs. In addition, the FOU in the IT2FSs provides more degrees of freedom [29].

### 3. The Proposed DT2FTW Model

The block diagram of the proposed DT2FTW model is shown in Figure 1. The fuzzifier can be categorized into two types, singleton and non-singleton, according to the number of non-zero MF values, and defines the membership grade of the input. In this study, the TSK fuzzy rule type was considered more precise than the Mamdani rules. In this study, a singleton fuzzifier was implemented. The output processing, including the type-reducer and defuzzifier, generates a crisp output. For this step, the KM algorithm [30] was applied in this study. The proposed steps in the DT2FTW warping path prediction model are described below.

Step 1 (Fuzzification): Fuzzification is the first step in the proposed DT2FTW model. According to the number of non-zero MF values and inputs, the proposed architecture’s fuzzifier can be categorized into two types: singleton and non-singleton. In this study, the TSK fuzzy rule type was considered. The product and minimum t-norms were introduced as acceptable inference methods for computing the firing strength of multiple antecedents. The membership degree of the input data or measurements of IT2FSs was defined. The Gaussian MF followed the normal distribution of the Gaussian function. The Gaussian MF assumed that the time steps are typically distributed. In this study, the Gaussian MF was applied to compute the fuzzy memberships from the normalized DT2FTW distances. The Gaussian MF was used because most real-world variables follow a standard normal distribution with a gradual change around a mean value. The Gaussian MF is expressed as follows:

μ^i=exp exp (-(xi-x_c)22σc2),

where μ̂i is the membership function (or membership grade) of the Gaussian MF, xi is the normalized DT2FTW distance of the class query sequence, and xc and σc stand for the mean and standard deviation, respectively, of the normalized DT2FTW distances. Eq. (15) was used for normalizing the fuzzy memberships

μi=μ^iiμ^i,

where μi is the normalized fuzzy membership degree, and μ̂i is the original fuzzy membership.

Step 2 (Inference): To design the inference systems of the DT2FTW model, we applied a TSK type with three inputs and one output; therefore, each input-output variable corresponds to Gaussian MFs and has 2n if-then rules. In this study, 2n optimal fuzzy rules were applied to the experiments, for improving performance and for minimizing the prediction error rates in different scenarios. As mentioned above, the main goal of DTW is to equate two time-dependent sequences: S = (S1, S2, ..., Si, ..., SN).

The mentioned sequences may well be discrete time-series or feature sequences sampled at intermediate points under chaotic and non-stationary conditions. As a predictable value of S relies on S(tk) which is the past value of S, there are s antecedents in every rule. According to Eqs. (1)(3), X(t) is the observed series at time t, T(t) represents the trend factor, S(t) represents the seasonal factor, and IT denotes irregular or random factor white noise components with zero mean and constant variance. Hence, an uncertain time-series condition is used for regulating the arrangement between the S and P time patterns in an additive form of the time-series. Therefore, this development can be defined by Eq. (16)

X(t+1)=F(x(t)),

where X(t+1)Rd is the condition of the system at time step t+1, and F is a nonlinear vector-valued function of the additive form of different types of time-series. On the other hand, x(t), t = 1, ..., N, is a time-ordered set. This measure is distinct in relation to the state x(t) of the primary scheme, as follows:

x(t)=hX(t)+ɛ(t),

where h is a nonlinear scalar-valued function, and ɛ is a random variable that constitutes the uncertainties and the presence of noise. It is usually assumed that ɛ(t) is strained from a Gaussian MF. The values of h|x(t)| define the time observations in the mid- and long-term. In contrast, for the random factor, the magnitude of epsilon was defined.

Step 3 (Finding similarity in sliding windows): By implementing the sliding windows, each data center was trained using observations from 0 to K – 1. The fuzzy inference system was trained with different significance levels to forecast the upper and lower bounds of the prediction intervals of S and P. In addition, to estimate the optimal values of S and P, we applied an accumulated cost matrix algorithm as C(s,p) function, where C(s,p) is the minimal interval of the distance measure between S and P. To compare two time-series, we need a local cost measure (LCM) and a local distance measure (LDM), as the functions

C:m×nR0.

To compute the LCM for each interval of the sequences S and P, the cost matrix can be considered as C = Rn×m distinct, where C(n,m) = C(Sn,Pm).

Step 4 (Best warping path): To find the best warping path between S and P, we need to find the path through the grid, as follows:

W=w1,w2,,wk,Wt=(it,jt),

where W is the warping function at time step t, and the total distance between (it, jt) has to be minimized. To evaluate the time-normalized distance measures of S and P, we propose the following equation:

D(S,P)=[t=1kd(Pt)·Wtt=1kWt],

where d(pt) is the distance between it and jt and, wt > 0 is the weighting factor. To calculate the best alignment path between S and P, we used

w0=argarg minW(D(S,P)).

Moreover, at that point, the reduction was calculated using the boundary conditions of the warping window, as follows:

it-jtr,

where r > 0 is the window length. At the end of this step, a weighting coefficient must be chosen. The time-normalized distance between S and P is

D(S,P)=minW[t=1kd(Wt)·Wtt=1kWt].

Step 5 (Measuring uncertainty): Uncertainty in a model affects the model’s confidence and accuracy. This paper presents a model for managing the uncertainty associated with time-series prediction by considering a distribution over data points. This distribution depends on the data points as follows: D = {X, Y }, where D is the distribution, and X, Y are the data points of the sliding windows, as follows: X = {x1, x2, ..., xn}, and Y = {y1, y2, ..., yn}. Therefore, the weight distribution after predicting the time-series can be expressed as follows:

p(X,Y),

where ! is the weight of the data points. To approximate this distribution, the Bernoulli rate equation must be calculated, as follows:

p(X,Y)Bern(ω;α),

where α is the Bernoulli rate on the weights. Hence, the model uncertainty is the variance of T Monte Carlo data points, as follows:

(y)=1Tt=1T(yt-y_)2,

where {yt}t=1T is a set of T outputs of the DT2FTW, as follows:

y_=1Ttyt.

Step 6 (Defuzzification): Defuzzification is a process by which elements in a fuzzy set are converted into crisps. The defuzzification process was performed to understand the DT2FTW output better. The output processing of the proposed model used the KM algorithm [30]. For this reason, the time steps (data points) must be assigned to a class with the highest membership. During defuzzification, we set up decision rules to guide the process by assigning time steps to crisp sets. A threshold value for MFs was defuzzified and assigned to the class with the highest membership in the last part. Consequently, a group of time-lagged values in dE must had delivered adequate data to rebuild the states of a noticeable complex and dynamical system. The centroid method was applied to the defuzzification process. The centroid C(A) of the DT2FTW model is the union of the centroids of all its embedded general forms, as follows [32–33]:

c(A)=i=1NxiμA(xi)i=1NμA(xi),CA˜Aec(Ae)=[c1(A˜),cr(A˜)],

where ∪ is the union operation, and

c1(A˜)=minAec(Ae),cr(A˜)=minAec(Ae)

where c1(Ã) and cr(Ã) can be expressed as follows:

c1(A˜)=i=1Lxiμ_A˜(xi)+i=L+1Nxiμ_A˜(xi)i=1Lμ_A˜(xi)+i=L+1NμA˜(xi),cr(A˜)=i=1RxiμA˜(xi)+i=R+1Nxiμ_A˜(xi)i=1RμA˜(xi)+i=R+1Nμ_A˜(xi),

where the switch points xL and xR, as well as c1(Ã) and cr(Ã), are computed using the KM algorithm [33].

### 4. Performance Evaluation

In this section, the evaluation of the proposed DT2FTW model is presented. First, the metrics for the performance measurements and datasets used in this study are explained. Then, the statistical results, a comparative study, and experimental results are discussed.

### 4.1 Datasets of This Study

This study used well-known existing time-series datasets, including the Mackey-Glass, NASDAQ, and Dow Jones time-series. The selected datasets represent a wide range of uncertainties through the time-series prediction procedure. The proposed DT2FTW model was complemented to acquire an innovative estimator of the predictions, which also permitted us to compute the uncertainties of predictions for noisy Mackey-Glass chaotic time-series. The test of the proposed model is a simulation of time-series data using the following form of the Mackey-Glass nonlinear delay differential equation

x(t)=0.2x(t-τ)1+x10(t-τ)-0.1x(t).

In addition, the NASDAQ is the leading United States electronic stock market. It lists around 3,300 companies. We applied 4,250 pairs of data points from the NASDAQ time-series corresponding to the window from 01/12/2018 to 01/12/2020; these data can be downloaded from Yahoo’s live daily data center. The first 3,025 pairs of the data points were used for training, while the remaining 1,225 pairs of the data points were used for validating the DT2FTW model. From the Dow Jones time-series, we used 1,250 pairs of data points, corresponding to the window from 01/03/2019 to 05/01/2020. These data can be downloaded from Yahoo’s live daily data center. The first 1,025 pairs of the data points were used for training, while the remaining 675 pairs of the data points were used for validating the DT2FTW model.

### 4.2 Performance Metrics

To evaluate the prediction error, MAE, RMSE, and MPE metrics were applied to the proposed DT2FTW model, as follows:

MAE=12t=1n(St-Pt),RMSE=12t=1n(St-Pt)2,MPE=100%n=t=1n(St-Pt)St,

where S denotes the real data points in the time-series, P denotes the DT2FTW predictions for the aggregation model in the final step of the prediction procedure, n is the number of data points, and t is the time variable of the time-series.

### 4.3 Statistical Evaluation

A T-test (left-tailed) was used for estimating the proficiency of the DT2FTW method and its robustness. The null hypothesis was H0 = μi > μj and H1: μi < μj, where μi and μj are the means of for the DT2FTW and the DTW models, for 100 different runs of the cross-validation technique, respectively (Eq. 36). The T-test results in Table 2 reveal the superiority of the proposed DT2FTW model for Mackey-Glass time-series prediction.

μi,μj=110k=110AUCj.

### 4.4 Comparative Study and Experimental Results

The results of learning the NASDAQ, Dow Jones, and Mackey-Glass time-series with added noise are detailed in Table 3, where the average RMSE curves and the acceptance ratios during search are shown in Figures 2 and 3, respectively. The obtained results show that the DT2FTW model achieves state-of-the-art performance for global benchmarks for uncertain time-series. These results show a significant difference between the DT2FTW model and its counterpart DTW model, clustering-DTW model [15], and the fuzzy deep artificial neural network (ANN) model [16]. The results, shown in Table 3, confirm that the proposed DT2FTW model performs better, with lower error rates in terms of the RMSE, MAE, and MPE metrics, for all scenarios.

The results of this study adhere to the uncertainty modeling theory, thus confirming that the proposed model is genuinely more reliable and accurate than its counterpart models in the literature. In addition, the results in Figure 2 show that the proposed DT2FTW model outperforms the clustering and fuzzy deep time-series models, for all scenarios, and that it also outperforms the classical DTW model. Figures 3 and 4 reveal that the proposed model better performs on realistic time series such as the Dow Jones and NASDAQ series, and performs better on Mackey-Glass time-series at different delay rates.

### 4.5 ROC Curve Analysis

ROC curve analysis was conducted for obtaining a reliable estimate of the DT2FTW model’s performance. The following equations were used for assessing the performance based on the ROC curve analysis of the proposed model. In addition, standard metrics, such as precision, recall, and F-measure, were used for evaluating the proposed DT2FTW model; those metrics were defined as follows:

Precision=TP(TP+FP)×100%,Recall=TP(TP+TN)×100,F-measure=2Precision*RecallPrecision+Recall.

Table 6 shows the comparison of ROC curve analysis results between the proposed DT2FTW model and its counterparts, revealing that the proposed DT2FTW model performs significantly better than its counterpart models. The ROC curve analysis results confirm that the proposed DT2FTW model performs 12% better than the fuzzy clustering model [15], and 4% better than the fuzzy deep ANN [16], in terms of the AUC measure.

### 4.6 Time Complexity Analysis

The time-series complexity was determined from the number of computation steps required for running an algorithm as a function of the input size. Time was measured in hours, minutes, and seconds (00:00:00). The results are shown in Tables 5 and 6.

Tables 5 and 6 show the results for all tests on the dataset that has been divided into three different scenarios. The DT2FTW model in the first, second, and third measurements for time complexity had a greater order in several runs with different datasets and different scenarios, namely the Mackey-Glass, Dow Jones, and NASDAQ time-series.

### 4.7 Discussion and Analysis

The proposed DT2FTW model has more degrees-of-freedom than the type-1 or other related models, because of the FOU parameters in T2FSs, and owing to its potential to model non-uniform time intervals. The proficiency of T2FSs has been proven for high-order uncertainties such as non-stationary events in time-series. In addition, in the proposed model, the problem of distortion of the time axis was addressed by the DTW algorithm alongside a type-2 fuzzy logic approach for modeling high-order uncertainties in time-series prediction using the footprint of uncertainty and unequal length of the time intervals. According to the results of the ROC curve analysis, the proposed DT2FTW model was 12% better than the fuzzy clustering model [15], and 4% better than the fuzzy deep ANN model [16], in terms of the AUC metric. Similarly, the experimental results confirmed that the proposed DT2FTW model has lower error rates than the current methods. The experimental results confirmed the superiority of the DT2FTW model in terms of the RMSE, MAE, and MPE metrics.

This study presented the DT2FTW model for predicting uncertain time-series. The DT2FTW model was evaluated by applying it to three global standard datasets. It is a generalized DTW algorithm that takes advantage of both DTW and type-2 fuzzy logic, optimal alignment between time instance paths, and prediction of high-order uncertainties. The experimental results confirm that the proposed DT2FTW model has lower error rates than other state-of-the-art algorithms

Fig. 1.

Block diagram of the proposed DT2FTW model.

Fig. 2.

Error rate comparison of the DT2FTW model with its counterparts, for global time series: (a) NASDAQ, (b) Dow Jones, and (c) Mackey-Glass.

Fig. 3.

Error rate of the DT2FTW model, for different noise levels: (a) = 1, (b) = 0.5, and (c) = 0.2, for the Mackey-Glass time series.

Fig. 4.

Prediction fit results of the DT2FTW model, for different datasets: (a) NASDAQ, (b) Dow Jones, and (c)Mackey-Glass.

Table. 1.

Table 1. Related fuzzy models for time-series prediction.

MethodLimitationAccuracy (%)
FCM time-series [17]No optimal parameter77.86
Fuzzy logic [18]No data reduction77.18
Fuzzy Markov [19]Complexity of model69.80
Fuzzy time-series [22]High-order uncertainty is not modeled77.00
Fuzzy neural [24]Weights are fixed79.24
Fuzzy-PSO [25]Limited to mid-range series80.12
Gustafson-Kessel fuzzy clustering [26]Insufficient results for long time-series82.93
PSO-fuzzy time-series [27]Not reliable for long time-series82.45
Fuzzy-NN time-series clustering [18]Time complexity is high84.73

Table. 2.

Table 2. T-test results for the DT2FTW and DTW.

Fold#DT2FTWDTW
10.90890.6414
20.90710.6149
30.90790.6212
40.91090.6311
50.91540.7063
60.92370.7195
70.93410.7431
80.94720.7621
90.94290.7901
100.95030.7811
Mean0.924840.70108

Table. 3.

Table 3. Comparison results for the DT2FTW model on the NASDAQ, Dow Jones, and Mackey-Glass data.

MethodNASDAQDow JonesMackey-Glass
MAERMSEMPEMAERMSEMPEMAERMSEMPE
Fuzzy clustering time-series [15]0.0290.0351.590.3200.0371.710.0270.0261.41
Fuzzy deep ANN time-series [16]0.0190.0191.520.0240.0281.640.0210.0171.39
DT2FTW0.0150.0131.390.0190.0171.410.0110.0091.19

Table. 4.

Table 4. Comparison of average performance of the DT2FTW model with its counterpart models (unit: %).

MethodAUCCIRecallPrecisionF-measure
Fuzzy-clustering8280–83828081
Fuzzy deep ANN9088–91929190
DT2FTW9492–95949593

Table. 5.

Table 5. Average complexity of the proposed DT2FTW model.

Pseudo-codeAverage complexity
Class type(6 * O(1)+O(N * M)+O(N2)
Training time(8 * O(1) + O(N2))
Calculating the outputO(N)
Calculating the errorO(1) + O(M * N)
The outputO(1) + O(N)

Table. 6.

Table 6. Time consumption of the DT2FTW model, for the three datasets.

SamplesMackey-GlassDow JonesNASDAQ
360:00:010:00:080:00:14
480:00:010:00:090:00:23
600:00:020:00:120:00:29
1120:00:040:00:210:00:37
2240:00:080:00:320:01:05
4480:00:160:00:550:02:23
1,2000:00:290:01:220:03:37

1. Wang, X, and Wang, C (2020). Time series data cleaning: a survey. IEEE Access. 8, 1866-1881. https://doi.org/10.1109/ACCESS.2019.2962152
2. Wang, F, Li, M, Mei, Y, and Li, W (2020). Time series data mining: a case study with big data analytics approach. IEEE Access. 8, 14322-14328. https://doi.org/10.1109/ACCESS.2020.2966553
3. Kanungsukkasem, N, and Leelanupab, T (2019). Financial latent Dirichlet allocation (FinLDA): feature extraction in text and data mining for financial time series prediction. IEEE Access. 7, 71645-71664. https://doi.org/10.1109/ACCESS.2019.2919993
4. Stoffer, DS, and Ombao, H (2012). Special issue on time series analysis in the biological sciences. Journal of Time Series Analysis. 33, 701-703. https://doi.org/10.1111/j.1467-9892.2012.00805.x
5. Topol, EJ (2019). High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine. 25, 44-56. https://doi.org/10.1038/s41591-018-0300-7
6. Bengio, Y, Courville, A, and Vincent, P (2013). Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence. 35, 1798-1828. https://doi.org/10.1109/TPAMI.2013.50
7. Bose, JH, Flunkert, V, Gasthaus, J, Januschowski, T, Lange, D, Salinas, D, Schelter, S, Seeger, M, and Wang, Y (2017). Probabilistic demand forecasting at scale. Proceedings of the VLDB Endowment. 10, 1694-1705. https://doi.org/10.14778/3137765.3137775
8. Sakoe, H, and Chiba, S (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing. 26, 43-49. https://doi.org/10.1109/TASSP.1978.1163055
9. Tappert, CC, Suen, CY, and Wakahara, T (1990). The state of the art in online handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 12, 787-808. https://doi.org/10.1109/34.57669
10. Kuzmanic, A, and Zanchi, V . Hand shape classification using DTW and LCSS as similarity measures for vision-based gesture recognition system., Proceedings of 2007 International Conference on Computer as a Tool (EUROCON), 2007, Warsaw, Poland, Array, pp.264-269. https://doi.org/10.1109/EURCON.2007.4400350
11. Niennattrakul, V, and Ratanamahatana, CA . On clustering multimedia time series data using k-means and dynamic time warping., Proceedings of 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE), 2007, Seoul, Korea, Array, pp.733-738. https://doi.org/10.1109/MUE.2007.165
12. Bahlmann, C, and Burkhardt, H (2004). The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping. IEEE Transactions on Pattern Analysis and Machine Intelligence. 26, 299-310. https://doi.org/10.1109/TPAMI.2004.1262308
13. Kahveci, T, Singh, A, and Gurel, A . Similarity searching for multi-attribute sequences., Proceedings 14th International Conference on Scientific and Statistical Database Management, 2002, Edinburgh, UK, Array, pp.175-184. https://doi.org/10.1109/SSDM.2002.1029718
14. Woo, H, Boccelli, DL, Uber, JG, Janke, R, and Su, Y (2019). Dynamic time warping for quantitative analysis of tracer study time-series water quality data. Journal of Water Resources Planning and Management. 145. article no 04019052
15. Zhang, Y, Qu, H, Wang, W, and Zhao, J (2020). A novel fuzzy time series forecasting model based on multiple linear regression and time series clustering. Mathematical Problems in Engineering. 2020. article no 9546792
16. Rosato, A, and Panella, M . Time series prediction using random weights fuzzy neural networks., Proceedings of 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2020, Array, pp.1-6. https://doi.org/10.1109/FUZZ48607.2020.9177651
17. Egrioglu, E, Aladag, CH, and Yolcu, U (2013). Fuzzy time series forecasting with a novel hybrid approach combining fuzzy c-means and neural networks. Expert Systems with Applications. 40, 854-857. https://doi.org/10.1016/j.eswa.2012.05.040
18. Chen, SM (1996). Forecasting enrollments based on fuzzy time series. Fuzzy Sets and Systems. 81, 311-319. https://doi.org/10.1016/0165-0114(95)00220-0
19. Ramadani, K, and Devianto, D (2020). The forecasting model of Bitcoin price with fuzzy time series Markov chain and Chen logical method. AIP Conference Proceedings. 2296. article no 020095
20. Cheng, CH, Cheng, GW, and Wang, JW (2008). Multi-attribute fuzzy time series method based on fuzzy clustering. Expert Systems with Applications. 34, 1235-1242. https://doi.org/10.1016/j.eswa.2006.12.013
21. Cheng, CH, Chen, TL, Teoh, HJ, and Chiang, CH (2008). Fuzzy time-series based on adaptive expectation model for TAIEX forecasting. Expert Systems with Applications. 34, 1126-1132. https://doi.org/10.1016/j.eswa.2006.12.021
22. Tsaur, RC, Yang, JCO, and Wang, HF (2005). Fuzzy relation analysis in fuzzy time series model. Computers & Mathematics with Applications. 49, 539-548. https://doi.org/10.1016/j.camwa.2004.07.014
23. Singh, SR (2007). A simple method of forecasting based on fuzzy time series. Applied Mathematics and Computation. 186, 330-339. https://doi.org/10.1016/j.amc.2006.07.128
24. Aladag, CH, Basaran, MA, Egrioglu, E, Yolcu, U, and Uslu, VR (2009). Forecasting in high order fuzzy times series by using neural networks to define fuzzy relations. Expert Systems with Applications. 36, 4228-4231. https://doi.org/10.1016/j.eswa.2008.04.001
25. Aladag, CH, Yolcu, U, Egrioglu, E, and Dalar, AZ (2012). A new time invariant fuzzy time series forecasting method based on particle swarm optimization. Applied Soft Computing. 12, 3291-3299. https://doi.org/10.1016/j.asoc.2012.05.002
26. Egrioglu, E, Aladag, CH, Yolcu, U, Uslu, VR, and Erilli, NA (2011). Fuzzy time series forecasting method based on Gustafson-Kessel fuzzy clustering. Expert Systems with Applications. 38, 10355-10357. https://doi.org/10.1016/j.eswa.2011.02.052
27. Egrioglu, E (2014). PSO-based high order time invariant fuzzy time series method: application to stock exchange data. Economic Modelling. 38, 633-639. https://doi.org/10.1016/j.econmod.2014.02.017
28. Corradini, A . Dynamic time warping for off-line recognition of a small gesture vocabulary., Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 2001, Vancouver, Canada, Array, pp.82-89. https://doi.org/10.1109/RATFG.2001.938914
29. Wu, D, and Mendel, JM (2019). Recommendations on designing practical interval type-2 fuzzy systems. Engineering Applications of Artificial Intelligence. 85, 182-193. https://doi.org/10.1016/j.engappai.2019.06.012
30. Mendel, JM (2002). On KM algorithms for solving type-2 fuzzy set problems. IEEE Transactions on Fuzzy Systems. 21, 426-446. https://doi.org/10.1109/TFUZZ.2012.2227488

Aref Safari received his M.Sc. degree in artificial intelligence. He is now working on his Ph.D. thesis to model the uncertainty of non-stationary time-series. His research interests are soft computing, time-series analysis, and pattern analysis.

E-mail: safari.aref@gmail.com

Rahil Hosseini received her Ph.D. degree in computational Iintelligence from Kingston University. She is a faculty member at Department of Computer Engineering. Her main research interests include pattern recognition, fuzzy modeling and data mining.

E-mail: rahilhosseini@gmail.com

Mahdi Mazinani received his Ph.D. degree in electrical engineering from Kingston University. Currently, he is a faculty member of the Department of Electrical Engineering. His main research interests include probability theory and pattern recognition.

E-mail: mahdi mazinani@yahoo.com

### Article

#### Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2021; 21(4): 338-348

Published online December 25, 2021 https://doi.org/10.5391/IJFIS.2021.21.4.338

## Dynamic Type-2 Fuzzy Time Warping (DT2FTW): A Hybrid Model for Uncertain Time-Series Prediction

Aref Safari1, Rahil Hosseini1 , and Mahdi Mazinani2

1Department of Computer Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
2Department of Electronic Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran

Correspondence to:Rahil Hosseini (rahil.hosseini@qodsiau.ac.ir)

Received: April 4, 2020; Revised: June 22, 2021; Accepted: August 31, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

### Abstract

Prediction of time series is associated with nondeterministic pattern analysis for uncertain conditions. Therefore, it is necessary to develop high-quality prediction methods for real-world applications. Type-2 fuzzy systems can handle high-order uncertainties, such as sequential dependencies associated with time series. Precise and reliable prediction can help to develop reasonable strategies and assist specialists in planning the best policies for modeling events in uncertain time series. In this study, a hybrid model (dynamic type-2 fuzzy time warping [DT2FTW]) was proposed for handling high-order uncertainties in time-series prediction. A type-2 fuzzy intelligent system was developed alongside a dynamic time warping algorithm for predicting the patterns’ similarity in long-time series for time-series prediction. The results demonstrate that the proposed DT2FTW model yields more reliable predictions on global standard benchmarks such as the Mackey-Glass, Dow Jones, and NASDAQ time-series. The results also confirm that the proposed DT2FTW model has lower error rates than its counterpart algorithms in terms of the root mean square error (RMSE), mean absolute error (MAE), and mean percentage error (MPE). In addition, the results confirm the superiority of the proposed model with an average area under the ROC curve (AUC) of 94%, with the 95% confidence interval (92%-95%).

Keywords: Dynamic time warping, Interval type-2 fuzzy system, Time-series prediction

### 1. Introduction

Modeling of time-series is important because many pattern-analysis problems contain the time component. These problems are typically not addressed, because the time dependence makes time-series-related problems difficult to handle. One of the most challenging issues associated with time-series is their prediction. Prediction problems are often classified into short-term, medium-term, and long-term, and are characterized by different orders of uncertainty. The uncertainty associated with time-series data is implicit, with a nonlinear pattern. On the other hand, unreliable accuracy is a major issue in time-series predictions. Different time-series models, such as fuzzy time-series, have been considered for improving the prediction accuracy.

Many intelligent models have been used for analyzing time-series patterns. Recently, the use of soft computing approaches, such as fuzzy logic, neural networks, simulated annealing, and genetic algorithms, has been reported in the literature on the time-series prediction. These approaches have been considered advantageous compared with traditional methods, because they can address nonlinearities and can approximate many types of complex dynamical systems better than linear statistical models. Fuzzy logic models have been widely adopted because of the prevalent uncertainty in the time-series data. Most related studies use the Euclidean distance metric for measuring time-series intervals. The Euclidean distance metric is widely known to be very sensitive to distortions along the time axis, but it has been very popular with many researchers. The ubiquity of the Euclidean distance metric in the face of increasing evidence of its poor accuracy for time-series prediction is almost certainly owing to its ease of implementation and its time and space efficiency. However, in our work, the problem of distortion along the time axis is addressed by dynamic time warping (DTW) alongside a type-2 fuzzy logic approach, for modeling high-order uncertainties in time-series prediction based on the footprint of uncertainty and unequal lengths of time intervals.

### 1.1 Literature Review

Analysis and prediction of time-series belong to the field of temporal pattern recognition. A time-series corresponds to a stretch of values on the same scale, indexed by a naturally occurring time, as encountered in many applications in the engineering, ecology, economy, medicine, and finance fields [1]. Five concepts are important for time-series: the starting time, pattern similarity, period range, confidence, and the endpoint time. Many techniques [28] have been applied for time-series prediction based on the period and data type. The DTW model has been effectively used to automatically deal with time deformations in different time-series ranges with time-dependent data, for pattern recognition and similarity analysis. The DTW approach is currently used in many areas, including online signature matching and handwriting recognition [9], gestures and sign language recognition [10], knowledge discovery, data mining and clustering [11], pattern recognition and data analysis [12], and signal processing [13,14]. In addition, in many real-world applications temporal problems are complex, uncertain, and chaotic [15]. Fuzzy logic is one of the most effective methods for handling uncertainties in dynamic and non-stationary environments [16]. Table 1 lists various hybrid fuzzy models that have been used for predicting time-series [1722]. Fuzzy systems, especially hybrid fuzzy models, have been very promising for solving complex problems, where a model estimates and predicts the similarity between two time-series in uncertain conditions [18,2327].

### 1.2 Highlights

The main objective of this study was to introduce an intelligent dynamic type-2 fuzzy time warping (DT2FTW) model for predicting long-term temporal data in realistic time-series. In this work, the proposed model aims to overcome the drawbacks of the existing methods and offer a more robust, reliable, and accurate model for predicting long time-series using type-2 fuzzy logic. The model is split into two parts: 1) high-order type-2 fuzzy logic-based time-series prediction and 2) DTW. The type-2 time-series prediction consists of several steps, and we applied the operators of Karnik-Mendel (KM) algorithm for defuzzification. Complex min-max composition operators were applied to all predictions. Then, the prediction performance was evaluated using the root mean square error (RMSE), mean absolute error (MAE), and mean percentage error (MPE) metrics, along with the statistical evaluation using the left-tailed T-test.

The remainder of this paper is organized as follows: Section 2 presents the theoretical research background and materials of the proposed model. Section 3 describes the detailed structure of the proposed model. Performance evaluation and experimental results are presented in Section 4, and the paper is concluded in Section 5.

### 2. Theoretical Background

This section presents a brief overview of DTW, followed by an overview of the concept of the interval type-2 fuzzy set (IT2FS). Finally, relevant mathematical expressions are provided.

### 2.1 DTW

To obtain a reliable time-series model for real-time applications, DTW and event-DTW (E-DTW) models were applied in [28], for predicting the optimal distance measure between two related time-series. Since the time is aggregated to represent a reasonable estimate, patterns may match a wide variety of actual time-series. Specifically, the pattern-detection task involves searching for a time-series, S, for a pattern, P.

$S=S1, S2,…, Si,…, Sn,$$P=P1, P2,…, Pj,…, Pn.$

Sequences S and P can be organized to form an m × n grid or plane, where each point in the plane, (i, j), corresponds to an alignment between elements Si and Pj. Therefore, W, a warping path, maps the elements of S and P; this path is called the “distance” between the elements, which must be minimized to yield lower uncertainty.

$W=w1, w2,…,wk,$

where W is the sequence of grid points, where each wk corresponds to a point (i, j)k. To formulate the DTW algorithm in a time-series problem, we need a distance measure between two elements. The distance function δ is the degree of the difference in the square of the difference between two data points, as follows:

$δ(i,j)=∣Si-Pj∣,$$δ(i,j)=(Si-Pj)2.$

The cumulative distance for each path is as follows:

$DTW(S,P)=minW [∑ktδ(Wk)],$

where δ is the distance measure between two time-series elements. In addition, there are three constraints on an arbitrary warping path W = {w1, w2, . . . , wk}. Element wkW represents the mapping between xi and yj, that is, Wk ∈ (S, P). The three constraints cause element Wk+1 only to appear only at the three adjacent positions.

### 2.2 An Overview of Interval Type-2 Fuzzy Sets

This section provides definitions of type-2 fuzzy sets (T2FSs) and related but essential concepts. The membership function (MF) of a T2FS of a given element is itself a type-1 fuzzy set (T1FS). A T2FS represented as Ã is characterized by a type-2 MF μÃ(x,u) where xX and uJx ⊆ [0, 1], that is [29]:

$A˜={((x,u), μA˜(x,u))∣∀x∈X,∀∈Jx⊆[0,1]},$

where 0 ≤ μÃ(x,u) ≤ 1, X is the domain of the fuzzy set, and Jx is the domain of the secondary MF at x. Ã is [29]:

$A˜=∫x∈X∫u∈JxμA˜(x,u)x,uJx⊆[0,1],$

where ∫∫ represents a union over the admissible x and u

$A˜=∫x∈X∫u∈Jx1x,u=∫x∈X[∫u∈Jx1u]x,$

where x is the primary variable, Jx, the interval in [0, 1], is the primary MF of x, u is the secondary variable, and ∫uJx is the secondary MF at x. Uncertainty about Ã is addressed by the union of all of the primary memberships, called the footprint of uncertainty (FOU) of Ã, that is, [$FOU(A)˜$], and given as [30]:

$FOU(A)˜=∪x∈XJx.$

The FOU for a Gaussian primary MF with an uncertain standard deviation is shown in Figure 3. The FOU is bounded by an upper bound membership function (UMF) μÃ (x), and a lower bound membership function (LMF) μÃ (x), which are T1FSs; consequently, the membership grade of each element of an IT2FS is identified by an interval of [μÃ (x), μÃ (x)]. In the IT2FSs, the UMF and LMF represent the uncertainties of the input variables compared with the T1FSs. In addition, the FOU in the IT2FSs provides more degrees of freedom [29].

### 3. The Proposed DT2FTW Model

The block diagram of the proposed DT2FTW model is shown in Figure 1. The fuzzifier can be categorized into two types, singleton and non-singleton, according to the number of non-zero MF values, and defines the membership grade of the input. In this study, the TSK fuzzy rule type was considered more precise than the Mamdani rules. In this study, a singleton fuzzifier was implemented. The output processing, including the type-reducer and defuzzifier, generates a crisp output. For this step, the KM algorithm [30] was applied in this study. The proposed steps in the DT2FTW warping path prediction model are described below.

Step 1 (Fuzzification): Fuzzification is the first step in the proposed DT2FTW model. According to the number of non-zero MF values and inputs, the proposed architecture’s fuzzifier can be categorized into two types: singleton and non-singleton. In this study, the TSK fuzzy rule type was considered. The product and minimum t-norms were introduced as acceptable inference methods for computing the firing strength of multiple antecedents. The membership degree of the input data or measurements of IT2FSs was defined. The Gaussian MF followed the normal distribution of the Gaussian function. The Gaussian MF assumed that the time steps are typically distributed. In this study, the Gaussian MF was applied to compute the fuzzy memberships from the normalized DT2FTW distances. The Gaussian MF was used because most real-world variables follow a standard normal distribution with a gradual change around a mean value. The Gaussian MF is expressed as follows:

$μ^i=exp exp (-(xi-x_c)22σc2),$

where μ̂i is the membership function (or membership grade) of the Gaussian MF, xi is the normalized DT2FTW distance of the class query sequence, and xc and σc stand for the mean and standard deviation, respectively, of the normalized DT2FTW distances. Eq. (15) was used for normalizing the fuzzy memberships

$μi=μ^i∑iμ^i,$

where μi is the normalized fuzzy membership degree, and μ̂i is the original fuzzy membership.

Step 2 (Inference): To design the inference systems of the DT2FTW model, we applied a TSK type with three inputs and one output; therefore, each input-output variable corresponds to Gaussian MFs and has 2n if-then rules. In this study, 2n optimal fuzzy rules were applied to the experiments, for improving performance and for minimizing the prediction error rates in different scenarios. As mentioned above, the main goal of DTW is to equate two time-dependent sequences: S = (S1, S2, ..., Si, ..., SN).

The mentioned sequences may well be discrete time-series or feature sequences sampled at intermediate points under chaotic and non-stationary conditions. As a predictable value of S relies on S(tk) which is the past value of S, there are s antecedents in every rule. According to Eqs. (1)(3), X(t) is the observed series at time t, T(t) represents the trend factor, S(t) represents the seasonal factor, and IT denotes irregular or random factor white noise components with zero mean and constant variance. Hence, an uncertain time-series condition is used for regulating the arrangement between the S and P time patterns in an additive form of the time-series. Therefore, this development can be defined by Eq. (16)

$X(t+1)=F(x(t)),$

where X(t+1)Rd is the condition of the system at time step t+1, and F is a nonlinear vector-valued function of the additive form of different types of time-series. On the other hand, x(t), t = 1, ..., N, is a time-ordered set. This measure is distinct in relation to the state x(t) of the primary scheme, as follows:

$x(t)=h∣X(t)∣+ɛ(t),$

where h is a nonlinear scalar-valued function, and ɛ is a random variable that constitutes the uncertainties and the presence of noise. It is usually assumed that ɛ(t) is strained from a Gaussian MF. The values of h|x(t)| define the time observations in the mid- and long-term. In contrast, for the random factor, the magnitude of epsilon was defined.

Step 3 (Finding similarity in sliding windows): By implementing the sliding windows, each data center was trained using observations from 0 to K – 1. The fuzzy inference system was trained with different significance levels to forecast the upper and lower bounds of the prediction intervals of S and P. In addition, to estimate the optimal values of S and P, we applied an accumulated cost matrix algorithm as C(s,p) function, where C(s,p) is the minimal interval of the distance measure between S and P. To compare two time-series, we need a local cost measure (LCM) and a local distance measure (LDM), as the functions

$C:m×n→R≥0.$

To compute the LCM for each interval of the sequences S and P, the cost matrix can be considered as C = Rn×m distinct, where C(n,m) = C(Sn,Pm).

Step 4 (Best warping path): To find the best warping path between S and P, we need to find the path through the grid, as follows:

$W=w1, w2,…,wk,$$Wt=(it,jt),$

where W is the warping function at time step t, and the total distance between (it, jt) has to be minimized. To evaluate the time-normalized distance measures of S and P, we propose the following equation:

$D(S,P)=[∑t=1kd(Pt)·Wt∑t=1kWt],$

where d(pt) is the distance between it and jt and, wt > 0 is the weighting factor. To calculate the best alignment path between S and P, we used

$w0=argarg minW(D(S,P)).$

Moreover, at that point, the reduction was calculated using the boundary conditions of the warping window, as follows:

$∣it-jt∣ ≤r,$

where r > 0 is the window length. At the end of this step, a weighting coefficient must be chosen. The time-normalized distance between S and P is

$D(S,P)=minW [∑t=1kd(Wt)·Wt∑t=1kWt].$

Step 5 (Measuring uncertainty): Uncertainty in a model affects the model’s confidence and accuracy. This paper presents a model for managing the uncertainty associated with time-series prediction by considering a distribution over data points. This distribution depends on the data points as follows: D = {X, Y }, where D is the distribution, and X, Y are the data points of the sliding windows, as follows: X = {x1, x2, ..., xn}, and Y = {y1, y2, ..., yn}. Therefore, the weight distribution after predicting the time-series can be expressed as follows:

$p(X,Y),$

where ! is the weight of the data points. To approximate this distribution, the Bernoulli rate equation must be calculated, as follows:

$p(X,Y)≈Bern(ω; α),$

where α is the Bernoulli rate on the weights. Hence, the model uncertainty is the variance of T Monte Carlo data points, as follows:

$(y)=1T∑t=1T(yt-y_)2,$

where ${yt}t=1T$ is a set of T outputs of the DT2FTW, as follows:

$y_=1T∑tyt.$

Step 6 (Defuzzification): Defuzzification is a process by which elements in a fuzzy set are converted into crisps. The defuzzification process was performed to understand the DT2FTW output better. The output processing of the proposed model used the KM algorithm [30]. For this reason, the time steps (data points) must be assigned to a class with the highest membership. During defuzzification, we set up decision rules to guide the process by assigning time steps to crisp sets. A threshold value for MFs was defuzzified and assigned to the class with the highest membership in the last part. Consequently, a group of time-lagged values in dE must had delivered adequate data to rebuild the states of a noticeable complex and dynamical system. The centroid method was applied to the defuzzification process. The centroid C(A) of the DT2FTW model is the union of the centroids of all its embedded general forms, as follows [32–33]:

$c(A)=∑i=1NxiμA(xi)∑i=1NμA(xi),$$CA˜≡∪∀Aec(Ae)=[c1(A˜), cr(A˜)],$

where ∪ is the union operation, and

$c1(A˜)=min∀Aec(Ae),$$cr(A˜)=min∀Aec(Ae)$

where c1(Ã) and cr(Ã) can be expressed as follows:

$c1(A˜)=∑i=1Lxiμ_A˜(xi)+∑i=L+1Nxiμ_A˜(xi)∑i=1Lμ_A˜(xi)+∑i=L+1NμA˜(xi),$$cr(A˜)=∑i=1RxiμA˜(xi)+∑i=R+1Nxiμ_A˜(xi)∑i=1RμA˜(xi)+∑i=R+1Nμ_A˜(xi),$

where the switch points xL and xR, as well as c1(Ã) and cr(Ã), are computed using the KM algorithm [33].

### 4. Performance Evaluation

In this section, the evaluation of the proposed DT2FTW model is presented. First, the metrics for the performance measurements and datasets used in this study are explained. Then, the statistical results, a comparative study, and experimental results are discussed.

### 4.1 Datasets of This Study

This study used well-known existing time-series datasets, including the Mackey-Glass, NASDAQ, and Dow Jones time-series. The selected datasets represent a wide range of uncertainties through the time-series prediction procedure. The proposed DT2FTW model was complemented to acquire an innovative estimator of the predictions, which also permitted us to compute the uncertainties of predictions for noisy Mackey-Glass chaotic time-series. The test of the proposed model is a simulation of time-series data using the following form of the Mackey-Glass nonlinear delay differential equation

$x(t)=0.2x(t-τ)1+x10(t-τ)-0.1x(t).$

In addition, the NASDAQ is the leading United States electronic stock market. It lists around 3,300 companies. We applied 4,250 pairs of data points from the NASDAQ time-series corresponding to the window from 01/12/2018 to 01/12/2020; these data can be downloaded from Yahoo’s live daily data center. The first 3,025 pairs of the data points were used for training, while the remaining 1,225 pairs of the data points were used for validating the DT2FTW model. From the Dow Jones time-series, we used 1,250 pairs of data points, corresponding to the window from 01/03/2019 to 05/01/2020. These data can be downloaded from Yahoo’s live daily data center. The first 1,025 pairs of the data points were used for training, while the remaining 675 pairs of the data points were used for validating the DT2FTW model.

### 4.2 Performance Metrics

To evaluate the prediction error, MAE, RMSE, and MPE metrics were applied to the proposed DT2FTW model, as follows:

$MAE=12∑t=1n∣(St-Pt)∣,$$RMSE=12∑t=1n(St-Pt)2,$$MPE=100%n=∑t=1n(St-Pt)St,$

where S denotes the real data points in the time-series, P denotes the DT2FTW predictions for the aggregation model in the final step of the prediction procedure, n is the number of data points, and t is the time variable of the time-series.

### 4.3 Statistical Evaluation

A T-test (left-tailed) was used for estimating the proficiency of the DT2FTW method and its robustness. The null hypothesis was H0 = μi > μj and H1: μi < μj, where μi and μj are the means of for the DT2FTW and the DTW models, for 100 different runs of the cross-validation technique, respectively (Eq. 36). The T-test results in Table 2 reveal the superiority of the proposed DT2FTW model for Mackey-Glass time-series prediction.

$μi,μj=110∑k=110AUCj.$

### 4.4 Comparative Study and Experimental Results

The results of learning the NASDAQ, Dow Jones, and Mackey-Glass time-series with added noise are detailed in Table 3, where the average RMSE curves and the acceptance ratios during search are shown in Figures 2 and 3, respectively. The obtained results show that the DT2FTW model achieves state-of-the-art performance for global benchmarks for uncertain time-series. These results show a significant difference between the DT2FTW model and its counterpart DTW model, clustering-DTW model [15], and the fuzzy deep artificial neural network (ANN) model [16]. The results, shown in Table 3, confirm that the proposed DT2FTW model performs better, with lower error rates in terms of the RMSE, MAE, and MPE metrics, for all scenarios.

The results of this study adhere to the uncertainty modeling theory, thus confirming that the proposed model is genuinely more reliable and accurate than its counterpart models in the literature. In addition, the results in Figure 2 show that the proposed DT2FTW model outperforms the clustering and fuzzy deep time-series models, for all scenarios, and that it also outperforms the classical DTW model. Figures 3 and 4 reveal that the proposed model better performs on realistic time series such as the Dow Jones and NASDAQ series, and performs better on Mackey-Glass time-series at different delay rates.

### 4.5 ROC Curve Analysis

ROC curve analysis was conducted for obtaining a reliable estimate of the DT2FTW model’s performance. The following equations were used for assessing the performance based on the ROC curve analysis of the proposed model. In addition, standard metrics, such as precision, recall, and F-measure, were used for evaluating the proposed DT2FTW model; those metrics were defined as follows:

$Precision=TP(TP+FP)×100%,$$Recall=TP(TP+TN)×100,$$F-measure=2Precision*RecallPrecision+Recall.$

Table 6 shows the comparison of ROC curve analysis results between the proposed DT2FTW model and its counterparts, revealing that the proposed DT2FTW model performs significantly better than its counterpart models. The ROC curve analysis results confirm that the proposed DT2FTW model performs 12% better than the fuzzy clustering model [15], and 4% better than the fuzzy deep ANN [16], in terms of the AUC measure.

### 4.6 Time Complexity Analysis

The time-series complexity was determined from the number of computation steps required for running an algorithm as a function of the input size. Time was measured in hours, minutes, and seconds (00:00:00). The results are shown in Tables 5 and 6.

Tables 5 and 6 show the results for all tests on the dataset that has been divided into three different scenarios. The DT2FTW model in the first, second, and third measurements for time complexity had a greater order in several runs with different datasets and different scenarios, namely the Mackey-Glass, Dow Jones, and NASDAQ time-series.

### 4.7 Discussion and Analysis

The proposed DT2FTW model has more degrees-of-freedom than the type-1 or other related models, because of the FOU parameters in T2FSs, and owing to its potential to model non-uniform time intervals. The proficiency of T2FSs has been proven for high-order uncertainties such as non-stationary events in time-series. In addition, in the proposed model, the problem of distortion of the time axis was addressed by the DTW algorithm alongside a type-2 fuzzy logic approach for modeling high-order uncertainties in time-series prediction using the footprint of uncertainty and unequal length of the time intervals. According to the results of the ROC curve analysis, the proposed DT2FTW model was 12% better than the fuzzy clustering model [15], and 4% better than the fuzzy deep ANN model [16], in terms of the AUC metric. Similarly, the experimental results confirmed that the proposed DT2FTW model has lower error rates than the current methods. The experimental results confirmed the superiority of the DT2FTW model in terms of the RMSE, MAE, and MPE metrics.

### 5. Conclusion

This study presented the DT2FTW model for predicting uncertain time-series. The DT2FTW model was evaluated by applying it to three global standard datasets. It is a generalized DTW algorithm that takes advantage of both DTW and type-2 fuzzy logic, optimal alignment between time instance paths, and prediction of high-order uncertainties. The experimental results confirm that the proposed DT2FTW model has lower error rates than other state-of-the-art algorithms

### Fig 1.

Figure 1.

Block diagram of the proposed DT2FTW model.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 338-348https://doi.org/10.5391/IJFIS.2021.21.4.338

### Fig 2.

Figure 2.

Error rate comparison of the DT2FTW model with its counterparts, for global time series: (a) NASDAQ, (b) Dow Jones, and (c) Mackey-Glass.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 338-348https://doi.org/10.5391/IJFIS.2021.21.4.338

### Fig 3.

Figure 3.

Error rate of the DT2FTW model, for different noise levels: (a) = 1, (b) = 0.5, and (c) = 0.2, for the Mackey-Glass time series.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 338-348https://doi.org/10.5391/IJFIS.2021.21.4.338

### Fig 4.

Figure 4.

Prediction fit results of the DT2FTW model, for different datasets: (a) NASDAQ, (b) Dow Jones, and (c)Mackey-Glass.

The International Journal of Fuzzy Logic and Intelligent Systems 2021; 21: 338-348https://doi.org/10.5391/IJFIS.2021.21.4.338

Related fuzzy models for time-series prediction.

MethodLimitationAccuracy (%)
FCM time-series [17]No optimal parameter77.86
Fuzzy logic [18]No data reduction77.18
Fuzzy Markov [19]Complexity of model69.80
Fuzzy time-series [22]High-order uncertainty is not modeled77.00
Fuzzy neural [24]Weights are fixed79.24
Fuzzy-PSO [25]Limited to mid-range series80.12
Gustafson-Kessel fuzzy clustering [26]Insufficient results for long time-series82.93
PSO-fuzzy time-series [27]Not reliable for long time-series82.45
Fuzzy-NN time-series clustering [18]Time complexity is high84.73

T-test results for the DT2FTW and DTW.

Fold#DT2FTWDTW
10.90890.6414
20.90710.6149
30.90790.6212
40.91090.6311
50.91540.7063
60.92370.7195
70.93410.7431
80.94720.7621
90.94290.7901
100.95030.7811
Mean0.924840.70108

Comparison results for the DT2FTW model on the NASDAQ, Dow Jones, and Mackey-Glass data.

MethodNASDAQDow JonesMackey-Glass
MAERMSEMPEMAERMSEMPEMAERMSEMPE
Fuzzy clustering time-series [15]0.0290.0351.590.3200.0371.710.0270.0261.41
Fuzzy deep ANN time-series [16]0.0190.0191.520.0240.0281.640.0210.0171.39
DT2FTW0.0150.0131.390.0190.0171.410.0110.0091.19

Comparison of average performance of the DT2FTW model with its counterpart models (unit: %).

MethodAUCCIRecallPrecisionF-measure
Fuzzy-clustering8280–83828081
Fuzzy deep ANN9088–91929190
DT2FTW9492–95949593

Average complexity of the proposed DT2FTW model.

Pseudo-codeAverage complexity
Class type(6 * O(1)+O(N * M)+O(N2)
Training time(8 * O(1) + O(N2))
Calculating the outputO(N)
Calculating the errorO(1) + O(M * N)
The outputO(1) + O(N)

Time consumption of the DT2FTW model, for the three datasets.

SamplesMackey-GlassDow JonesNASDAQ
360:00:010:00:080:00:14
480:00:010:00:090:00:23
600:00:020:00:120:00:29
1120:00:040:00:210:00:37
2240:00:080:00:320:01:05
4480:00:160:00:550:02:23
1,2000:00:290:01:220:03:37

### References

1. Wang, X, and Wang, C (2020). Time series data cleaning: a survey. IEEE Access. 8, 1866-1881. https://doi.org/10.1109/ACCESS.2019.2962152
2. Wang, F, Li, M, Mei, Y, and Li, W (2020). Time series data mining: a case study with big data analytics approach. IEEE Access. 8, 14322-14328. https://doi.org/10.1109/ACCESS.2020.2966553
3. Kanungsukkasem, N, and Leelanupab, T (2019). Financial latent Dirichlet allocation (FinLDA): feature extraction in text and data mining for financial time series prediction. IEEE Access. 7, 71645-71664. https://doi.org/10.1109/ACCESS.2019.2919993
4. Stoffer, DS, and Ombao, H (2012). Special issue on time series analysis in the biological sciences. Journal of Time Series Analysis. 33, 701-703. https://doi.org/10.1111/j.1467-9892.2012.00805.x
5. Topol, EJ (2019). High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine. 25, 44-56. https://doi.org/10.1038/s41591-018-0300-7
6. Bengio, Y, Courville, A, and Vincent, P (2013). Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence. 35, 1798-1828. https://doi.org/10.1109/TPAMI.2013.50
7. Bose, JH, Flunkert, V, Gasthaus, J, Januschowski, T, Lange, D, Salinas, D, Schelter, S, Seeger, M, and Wang, Y (2017). Probabilistic demand forecasting at scale. Proceedings of the VLDB Endowment. 10, 1694-1705. https://doi.org/10.14778/3137765.3137775
8. Sakoe, H, and Chiba, S (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing. 26, 43-49. https://doi.org/10.1109/TASSP.1978.1163055
9. Tappert, CC, Suen, CY, and Wakahara, T (1990). The state of the art in online handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 12, 787-808. https://doi.org/10.1109/34.57669
10. Kuzmanic, A, and Zanchi, V . Hand shape classification using DTW and LCSS as similarity measures for vision-based gesture recognition system., Proceedings of 2007 International Conference on Computer as a Tool (EUROCON), 2007, Warsaw, Poland, Array, pp.264-269. https://doi.org/10.1109/EURCON.2007.4400350
11. Niennattrakul, V, and Ratanamahatana, CA . On clustering multimedia time series data using k-means and dynamic time warping., Proceedings of 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE), 2007, Seoul, Korea, Array, pp.733-738. https://doi.org/10.1109/MUE.2007.165
12. Bahlmann, C, and Burkhardt, H (2004). The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping. IEEE Transactions on Pattern Analysis and Machine Intelligence. 26, 299-310. https://doi.org/10.1109/TPAMI.2004.1262308
13. Kahveci, T, Singh, A, and Gurel, A . Similarity searching for multi-attribute sequences., Proceedings 14th International Conference on Scientific and Statistical Database Management, 2002, Edinburgh, UK, Array, pp.175-184. https://doi.org/10.1109/SSDM.2002.1029718
14. Woo, H, Boccelli, DL, Uber, JG, Janke, R, and Su, Y (2019). Dynamic time warping for quantitative analysis of tracer study time-series water quality data. Journal of Water Resources Planning and Management. 145. article no 04019052
15. Zhang, Y, Qu, H, Wang, W, and Zhao, J (2020). A novel fuzzy time series forecasting model based on multiple linear regression and time series clustering. Mathematical Problems in Engineering. 2020. article no 9546792
16. Rosato, A, and Panella, M . Time series prediction using random weights fuzzy neural networks., Proceedings of 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2020, Array, pp.1-6. https://doi.org/10.1109/FUZZ48607.2020.9177651
17. Egrioglu, E, Aladag, CH, and Yolcu, U (2013). Fuzzy time series forecasting with a novel hybrid approach combining fuzzy c-means and neural networks. Expert Systems with Applications. 40, 854-857. https://doi.org/10.1016/j.eswa.2012.05.040
18. Chen, SM (1996). Forecasting enrollments based on fuzzy time series. Fuzzy Sets and Systems. 81, 311-319. https://doi.org/10.1016/0165-0114(95)00220-0
19. Ramadani, K, and Devianto, D (2020). The forecasting model of Bitcoin price with fuzzy time series Markov chain and Chen logical method. AIP Conference Proceedings. 2296. article no 020095
20. Cheng, CH, Cheng, GW, and Wang, JW (2008). Multi-attribute fuzzy time series method based on fuzzy clustering. Expert Systems with Applications. 34, 1235-1242. https://doi.org/10.1016/j.eswa.2006.12.013
21. Cheng, CH, Chen, TL, Teoh, HJ, and Chiang, CH (2008). Fuzzy time-series based on adaptive expectation model for TAIEX forecasting. Expert Systems with Applications. 34, 1126-1132. https://doi.org/10.1016/j.eswa.2006.12.021
22. Tsaur, RC, Yang, JCO, and Wang, HF (2005). Fuzzy relation analysis in fuzzy time series model. Computers & Mathematics with Applications. 49, 539-548. https://doi.org/10.1016/j.camwa.2004.07.014
23. Singh, SR (2007). A simple method of forecasting based on fuzzy time series. Applied Mathematics and Computation. 186, 330-339. https://doi.org/10.1016/j.amc.2006.07.128
24. Aladag, CH, Basaran, MA, Egrioglu, E, Yolcu, U, and Uslu, VR (2009). Forecasting in high order fuzzy times series by using neural networks to define fuzzy relations. Expert Systems with Applications. 36, 4228-4231. https://doi.org/10.1016/j.eswa.2008.04.001
25. Aladag, CH, Yolcu, U, Egrioglu, E, and Dalar, AZ (2012). A new time invariant fuzzy time series forecasting method based on particle swarm optimization. Applied Soft Computing. 12, 3291-3299. https://doi.org/10.1016/j.asoc.2012.05.002
26. Egrioglu, E, Aladag, CH, Yolcu, U, Uslu, VR, and Erilli, NA (2011). Fuzzy time series forecasting method based on Gustafson-Kessel fuzzy clustering. Expert Systems with Applications. 38, 10355-10357. https://doi.org/10.1016/j.eswa.2011.02.052
27. Egrioglu, E (2014). PSO-based high order time invariant fuzzy time series method: application to stock exchange data. Economic Modelling. 38, 633-639. https://doi.org/10.1016/j.econmod.2014.02.017
28. Corradini, A . Dynamic time warping for off-line recognition of a small gesture vocabulary., Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 2001, Vancouver, Canada, Array, pp.82-89. https://doi.org/10.1109/RATFG.2001.938914
29. Wu, D, and Mendel, JM (2019). Recommendations on designing practical interval type-2 fuzzy systems. Engineering Applications of Artificial Intelligence. 85, 182-193. https://doi.org/10.1016/j.engappai.2019.06.012
30. Mendel, JM (2002). On KM algorithms for solving type-2 fuzzy set problems. IEEE Transactions on Fuzzy Systems. 21, 426-446. https://doi.org/10.1109/TFUZZ.2012.2227488