In this paper we suggest to use special “fuzzy” techniques for detection of structural breaks in time series, namely the fuzzy (F)-transform and one method of fuzzy natural logic (FNL). The idea is based on application of the F^{1}-transform which makes it possible to estimate effectively slope of time series (ignoring its possible volatility) and its evaluation by a suitable evaluative linguistic expression. The method is computationally very effective.
This paper addresses interesting problem related to the area of mining information from time series (cf. [1]), namely detection of structural breaks in the latter. These are unexpected shifts in the course of the time series that can be caused, e.g., by economic development, global shifts in capital and labor, changes in resource availability due to war or natural disaster, discovery or depletion of natural resources, or a change in political system, etc. From the general point of view, a structural change is possible because of the dynamic nature of the economic system and it occurs when an industry or market changes the way how it functions or operates. Structural breaks are quite frequent and concern not only time series analysis but also general signal processing, for example searching faults in a wire, irregularities in heart functioning, etc.
There are a lot of methods for detection of structural breaks in time series. Most of them are statistical, for example [2–4]. There are also non-statistical approaches, e.g., [5] where the authors suggest to use genetic algorithms.
In this paper we suggest to use special “fuzzy” techniques for detection of structural breaks in time series, namely the fuzzy (F)-transform and one method of fuzzy natural logic (FNL). Application of these techniques to the analysis and forecasting of time series was described in several papers and also in the book [6]. The first application of the F-transform to detection of changes in a signal was described in [7] where the problem was to recognize network attacks. Let us recall that the F-transform makes it possible to analyze time series, extract its trend and trend-cycle, and to detect specific areas, for example intervals of monotonous behavior. The FNL stems from these results and provides methods for forecasting of the future behavior of time series, automatic generation of linguistic description of specific parts of it, and for obtaining other kinds of information.
One of the important outcomes of the F-transform is its ability to estimate derivatives of the time series (taken as a function) over a larger area that can be specified imprecisely, and in which the volatility is ignored. Then, using methods of FNL, we can generate comments in natural language to various characteristic features of the time series (cf. [8, 9]).
In this paper, we suggest methods how the above mentioned techniques can be used for detection of the structural breaks in time series. We will apply estimation of the first derivative and the theory of evaluative linguistic expressions. The main outcome of this method is its relative simplicity, transparency and computation effectiveness because the complexity of the F-transform is linear.
The structure of the paper is the following. In Section 2, we will briefly overview the main principles of the fuzzy transform with the focus on those parts that are used in our paper. In Section 3, we will mention FNL and especially the theory of evaluative linguistic expressions that is applied in the sequel. In Section 4 we describe how the F-transform is applied in the analysis of time series. Finally, Section 5 is devoted to description of the method for finding structural breaks in time series. The paper is finished by demonstration of the method on four concrete examples of structural breaks and its comparison with statistical methods.
By a fuzzy set A ⫇ U we understand a function A : U → [0, 1] where U is an ordinary set called the universe. The [0, 1] is the support of a suitable algebra. In this paper we will consider the standard Łukasiewicz MV-algebra (see [10, 11]).
The fuzzy F-transform is a universal approximation technique introduced in [12, 13]. Its fundamental idea is to map a bounded continuous function f : [a, b] → ℝ to a finite vector
The power of the F-transform stems from the proof of its approximation power, its ability to filter out high frequencies and reduce noise [14], and the possibility to estimate average values of derivatives over a given imprecisely specified area [15]. This property is especially useful in mining information from time series.
The first step of the F-transform procedure is to form a fuzzy partition of the domain [a, b]. It consists of a finite set of fuzzy sets
defined over nodes
The properties of the fuzzy sets from are specified by five axioms, namely: normality, locality (bounded support), continuity, unimodality, and orthogonality. The support of k-the fuzzy set A_{k}, k = 1, . . . , n – 1, is formed by the interval [c_{k}_{–1}, c_{k}] ∪ [c_{k}, c_{k}_{+1}]. A fuzzy partition is called h-uniform if the nodes c_{0}, . . . , c_{n} are h-equidistant, i.e., for all k = 0, . . . , n – 1, c_{k}_{+1} = c_{k} + h, where h = (b – a)/n and the fuzzy sets A_{1}, . . . , A_{n}_{–1} are shifted copies of a generating function A : [−1, 1] –→ [0, 1] such that for all k = 1, . . . , n–1
(for k = 0 and k = n we consider only half of the function A, i.e. restricted to the interval [0, 1] and [−1, 0], respectively). Membership functions A_{0}, . . . , A_{n} of the fuzzy sets forming the fuzzy partition are usually called basic functions.
The orthogonality is formally defined as
(
Once the fuzzy partition A_{0}, . . . , is determined, we define a direct F-transform of a continuous function f as a vector
Clearly, the F_{k}[f] component is a weighted average of the functional values f(x) where weights are the membership degrees A_{k}(x). The inverse F-transform of f with respect to
The inverse F-transform f̂ has the following properties:
The sequence of inverse F-transforms {f̂_{n}} determined by a sequence of uniform fuzzy partitions based on uniformly distributed nodes with h = (b – a)/n uniformly converges to f for n→∞.
The F-transform is linear, i.e., if f = αu + βv then f̂ = αû + βv̂.
The F-transform introduced above is F^{0}-transform (i.e., zero-degree F-transform). Its components are real numbers. If we replace them by polynomials of arbitrary degree m ≥ 0, we arrive at the higher degree F^{m} transform. This generalization has been in detail described in [13]. Let us remark that the F^{1} transform enables to estimate also derivatives of the given function f as weighted average values over a specified area.
The direct F^{1}-transform of f with respect to A_{1}, . . . , A_{n}_{–1} is a vector
with the coefficients
Note that
We may also use the F^{2}-transform. Its components are the functions
(provided that the basic functions are triangles) and the coefficient
If f is four-times continuously differentiable on [a, b] then for each k = 1, . . . , n – 1,
Thus, each F-transform component provides a weighted average of values of the function f in the area around the node c_{k} (
The fuzzy natural logic is a class of formal logical theories developed on unique grounds whose goal is to develop a mathematical model of special human reasoning schemes that employ natural language. So far, the following theories are included:
A formal theory of evaluative linguistic expressions [16].
A formal theory of fuzzy/linguistic IF-THEN rules and approximate reasoning [17–20].
A formal theory of intermediate and generalized fuzzy quantifiers [21–25].
In this paper, we will apply only the theory of evaluative linguistic expressions.
Of course, the fuzzy IF-THEN rules or fuzzy and generalized quantifiers are studied by many authors, for example [26, 27] or [28–31]. Though a lot of their results are source of inspiration for the development of FNL, they lack a sound formal theory and so, we do not consider them as part of FNL.
Evaluative linguistic expressions are special expressions of natural language that have the general form
where 〈TE-adjective〉 is one of the canonical adjectives “small, medium, big”, or “zero” and possibly also a symmetric fuzzy number. Instead of the canonical adjectives, we can use other kinds of adjectives having similar nature (they belong to specific classes such as gradable or evaluative adjectives), for example “shallow, medium deep, deep”, “weak, medium strong, strong”, etc. The 〈linguistic hedge〉 is a special expression that makes the meaning of the 〈TE-adjective〉 more specific. Quite often it is represented by an intensifying adverb such as “very, roughly, approximately, significantly”, etc. Linguistic hedges can have narrowing (“extremely, significantly, very, typically”) and widening effect (“more or less, roughly, quite roughly, very roughly”) on the meaning of the 〈TE-adjective〉.
If 〈linguistic hedge〉 is not present (expressions such as “weak, large”, etc.) then we take it as presence of empty linguistic hedge. Thus, all the simple evaluative expressions have the same form (
We distinguish abstract evaluative expressions from more specific evaluative predications. The latter are expressions of natural language of the form ‘X is ’ where is an evaluative expression and X is a variable which stands for objects, for example “degrees of temperature, height, length, speed”, etc. Examples are “temperature is high”, “speed is extremely low”, “quality is very high”, etc. In general, the variable X represents certain features of objects such as “size, volume, force, strength,” etc. and so, its values are often real numbers.
When using evaluative expressions, we must first specify the linguistic context. In our theory it is an interval [v_{L}, v_{S}] ∪ [v_{S}, v_{R}] determined by a triple of numbers w = 〈v_{L}, v_{S}, v_{R}〉 where v_{L} is the leftmost typically small value, v_{S} is typically medium value and v_{R} is the rightmost typically big value. For example, when speaking about temperature of water, we may set v_{L} = 15°C (water temperature in the crane), v_{S} = 50°C and v_{R} = 100°C. In the sequel, we will consider a set of all linguistic contexts
The element x belongs to a context w ∈ W if x ∈ [v_{L}, v_{R}] and we may write x ∈ w.
The meaning of an evaluative linguistic expression or predication is represented by its intension
where ℱ(ℝ) is a set of all fuzzy sets on ℝ. This is a function that to each context w ∈ W assigns the extension Ext_{w}(X is ) where the latter is a specific fuzzy set on ℝ. Example of extensions of several evaluative linguistic expressions is in Figure 1. Let us emphasize that their shapes have been established on the basis of logical analysis of the meaning of the corresponding evaluative expressions (for the details, see [16]).
Moreover, there is a natural ordering ⋘ of these expressions obtained as lexicographic ordering based on the natural ordering
and the following ordering of hedges:
The linear lexicographic ordering ⋘ of simple evaluative expressions (
The function due to the following definition is needed in the sequel.
The function of local perception is a partial function LPerc : ℝ × W –→ EvExpr defined as follows:
If (
A time series is a stochastic process (see [32, 33]) X : Q × Ω −→ ℝ where Ω is a set of elementary random events and Q = {0, . . . , p} ⊂ ℕ is a finite set whose elements are interpreted as time moments. Statistical models assume that each X(t), t ∈ Q is a random variable having a specific distribution function. This assumption is the basis of many kinds of probabilistic methods for analysis and forecasting of time series.
Fuzzy techniques stem from the following decomposition model:
where Tr is the trend, C is a cyclic component, S is a seasonal component that is a mixture of periodic functions. Note that Tr, C, S are ordinary functions not being stochastic. Only R is a random noise, i.e., a sequence of independent random variables R(t) such that for each , the R(t) has zero mean and finite variance. Let us remark that trend and cyclic component are quite often joined into one component TC(t) = Tr(t) + C(t) called trend-cycle.
Let us now assume (without loss of generality) that the time series (
respectively (via the equality T = 2π/λ).
The following theorem demonstrates the power of the F-transform for time series analysis.
Let X(t) be realization of the stochastic process (
for t ∈ [c_{1}, c_{n}_{–1}], where D for d ≥ 2 is a certain small number and ω(h,TC) is a modulus of continuity of TC w.r.t. h.
The precise form of D and the detailed proof of this theorem can be found in [14, 34]. The theorem holds both for F^{0} as well as for F^{1}-transform. It follows from it that the F-transform makes it possible to filter out frequencies higher than a given threshold and also reduces the noise R.
The periodicity T_{q} can be found using the well known periodogram — see [32, 33]. Then, by setting a proper fuzzy partition due to Theorem 3, we first compute the F-transform of X(t) (either zero or first degree)
Then the estimation of the trend-cycle TC is obtained using the inverse F-transform: TC(t) ≈ X̂ (t).
Theorem 3 tells us how the distance between nodes of the fuzzy partition should be set to obtain good estimation of the trend-cycle. Namely, we put it equal to h = dT_{q} where d is in practice equal to 1 or 2. The fuzzy partition is constructed over an interval [0, p] of real numbers, i.e., it is not constructed over the discrete set of natural numbers. Consequently, we obtain X̂ ≈ TC which means that we can estimate the trend cycle TC with high fidelity. Of course, the estimation depends on the course of TC and it is the better the smaller is the modulus of continuity ω(h,TC) (which is usually the case).
In this section, we will briefly describe the principles of linguistic evaluation of the (direction of) trend of time series in a local time slot (for example, quarter of year, production period, etc.). Such evaluation is later used in the detection of structural breaks.
The trend is estimated using the coefficients
First, we must specify a context in which the (direction of) the trend is evaluated. This is clear because, for example, the increase of temperature in Sweden by 3°C in, say, 2 hours can be taken as a “sharp increase”, while the same in Africa is a “very small increase”. Therefore, must specify the three distinguished values v_{L}, v_{S}, v_{R} of the tangent where v_{R} is the extreme increase (decrease), while the smallest one is typically (but not always) v_{L} = 0. The typical medium value v_{S} is determined analogously as v_{R}. The result is the linguistic context w_{tg} = 〈v_{L}, v_{S}, v_{R}〉 for the trend. Moreover, because we distinguish between increase and decrease of time series, we must also distinguish positive context
The trend can be evaluated by the following sentence:
where
and
The empty hedge ∅︀ is used if no more specific characterization of the direction of the trend is required. Note that (
The evaluative predication (
where is an evaluative expression in which the TE-adjective is canonical (i.e., either of “small, medium, big” or “zero”). This leads to the following definition.
Let X be a time series (
Note that (
From (
Recall that (
The fundamental idea for detection of structural breaks consists in the application of the F^{1}-transform over properly defined fuzzy partition. Note that the supports of the basic functions cannot be too wide because then the structural breaks might disappear among other values of the time series X(t) (cf. Theorem 3).
Let the realization {X(t) | } of the time series X be given. Let w_{tg} be a context for the trend of X. A structural break is the sudden change of the values X(t) in an interval such that ± in (
As mentioned, the interval is supposed to be “short”. According to the experiments, its proper value should be {4, . . . , 10} (by | · | we denote length of the corresponding interval).
The method for detection of structural breaks using fuzzy modeling methods is the following:
Specify the distance h between nodes of the fuzzy partition (cf. Subsection 2.1). This means that width of the basic functions is equal to 2h. Recall that the fuzzy partition is constructed over the interval .
Set the context w_{tg} for evaluation of the trend in areas determined by the basic functions due to item 1. This can be determined as follows: v_{L} = 0, v_{R} = (±σ)/(2h) where σ is the standard deviation of the whole time series. Note that it covers both normal course of the time series as well as parts of the structural breaks. Alternatively, we can set v_{R} on the basis of the knowledge of the character of time series or, replace σ by
Compute the F^{1}-transform
Locate all components F_{k}[X], where the coefficient
To make the location more convincing, it is possible to shift the corresponding basic function(s) A_{k} slightly to the left or right. Let us emphasize, however, that the shift can be only little and, as the whole procedure is very robust, we do not need to find the maximal value of
The located components point to the areas of structural breaks in the time series. The areas are determined by the corresponding fuzzy sets (basic functions) from the fuzzy partition (
The method presented in the previous subsection is demonstrated on examples. We prepared 4 artificial time series with typical structural breaks—see Figure 2. The time series were obtained by combination of few real time series taken from the INDUSTRY subset of time series on a monthly basis from the M3-Competition published on the Internet.
Each time series in Figure 3 is depicted together with the fuzzy partition (Figure 3 was prepared using the experimental software FT-studio which makes it possible to apply fuzzy transform to a function given either by a precise formula, or by the data. The program was developed in the Institute for Research of Applications of Fuzzy Modeling of the University of Ostrava, Czech Republic. Its author is Radek Valášek). All the fuzzy partitions are equidistant with h = 4, formed by triangular basic functions with the support equal to 2h. The linguistic contexts of each time series were obtained by setting ±v_{R} = σ/(2h), v_{L} = 0 and v_{S} = 0.4 v_{R}. The standard deviations are σ_{(}_{a}_{)} = 978, σ_{(}_{b}_{)} = 1711, σ_{(}_{c}_{)} = 6307, σ_{(}_{d}_{)} = 4921, respectively. The results are depicted in Figure 3.
One can see that the places of structural breaks are well detected. Moreover, we also immediately know whether it is jump up or down (in correspondence with the sign of β^{1}):
There are 2 structural breaks determined by 3 basic functions. The values of the corresponding coefficients are
Linguistic evaluation of the course of time series between the structural breaks (the interval [41, 77]) is “stagnating” (Evaluation of the course of time series was obtained using the software LFL Forecaster developed in the Institute for Research of Applications of Fuzzy Modeling of the University of Ostrava, Czech Republic. Its author is Viktor Pavliska).
There are 4 structural breaks determined by 5 basic functions. The values of the corresponding coefficients are
Linguistic evaluations of the course of time series between the structural breaks are in both intervals [20, 44] and [52, 108] “stagnating”.
There are 2 narrow structural breaks, each determined by 2 basic functions, one for shift up and the second one for shift down. The values of the corresponding coefficients are
There are 2 structural breaks determined by 2 basic functions. The values of the corresponding coefficients are
Linguistic evaluation of the course of time series between the structural breaks (the interval [49, 89]) is also “stagnating” which means that the volatility does not lead to change of the trend. On the other hand, inside this area places with specific course can be detected, e.g., “clear decrease” in [52, 64] and “clear increase” in [64, 75].
For comparison, we also applied the classical Chow test on the above four artificial time series. It is based on statistical test of significance of non-zero change of regression coefficients
The null hypothesis is H_{0} : β_{0} = β_{i}. If it is rejected then there is a structural break detected. Note that Chow test requires longer time series.
The results of the Chow test applied to the above 4 artificial time series is in Figure 4. One can see that the detection using this method is not convincing. As a matter of fact, all p-values are significant which means that the structural breaks are detected in each time moment. Of course, it does not mean that other statistical methods cannot be successful. At any case, however, we may conclude that our method has the power to detect structural breaks in time series and sometimes is able to give even more convincing results than the standard statistical techniques.
In this paper we suggested the method how structural breaks in time series can be detected. The method is based on application of the first degree fuzzy transform that provides estimation of the slope of time series in an imprecisely determined area. The structural breaks are easily detected if we use fuzzy partition whose basic functions are sufficiently narrow. The method is robust w.r.t. the starting position of the fuzzy partition. It is also important to emphasize that the detection is extremely fast because time complexity of the fuzzy transform is linear.
Finally let us remark that this method is related to detection of intervals of monotonous course of time series described in [9]. Using this algorithm, we can obtain information about length and character of the behavior of the time series between structural breaks.
The paper has been supported by the project IT4I XS (LQ1602).
No potential conflict of interest relevant to this article was reported.
Shapes of extensions of few evaluative expressions in the context 〈0, 0.5, 1〉. The hedges are {Extremely, Significantly, Very, empty hedge} for “small” and “big” and {More-or-Less, Roughly, Quite Roughly, Very Roughly} for “small”, and “big”. Extension of “medium” is not modified but it can be modified by widening hedges.
Artificial time series with four types of structural breaks.
Artificial time series from
The result of Chow test applied to the artificial time series from
The translation table between special and canonical evaluative expressions
Special hedge, direction | from ( |
---|---|
stagnating | extremely small or zero |
negligibly | significantly small |
slightly | very small |
somewhat | rather small |
clearly | very roughly small or medium |
roughly | very roughly big |
fairly large | roughly big |
quite large | rather big |
large | big |
sharply | very big |
significantly | significantly big |
huge | extremely big |