JDAIP  Vol.5 No.3 , August 2017
Regression Analysis of a Kind of Trapezoidal Fuzzy Numbers Based on a Shape Preserving Operator
Author(s) Jie Sun, Qiujun Lu
ABSTRACT
Fuzzy regression provides more approaches for us to deal with imprecise or vague problems. Traditional fuzzy regression is established on triangular fuzzy numbers, which can be represented by trapezoidal numbers. The independent variables, coefficients of independent variables and dependent variable in the regression model are fuzzy numbers in different times and TW, the shape preserving operator, is the only T-norm which induces a shape preserving multiplication of LL-type of fuzzy numbers. So, in this paper, we propose a new fuzzy regression model based on LL-type of trapezoidal fuzzy numbers and TW. Firstly, we introduce the basic fuzzy set theories, the basic arithmetic propositions of the shape preserving operator and a new distance measure between trapezoidal numbers. Secondly, we investigate the specific model algorithms for FIFCFO model (fuzzy input-fuzzy coefficient-fuzzy output model) and introduce three advantages of fit criteria, Error Index, Similarity Measure and Distance Criterion. Thirdly, we use a design set and two reference sets to make a comparison between our proposed model and the reference models and determine their goodness with the above three criteria. Finally, we draw the conclusion that our proposed model is reasonable and has better prediction accuracy, but short of robust, comparing to the reference models by the three goodness of fit criteria. So, we can expand our traditional fuzzy regression model to our proposed new model.

1. Introduction

Fuzzy regression, one of the most popular methods of modeling and prediction, is an important statistical tool in evaluating the functional relationship between a set of explanatory variables and explained variable (Montgomery and Peck, 2006 [1] ). It shows particular advantages in analyzing complex systems where the vagueness of human subjective judgment doesn’t work, such as economic systems, social systems and environmental systems. In most fuzzy regression models, deviations between the observed and estimated values are supposed to be due to random errors, like classical linear regression model. But in the real world, imprecise information, incomplete knowledge, unacquirable data and indeterminable underlying model can lead to larger error.

Therefore, fuzzy set theory, introduced by Zadeh (1965) [2] , provides us appropriate tools for regression analysis, when relationship between variables is vaguely defined or observations are recorded imprecisely. After introducing fuzzy set theory, fuzzy regression techniques can be classified into two distinct areas. The first approach, possibilistic regression, proposed by Tanaka et al., (1982) [3] , aims at minimizing the total spread of the output. In this case, the problem of fitting a fuzzy model can be viewed as a linear programming problem. Still in this area, Tanaka and Ishibushi (1991) [4] extended their approach for dealing with interactive fuzzy parameters. In the fuzzy literature, several extensions of this approach have been proposed [5] [6] [7] [8] . Five years later, Celmins (1987) [9] and Diamond (1988) [10] put forward another approach, the fuzzy least squares regression, which aims to minimize the overall square errors between the observed and the estimated values. Hong et al. (2001) [11] studied the fuzzy least squares linear regression by using shape preserving operations. Moreover, several variants of this approach [12] [13] [14] [15] [16] have been used in fuzzy linear regression.

Both of the above approaches to fuzzy regression are widely used in usual fuzzy linear regression. But they are all sensitive to outliers. In such cases, least absolutes deviation (LAD) based on least squares deviation (LSD), is preferred to be used as a robust method. Especially, when outliers are in the response variable, the LAD estimator is more robust than the LSD estimator (Stahel and Weisberg, 1991 [17] ). Based on this method, many researchers made more extension about fuzzy linear regression models. However, each has his strong point. When there exist no outliers, LSD is similar to LAD, even better for evaluating more steady and unique solution [18] [19] [20] . Besides, Yager (1980) [21] proposed centroid method to translate fuzzy numbers into crisp numbers. Based on this, Zhang (2012) [22] proposed statistical analysis of fuzzy regression model based on centroid method.

In the development of fuzzy linear regression models, a new problem arose imperceptibly that the usual multiplication changed the shape of fuzzy numbers in some cases. On the one hand, Hojati et al. (2005) [23] proposed to evaluate the estimators of fuzzy outputs and parameters, by setting α -set in fuzzy multiplication, but the estimators of fuzzy outputs depend on the value of α , which is unknown. On the other, a shape preserving operator, T W was proved, by Hong (2001) [24] , to be the only T-norm which induces a shape preserving multiplication of LL-fuzzy numbers. Mesiar (1997) [25] and Hong et al. (1997) [26] all made further study based on T W , which can efficiently control the shape of estimators and decrease the risk of bias caused by taking minimum (Hong et al., 2001) [27] .

However, traditional fuzzy regression is still based on triangle fuzzy numbers or partial fuzzy numbers between inputs, coefficients, output. In consideration of that trapezoidal fuzzy numbers, which can represent other types of fuzzy numbers, take an important role in fuzzy numbers [28] [29] [30] . Some researchers made further study on fuzzy linear regression based on trapezoidal numbers [31] [32] [33] . And the distance between trapezoidal fuzzy numbers is also an important research topic in the fuzzy set theory, which is a basis for many related applications. So many researchers have investigated and obtained some meaningful conclusions [34] [35] [36] [37] . Taking advantages of LSD and trapezoidal fuzzy number and basing on the paper, written by Wang and Lu (2016) [33] , we first introduce the basic set theories, the basic arithmetic propositions of T W and a new distance between trapezoidal fuzzy numbers. Then we want to propose a new model, whose coefficients are trapezoidal fuzzy numbers, basing on the shape preserving operator, T W , to expand fuzzy regression, while no outliers in sample set and investigate the model algorithms and fulfil model complexity analysis.

The structure of this paper is as follows. In Section 2, we introduce some basic notions, and prove the good arithmetic property of T W and our proposed distance. In Section 3, we propose fuzzy regression model based on least squares deviation with FIFCFO (fuzzy input-fuzzy coefficient-fuzzy output), investigate its steps detailedly, evaluate the performance of our model and introduce the measures of errors, such as error index, similarity measure and distance criterion. In Section 4, we use three examples to illustrate our proposed model and make comparisons with existing fuzzy regression models. In the last section, we do comprehensive analysis about our proposed model and give the results and conclusion.

2. Preliminary

For the sake of rigor and clarity, the basic fuzzy set theories and the basic arithmetic propositions of the shape preserving operator, used in this paper, will be introduced in this section. Throughout this paper, we use R to denote all the real numbers, FN stands for the set of the all fuzzy numbers in R.

Definition 1. (Zadeh, 1965 [2] ). Suppose that A ˜ is a fuzzy set in R and satisfies the following properties:

1) Regularity: x 0 R , A ˜ ( x 0 ) = 1 .

2) Bounded closed interval: λ ( 0 , 1 ] , A λ = [ A λ , A λ + ] is a bounded closed interval.

Then we call A ˜ a fuzzy number in R.

Definition 2. (Hu, 2010 [38] ). Set A ˜ is a fuzzy number in R, if the s u p p A [ 0 , + ) , then we call A ˜ a positive fuzzy number, and denote the set of all the positive fuzzy numbers in R by PFN. If the s u p p A ( , 0 ] , then we call A ˜ a negative fuzzy number, and denote the set of all the negative fuzzy numbers in R by NFN.

Definition 3. (Hu, 2010 [38] ). Suppose that the membership function of LR-type fuzzy number A ˜ is defined as follows:

A ˜ ( x ) = { L ( a x α A ) , x a R ( x a β A ) , x a (1)

where L , R satisfy

1) L , R ( x ) : ( , + ) [ 0 , 1 ]

2) L ( x ) = L ( x ) , R ( x ) = R (−x)

3) L ( 0 ) = R ( 0 ) = 1

4) L ( x ) and R ( x ) are non-increasing functions on.

Here, a is the center point, α A is the width of the left side and β A is the width of the right side of the fuzzy number A ˜ , respectively. a R and α A , β A 0 . Besides, we call A ˜ a LL-fuzzy number, when L ( x ) = R ( x ) .

Suppose A ˜ = ( a 1 , a 2 , α A , β A ) a trapezoidal fuzzy number in R ( α A , β A 0 , a 2 a 1 ) . If the membership function of A ˜ can be represent as that in Definition 3, then we call A ˜ a LL-trapezoidal fuzzy number and denote the set of the all LL-trapezoidal fuzzy numbers as T F N L L . Therefore, we let P N T F N L L = P T F N L L N T F N L L , where P T F N L L and N T F N L L stand for the positive T F N L L and the negative T F N L L in R, respectively.

Definition 4. (Hu, 2010 [38] ). For any a , b , c , d [ 0 , 1 ] , mapping satisfies the following conditions:

1) commutative law: T ( a , b ) = T ( b , a )

2) associative law: T ( T ( a , b ) , c ) = T ( a , T ( b , c ) )

3) monotonicity: a c , b d T ( a , b ) T ( c , d )

4) boundary condition: T ( 1 , a ) = a .

Then we use T to denote T-norm on [ 0 , 1 ] .

Proposition 1. (Hu, 2010 [38] ) T is T-norm on [ 0 , 1 ] , it is generally acknow- ledged that T W T T M , here

T W ( a , b ) = { 0 , max ( a , b ) < 1 min ( a , b ) , others (2)

(3)

where T W is called drastic product and T M is called minimax operator.

Definition 5. (Hu, 2010 [38] ). Let A ˜ , B ˜ F N , stands for the arithmetic operations on R, such as + , , , and stands for its arithmetical operations on FN, such as , , :

(4)

Hence, we use and A ˜ W B ˜ to stand for extended addition, extended subtraction and extended multiplication of T W , respectively.

Proposition 2. Let, so we can get

1) k W A ˜ = { ( k a 1 , k a 2 , k α A , k β A ) L L , k 0 ( k a 2 , k a 1 , | k | β A , | k | α A ) L L , k < 0

2) A ˜ W B ˜ = ( a 1 + b 1 , a 2 + b 2 , max ( α A , α B ) , max ( β A , β B ) ) L L (5)

3) A ˜ W B ˜ = ( a 1 b 2 , a 2 b 1 , max ( α A , β B ) , max ( β A , α B ) ) L L

Proposition 3. Let

so we can get

A ˜ W B ˜ = { ( a 1 b 1 , a 2 b 2 , max ( α A b 1 , α B a 1 ) , max ( β A b 2 , β B a 2 ) ) L L , A ˜ , B ˜ P T F N L L ( a 2 b 2 , a 1 b 1 , max ( β A | b 2 | , β B | a 2 | ) , max ( α A | b 1 | , α B | a 1 | ) ) L L , A ˜ , B ˜ N T F N L L ( a 1 b 2 , a 2 b 1 , max ( α A b 2 , β B | a 1 | ) , max ( β A b 1 , α B | a 2 | ) ) L L , A ˜ N T F N L L , B ˜ P T F N L L
(6)

Proof. Let A ˜ = ( a 1 , a 2 , α A , β A ) L L , B ˜ = ( b 1 , b 2 , α B , β B ) L L , and their membership function of satisfy Definition 3. We consider the case of A ˜ , B ˜ P T F N L L , which means a 1 , a 2 , b 1 , b 2 > 0 . Then,

1) For z a 1 b 1 ,

( A ˜ W B ˜ ) ( z ) = sup x y = z T W ( A ˜ ( x ) , B ˜ ( y ) ) = max ( A ˜ ( z b 1 ) , B ˜ ( z a 1 ) ) = max ( L ( a 1 z / b 1 α A ) , L ( b 1 z / a 1 α B ) ) = max ( L ( a 1 b 1 z α A b 1 ) , L ( a 1 b 1 z α B a 1 ) ) = L [ ( a 1 b 1 z ) / max ( α A b 1 , α B a 1 ) ]

2) For a 1 b 1 z a 2 b 2 ,

( A ˜ W B ˜ ) ( z ) = sup x y = z T W ( A ˜ ( x ) , B ˜ ( y ) ) = max a 1 b 1 z a 2 b 2 ( A ˜ ( z b 1 ) , B ˜ ( z a 1 ) ) = max ( L ( a 1 z / b 1 α A ) , L ( b 1 z / a 1 α B ) ) = max ( max a 1 p a 2 b 2 / b 1 ( A ˜ ( p ) ) , max b 1 q b 2 a 2 / a 1 ( B ˜ ( q ) ) ) = max ( 1 , 1 ) = 1

3) For z a 2 b 2 ,

( A ˜ W B ˜ ) ( z ) = sup x y = z T W ( A ˜ ( x ) , B ˜ ( y ) ) = max ( A ˜ ( z b 2 ) , B ˜ ( z a 2 ) ) = max ( L ( z / b 2 a 2 β A ) , L ( z / a 2 b 2 β B ) ) = max ( L ( z a 2 b 2 β A b 2 ) , L ( z a 2 b 2 β B a 2 ) ) = L [ ( z a 2 b 2 ) / max ( β A b 2 , β B a 2 ) ]

It follows that A ˜ W B ˜ = ( a 1 b 1 , a 2 b 2 , max ( α A b 1 , α B a 1 ) , max ( β A b 2 , β B a 2 ) ) L L , A ˜ , B ˜ P T F N L L . For the other cases, we can similarly get the same formulas as the cases in (6) and omit the proof.

Remark. The propositions 1.3 in Wang (2016) [33] are the special cases of our proposition 2 and proposition 3.

Proposition 4. T W is the only T-norm which can induce a shape preserving multiplication of P N T F N L L .

Proof. From proposition 3, we can get that T W induces a shape preserving multiplication of P N T F N L L . The following work is to prove T W is the unique one induces a shape preserving multiplication on P N T F N L L .

Now, give L ( x m ) be a non-increasing continuous function form [ m , + ) to [ 0 , 1 ] with lim x + L ( x m ) = 0 , L ( 0 ) = 1 , m 2 , which induces the case of L ( 1 ) = 0 and assume { x | L ( x m ) = 1 } = { m } . Let A ˜ = B ˜ = ( m , m , 1 , 1 ) L L P T F N L L . Then A ˜ T B ˜ 1 { m } . For this, suppose T ( x 0 , y 0 ) > 0 , for some x 0 , y 0 ( 0 , 1 ) , then there exist a 0 , b 0 ( m , + ) such that L ( a 0 m ) = x 0 , L ( b 0 m ) = y 0 . Then

A ˜ B ˜ ( a 0 b 0 ) = sup x y = a 0 b 0 T ( A ( x ) , B ( y ) ) T ( L ( a 0 m ) , L ( b 0 m ) ) = T ( x 0 , y 0 ) > 0

Let { x | L ( x m ) h } = [ m , L h + m ] . Then by Nguyen’s theorm (1978) [39] { z | A ˜ T M B ˜ ( z ) h } = [ ( L h + m ) 2 , m 2 ] [ m 2 , ( L h + m ) 2 ] for 0 h 1 . Now suppose A ˜ T B ˜ = ( m , m , α , α ) for some 0 < α < 2 , and hence { z | A ˜ T B ˜ h } = [ α ( L h + m ) , m ] [ m , α ( L h + m ) ] . But, since A ˜ T B ˜ A ˜ T M B ˜ , ( L h + m ) 2 α ( L h + m ) for any h [ 0 , 1 ] . Then α = 0 , a contradiction. Hence A ˜ T B ˜ is not a fuzzy number of LL-type. Therefore, we have proved this proposition.

Proposition 5. Let A ˜ = ( a 1 , a 2 , α A , β A ) L L , B ˜ = ( b 1 , b 2 , α B , β B ) L L P N T F N L L , α [ 0 , 1 ] , k 0 , k R , so we can get

( 1 ) ( A ˜ W B ˜ ) A ˜ α + B ˜ α ( 2 ) ( A ˜ W B ˜ ) A ˜ α B ˜ α ( 3 ) ( k W A ˜ ) = k A ˜ α ( 4 ) ( A ˜ W B ˜ ) A ˜ α B ˜ α (7)

Proposition 6. Let A ˜ = ( a 1 , a 2 , α A , β A ) L L , B ˜ = ( b 1 , b 2 , α B , β B ) L L , C ˜ = ( c 1 , c 2 , α C , β C ) L L P N T F N L L , k 1 , k 2 , k R , then

(8)

Definition 6. (Xu and Li, 2001) Set A ˜ , B ˜ F N , then the distance between A ˜ , B ˜ is defined as follows:

d ( A ˜ , B ˜ ) = ( 0 1 f ( λ ) d 2 ( A ˜ λ , B ˜ λ ) d λ ) 1 2 (9)

where d 2 ( A ˜ λ , B ˜ λ ) = ( a l ( λ ) b l ( λ ) ) 2 + ( a r ( λ ) b r ( λ ) ) 2 , A ˜ λ = [ a l ( λ ) , a r ( λ ) ] , B ˜ λ = [ b l ( λ ) , b r ( λ ) ] , f ( λ ) is an increasing function on [ 0 , 1 ] , f ( 0 ) = 0 , and 0 1 f ( λ ) d λ = 1 2 .

Theorem 1. Set A ˜ = ( a 1 , a 2 , α A , β A ) , B ˜ = ( b 1 , b 2 , α B , β B ) P N T F N L L , their membership function can be represented as the form of that in Definition 3, then the distance can be defined as follows:

d 2 ( A ˜ , B ˜ ) = Δ 1 ( a 1 b 1 ) 2 + Δ 1 ( a 2 b 2 ) 2 + Δ 2 ( α A α B ) 2 + Δ 2 ( β A β B ) 2 Δ 3 ( a 1 b 1 ) ( α A α B ) + Δ 3 ( a 2 b 2 ) ( β A β B ) (10)

where Δ 1 = 0 1 f ( λ ) d λ , Δ 2 = 0 1 f ( λ ) L λ 2 d λ , Δ 3 = 2 0 1 f ( λ ) L λ d λ , L λ = L 1 ( λ ) .

Proof. For A ˜ = ( a 1 , a 2 , α A , β A ) , B ˜ = ( b 1 , b 2 , α B , β B ) , we can get the λ -set of A ˜ , B ˜ :

A λ = [ a 1 α A L λ , a 2 + β A L λ ] , B λ = [ b 1 α B L λ , b 2 + β B L λ ]

so,

d 2 ( A λ , B λ ) = [ ( a 1 L λ α A ) ( b 1 L λ α B ) ] 2 + [ ( a 2 + L λ β A ) ( b 2 + L λ β B ) ] 2 = [ ( a 1 b 1 ) L λ ( α A α B ) ] 2 + [ ( a 2 b 2 ) + L λ ( β A β B ) ] 2 = ( a 1 b 1 ) 2 + ( a 2 b 2 ) 2 + L λ 2 ( α A α B ) 2 + L λ 2 ( β A β B ) 2 2 L λ ( a 1 b 1 ) ( α A α B ) + 2 L λ ( a 1 b 1 ) ( β A β B )

further, we can get

d ( A ˜ , B ˜ ) 2 = 0 1 f ( λ ) d 2 ( A ˜ λ , B ˜ λ ) d λ = Δ 1 ( a 1 b 1 ) 2 + Δ 1 ( a 2 b 2 ) 2 + Δ 2 ( α A α B ) 2 + Δ 2 ( β A β B ) 2 Δ 3 ( a 1 b 1 ) ( α A α B ) + Δ 3 ( a 2 b 2 ) ( β A β B )

Hence, we complete the proof of Theorem 1.

In the following discussion, we set f ( λ ) = λ , L ( λ ) = max { 0 , 1 | λ | } , then we can get

d ( A ˜ , B ˜ ) 2 = 1 2 ( a 1 b 1 ) 2 + 1 2 ( a 2 b 2 ) 2 + 1 12 ( α A α B ) 2 + 1 12 ( β A β B ) 2 1 3 ( a 1 b 1 ) ( α A α B ) + 1 3 ( a 2 b 2 ) ( β A β B )

3. Fuzzy Least Squares Linear Regression Model

In this section, we consider a group of n sample data, denoted by ( X ˜ i 1 , X ˜ i 2 , X ˜ i 3 , , Y ˜ i ) ,. Let X ˜ i j = ( x i j 1 , x i j 2 , α X i j , β X i j ) be the dependent variable, and B ˜ j = ( b j 1 , b j 2 , α B j , β B j ) be the P N T F N L L regression coefficient, ε ˜ i = ( ε i 1 , ε i 2 , α ε i , β ε i ) be the random error. Here X ˜ i j , B ˜ j , ε ˜ i P N T F N L L , ( i = 0 , 1 , , n , j = 1 , 2 , , p ) . Then the general trapezoidal fuzzy linear regression model can be represented as follows:

Y ˜ i = j = 0 p B ˜ W j W X ˜ i j W ε ˜ i (11)

Now, we define set P and set N , P = { j | b ^ j 0 , j = 1 , 2 , , p } , N = { j | b ^ j < 0 } . If B ˜ j P T F N L L , j P , otherwise, j N . Then this linear regression model has the following form (specify X ˜ i 0 = ( 1 , 1 , 0 , 0 ) ). According to T W , we can calculate the model:

Y ˜ i = j = 0 p W B ˜ j W X ˜ i j W ε ˜ i = B ˜ 0 W ( B ˜ 1 W X ˜ i 1 ) W W ( B ˜ p W X ˜ i p ) W ε ˜ i = j P ( B ˜ j W X ˜ i j ) W W j N ( B ˜ j W X ˜ i j ) W W ε ˜ i = j P ( b j 1 , b j 2 , α B j , β B j ) W L L W ( x i j 1 , x i j 2 , α X i j , β X i j ) W j N ( b j 1 , b j 2 , α B j , β B j ) W L L W ( x i j 1 , x i j 2 , α X i j , β X i j ) W ε ˜ i = j P ( b j 1 x i j 1 , b j 2 x i j 2 , max ( α B j x i j 1 , α X i j b j 1 ) , max ( β B j x i j 2 , β X i j b j 2 ) ) W L L W j N ( b j 1 x i j 2 , b j 2 x i j 1 , max ( α B j x i j 2 , β X i j | b j 1 | ) , max ( β B j x i j 1 , α X i j | b j 2 | ) ) W L L W ( ε i 1 , ε i 2 , α ε i , β ε i ) L L (12)

s . t . { y i 1 = b 0 1 + j P b j 1 x i j 1 + j N b j 1 x i j 2 + ε i 1 y i 2 = b 0 2 + j P b j 2 x i j 2 + j N b j 2 x i j 2 + ε i 2 α y i = max ( α B 0 , max j P ( α B j x i j 1 , α X i j b j 1 ) , max j N ( α B j x i j 2 , β X i j | b j 1 | ) , α ε i ) = max ( α B 0 , { α B j x i j 1 , α X i j b j 1 } j P , { α B j x i j 2 , β X i j | b j 1 | } j N , α ε i ) β y i = max ( β B 0 , max j P ( β B j x i j 2 , β X i j b j 2 ) , max j N ( β B j x i j 1 , α X i j | b j 2 | ) , β ε i ) = max ( β B 0 , { β B j x i j 2 , β X i j b j 2 } j P , { β B j x i j 1 , α X i j | b j 2 | } j N , β ε i ) (13)

We determine each estimated value B ˜ ^ j of the regression coefficient B ˜ j based on the least squares deviation criterion by minimizing the overall square error according to the proposed square distance and obtain the following objective function:

min B ˜ ^ j ( j = 0 , , p ) i = 1 n d 2 ( Y ˜ i , W j = 0 p B ˜ ^ j W X ˜ i j ) (14)

Finally, we draw the conclusion:

Y ˜ ^ i = j = 0 p W B ˜ ^ j W X ˜ i j (15)

Considering the efficiency of evaluation, we design the specific steps in the following. The whole process is solved by using MATLAB.

Step 1: Calculate X i j c , Y i c , the centers of X i j and Y i , with centroid method, then the estimates, i = 1 , 2 , , n , j = 0 , 1 , , p .

Step 2: Determine set P and set N .

Step 3: Compare the sign of B ˜ ^ j and the estimates of b ^ j , if they are same, we can determine B ˜ ^ j , or we need to modify set P and set N and repeat Step 2, until the sign of B ˜ ^ j is consistent with preset.

3.1. Independent Variable, Dependent Variables and Regression Coefficients Are in P N T F N L L

Based on the above, we can conclude least-squares regression of F I F C F O model:

Y ˜ i = B ˜ 0 W ( B ˜ 1 W X ˜ i 1 ) W W ( B ˜ p W X ˜ i p ) W ε ˜ i (16)

where,

, Y ˜ i = ( y i 1 , y i 2 , α Y i , β Y i ) L L ,

B ˜ j = ( b j 1 , b j 2 , α B j , β B j ) L L , X ˜ i 0 = ( 1 , 1 , 0 , 0 ) L L ,

X ˜ i j , B ˜ j P N T F N L L , i = 1 , 2 , , n , j = 0 , 1 , , p .

Let X ˜ i j P T F N L L ,

(17)

The other cases can be calculated as the above similarly.

3.2. Error Management Criterion

For the fuzzy linear regression model (14), let Y ˜ i and Y ˜ ^ i be the observed and estimated fuzzy response for the ith observation, respectively. E i represents the difference of membership values between two membership functions, S i represents the similarity of membership values between two membership functions, R i represents the relative difference of membership values in shape between two membership functions, Y ˜ i ( x ) and Y ˜ ^ i ( x ) are the membership functions of Y ˜ i and Y ˜ ^ i , respectively, S Y ˜ i and S Y ˜ ^ i denote the support of Y ˜ i and Y ˜ ^ i .

1) Error Index (Kim and Bishu, 1998 [40] )

E i = S Y ˜ i S Y ˜ ^ i | Y ˜ i ( x ) Y ˜ ^ i ( x ) | d x S Y ˜ i Y ˜ i ( x ) d x (18)

2) Similarity Measure (Rezaei et al., 2006 [41] )

S i = S Y ˜ i S Y ˜ ^ i min ( Y ˜ i ( x ) , Y ˜ ^ i ( x ) ) d x S Y ˜ i S Y ˜ ^ i max ( Y ˜ i ( x ) , Y ˜ ^ i ( x ) ) d x (19)

3) Distance Criterion

R i = | α Y ˜ ^ i α Y ˜ i | α Y ˜ i + | β Y ˜ ^ i β Y ˜ i | β Y ˜ i (20)

Inspired by Chen and Hsueh (2007) [42] , we proposed R i to measure the fitting effect on the shape.

For each index having its own pros and cons. In general, smaller E i and R i , larger S i , better effect of the fitting model has. So, in this paper, we compare the fitting effect from different points.

4. Numerical Analysis

Example 1. The source sample data was produced by MATLAB randomly. First, we consider the model: Y ˜ i = B ˜ 0 W B ˜ 1 W X ˜ i W ε ˜ i . Then, we set the true value of

B ˜ 0 = ( 3 , 2 , 0.5 , 1 ) L L , B ˜ 1 = ( 1 , 2 , 0.25 , 0.5 ) L L ,

X ˜ i = ( x i 1 , x i 2 , α X i , β X i ) L L , ε ˜ i = ( ϵ i 1 , ϵ i 2 , α ε i , β ε i ) L L

where X ˜ i P N T F N L L , x i 1 ~ U ( 2 , 3 ) , x i 2 ~ U ( 3 , 4 ) , α X i , β X i ~ U ( 0 , 1 ) , and ε ˜ i P N T F N L L , ϵ i 1 α ε i , ϵ i 1 , ϵ i 2 , ϵ i 2 + β ε i ~ N ( 0 , 0.01 ) , ϵ i 1 α ε i ϵ i 1 ϵ i 2 ϵ i 2 + β ε i . Let L ( x ) = max { 0 , 1 | x | } . The sample size is 50. Then, we can get the data set presented in Table 1. Now, we can use (14) to construct fuzzy regression model, obtain the estimated output and use Error Index, Similarity Measure, Distance Criterion to evaluate deviation.

Y ˜ ^ S L = ( 2.9679 , 1.9909 , 0.5273 , 0.6661 ) W ( 0.9873 , 1.9973 , 0.2503 , 0.5003 ) W X ˜

Y ˜ ^ C O = ( 2.2344 , 4.6114 × 10 19 , 5.0184 , 1.4784 ) M ( 0.5093 , 0.2058 , 2.1246 × 10 18 , 6.2998 × 10 14 ) M X ˜

Y ˜ ^ Z = ( 2.7155 , 2.7155 , 1.3188 , 0.4816 ) M ( 0.7574 , 0.8908 , 1.8053 , 1.8053 ) M X ˜

From Table 2, we can find that the sum of E i and R i of our proposed model are smaller than that of the reference models, and the sum of S i of our proposed model is larger than that of the reference models, that means our proposed model has lower deviations than the reference models.

Example 2. The source sample data comes from Table 1 in Zhang (2012) [16] , where the inputs are crisp real numbers, and the outputs are trapezoidal fuzzy numbers. In consideration of the applicability, we enlarge the sample size from 8 to 16, and expand the crisp inputs to fuzzy inputs. First, add

x i = 0.5 , 1.5 , , 7.5 and corresponding y i into the sample data, then expand the crisp input to fuzzy input by setting. Now, we get the final sample in data Table 3. We still use (14) to construct fuzzy regression model, obtain the estimated output and use Error Index, Similarity Measure, Distance Criterion to evaluate deviation. Besides, the results in Table 4, we also illustrate the results through Figures 1(a)-(d) (we use O T i to denote the observed output, C O i to denote Li’s estimated output, Z i to denote Zhang’s estimated output, and L S i to denote our estimated output), which represent the fitting effect of components of trapezoidal fuzzy number between observed outputs, Y ˜ ^ C O , Y ˜ ^ Z and Y ˜ ^ S L , respectively. In Figures 1(a)-(d), the horizontal axis represents the central value

Table 1. Sample data in Example 1.

Table 2. Comparison of the fitting effect in Example 1.

Table 3. Sample data in Example 2.

(a) (b) (c) (d)

Figure 1. The fitting effect of the 1st, 2nd, 3rd and 4th component.

Table 4. Comparison of the fitting effect in Example 2.

of the independent variable, the vertical axis represents the value of the components of trapezoidal fuzzy number.

Y ˜ ^ S L = ( 3.9933 , 3.9933 , 0.3487 , 0.4184 ) W ( 0.7749 , 0.7749 , 0.0634 , 0.0493 ) W X ˜

Y ˜ ^ C O = ( 3.7636 , 0.0231 , 1.1044 × 10 10 , 0.0858 ) M ( 0.8081 , 1.2236 × 10 11 , 8.6385 × 10 12 , 9.3177 × 10 12 ) M X ˜

Y ˜ ^ Z = ( 3.9860 , 3.9860 , 3.9860 , 4.0511 ) M ( 0.7755 , 0.7755 , 0.7755 , 0.7755 ) M X ˜

From Table 4, we can find that the sum of E i and R i of our proposed model are smaller than that of the reference models, and the sum of S i of our proposed model is larger than that of the reference models, that means our proposed model has lower deviations than the reference models. From Figure 1(a) and Figure 1(c), we can see our proposed model is on par with the reference models. From Figure 1(b) and Figure 1(d), we can obviously find that the 2nd and 4th component has perfect fitting effect, they can more aptly describe the trend of the shape of output fuzzy numbers. From Figure 2, we can find the estimated outputs of our proposed model have better coverage than the reference models, especially the 1st, 3rd, 4th. In conclusion, our proposed model has better fitting effect in this case.

Example 3.The source sample data comes from Table 2 in Zhang (2012) [16] , where the inputs are crisp real numbers, and the outputs are trapezoidal fuzzy numbers. In consideration of the applicability, we modify the sample data, and expand the crisp inputs to fuzzy inputs. The specific steps are similar to Example 2. After obtaining the proper sample data in Table 5, we still use (14) to construct fuzzy regression model, obtain the estimated output and use Error Index, Similarity Measure, Distance Criterion to evaluate deviation shown in Table 6.

Table 5. Sample data in Example 2.

Figure 2. The shape of the four estimated outputs.

Table 6. Comparison of the fitting effect in Example 3.

Y ˜ ^ S L = ( 4.4703 , 4.4703 , 0.3554 , 0.2552 ) W ( 0.1531 , 0.1531 , 0.0340 , 0.0322 ) W X ˜ 1 W ( 0.7719 , 0.7719 , 0.0019 , 0.0010 ) W X ˜ 2 W ( 0.3951 , 0.3951 , 0.0032 , 0.0145 ) W X ˜ 3

Y ˜ ^ C O = ( 7.6546 , 0.5084 , 5.5503 × 10 20 , 0.5284 ) M ( 3.6192 × 10 20 , 3.3385 × 10 20 , 4.8089 × 10 21 , 5.1307 × 10 20 ) M X ˜ 1 M ( 0.7626 , 6.3898 × 10 21 , 4.1140 × 10 21 , 6.4876 × 10 21 ) M X ˜ 2 M ( 0.2108 , 6.3215 × 10 21 , 2.6331 × 10 21 , 1.0405 × 10 20 ) M X ˜ 3

Y ˜ ^ Z = ( 8.2600 , 8.2600 , 8.2600 , 8.4384 ) M ( 0.2238 , 0.2238 , 0.2238 , 0.2238 ) M X ˜ 1 M ( 0.3971 , 0.3385 , 0.2103 , 0.2032 ) M X ˜ 2 M ( 0.1443 , 0.1443 , 0.1443 , 0.1426 ) M X ˜ 3

From Table 6, we can find that the sum of E i of our proposed model is smaller than that of the reference models, and the sum of S i and R i of our proposed model is larger than that of the reference models, that means our proposed model has lower deviations than the reference models, but bad shape estimation.

5. Conclusions

In this study, we took advantages of drastic product and classic LSD and used T W to design the a kind of trapezoidal fuzzy number ( P N T F N L L ) regression model, which handles regression problem with fuzzy inputs, fuzzy coefficients and fuzzy outputs represented as F I F C F O . The first two examples show great support for our model, and the last example is inferior in R i . In general, our proposed model has better performance than the reference models when on outliers in sample sets, that means our proposed model is short of robust property.

Although the experimental results show that our proposed model has better performance, but the complexity of computation is still a potential problem even though it is solved to a certain extent by optimized program. The sample size or the number of variables is larger; the computation is more complex. In the future research, we will further study how to perform better when sample size is large, or there are outliers in sample sets and apply it to non-linear fuzzy regression analysis.

Acknowledgements

The authors appreciate the helpful comments of the referees on this manuscript.

Cite this paper
Sun, J. and Lu, Q. (2017) Regression Analysis of a Kind of Trapezoidal Fuzzy Numbers Based on a Shape Preserving Operator. Journal of Data Analysis and Information Processing, 5, 96-114. doi: 10.4236/jdaip.2017.53008.
References
[1]   Montgomery, D.C., Peck, E.A. and Vining, C.G. (2006) Introduction to Linear Regression Analysis. John Wiley Sons, Hoboken.

[2]   Zadeh, L.A. (1965) Fuzzy Sets. Information and Control, 8, 338-353.

[3]   Tanaka, H., Uejima, S. and Asai, K. (1982) Linear Regression Analysis with Fuzzy Model. IEEE Transactions on Systems, Man, and Cybernetics, 12, 903-907.
https://doi.org/10.1109/TSMC.1982.4308925

[4]   Tanaka, H. and Ishibushi, H. (1991) Identification of Possibilistic Linear Systems by Quadratic Membership Functions of Fuzzy Parameters. Fuzzy Sets and Systems, 41, 145-160.

[5]   Choi, S.H. and Dong, K.H. (2004) Note on Fuzzy Regression Model. Iran. Stat. Conf., 7, 51-55.

[6]   Guo, P. and Tanaka, H. (2006) Dual Models for Posibilistic Regression Analysis. Computational Statistics & Data Analysis, 51, 253-266.

[7]   Tanaka, H. and Guo, P. (1999) Possibilistic Data Analysis for Operations Research. Springer-Verlag, New York.

[8]   Yen, K.K., Ghoshray, G. and Roig, G. (1999) A Linear Regression Model Using Triangular Fuzzy Number Coefficient. Fuzzy Sets and Systems, 106, 167-177.

[9]   Celmins, A. (1987) Least Squares Model Fitting to Fuzzy Vector Data. Fuzzy Sets and Systems, 22, 245-269.

[10]   Diamond, P. (1988) Fuzzy Least Squares. Information Sciences, 46, 141-157.

[11]   Hong, D.H., Song, L.K. and Do, H.Y. (2001) Fuzzy Least-Squares Linear Regression Analysis Using Shape Preserving Operations. Information Sciences, 138, 185-193.

[12]   Yang, M.S. and Lin, T.S. (2002) Fuzzy Least-Squares Linear Regression Analysis for Fuzzy Input-Output Data. Fuzzy Sets and Systems, 126, 389-399.

[13]   Choi, S.H. and Buckley, J.J. (2008) Fuzzy Regression Using Least Absolute Deviation Estimators. Soft Computing, 12, 257-263.
https://doi.org/10.1007/s00500-007-0198-3

[14]   Chen, L.H. and Hsueh, C.C. (2009) Fuzzy Regression Models Using the Least-Squares Method Based on the Concept of Distance. IEEE Transactions on Fuzzy Systems, 6, 1259-1272.
https://doi.org/10.1109/TFUZZ.2009.2026891

[15]   Hassanpour, H., Maleki, H.R. and Yaghoobi, M.A. (2010) Fuzzy Linear Regression Model Crisp Cofficients: A Goal Programming Approach. Iranian Journal of Fuzzy Systems, 7, 19-39.

[16]   Zhang, A.W. (2012) A Least-Squares Approach to Fuzzy Regression Analysis with Trapezoidal Fuzzy Number. Mathematics in Practice and Theory, 42, 235-244. (In Chinese)

[17]   Stahel, W. and Weisberg, S. (1991) Directions in Robust Statistics and Diagnostics. Springer-Verlag, New York.
https://doi.org/10.1007/978-1-4615-6861-2

[18]   Kim, M.H. and Kim, M.S. (2016) Interactive Visual Least Absolutes Method: Comparison with the Least Squares and the Median Methods. Journal of Chemical Education, 93, 1737-1743.
https://doi.org/10.1021/acs.jchemed.6b00079

[19]   Zhang, J.L. (2007) The Least Square and Least Absolute Deviation. Science, 27, 152. (In Chinese)

[20]   Zhu, C.H. and Wang, W.P. (2011) An Empirical Analysis on Difference of Least One-Power Method and Least Squares Method. Journal of Wuhan Institute of Shipbuilding Technology, 3, 118-120. (In Chinese)

[21]   Yager, R.R. (1980) On a General Class of Fuzzy Connectives. Fuzzy Sets and Systems, 4, 235-242.

[22]   Zhang, A.W. (2012) Statistical Analysis of Fuzzy Linear Regression Model Based on Centroid Method. Fuzzy Systems and Mathematics, 26, 172-177. (In Chinese)

[23]   Hojati, M., Bector, C.R. and Smimou, K. (2005) A Simple Method for Computation for Fuzzy Linear Regression. European Journal of Operational Research, 166, 172-184.

[24]   Hong, D.H. (2001) Shape Preserving Multiplication of Fuzzy Numbers. Fuzzy Sets and Systems, 123, 81-84.

[25]   Mesiar, R. (1997) Shape Preserving Additions of Fuzzy Interval. Fuzzy Sets and Systems, 86, 73-78.

[26]   Hong, D.H. and Do, H.Y. (1997) Fuzzy System Reliability Analysis by the Use TW (The Weakest T-Norm) on the Fuzzy Number Arithmetic Operations. Fuzzy Sets and Systems, 90, 307-316.

[27]   Hong, D.H., Lee, S. and Do, H.Y. (2001) Fuzzy Linear Regression Analysis for Fuzzy Input-Output Data Using Shape-Preserving Operations. Fuzzy Sets and Systems, 122, 513-526.

[28]   Abbasbandy, S. and Asady, B. (2004) The Nearest Trapezodial Fuzzy Number to a Fuzzy Quantity. Applied Mathematics and Computation, 156, 381-386.

[29]   Ban, A. (2008) Approximation of Fuzzy Numbers by Trapezoidal Fuzzy Numbers Preserving the Expected Interval. Fuzzy Sets and Systems, 159, 1327-1344.

[30]   Yeh, C.T. (2007) A Note on Trapezoidal Approximations of Fuzzy Numbers. Fuzzy Sets and Systems, 158, 747-754.

[31]   Amory, B., Reda, B. and Sylvie, G. (2010) A Revisited Approach to Linear Regression Using Trapezoidal Fuzzy Intervals. Information Sciences, 180, 3653-3673.

[32]   Li, J.H., Zeng, W.Y. and Yin, Q. (2016) A New Fuzzy Regression Model Based on Least Absolute Devation. Engineering Applications of Artificial Intelligence, 52, 54-64.

[33]   Wang, N. and Lu, Q.J. (2016) Least-Absolutes Regression Using LL Type of Fuzzy Number Operations Based on Drastic Product. Operations Research and Management Science, 25, 145-153. (In Chinese)

[34]   Xu, R.N. and Li, C.L. (2001) Multidimensional Least Squares Fitting with a Fuzzy Model. Fuzzy Sets and Systems, 119, 182-203.

[35]   Bale entis, T. and Zeng, S. (2013) Group Multi-Criteria Decision Making Based upon Interval-Valued Fuzzy Numbers, an Extension of the MULTIMOORA Method. Expert Systems with Applications, 40, 543-550.

[36]   Chamodrakas, I. and Martakos, D. (2011) A Utility-Based Fuzzy TOPSIS Method for Energy Efficient Network Selection in Heterogeneous Wireless Networks. Applied Soft Computing, 11, 3734-3743.

[37]   Rashid, T., Beg, I. and Husnien, S.M. (2014) Robot Selection by Using Generalized Interval-Valued Fuzzy Numbers with TOPSIS. Applied Soft Computing, 21, 462-468.

[38]   Hu, B.Q. (2010) Fuzzy Theory Basis. 2nd Edition, Wuhan University Press. (In Chinese)

[39]   Nguyen, T. (1978) A Note on the Extension Principle for Fuzzy Sets. Journal of Mathematical Analysis and Applications, 64, 369-380.

[40]   Kim, B. and Bishu, R.R. (1998) Evaluations of Fuzzy Linear Regression Models by Comparing Membership Functions. Fuzzy Sets and Systems, 100, 343-352.

[41]   Rezaei, H., Emoto, M. and Mukaidono, M. (2006) New Similarity Measure between Two Fuzzy Sets. Journal of Advanced Computational Intelligence and Intelligent Informatics, 10, 946-953.
https://doi.org/10.20965/jaciii.2006.p0946

[42]   Chen, L.H. and Hsueh, C.C. (2007) A Mathematical Programming Method for Formulating a Fuzzy Regression Model Based on Distance Criterion. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 37, 705-712.
https://doi.org/10.1109/TSMCB.2006.889609

 
 
Top