Reducing the dimensionality of multivariate indicators containing non-linearly dependent components

}, journal = {}, year = {2015}, number = {3(33) }, pages = {24-33}, url = {https://bijournal.hse.ru/en/2015--3(33) /162640920.html}, publisher = {}, abstract = {Elena R. Goryainova - Associate Professor, Department of Mathematics, Faculty of Economic Sciences, National Research University Higher School of Economics. Address: 20, Myasnitskaya Street, Moscow, 101000, Russian Federation.E-mail: el-goryainova@mail.ruJulia A. Shalimova - Graduate Student, Faculty of Economic Sciences, National Research University Higher School of Economics. Address: 20, Myasnitskaya Street, Moscow, 101000, Russian Federation.E-mail: july.shalimova@yandex.ru To solve the problem of reduction of the multidimensional vector of indicators methods of factor analysis are used. One of them is the maximum likelihood method (MLM). It allows to identify uncorrelated common factors among the set of correlated quantitative indicators. The uncorrelated common factors can represent initial indicators without significant loss of information. Common factors are detected using a special representation of the correlation matrix of the observed indicators. However, the correlation coefficient is not defined for the characteristics measured in a nominal scale. In addition, it cannot serve as a measure for the strength of the coupling indicators with nonlinear dependence. Traditional methods of factor analysis are ineffective for such situations. Two MLM modifications are proposed in the paper. They use the rank Spearman correlation coefficients and Cramer coefficients as measures of relationship between variables. 12-dimensional vectors with their coordinates dependent on each other with linear and nonlinear dependency were simulated, using the Monte Carlo method. Then a comparative analysis of the effectiveness of the traditional MLM and the two proposed modifications of the MLM was carried out for these data. It is shown that only adapted method that uses the Cramer coefficients is able to combine correctly the indicators related with nonmonotonic dependency in the common factor. On the other hand, this method has a lower efficiency than the other two methods in the cases where the dependency between variables is linear or monotonic. To demonstrate the efficiency of these methods on real data, the task of reducing the dimension of the dynamics of the relative consumer price growth in the years 2008-2014 for a group of food products has been solved.}, annote = {Elena R. Goryainova - Associate Professor, Department of Mathematics, Faculty of Economic Sciences, National Research University Higher School of Economics. Address: 20, Myasnitskaya Street, Moscow, 101000, Russian Federation.E-mail: el-goryainova@mail.ruJulia A. Shalimova - Graduate Student, Faculty of Economic Sciences, National Research University Higher School of Economics. Address: 20, Myasnitskaya Street, Moscow, 101000, Russian Federation.E-mail: july.shalimova@yandex.ru To solve the problem of reduction of the multidimensional vector of indicators methods of factor analysis are used. One of them is the maximum likelihood method (MLM). It allows to identify uncorrelated common factors among the set of correlated quantitative indicators. The uncorrelated common factors can represent initial indicators without significant loss of information. Common factors are detected using a special representation of the correlation matrix of the observed indicators. However, the correlation coefficient is not defined for the characteristics measured in a nominal scale. In addition, it cannot serve as a measure for the strength of the coupling indicators with nonlinear dependence. Traditional methods of factor analysis are ineffective for such situations. Two MLM modifications are proposed in the paper. They use the rank Spearman correlation coefficients and Cramer coefficients as measures of relationship between variables. 12-dimensional vectors with their coordinates dependent on each other with linear and nonlinear dependency were simulated, using the Monte Carlo method. Then a comparative analysis of the effectiveness of the traditional MLM and the two proposed modifications of the MLM was carried out for these data. It is shown that only adapted method that uses the Cramer coefficients is able to combine correctly the indicators related with nonmonotonic dependency in the common factor. On the other hand, this method has a lower efficiency than the other two methods in the cases where the dependency between variables is linear or monotonic. To demonstrate the efficiency of these methods on real data, the task of reducing the dimension of the dynamics of the relative consumer price growth in the years 2008-2014 for a group of food products has been solved.} }