2014. No. 4 (30)

Full text of the journal

Display abstracts

Data analysis and intelligence systems

Maxim Kolomeychenko, Andrey Chepovskiy

Huge graph visualization and analysis

7–16

Maxim I. Kolomeychenko - Graduate of MSc Program, National Research University Higher School of Economics;
Address: 20, Myasnitskaya street, Moscow, 101000, Russian Federation.
E-mail: maxim.kolomeychenko@mail.ru

Andrey M. Chepovskiy - Associate Professor, Department of Information Security Management, Faculty of Business Informatics, National Research University Higher School of Economics; Professor, Department of Applied Mathematics and Systems Modeling, Institute of Communications and Media Business, Moscow State University of Printing Arts.
Address: 2a, Pryanishnikova street, Moscow, 127550, Russian Federation.
E-mail: achepovskiy@hse.ru

The problem of huge graph visualization arises in various fields of sociology and marketing. The relevance of this work is determined by the need for the software package for analysis and visualization of such graphs. This paper introduces summary analysis of several software products and highlights their weaknesses: lack of cross-platform and specialized graph warehouses, inability to deal with huge graphs.
This paper presents a detailed description of the overall architecture of the software developed and each of its modules separately, as well as a procedure for communication between core modules. A special graph warehouse capable to process graphs having up to 100 million vertices and billions of links is used to store graphs. A description of main data warehousing principles is also introduced. The use of a proprietary file system ensures the absence of additional system calls when working with the warehouse and the lack of a complex addressing system and any excessive mechanisms, that enables to avoid any extra overhead costs associated with data warehousing.
Furthermore, there is a description of a methodology for data visualization module, used data structures and computer graphics algorithms, that enable to handle graphs comprising up to millions of vertices on real time basis.
It is worth noting a wide range of algorithms for auto layout of graphs such as random layout, circular layout, circular component layout, peacock tail layout, one or two themes layout, layout based on community detection and estimation of cohesion. This paper presents a detailed description of each of the layout mentioned above.
Special attention should be paid to the developed methods of graph analysis. Algorithms have been designed to detect communities in social networks, to evaluate graph cohesion, to find the shortest paths between any pair of vertices, graphs union and intersection, etc.
A key feature of all algorithms presented in this paper is their ability to handle huge graphs.

Mikhail Lanin

Automatic detection of reference elements on semi-structured document images

17–23

Mikhail Lanin - Post-graduate student, Department of Images Recognition and Text Processing, Faculty of Innovations and High Technologies, Moscow Institute of Physics and Technology (State University); Software engineer, ABBYY Production.
Address: 9, Institutskiy per., Dolgoprudny, Moscow Region, 141700, Russian Federation.
E-mail: mike.lanin@gmail.com

The paper deals with automatic data extraction from semi-structured documents. The through optical character recognition methods are slightly applicable for this kind of input. To simplify the process to create structural descriptions of such documents machine learning methods are widely used, however, current solutions are still complicated for end-users, because these require manual description of document structure elements, which are not directly relevant to date to be extracted.
The article presents a possible approach to describe variable structure document images used in document data capture system called ABBYY FlexiCapture and a method of automatic model creation based on layout of all structure elements. The paper provides a detailed description of an algorithm for automatic detection of reference elements based on user layout of data to be extracted that enables to facilitate dramatically the process of building of a structured model of an ABBYY FlexiCapture document from the user perspective. Integration of this technology at the data extraction validation stage enables to incrementally improve the structural model of a document, as it requires a user only to correct localization of wrongly found data being extracted. Finally, the paper describes a method to assess robustness of the proposed approach and test results. The described method involving detection of reference elements has shown its effectiveness in processing actual payment documents of a number of German suppliers: 89.3% of invoiced can be treated with no faults with minimum user intervention; furthermore, the data had been extracted correctly from 97.8% of fields.

Mikhail Orlov

An algorithm for multicriteria stratification

24–35

Mikhail A. Orlov - Post-graduate Student, Department of Data Analysis and Artificial Intelligence, Faculty of Computer Science, National Research University Higher School of Economics.
Address: 20, Myasnitskaya street, Moscow, 101000, Russian Federation.
E-mail: ormian@mail.ru

This paper elaborates an approach to the problem of multicriteria ranking referred to as multicriteria stratification. The target of stratification is an ordered partition with predefined number of classes – strata rather than a complete ranking of the set of objects. Ranking is computed by means of linear convolution of criteria with some weights. These weights are based on assumption that data can fit some linear structure so that “parallel” layers can be identified - strata.
In the paper [6] the authors formulated the problem of multicriteria stratification as a task of minimization of a cost function depending on criteria weights; however the algorithm proposed in that paper to address the emerging task based on random searching has demonstrated low performance in comparison to some other stratification approaches.
In this paper a new algorithm based on quadratic programming is proposed to optimize the multicriteria stratification target function. A more sophisticated synthetic data generator for a comparative study of the stratification algorithm has been developed. The new data generator has more parameters to tune and allows more flexible control of geometry of synthetic strata: orientation, thickness, spread and intensity of layers that enables to pay due regard to real data structure.
The novel algorithm has been compared experimentally with existing stratification approaches by involving synthetic data, and its competitiveness has been shown in the majority of case studies. Two real-world datasets have been processed – bibliometrical indicators of 118 scientific journals and parameters of publication activities of 102 countries. The new algorithm applied to handle these data has produced sensible and well interpretable outputs. Furthermore, on these data the proposed algorithm found the most coherent multicriteria stratification to those computed by each single criterion

Information systems and technologies in business

Igor Fomin, Nadezhda Serdyukova

Computational model for measuring electric power in meter-to-cash systems

36–44

Igor N. Fomin -Post-graduate Student, Department of Applied Informatics and Software Engineering, International School of Applied Information Technology, Yuri Gagarin State Technical University of Saratov.
Address: 77, Politechnicheskaya Street, Saratov, 410054, Russian Federation.
E-mail:ignik16@yandex.ru

Nadezhda V. Serdyukova - MSc Program Student, Department of Applied Informatics and Software Engineering, International School of Applied Information Technology, Yuri Gagarin State Technical University of Saratov.
Address: 77, Politechnicheskaya Street, Saratov, 410054, Russian Federation.
E-mail: serdukova@orgcentr.com

This article addresses building a computational model for measuring electric power in information systems designed to support automated pricing in the retail electricity market of the Russian Federation.
The object of this study is the retail electricity market and business processes of retail power supply enterprises. The subject of the study is arrays of energy consumption habits data.
The purpose of this study is to build a conceptual framework and an information system model enabling to calculate retail prices and costs of electric power consumed.
This study has solved the following tasks:
·      to identify entities that constitute parameters to calculate electric power prices and costs;
·      to classify user data and estimate data involved in automated pricing, and to formulate classification criteria;
·      to select various options to integrate these parameters;
·      to simulate the structure of a database that forms the computational model for measuring electric power.
This paper presents definitions of user data and estimate indicators, which are elements of industry-specific corporate information systems, describes underlying principles applied to build the computational model for measuring electric power when designing and implementing meter-to-cash systems, as well as formulates definitions of various electric power concepts from the perspective of information technologies.
Pricing in the retail electricity market has been regarded as a socio-economic process, towards which mathematical simulation and systems analysis methods have been applied. The suggested classification criteria and rules for links building in the database structure enable to ensure sustainable element links and organization that has manifested itself in revealed decrease in entropy of the system as a whole compared to the backbone elements and factors.
The detected and described links can offer practical help to address industry-specific software engineering and parametric adaptation challenges faced in development of meter-to-cash systems.
This study involves simulation of the database structure that forms the computational model for measuring electric power enabling to operate various power supply contractual and technical inputs, as well as to determine retail prices and cost of electric power consumed.
The theoretical and analytical research enabled the authors to create an innovative product – a software solution “AS Energo.UPEL” (certificate of state registration # 2013615624 issued on 17.06.2013) that has been successfully tested at power supply enterprises in several Russian regions.

Valeriy Chebotarev , Alexander Gromov

Automation of education process

45–52

Valeriy Chebotarev - Associate Professor, Department of Modeling and Business Process Optimization, National Research University Higher School of Economics
Address: 20, Myasnitskaya str., Moscow, 101000, Russian Federation
E-mail: vchebotarev@hse.ru

Alexander Gromov - Professor, Department of Modeling and Business Process Optimization, National Research University Higher School of Economics
Address: 20, Myasnitskaya str., Moscow, 101000, Russian Federation
E-mail: agromov@hse.ru

The article is based on the results obtained by the participants of the research seminar in the faculty of business Informatics of the Higher School of Economics. Its objective is to study possibilities of learning management as a process and to apply modern information technologies and tools. The learning process is defined as joint activity of participants in organization of discipline study and obtaining of subjectively or objectively new subject and procedural knowledge. The coursework in the discipline “Modeling and optimization of business processes” has been chosen for our research. We have analyzed specifics of learning process, defined requirements of process management and made modeling and assessment of automated process management. The analysis is based on information of real studies in the Faculty of Business Informatics. We have identified the following requirements to learning process: fast adaptation to changes, flexible management using various teaching methods, constant interaction with education participants, creative and reflexive abilities using participants’ personality. EEPC (ARIS) and interaction and behavior diagram (Metasonic) were used as two different notations. The first notation refers to classic modeling methodology (ARIS) and the second one – to subject-oriented methodology of process management. It was shown that subject-oriented approach meets all requirements of education process that is almost impossible with classical approach. Principles of dividing learning process into smaller processes, process models (interaction diagrams and behavior diagrams of subjects) are developed. Possibility to change process management by learning subjects (teachers, students) without programming, including the automated generation of workflow application was presented. Results of research can be useful for any organization thinking about transition from traditional rigid education structure to a more advanced reflexive learning environment with network communications.

Mathematical methods and algorithms of business informatics

Pavel Malyzhenkov, Tatiana Babkina, Aleksey Sergeev

Flexible organizational forms design based on a transaction approach

53–62

Pavel V. Malyzhenkov - Professor, Department of Information Systems and Technologies, Faculty of Business Informatics and Applied Mathematics, National Research University Higher School of Economics.
Address: 25/12, Bolshaya Pecherskaya str., Nizhniy Novgorod, 603155, Russian Federation.
E-mail: pmalyzhenkov@hse.ru

Tatiana S. Babkina - Senior Lecturer, Department of Applied Mathematics and Informatics, Faculty of Business Informatics and Applied Mathematics, National Research University Higher School of Economics.
Address: 25/12, Bolshaya Pecherskaya str., Nizhniy Novgorod, 603155, Russian Federation.
E-mail: tbabkina@hse.ru

Aleksey I. Sergeev - Post-graduate Student, Department of Information Systems and Technologies, Faculty of Business Informatics and Applied Mathematics, National Research University Higher School of Economics.
Address: 25/12, Bolshaya Pecherskaya str., Nizhniy Novgorod, 603155, Russian Federation.
E-mail: aisergeev07bi@gmail.com

Modern international economic environment is exposed to profound transformations of business operating conditions due to consequences of the financial crisis. Currently the organizational flexibility becomes the most important characteristic of enterprises. In its turn it presumes the adoption of such organizational structures where business relationships and aligned IT infrastructure are recognized as a specific type of the resource that a company can use to achieve competitive advantage. This research analyzes various issues of flexible organization and enterprise models which influence functionality and architecture constraints of enterprise information systems. For the analysis the authors have applied a transactions mechanism concept and specific design methodology. This paper offers an insight into key properties of four flexible organizational forms in tight connection with Enterprise Ontology formal modeling approach and DEMO, which follow the language-action perspective.

Software engineering

Denis Pashchenko, Andrey Blinov

Standardization in software production at the corporate level: Results of research in CIS

63–71

Denis S. Pashchenko - Independent consultant in software domain.
E-mail: denpas@rambler.ru

Andrey O. Blinov - Professor, Department of General Management, Faculty of Management, Financial University under the Government of Russian Federation; Corresponding Member of the Russian Academy of Natural Science
Address: 49, Leningradskiy Prospect, Moscow, 125993, Russian Federation.
E-mail: aoblinov@mail.ru

This paper focuses on challenges associated with modification and enhancement of process models in software production in CIS region, usually accompanied by specific risks and organizational resistance, and aggravated by weakness of formal corporate change management structures. All findings and conclusions are based on authors’ survey carried out at the end of 2013 that covered 21 managers of software companies from CIS. The study was aimed to address challenging issues of software production standardization and certification, organizational resistance and other specifics of change management at the level of an entire company. This paper highlights relevant institutional interventions to support change management at planning, staff preparation and change implementation phases. The experts have ascertained that the systemic approach to change management is needed, including formal change planning activities and establishment of a special team change management for an internal project. Also the experts have shared their practical experiences and outputs: typical challenges, change reinforcement techniques and transformation timeframes.
The authors have resumed their research findings by formulating the following recommendations: to use a 4-stage lifecycle change plan, to manage general and specific risks at all stages, to formalize change management and to use change implementation analysis results in future practice.

Tatiana Bogdanova , Tatiana Yakovets

World demographic situation from the perspective of global demographic balance

72–77

Tatiana Bogdanova - Associate Professor, Department of Business Analytics, Faculty of Business Informatics, National Research University Higher School of Economics.
Address: 20, Myasnitskaya str., Moscow, 101000, Russian Federation.
E-mail: tanbog@hse.ru

Tatiana Yakovets - Corresponding member of Russian Academy of Natural Sciences, Doctoral candidate of Institute of social and economic problems of the population of the Russian Academy of Sciences.
Address: 32, Nahimovskij prospect, Moscow, 117218, Russian Federation.
E-mail: tzag@mail.ru

In 1971 in his Nobel lecture Simon Kuznets noted that the population growth had been ceasing to be the main force of the economic growth over the last one or two decades. Accordingly, the authors have examined the contemporary demographic situation in the world based on information given in United Nations(UN) population prospects. This paper describes the global demographic balance method that includes 5 age cohorts of the population of 20 countries and regions of the world for the last period of 1950-2010 and UN Population Prospect to 2050. This method has been applied to analyze quantitative parameters of the demographic situation in developed, least developed and in developing countries. Developed countries, which had passed the demographic transition, will face a depopulation pattern in XXI. The age structure of depopulation trends in these countries is given. In least developed countries the population growth has been persisting but not at so high pace as it was in the second half of the 20^th century. BRICS countries stand out from developing countries. To assess qualitative characteristics of countries of the world the Human Development Index (HDI) has been used. This paper outlines characteristics of this indicator given by United Nations Development Programme (UNDP). HDI values for BRICS countries are specified, and a conclusion has been drawn that qualitative growth is needed for economies of these countries. Outputs of world population simulation and projections by G.P.Gorshkov, B.M.Dolgopolov and A.A.Akayev, adjusted for the biosphere ultimate capacity, are presented. A conclusion has been formulated that projections by S.P.Kapitsa and UN experts that disregard the biosphere ultimate capacity are more realistic.