Improving bankruptcy prediction using Data Envelopment Analysis scores
Abstract
Most current bankruptcy prediction models are based on financial ratios, although their usage is not supported by formal theory and their interpretation is problematic. One of the prospects for improving the predictive models is the study of other firm performance measures, such as the data envelopment analysis (DEA) scores. However, this raises the problem of choosing the optimal DEA specification, since it determines the shape of the efficiency frontier and the predictive properties of the model. This paper presents a method for automatically designing DEA models whose scores are then used as features to improve the quality of bankruptcy predictors. The method has two goals. The first is to improve accuracy. The second objective assumes that if DEA scores improve the prediction, then the specification of this model can provide information about failures. At the first step, accounting measures that are potentially suitable for the DEA are selected using hierarchical clustering. The second step explores the causal relationships between the selected measures. The third step calculates pure technical efficiency, scale efficiency and mix efficiency. Experiments with two datasets show that the inclusion of these scores in the list of features improves the AUC-ROC by more than 20%, which is superior to previous works. The analysis of the DEA models provides insight into the reasons for a firm’s failure in both stable and crisis periods.
Downloads
References
Zhao J., Ouenniche J., De Smedt J. (2024) Survey, classification and critical analysis of the literature on corporate bankruptcy and financial distress prediction. Machine Learning with Applications, article 100527. https://doi.org/10.1016/j.mlwa.2024.100527
Smith P. (1990) Data envelopment analysis applied to financial statements. Omega, vol. 18(2), pp. 131–138.
Feroz E.H., Kim S., Raab R.L. (2003) Financial statement analysis: A data envelopment analysis approach. Journal of the Operational Research Society, vol. 54(1), pp. 48–58. https://doi.org/10.1057/palgrave.jors.2601475
Fernandez-Castro A., Smith P. (1994) Towards a general non-parametric model of corporate performance. Omega, vol. 22(3), pp. 237–249.
Zelenkov Y., Volodarskiy N. (2021) Bankruptcy prediction on the base of the unbalanced data using multi-objective selection of classifiers. Expert Systems with Applications, vol. 185, article 115559. https://doi.org/10.1016/j.eswa.2021.115559
Burova A., Penikas H., Popova S. (2020) Probability of default (PD) model to estimate ex ante credit risk. Bank of Russia. Working Paper Series, no. 66. Available at https://www.cbr.ru/StaticHtml/File/116472/wp-66_e.pdf (accessed 01 August 2025).
Tangsawasdirat B., Tanpoonkiat S., Tangsatchanan B. (2021) Credit Risk Database: Credit Scoring Models for Thai SMEs. Puey Ungphakorn Institute for Economic Research. Discussion paper, no. 168. Available at: https://www.pier.or.th/en/dp/168/ (accessed 01 August 2025).
Benítez-Peña S., Bogetoft P., Morales D.R. (2020) Feature selection in data envelopment analysis: a mathematical optimization approach. Omega, vol. 96, article 102068. https://doi.org/10.1016/j.omega.2019.05.004
Li Z., Crook J., Andreeva G. (2014) Chinese companies distress prediction: an application of data envelopment analysis. Journal of the Operational Research Society, vol. 65(3), pp. 466–479. https://doi.org/10.1057/jors.2013.67
Mousavi M.M., Ouenniche J., Tone K. (2019) A comparative analysis of two-stage distress prediction models. Expert Systems with Applications, vol. 119, pp. 322–341. https://doi.org/10.1016/j.eswa.2018.10.053
Charnes A., Cooper W.W., Rhodes E. (1978) Measuring the efficiency of decision making units. European Journal of Operational Research, vol. 2, pp. 429–444.
Banker R.D., Charnes A., Cooper W.W. (1984) Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, vol. 30, pp. 1078–1092.
Cooper W.W., Seiford L.M., Tone K. (2006) Data envelopment analysis: A comprehensive text with models, applications, references, and DEA-Solver software. Springer.
Premachandra I.M., Chen Y., Watson J. (2011) DEA as a tool for predicting corporate failure and success: A case of bankruptcy assessment. Omega, vol. 39(6), pp. 620–626. https://doi.org/10.1016/j.omega.2011.01.002
Shetty U., Pakkala T.P.M., Mallikarjunappa T. (2012) A modified directional distance formulation of DEA to assess bankruptcy: An application to IT/ITES companies in India. Expert Systems with Applications, vol. 39(2), pp. 1988–1997. https://doi.org/10.1016/j.eswa.2011.08.043
Ouenniche J., Tone K. (2017) An out-of-sample evaluation framework for DEA with application in bankruptcy prediction. Annals of Operations Research, vol. 254(1), pp. 235–250. https://doi.org/10.1007/s10479-017-2431-5
Xu X., Wang Y. (2009) Financial failure prediction using efficiency as a predictor. Expert Systems with Applications, vol. 36(1), pp. 366–373. https://doi.org/10.1016/j.eswa.2007.09.040
Yeh C.C., Chi D.J., Hsu M.F. (2010) A hybrid approach of DEA, rough set and support vector machines for business failure prediction. Expert Systems with Applications, vol. 37(2), pp. 1535–1541. https://doi.org/10.1016/j.eswa.2009.06.088
Psillaki M., Tsolas I.E., Margaritis D. (2010) Evaluation of credit risk based on firm performance. European Journal of Operational Research, vol. 201(3), pp. 873–881. https://doi.org/10.1016/j.ejor.2009.03.032
Tone K. (2002) A slacks-based measure of super-efficiency in data envelopment analysis. European Journal of Operational Research, vol. 143(1), pp. 32–41. https://doi.org/10.1016/S0377-2217(01)00324-1
Spirtes P., Glymour C., Scheines R. (2001) Causation, prediction, and search. The MIT Press.
Koller D., Friedman N. (2009) Probabilistic graphical models: principles and techniques. MIT press.
Zelenkov Yu.A., Lashkevich E.V. (2024) Counterfactual explanations based on synthetic data generation. Business Informatics, vol. 18(3), pp. 24–40. https://doi.org/10.17323/2587-814X.2024.3.24.40
Zhang J. (2008) On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence, vol. 172(16–17), pp. 1873–1896. https://doi.org/10.1016/j.artint.2008.08.001
Copyright (c) 2025 HSE University

This work is licensed under a Creative Commons Attribution 4.0 International License.








