A multimodal model for commercial real estate valuation based on the integration of geospatial data and language-model-derived features

Keywords: street retail, automated real estate valuation, commercial real estate, machine learning, geospatial data, large language models, LLM, multimodal data

Abstract

The aim of this paper is to substantiate methodological approaches to commercial real estate valuation through the integration of heterogeneous data, and to demonstrate their application within the Moscow market. A methodology for constructing a multimodal automated valuation model (AVM) is presented, built on the integration of structured, geospatial, and semantic data. The study was conducted in the street retail segment of the Moscow commercial real estate market. Three sequential feature configurations were developed: a baseline model, a model extended with geospatial characteristics, and a model further augmented with interpretable features extracted from property text descriptions via the GigaChat language model. Modeling was performed with CatBoost and LightGBM algorithms. Hyperparameters were optimized through cross-validation on the training sample, and final model quality was assessed on a held-out test set. Adding geospatial features reduces the mean absolute percentage error by 17–20% for both algorithms-a significant reduction. The inclusion of LLM-derived semantic features yields a further accuracy gain of 1.6–3.3%. SHAP analysis confirms the dominant role of spatial factors, while qualitative property characteristics also contribute meaningfully to model output. The multimodal approach delivers higher accuracy and interpretability than conventional single-source models, demonstrating its applicability to mass appraisal tasks and its practical value for the banking sector, real estate developers, and valuation firms. The theoretical contribution of this work lies in identifying a viable direction for the development of valuation methodology in the era of digital data. The approach does not substitute for expert judgment. It reinforces it, providing practitioners with a scalable, reproducible, and objective analytical instrument that formally incorporates both spatial and semantic context. The results have practical relevance for real estate lending institutions, valuation companies, and entities owning or managing commercial property portfolios. They are also directly applicable to regional state budgetary institutions of the Russian Federation responsible for cadastral valuation, and to Rosreestr (the Federal Service for State Registration, Cadaster and Cartography) in its efforts to advance mass property appraisal methodology at the national level.

Downloads

Download data is not yet available.

References

Knight, A. (2022). Automated valuation models (AVMs): implications for the profession and their clients. RICS. https://www.rics.org/content/dam/ricsglobal/documents/standards/april_2022_automated_valuation_models_insight.pdf

Darrieux, S., Schmidt, C., Saint-Sernin, A., Mauger, M., & Thorne, C. (2024). Commercial real estate valuations: insights from on-site inspections. ECB Banking Supervision Newsletter. https://www.bankingsupervision.europa.eu/press/supervisory-newsletters/newsletter/2024/html/ssm.nl240814.en.html

Calainho, F. D., van de Minne, A. & Francke, M. K. (2024). A machine learning approach to price indices: Applications in commercial real estate. The Journal of Real Estate Finance and Economics, 68, 624–653. https://doi.org/10.1007/s11146-022-09893-1

Deppner, J., von Ahlefeldt-Dehn, B., Beracha, E., & Schaefers, W. (2025). Boosting the accuracy of commercial real estate appraisals: An interpretable machine learning approach. The Journal of Real Estate Finance and Economics, 71, 314–351. https://doi.org/10.1007/s11146-023-09944-1

von Ahlefeldt-Dehn, B. (2023). Understanding commercial real estate markets with machine learning methods. Schriften zu Immobilienökonomie und Immobilienrecht 106, PhD, Universität Regensburg (Thesis of the University of Regensburg). https://doi.org/10.5283/epub.54900

Lorenz, F., Willwersch, J., Cajias, M., & Fuerst, F. (2023). Interpretable machine learning for real estate market analysis. Real Estate Economics, 51, 1178–1208. https://doi.org/10.1111/1540-6229.12397

Choy, L. H. T., & Ho, W. K. O. (2023). The use of machine learning in real estate research. Land, 12(4), 740. https://doi.org/10.3390/land12040740

Droj, G., Kwartnik-Pruc, A., & Droj, L. (2024). A comprehensive overview regarding the impact of GIS on property valuation. ISPRS International Journal of Geo-Information, 13(6), 175. https://doi.org/10.3390/ijgi13060175

Kucklick, J.-P., Priefer, J., Beverungen, D., & Müller, O. (2021). Quantifying the impact of location data for real estate appraisal – a GIS-based deep learning approach. European Conference on Information Systems (ECIS). https://aisel.aisnet.org/cgi/viewcontent.cgi?article=1022&context=ecis2021_rip

Kang, Y., Zhang, F., Peng, W., Gao, S., Rao, J., Duarte, F., & Ratti, C. (2021). Understanding house price appreciation using multi-source big geo-data and machine learning. Land Use Policy, 111, 104919. https://doi.org/10.1016/j.landusepol.2020.104919

Wei, C., Fu, M., Wang, L., Yang, H., Tang, F., & Xiong, Y. (2022). The research development of hedonic price model-based real estate appraisal in the era of big data. Land, 11(3), 334. https://doi.org/10.3390/land11030334

Hernes, M., Tutak, P., Nadolny, M., & Mazurek, A. (2024). Real estate valuation using machine learning. Procedia Computer Science, 246, 4592–4599. https://doi.org/10.1016/j.procs.2024.09.323

Zhang, H., Li, Y., & Branco, P. (2024). Describe the house and I will tell you the price: House price prediction with textual description data. Natural Language Engineering, 30(4), 661–695. https://doi.org/10.1017/S1351324923000360

Wang, P.-Y., Chen, C.-T., Su, J.-W., Wang, T.-Y., & Huang, S.-H. (2021). Deep learning model for house price prediction using heterogeneous data analysis along with joint self-attention mechanism. IEEE Access, 9, 55244–55259. https://doi.org/10.1109/ACCESS.2021.3071306

Geerts, M., Reusens, M., Baesens, B., vanden Broucke, S., & De Weerdt, J. (2025). On the performance of LLMs for real estate appraisal. arXiv:2506.11812. https://doi.org/10.48550/arXiv.2506.11812

Yuan, L., Mo, F., Huang, K., Wang, W., Zhai, W., Zhu, X., Li, Y., Xu, J., & Nie, J.-Y. (2025). OmniGeo: Towards a multimodal large language models for geospatial artificial intelligence // arXiv:2503.16326. https://doi.org/10.48550/arXiv.2503.16326

Huang, C., Li, Z., Chen, F., & Liang, B. (2025). Multimodal machine learning for real estate appraisal: A comprehensive survey // arXiv:2503.22119. https://doi.org/10.48550/arXiv.2503.22119

Geerts, M., vanden Broucke, S., & De Weerdt, J. (2023). A survey of methods and input data types for house price prediction. ISPRS International Journal of Geo-Information, 12(5), 200. https://doi.org/10.3390/ijgi12050200

Tekouabou, S.C.K., Gherghina, Ş.C., Kameni, E.D., Filali, Y., & Gartoumi, K. I. (2024). AI-based on machine learning methods for urban real estate prediction: A systematic survey. Archives of Computational Methods in Engineering, 31, 1079–1095. https://doi.org/10.1007/s11831-023-10010-5

Wang, F. (2023). The present and future of the digital transformation of real estate: A systematic review of smart real estate. Business Informatics, 17(2), 85–97. https://doi.org/10.17323/2587-814X.2023.2.85.97

Bogdanova, T. K., Kamalova, A. R., Kravchenko, T. K., & Poltorak, A. I. (2020). Problems of modeling the valuation of residential real estate. Business Informatics, 14(3), 7–23. https://doi.org/10.17323/2587-814X.2020.3.7.23

Astrakhantseva, I. A., & Smirnova, N. V. (2022). Valuation of commercial real estate based on machine learning models. Scientific Works of the Free Economic Society of Russia, 237(5), 34–57. https://cyberleninka.ru/article/n/otsenka-stoimosti-kommercheskoy-nedvizhimosti-na-osnove-modeley-mashinnogo-obucheniya

Sakhnovskaya O. E. (2024). Artificial intelligence in real estate valuation. Bulletin of Saint Petersburg State University of Economics, 3(147), 142–145. https://cyberleninka.ru/article/n/iskusstvennyy-intellekt-v-otsenke-stoimosti-nedvizhimosti

Grineva N. V., & Topyrkin A. D. (2023). Geoanalytics: collection, analysis, and visualization of geographic data for solving commercial organization placement problems. Innovations and Investments, 5, 296–302. https://cyberleninka.ru/article/n/geoanalitika-sbor-analiz-i-otobrazhenie-geograficheskih-dannyh-dlya-resheniya-zadach-razmescheniya-kommercheskih-organizatsiy

Smirnov, S., & Tlostanov, V. (2019). Moscow real estate: Pricing analysis through the prism of statistics and machine learning. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3374436

Published
2026-06-30
How to Cite
SerikovD. A., & BogdanovaN. V. (2026). A multimodal model for commercial real estate valuation based on the integration of geospatial data and language-model-derived features. Business Informatics, 20(2), 22-37. Retrieved from https://bijournal.hse.ru/article/view/30221
Section
Articles