A multimodal model for commercial real estate valuation based on the integration of geospatial data and language-model-derived features
Abstract
The aim of this paper is to substantiate methodological approaches to commercial real estate valuation through the integration of heterogeneous data, and to demonstrate their application within the Moscow market. A methodology for constructing a multimodal automated valuation model (AVM) is presented, built on the integration of structured, geospatial, and semantic data. The study was conducted in the street retail segment of the Moscow commercial real estate market. Three sequential feature configurations were developed: a baseline model, a model extended with geospatial characteristics, and a model further augmented with interpretable features extracted from property text descriptions via the GigaChat language model. Modeling was performed with CatBoost and LightGBM algorithms. Hyperparameters were optimized through cross-validation on the training sample, and final model quality was assessed on a held-out test set. Adding geospatial features reduces the mean absolute percentage error by 17–20% for both algorithms-a significant reduction. The inclusion of LLM-derived semantic features yields a further accuracy gain of 1.6–3.3%. SHAP analysis confirms the dominant role of spatial factors, while qualitative property characteristics also contribute meaningfully to model output. The multimodal approach delivers higher accuracy and interpretability than conventional single-source models, demonstrating its applicability to mass appraisal tasks and its practical value for the banking sector, real estate developers, and valuation firms. The theoretical contribution of this work lies in identifying a viable direction for the development of valuation methodology in the era of digital data. The approach does not substitute for expert judgment. It reinforces it, providing practitioners with a scalable, reproducible, and objective analytical instrument that formally incorporates both spatial and semantic context. The results have practical relevance for real estate lending institutions, valuation companies, and entities owning or managing commercial property portfolios. They are also directly applicable to regional state budgetary institutions of the Russian Federation responsible for cadastral valuation, and to Rosreestr (the Federal Service for State Registration, Cadaster and Cartography) in its efforts to advance mass property appraisal methodology at the national level.
Downloads
References
Knight, A. (2022). Automated valuation models (AVMs): implications for the profession and their clients. RICS. https://www.rics.org/content/dam/ricsglobal/documents/standards/april_2022_automated_valuation_models_insight.pdf
Darrieux, S., Schmidt, C., Saint-Sernin, A., Mauger, M., & Thorne, C. (2024). Commercial real estate valuations: insights from on-site inspections. ECB Banking Supervision Newsletter. https://www.bankingsupervision.europa.eu/press/supervisory-newsletters/newsletter/2024/html/ssm.nl240814.en.html
Calainho, F. D., van de Minne, A. & Francke, M. K. (2024). A machine learning approach to price indices: Applications in commercial real estate. The Journal of Real Estate Finance and Economics, 68, 624–653. https://doi.org/10.1007/s11146-022-09893-1
Deppner, J., von Ahlefeldt-Dehn, B., Beracha, E., & Schaefers, W. (2025). Boosting the accuracy of commercial real estate appraisals: An interpretable machine learning approach. The Journal of Real Estate Finance and Economics, 71, 314–351. https://doi.org/10.1007/s11146-023-09944-1
von Ahlefeldt-Dehn, B. (2023). Understanding commercial real estate markets with machine learning methods. Schriften zu Immobilienökonomie und Immobilienrecht 106, PhD, Universität Regensburg (Thesis of the University of Regensburg). https://doi.org/10.5283/epub.54900
Lorenz, F., Willwersch, J., Cajias, M., & Fuerst, F. (2023). Interpretable machine learning for real estate market analysis. Real Estate Economics, 51, 1178–1208. https://doi.org/10.1111/1540-6229.12397
Choy, L. H. T., & Ho, W. K. O. (2023). The use of machine learning in real estate research. Land, 12(4), 740. https://doi.org/10.3390/land12040740
Droj, G., Kwartnik-Pruc, A., & Droj, L. (2024). A comprehensive overview regarding the impact of GIS on property valuation. ISPRS International Journal of Geo-Information, 13(6), 175. https://doi.org/10.3390/ijgi13060175
Kucklick, J.-P., Priefer, J., Beverungen, D., & Müller, O. (2021). Quantifying the impact of location data for real estate appraisal – a GIS-based deep learning approach. European Conference on Information Systems (ECIS). https://aisel.aisnet.org/cgi/viewcontent.cgi?article=1022&context=ecis2021_rip
Kang, Y., Zhang, F., Peng, W., Gao, S., Rao, J., Duarte, F., & Ratti, C. (2021). Understanding house price appreciation using multi-source big geo-data and machine learning. Land Use Policy, 111, 104919. https://doi.org/10.1016/j.landusepol.2020.104919
Wei, C., Fu, M., Wang, L., Yang, H., Tang, F., & Xiong, Y. (2022). The research development of hedonic price model-based real estate appraisal in the era of big data. Land, 11(3), 334. https://doi.org/10.3390/land11030334
Hernes, M., Tutak, P., Nadolny, M., & Mazurek, A. (2024). Real estate valuation using machine learning. Procedia Computer Science, 246, 4592–4599. https://doi.org/10.1016/j.procs.2024.09.323
Zhang, H., Li, Y., & Branco, P. (2024). Describe the house and I will tell you the price: House price prediction with textual description data. Natural Language Engineering, 30(4), 661–695. https://doi.org/10.1017/S1351324923000360
Wang, P.-Y., Chen, C.-T., Su, J.-W., Wang, T.-Y., & Huang, S.-H. (2021). Deep learning model for house price prediction using heterogeneous data analysis along with joint self-attention mechanism. IEEE Access, 9, 55244–55259. https://doi.org/10.1109/ACCESS.2021.3071306
Geerts, M., Reusens, M., Baesens, B., vanden Broucke, S., & De Weerdt, J. (2025). On the performance of LLMs for real estate appraisal. arXiv:2506.11812. https://doi.org/10.48550/arXiv.2506.11812
Yuan, L., Mo, F., Huang, K., Wang, W., Zhai, W., Zhu, X., Li, Y., Xu, J., & Nie, J.-Y. (2025). OmniGeo: Towards a multimodal large language models for geospatial artificial intelligence // arXiv:2503.16326. https://doi.org/10.48550/arXiv.2503.16326
Huang, C., Li, Z., Chen, F., & Liang, B. (2025). Multimodal machine learning for real estate appraisal: A comprehensive survey // arXiv:2503.22119. https://doi.org/10.48550/arXiv.2503.22119
Geerts, M., vanden Broucke, S., & De Weerdt, J. (2023). A survey of methods and input data types for house price prediction. ISPRS International Journal of Geo-Information, 12(5), 200. https://doi.org/10.3390/ijgi12050200
Tekouabou, S.C.K., Gherghina, Ş.C., Kameni, E.D., Filali, Y., & Gartoumi, K. I. (2024). AI-based on machine learning methods for urban real estate prediction: A systematic survey. Archives of Computational Methods in Engineering, 31, 1079–1095. https://doi.org/10.1007/s11831-023-10010-5
Wang, F. (2023). The present and future of the digital transformation of real estate: A systematic review of smart real estate. Business Informatics, 17(2), 85–97. https://doi.org/10.17323/2587-814X.2023.2.85.97
Bogdanova, T. K., Kamalova, A. R., Kravchenko, T. K., & Poltorak, A. I. (2020). Problems of modeling the valuation of residential real estate. Business Informatics, 14(3), 7–23. https://doi.org/10.17323/2587-814X.2020.3.7.23
Astrakhantseva, I. A., & Smirnova, N. V. (2022). Valuation of commercial real estate based on machine learning models. Scientific Works of the Free Economic Society of Russia, 237(5), 34–57. https://cyberleninka.ru/article/n/otsenka-stoimosti-kommercheskoy-nedvizhimosti-na-osnove-modeley-mashinnogo-obucheniya
Sakhnovskaya O. E. (2024). Artificial intelligence in real estate valuation. Bulletin of Saint Petersburg State University of Economics, 3(147), 142–145. https://cyberleninka.ru/article/n/iskusstvennyy-intellekt-v-otsenke-stoimosti-nedvizhimosti
Grineva N. V., & Topyrkin A. D. (2023). Geoanalytics: collection, analysis, and visualization of geographic data for solving commercial organization placement problems. Innovations and Investments, 5, 296–302. https://cyberleninka.ru/article/n/geoanalitika-sbor-analiz-i-otobrazhenie-geograficheskih-dannyh-dlya-resheniya-zadach-razmescheniya-kommercheskih-organizatsiy
Smirnov, S., & Tlostanov, V. (2019). Moscow real estate: Pricing analysis through the prism of statistics and machine learning. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3374436
Copyright (c) 2026 National Research University Higher School of Economics

This work is licensed under a Creative Commons Attribution 4.0 International License.







