Using Machine Learning Regression Algorithms to Predict House Prices in Vietnam
Author
Start Page / End Page
Volume
Issue Number
Year
Publication
Minh-Thang Ha, Thi-Cham Nguyen, Thanh-Huyen Pham, Van-Hau Nguyen
505 / 527
28
4
2025
International Real Estate Review
Abstract
This study develops a comprehensive machine learning (ML) framework for house price prediction in Vietnam by utilizing a dataset of 28,156 property listings from a real estate website. We employ rigorous data preprocessing, feature engineering, and comparative analysis of ML algorithms, including CatBoost, XGBoost, and random forests. The results demonstrate the superiority of ensemble methods, with CatBoost achieving the highest performance on the main dataset (R² = 0.510, RMSE = 17.614). Regional analyses in Hanoi and Ho Chi Minh City reveal the adaptability of the models for local market dynamics. A Shapley additive explanations analysis reveals key drivers of house prices, such as area, population density, and property-specific attributes. The findings contribute to the academic understanding of real estate valuation and provide actionable insights for policymakers, investors, and other stakeholders. This study lays the groundwork for developing automated valuation models and their practical implementation, exemplified by a website application. By harnessing ML and data-driven insights, this research advances transparent, efficient, and informed decision-making in the real estate sector in Vietnam, while offering a robust methodology for house price prediction in emerging markets.
View PDF – https://doi.org/10.53383/100412
Keywords
House price prediction, Machine learning, Regression algorithms, Real estate market, Ensemble models