Dynamic pricing in the lodging industry: how can machine learning aid revenue management activities
The use of machine learning and data analytics is nowadays a substantial and fundamental theme for businesses, and it found its application in the lodging industry as well, affecting areas such as dynamic pricing, customer segmentation, demand forecasting, ratings and reviews analysis, cancellation prediction, and service customization. Price-setting operations are fundamental in the revenue management process of most businesses, as such they have become a time-consuming task for managers who must be constantly up to date on trends, events, and market conditions over extended periods of time.
Currently, advanced analytics are used in the hospitality industry mostly by large companies that have access to considerable amounts of data, nevertheless, they can provide significant support and be also applied to smaller businesses competing in the industry, indeed, analyzing how this can take place was the primary subject for this research, focusing specifically on pricing support and strategy towards a dynamic approach.
Firstly, practices in dynamic pricing, demand forecasting, and cancellation prediction in the lodging industry were reviewed, focusing especially on the possibilities for daily use of these technologies by short-term rentals, and gathering data by surveying local businesses in Verona, Italy. Findings from this first analysis showed the approach of the thirty-four surveyed businesses to pricing strategies and cancellation policies. Specifically, most of them faced these operations multiple times a week or more (59% of the respondents) and relied on manual price adjustments (77%) revealing the time-consuming aspect of this procedure that can occupy managers for a sizable portion of their daily activities.
The survey also exposed the negative impact cancellations regularly present to hotels and short-term rentals, as well as helped obtain insightful data regarding the variables that are considered by these businesses during their pricing activities, which later aided the modeling process.
Using Property Management System (PMS) data provided by a local property manager and rental business, a framework and model for price prediction was created, with the aim of understanding the potentials of machine learning technologies in this industry. The model was prepared and evaluated using Python and KNIME, with a dataset containing 3992 observations on 15 months of activity between 2021 and 2022. From the initial dataset new variables were created according to the previously collected insights to best extract value and information, also joining the PMS data with weather data, so as to include proxies and measures for demand.
After assessments regarding multicollinearity, outliers, and variable informativity, the final dataset included the target, price per night, and the following dependent features: length of stay, number of people, day, week, month, and year of check-in, lead time of reservation, language, booking source, room type (to identify and differentiate specific rooms), the maximum and minimum temperature at check-in, number of rooms booked during the five days prior to each reservation, occupancy rate, price moving average during the 5 days prior to each reservation, weekend stay (as a true/false value), and weather condition dummies for rain, fog, and storm.
The dataset was analyzed using three regression algorithms: Ordinary Least Squares (OLS), Random Forest (RF), and H2O Gradient Boosting (GB), the first through Python and the last two in a KNIME environment. The dataset was partitioned in training and testing sets, with respectively 70% and 30% of the data. OLS was most importantly used for a first exploratory variable analysis through backward stepwise selection, using AIC (Akaike Information Criterion) as the metric of choice for this step, to better evaluate the predictive power of the model. After obtaining a deeper understanding of the variables’ importance, KNIME was used for the RF and GB models, running for both TPE-based Bayesian hyperparameter optimization to tune the algorithms as shown in the figure, here using RMSE as a metric to be minimized.
The best model resulted in being GB, presenting good fit values, such as a R2 of 0.80, however, looking into other metrics such as RMSE, 23.3, and MAE, 16.9, we understand that it’s still not possible to create an automated pricing model using this data.
This problem might be overcome by including other external variables that, as the survey showed, are often considered by managers, such as competitors’ prices, rooms’ market supply, and events’ calendars, features that can often be acquired through APIs, or other external services. Likewise, models could be implemented with cancellation prediction algorithms to allow for consideration of these events during pricing, which, as the survey suggested, can cause significant uncertainty to businesses operating in the industry. Another aspect that should be considered is that these models were trained using sale data derived from manually set prices, hence the best the model can do is try to emulate them, certainly saving managers precious time, however without improving current pricing strategies.
Despite the mentioned limitations, possible model deployment modes can be discussed, focusing on two applications:
- The model can be used to provide managers with suggested prices as performed during testing, who could then adjust them according to their experience and personal considerations, which could be significant in reducing price-setting operations times.
- The model could be deployed with the addition of warnings to notify managers when current and suggested prices differ significantly, also including alerts that keep track of sudden changes in specific demand-related variables. This could enhance managers’ price and demand awareness, which would otherwise be hard to achieve, especially considering reservations with long lead times
Concluding, although the model itself was found not efficiently suitable for direct and automated deployment, the research showed the potential that advanced analytics has in the lodging industry. Research and survey results conveyed the importance of introducing algorithms for dynamic pricing, and finding advantageous potential model deployments as support during pricing operations, which could both enhance businesses’ demand awareness and improve time management.