Let’s take a look of Seattle AirBnB Open Data using Python

Which features are most related to homestay cost? How do seasonal costs change? What is different between superhost and regular host? Let’s get the answers by analyzing the data in python

Photo by MILKOVÍ on Unsplash

Summary

Seattle AirBnB Open Data describes the listing activity of AirBnB homestays in Seattle, WA until 2016. It is consists of 3 files

  • calendar.csv — including listing id and the price and availability for that day
  • reviews.csv — including unique id for each reviewer and detailed comments

How do seasonal costs change?

Below graphs show the mean, quantile prices by months.

  1. Prices fluctuate mainly at high price points, and cheap rooms do not change even when the season changes.

Which features are most related to homestay cost?

By using `pandas.DataFrame.corr` can easily do correlation analysis. Let’s show the heatmap of correlation matrix.

Correlations related to reviews

  • Each score of review features has strong correlation.
  • Number of reviews and price has a little negative correlation. Expensive homestay are expected to no attract many people and have fewer reviews.

Correlations related to bathrooms, accommodates, bedrooms, beds

  • The number of bedrooms, the number of bathrooms, and the number of beds have a high correlation with each other and a high correlation with price. It’s natural to think that the more rooms, beds, and toilets you have, the more expensive it becomes. However, the number of people was the most influential among them (0.69).
  • Whether or not a homestay can be rented for a long time has no significant correlation with other features other than each other. The minimum and maximum stay dates are not significantly correlated with other features.

What is different between superhost and regular host?

  1. Another difference is that on average the superhost has fewer rooms than the rest of the hosts.
  2. Even though the average number of beds, toilets, and number of people in a room operated by superhosts are slightly smaller There is a shorter minimum/maximum period for customers to stay, and more guests can be included. But the difference is not big.

Date the host started hosting

Location differences between superhosts and regular hosts

It was not possible to identify nearby facilities with the given data. So I marked the host’s location on the map. Red is superhost, blue is regular host.

Conclusion

  1. Accommodation prices increase in summer and December and are lowest at the beginning of the year. However, the lower the price, the less the volatility.
  2. The size of the accommodation has a big impact on the price. Price and number of reviews have a weak negative correlation.
  3. The biggest distinction between the superhost and the rest is the number of reviews. The rating is slightly higher than the rest, but it is about the same level.

2020.12.8 ~ 2022.6.9 군복무중 Serving in the South Korean Military Service