Course Project MISM 6214: Business Analytics Capstone

Optimization of Airbnb Pricing Strategies

In the evolving hospitality domain, especially Airbnb, setting the right listing price is pivotal for attracting guests and boosting revenue. The project focuses on harnessing data analytics to aid Airbnb listing owners in refining their pricing strategies amidst increasing competition. The U.S. has seen a 23.2% yearly surge in short-term rentals, totaling 1.38 million listings, driven by wealthier individuals joining platforms like Airbnb. This growth intensifies competition for individual hosts, even as Airbnb's overall business flourishes. Our project aims to bridge this gap, offering a tool based on data analysis and predictive modeling. This tool will guide Airbnb owners in pricing decisions, enhancing their market position, boosting bookings, and increasing revenue.

Dataset

  • Source: Inside Airbnb.

  • Listings.csv

  • Calendar.csv

  • Neighborhoods.csv

  • Data sets from other cities:

    • Boston: 3862 rows, 75 columns.

    • Seattle: 6376 rows, 75 columns.

    • Denver: 5362 rows, 75 columns.

Data Wrangling

  • Checked dataset using data profiling and functions like describe() & info().

  • Combined multiple datasets.

  • Handled duplicates, missing and null values by deleting or filling with specific values.

  • Dropped irrelevant columns.

  • Converted data types and encoded categorical data.

Exploratory Data Analysis

Highlights:

  • Highest difference in Superhost vs. normal host count at 32.85%.

  • Price varies from $110 to $301 based on location.

Highlights:

  • 48% of the 5361 listings are managed by superhosts.

  • Price ranges from $69/night to $313/night based on location.

  • Superhosts have higher review scores.

Highlights:

  • Highest proportion of superhosts at 58.3% out of 5886 listings.

  • Price fluctuates dramatically from $53 to $615.

Feature Selection

  • Correlation matrix

    • Higher absolute values indicate a stronger relationship

  • Random Forest

    • Accuracy: 0.13

  • Select-K-Best

    • Used chi-squared test.

    • Accuracy: 0.08

Machine Learning Models

Linear Regression

  • Modeled for Boston, Seattle, and Denver.

  • Variables: Bathrooms, Beds, Super-host status, Neighborhoods, Amenities.

Selected Output & Insights

  • Back Bay listing:

    • Increases price by $82.332, holding all else constant.

    • Desirable features include:

      • Proximity to downtown, attractions, dining, and shopping.

      • Historic charm.

      • Unique local amenities leading to higher prices.

  • East Boston listing:

    • Decreases price by $49.858, holding all else constant

    • Possible reasons for lower desirability:

      • Distance from tourist hotspots.

      • Less appealing local amenities.

      • Perceived safety concerns.

Logistic Regression

  • Predicts Airbnb rental prices classification: Low (0) or High (1).

Selected Output & Insights

  • Boston insights:

    • Central neighborhoods like Back Bay have positive coefficients.

    • Private room type generally results in a lower price.

  • Seattle insights:

    • Downtown neighborhoods like Central Business District have positive coefficients.

    • Listings with more than three bathrooms are likely higher priced.

  • Denver insights:

    • Listings with 2+ shared baths decrease predicted log price.

    • Listings with 3+ baths increase predicted log price.

Selected Final Output