Chat with us, powered by LiveChat I need help solving a Business Intelligence and Data Analytics assessment which includes a lot of co - STUDENT SOLUTION USA

I need help solving a Business Intelligence and Data Analytics assessment which includes a lot of coding in R. I only took a winter course and i think it is crazy impossible for me to solve it. The deadline is 5th january 2022 before midnight. I seriously need help.

Page 1 of 4

University of La Sabana Winter School 2021

Business Intelligence

Assessment: Individual Home Project Assignment

Submission deadline: 12:00 pm on 05 January 2021 Colombia time

Page 2 of 4

Overview

In hotel industry, it is very common that customers cancel their
bookings before they check-in or do not show up at the time of
their check-in. Both cases are usually shown as cancellation in
the hotel’s booking system. Predicting a hotel booking’s
likelihood to be cancelled can help the hotel manager to
effectively allocate rooms in their booking systems. In this
assessment, you are going to predict the hotel booking
cancellations using data-driven models.

The data related to this assessment can be downloaded from
Teams. You need to write an analysis report to discuss how do
you complete the tasks and go into sufficient depth to demonstrate knowledge and critical understanding of the
relevant processes involved. 100% of available marks are through the completion of the written report.

Report Guidance

Your report must conform to the below structure and include the required content as described. You must supply a
written report containing three distinct sections that provide a full and reflective account of the processes undertaken.

Section I: Data Loading and Preparation (15%)

As a first step, you need to download the datasets from Teams. There are two datasets: hotel_bookings_01.csv and
hotel_bookings_02.csv. The variables in both datasets are briefly explained as below:

Variable Description
ADR Average daily rate
Adults Number of adults
Agent ID of the travel agency that made the booking. Null is there is no agent.
ArrivalDateDayOfMonth Day of the month of the arrival date
ArrivalDateMonth Month of arrival date with 12 categories: “January” to “December”
ArrivalDateWeekNumber Week number of the arrival date
ArrivalDateYear Year of arrival date
AssignedRoomType Code for the type of room assigned to the booking. Sometimes the assigned

room type differs from the reserved room type due to hotel operation reasons
(e.g. overbooking) or by customer request. Code is presented instead of
designation for anonymity reasons

Babies Number of babies
BookingChanges Number of changes/amendments made to the booking from the moment the

booking was entered on the PMS until the moment of check-in or cancellation
Children Number of children
Country Country of origin. Categories are represented in the ISO 3155–3:2013 format
CustomerType Type of booking, assuming one of four categories: Contract – when the

booking has an allotment or other type of contract associated to it; Group –
when the booking is associated to a group; Transient – when the booking is not
part of a group or contract, and is not associated to other transient booking;
Transient-party – when the booking is transient, but is associated to at least
other transient booking

DaysInWaitingList Number of days the booking was in the waiting list before it was confirmed to
the customer

DepositType Indication on if the customer made a deposit to guarantee the booking. This
variable can assume three categories: No Deposit – no deposit was made; Non
Refund – a deposit was made in the value of the total stay cost; Refundable – a
deposit was made with a value under the total cost of stay.

IsCanceled Value indicating if the booking was canceled (1) or not (0)

Page 3 of 4

IsRepeatedGuest Value indicating if the booking name was from a repeated guest (1) or not (0)
LeadTime Number of days that elapsed between the entering date of the booking into the

PMS and the arrival date
Meal Type of meal booked. Categories are presented in standard hospitality meal

packages: Undefined/SC – no meal package; BB – Bed & Breakfast; HB –
Half board (breakfast and one other meal – usually dinner); FB – Full board
(breakfast, lunch and dinner)

PreviousBookingsNotCanceled Number of previous bookings not cancelled by the customer prior to the
current booking

PreviousCancellations Number of previous bookings that were cancelled by the customer prior to the
current booking

RequiredCardParkingSpaces Number of car parking spaces required by the customer
ReservationStatus Reservation last status, in one of three categories: Canceled – booking was

canceled by the customer; Check-Out – customer has checked in but already
departed; No-Show – customer did not check-in and did inform the hotel of the
reason why

ReservedRoomType Code of room type reserved. Code is presented instead of designation for
anonymity reasons

StaysInWeekendNights Number of weekend nights (Saturday or Sunday) the guest stayed or booked to
stay at the hotel

StaysInWeekNights Number of week nights (Monday to Friday) the guest stayed or booked to stay
at the hotel

TotalOfSpecialRequests Number of special requests made by the customer (e.g. twin bed or high floor)

1. You firstly merge the two datasets hotel_bookings_01.csv and hotel_bookings_02.csv using R. You need to

provide screenshots of the key steps and report the dimension (i.e., number of rows and number of columns) of
the merged dataset. (4%)

2. Do you realise any feature columns in the merged dataset that have missing values? If so, please report these

features and deal with the missing values. Usually, there are three ways of dealing missing values:
• removing the instances with missing values.
• filling in the missing values with other values.
• removing the columns with missing values.
Please provide screenshots of the key steps and justify the way you choose to deal with the missing values. (9%)

3. Please convert and export the prepared dataset into Excel file format (e.g. xlsx). (2%)

Section II: Descriptive Analytics (25%)

In this section, you are going to perform some descriptive analytics on the prepared dataset. You can use either Excel
or R to complete the questions as below:

1. How many numeric features you can identify in the prepared dataset and what are they? Please provide a

summary table of descriptive statistics (as the example shown below) for these numeric features as well as
calculate their correlation coefficients R. (6%)

Feature name Mean Median Min Max Standard deviation Number of unique values
… … … … … … …

2. How is the customer type distributed between city and resort hotels? Please answer this question by visualizing

data and report the key steps of operations in Excel or your R codes. (3%)

3. Is the average daily rate of “No-Show” smaller than that of “Canceled” and “Check-Out” for both city and resort
hotels? Please answer this question by visualizing data and report the key steps of operations in Excel or your R
codes. (3%)

4. If a customer is the repeated guest, is he/she more likely to check-out or not? Please answer this question by

visualizing data and report the key steps of operations in Excel or your R codes. (3%)

Page 4 of 4

5. Customers can book hotels direct or through agents. Which way is quicker and which way is cheaper? Please

answer the questions by visualizing data and report the key steps of operations in Excel or your R codes. (4%)

6. Draw line plots of the average daily rate and the average booking changes against the days in the waiting list,
respectively. The plots should be in one figure, with one x-axis and two different y-axes. Please answer this
question by visualizing data and report the key steps of operations in Excel or your R codes. Do they have any
statistically linear correlation? (6%)

Section III: Hotel Booking Cancellation Prediction (60%)

You need to use R to develop data-driven models to predict if a customer will cancel his/her booking or not.

1. Either IsCanceled or ReservationStatus can be used as the target/response variable. Here we will use IsCanceled

as the target variable, can you explain why? Can you also discuss that anything/steps that we can do if we want
to use ReservationStatus as the target variable? (5%)

2. You need to select two different classification models that you have studied in the course to predict cancellations

in the target variable IsCanceled. Please introduce your selected two models. Each model should have a short
paragraph and your description for each should be no more 300 words. (10%)

3. Show the key steps and your R codes for your model training and testing. Here we use accuracy as the model

evaluation metric. (20%)

You need to:
• Describe which variables are used as model input.
• Discuss and provide screenshots of the key steps and settings of model training and testing.
• Find your random seed number in BI_Random_Seed_2021.pdf and use your allocated random seed number

for all modules in your experiment if applicable.
• Show and discuss your data split ratio.

4. Select the best classification model from your developed two models, and explain why this is the best model.
Your best model selection and the model settings should be clearly presented. (10%)

5. Discuss the business insights provided by your finally selected model. For example, which variables/features are
important in prediction? What types/segments of customers are more likely to cancel their bookings? What do
you think might be the reasons behind the findings? Your analysis needs to be reasonable, and you can include
any theories or evidence from related subjects (e.g., consumer behaviour, marketing, psychology) or other
empirical studies to justify your statements along with your findings. (15%)

The report must

• Contain your student number and course name.
• Be in PDF and no more than 15 pages (excluding cover page and references if they are included).
• Be formatted single-spaced with 11 pt font size.
• Do not include this briefing document.

This assessment is an individually assessed component. If you have included any citations, your citation and
referencing should be by university guidelines. If you are unsure about any aspect of this assignment, please seek the
advice of the course coordinator.

error: Content is protected !!