Chat with us, powered by LiveChat cs2.pdf - STUDENT SOLUTION USA

11/2/22, 10:01 PM Print Preview

https://ng.cengage.com/static/nb/ui/evo/index.html?deploymentId=5994142305022023695824496950&eISBN=9780357131824&id=1635156102&sna… 1/2

Chapter 5: Descriptive Data Mining: Case Problem 2 Know Thy Customer Book Title: Business Analytics Printed By: Jigar Jitendrak Patel ([email protected]) © 2021 Cengage Learning, Cengage Learning

Chapter Review

Case Problem 2 Know Thy Customer

Know Thy Customer (KTC) is a financial consulting company that providespersonalized financial advice to its clients. As a basis for developing this tailoredadvising, KTC would like to segment its customers into several representativegroups based on key characteristics. Peyton Blake, the director of KTC’s fledginganalytics division, plans to establish the set of representative customer profilesbased on 600 customer records in the file KnowThyCustomer. Each customerrecord contains data on age, gender, annual income, marital status, number ofchildren, whether the customer has a car loan, and whether the customer has ahome mortgage. KTC’s market research staff has determined that these sevencharacteristics should form the basis of the customer clustering.

Peyton has invited a summer intern, Danny Riles, into her office so they candiscuss how to proceed. As they review the data on the computer screen,Peyton’s brow furrows as she realizes that this task may not be trivial. The datacontains both categorical variables (Female, Married, Car, and Mortgage) andnumerical variables (Age, Income, and Children).

1. Using Manhattan distance to compute dissimilarity between observations,apply hierarchical clustering on all seven variables, experimenting withusing complete linkage and group average linkage. Normalize the values ofthe input variables. Recommend a set of customer profiles (clusters).Describe these clusters according to their “average” characteristics. Whymight hierarchical clustering not be a good method to use for these sevenvariables?

2. Apply a two-step approach:

a. Using matching distance to compute dissimilarity betweenobservations, employ hierarchical clustering with group average

11/2/22, 10:01 PM Print Preview

https://ng.cengage.com/static/nb/ui/evo/index.html?deploymentId=5994142305022023695824496950&eISBN=9780357131824&id=1635156102&sna… 2/2

linkage to produce four clusters using the variables Female, Married,Loan, and Mortgage.

b. Based on the clusters from part (a), split the original 600observations into four separate data sets as suggested by the fourclusters from part (a). For each of these four data sets, apply k-means clustering with using Age, Income, and Children asvariables. Normalize the values of the input variables. This willgenerate a total of eight clusters. Describe these eight clustersaccording to their “average” characteristics. What benefit does thistwo-step clustering approach have over just using hierarchicalclustering on all seven variables as in part (1) or just using k-meansclustering on all seven variables? What weakness does it have?

Chapter 5: Descriptive Data Mining: Case Problem 2 Know Thy Customer Book Title: Business Analytics Printed By: Jigar Jitendrak Patel ([email protected]) © 2021 Cengage Learning, Cengage Learning

© 2022 Cengage Learning Inc. All rights reserved. No part of this work may by reproduced or used in any form or by any means -graphic, electronic, or mechanical, or in any other manner – without the written permission of the copyright holder.

error: Content is protected !!