Predictive Analytics
Assignment: Customer Segmentation, Association Rule Mining, and Market Basket Analysis (MBA) Case Studies
Part A - Cluster Analysis
A car insurance company wants to understand the different purchasing behaviours. As a first step they plan to identify different segments of customers in order to improve the current target marketing campaign.
The CUSTOMER_DATA dataset contains the basic details about customers obtained via customer IDs. In this dataset each row represents an individual customer. There are seven variables in the dataset.
The variables in the data set are shown below with the appropriate roles and levels.
Name |
Model Role |
Measurement Level |
Description |
CustomerID |
ID |
Interval |
Identification number of the customer |
Customer_Value_S core |
Input |
Interval |
Customer value score represents a customer`s value to a company based on past performance. |
Education |
Input |
Nominal |
Highest Education |
Gender |
Input |
Nominal |
Gender of the customer |
Income |
Input |
Interval |
Annual Income of the customer ($) |
Location_Code |
Input |
Nominal |
Residential Locality |
Marital_Status |
Input |
Nominal |
Marital status of customer |
You, as the data analyst, is required to conduct a cluster analysis of the data set and provide an insightful report on the different customer segments to the manager of the insurance.
a. Create a new diagram in your project. Name the diagram as Profiling.
b. Define the data set CUSTOMER_DATA as a data source and set appropriate roles and levels.
c. Add an Input Data Source node to the diagram workspace and select the CUSTOMER_DATA
data table as the data source.
d. Determine whether the model roles and measurement levels assigned to the variables are appropriate.
Examine the distribution of the variables.
- Are there any skewed variables? Are there missing values that should be replaced?
- If yes, use the Transform variables node to transform the skewed variables. (Hint: Use the log transformation; LOG(variable_name)
e. Add a Cluster node to the diagram workspace and set the number of clusters as four.
f. Set the appropriate properties for the Cluster node.
Leave the default setting as Internal Standardization => Standardization
What would happen if inputs were not standardized? Explain using knowledge from discussions in the class.
g. Run the diagram from the Cluster node and examine the results.
Does the number of clusters created seem reasonable? Discuss using knowledge from class discussions related to - what is a cluster/what is the ideal number of clusters to have, etc.
h. Increase the number of clusters to a maximum of six clusters and re-run the Cluster node. How does the number and quality of clusters compare to that obtained in question e? Do you think it is better to further increase the number of clusters? (You can answer this question by trying out a higher number of clusters - or discuss based on the previous clustering outcomes).
i. Use the Segment Profile node to summarize the nature of the clusters based on the better number of clusters from question h. Describe the profiles based on different customer behaviors.
(Hints: Distribution of each interval variable in the segments can be interpreted as the same as provided in the tutorial; Distribution of each nominal variable in the segments is visualized by pie chart, and the inner ring represents the distribution of the total population, while the outer ring
represents the distribution for the given segment)
j. The insurance company manager would like to develop a target marketing strategy based on this cluster analysis. Discuss how the clustering you have carried out could be used in such a strategy.
Part B - Market Basket Analysis and Association Rules
In order to plan innovative promotions to move items that are often purchased together, a supermarket chain is interested in market basket analysis of groceries purchased. You are a member of the analytics team assigned to the task.
The supermarket chose to conduct a market basket analysis of specific items purchased from the online TRANSACTIONS data set contains information about more than 38,000 transactions made over the past three months from 167 different items including:
Whole milk |
soda |
Tropical fruit |
Citrus fruit |
Shopping bags |
Bottled beer |
Other vegetables |
yogurt |
Bottled water |
Pip fruit |
Canned beer |
Newspapers |
Rolls/buns |
Root vegetables |
sausage |
pastry |
Whipped/sou r cream |
frankfurter |
You have access to SAS Enterprise Miner data analytics tools and decided to carry out a market basket and association rule-based analysis of the data. The following instructions will help you to set up the SAS diagram for the analysis.
There are three variables in the data set:
a. Create a new diagram. Name the diagram Retail.
b. Create a new data source using the data set TRANSACTION.
c. Assign the variable Date the model role Rejected. This variable is not used in this analysis. Assign the ID model role to the variable MemberId and the Target model role to the variable Item. Change the data source role to Transaction.
d. Add the TRANSACTIONS data set and an Association node to the diagram.
e. Change the setting for the Export Rule by ID property to Yes.
f. Leave the remaining default settings for the Association node and run the analysis.
g. Examine the results of the association analysis. Your team leader has indicated that the answer to the following questions will be useful to the management. You have to answer the questions and prepare a report giving evidence to support your answers - (e.g.: Screen shots, numeric values etc.).
1. What is the significance of the lift value of a rule? What is lift and what is the importance in calculating lift?
2. What is the highest lift value for the resulting rules, which rules have this value? What does the highest lift value signify?
3. Based on the association rules, briefly describe 3 example product bundles and promotions that you might suggest?
4. You are required to provide detailed report of the outcomes of the analysis to your manager.
Prepare a brief report (max. 1000 words) presenting:
(a) The problem
(b) Your solution/approach
(c) Outcomes
(d) Analysis results and interpretation
5. You should explain the approach and outcomes such as support, confidence, lift and-, how could the product bundles you suggested be used (practical value) by the departments.
Part C - Open Discussion - Analytics Case Study
This question is based on the week 11 workshop materials: It is very important That you attend the guest lecture to be able to answer this question.
Read the provided article: ‘Prediction the future of cx`. You may also reference additional material you can find to gain background knowledge of the area.
You are expected to summarize the content of the guest lecture and discuss how you relate the guest lecture to the article ‘Prediction the future of cx`
- How would you relate the knowledge and understanding you gained from the guest lecture to topics discussed in the article?
- Do you agree that using customer questionnaires to collect data to understand the customers will limit the possible insights that can be generated? Using information from the guest lecture discuss your thoughts on this topic.
You are expected to write a report (between 700 - 1000 words) discussing the above points (you may check answer guidelines for a more detailed information on the requirements for answer preparation).
BEN02 Planning and Presenting a Micro-Enterprise Idea BTEC Level 1/2
Read MoreBTEC Unit 35: Engineering Services Delivery Plan for Sector-Specific Organizations | HND Level 5 Assignment 2
Read MoreTQUK Level 3 Administering Medication and Monitoring Effects in Adult Care Assignment
Read MoreUnit 10: 3D Modelling and Assembly Drawing for Vice – Engineering Design Portfolio BTEC Level 3
Read MoreWhy is it important that you correlate the appropriate information of the patient when they arrive for their appointment?
Read MoreNCFE Level 3 Roles and Responsibilities in Health And Social Care
Read MoreMP3395 Turbocharger Performance Evaluation and System Analysis CW2 Assessment, AY2024-25
Read MoreKey Research Policies and Funding Models at University of Strathclyde
Read MoreCIPD Level 5 Associate Diploma Key Assessment Questions
Read MoreLaw Assignment Questions Critical Legal Analysis & Solutions
Read More