Predictive Modeling vs. Clustering

Data mining activities are concerned with improving customer understanding within an enterprise to impact the bottom line across customer service, sales, marketing or operations by generating revenue or controlling costs or risks.  Though predictive modeling and clustering are both considered techniques within data mining, their methods and application are quite different even when the goals are the same.  The algorithms used vary as well depending on whether the analysis is supervised (predictive modeling) or unsupervised (clustering).

Predictive modeling is concerned with the classification or estimation of an attribute and is a supervised approach.  This means that historic data is used to train and test a model, then the model used to score new records for the purpose of prediction.  An example would be to use historic demographic, transactional, attitudinal and behavioral data within a financial institution to address customer retention.  The field of interest is a flag (1 / 0 or true / false) indicating whether a customer is a current customer or not.  Predictive models are created based on the combined data sources, then customers scored for their likelihood to leave or churn.  Typical algorithms used for predictive modeling can be from the field of statistics (linear regression, logistic regression, cox regression, discriminant analysis) or data mining (decision trees, neural networks).

Clustering or segmentation is an unsupervised approach that divides records in a data source into sub-groups, where records within a sub-group are similar to each other and likely will behave in a similar fashion. An application of this would be to cluster the customers within a financial institution using demographic, transactional, attitudinal and behavioral data, then design programs to address retention, cross-selling and growth initiatives for each subgroup. Clustering differs from predictive modeling in that it is usually an initial step and not the end goal, as clusters discovered are often used as inputs to predictive models, in the form of an indicator of cluster membership. First, the clusters uncovered must be explored for meaning and relevance. If the initial analysis results in 5 clusters, and 2 have little usefulness, the analysis can be re-run to force 3 or 7 clusters in the result to better understand the segments. When a sub-group can be acted upon from a business perspective, we have a good sense of it’s usefulness. Kohonen, K-Means and Two-Step are examples of clustering algorithms.

In summary, fully understanding the business issue at hand is the key to determining whether predictive modeling or clustering or both is the proper approach. Both techniques have their place in impacting the bottom line for customers, both in generating revenue and controlling costs and risks.

The Evolution of Customer Analytics & Marketing in the Gaming Industry – Continued

Stage 3: Predict Customer Value and Behavior to Align Offers – Casino’s in this stage have already successfully created customer segments based on actual customer profitability. Casinos in stage three are adding the power of prediction to their marketing strategies, i.e., forecasting future customer profitability based on data from their CMS systems, player card, and other data like gender, age, and distance from the casino. A forecasting model predicts future/potential profitability for each and every casino customer as well as the likelihood to attrite. These predictions are then used to segment customers into as many segments as is deemed practical for marketing purposes. For example:

Potential Diamond
Potential Platinum
Potential Gold
Potential Silver
Potential Bronze

Specific promotions and comps are aligned to keep high value customers loyal, and to upwardly migrate low value customers that possess high potential.

Stage 4: Improve Customer Forecasts and Segmentations Based on a 360o View of the Customer

Cutting-edge casinos like Harrah’s are better able to assess current customer and predicted customer value by adding new and important data. A 360o view of the customer can be created by integrating data from Lodging, Food and Beverage, Meetings, the Call Center, the web and customer surveys. Once the 360o view is developed, new analytical techniques can be used to increase customer insight and deploy even more sophisticated segmentation strategies. These analytical techniques include traditional data mining using additional data, text mining (of unstructured data from call centers and the web), and Web Mining of customer website behavior.

The improvement in forecast accuracy from additional data can be very significant. Web data can deliver 20% better forecasts; attitudinal data can increase forecast accuracy by 30%; and, unstructured text data can increase forecast accuracy by as much as 40%.

Stage 5: Optimize and Operationalize Campaigns, and Offers Across Channels

Many casinos today are achieving increased marketing productivity by utilizing campaign management applications to automate labor intensive/manual customer segmentation and marketing activities. The most advanced casinos create, optimize and operationalize their marketing campaigns over time and channels based on:

§ Current and predicted customer value and behavior
§ Contact optimization based on business rules, and in some cases mathematical optimization algorithms
§ Real-time offers based on awareness of customer location and activity in the casino property

The goal here is to use all data that is available to develop models and an optimal contact strategy that meets the needs of the customers and the casino – a win-win situation.

Marketing strategies associated with current and predicted value were already covered in earlier stages, above. We will now turn our attention to optimization and real-time offer management.

Optimization

Optimization is a technology for creating an optimal contact strategy across campaigns over a given timeframe. It is based on a number of factors:

§ Customer communication channel preferences
§ Stated customer interests based on survey or web-form data
§ Value of an offer/comp to the casino
§ Channel capacity
§ The probability of customer response to specific offers/comps
§ Customer contact frequency constraints (to minimize contact fatigue)

Contact optimization approaches can be based on rules or by mathematically driven optimization algorithms. While both approaches are valid, it is difficult to understand and explain how optimization algorithms actually come up with their contact strategies. Business rules are easier to understand and explain, and are often a good first step in getting acquainted with contact optimization.

Here is an example of business rules used for developing an optimal contact strategy, “Over the next quarter’s planned campaigns, let’s only communicate with customers that have a 70% or greater likelihood to respond to various offers, while taking into account:

§ The maximum number of communications per customer is four.
§ Only extend one offer to each customer over this time period – the one with the greatest return for the casino.
§ Direct mail channel capacity is1MM pieces.

Without optimization, the casino may send out 2.5MM offers (increasing costs and extending the campaign schedule due to capacity constraints), while communicating with some customers ten times with conflicting, lesser value offers. This is not good for customer loyalty or casino/property profitability.

Real-time Offers Based on Awareness of Customer Activity in the Casino

Some leading casinos realize that appropriate offers in real-time can keep customers in their hotel property. Imagine making a real-time 50% off spa-treatment offer to a woman that just lost $300 at Baccarat. It makes her feel better and increases her property spend and loyalty.

Harrah’s uses its Total Rewards Program to track customer behavior in real-time. A guest celebrating her birthday might insert her loyalty card in a slot machine and be surprised by a promotions manager bearing a birthday card and a cookie .

The key is to know where the customers are in the property (based on use of the customers loyalty card), and tying that to real-time offers that are developed dynamically or are pre-determined via the utilization of campaign management applications. Offer integration may be data-based for pre-determined offers, or real-time via web services for dynamic offer creation. Increased profitability, customer loyalty and share-of-wallet will result by getting the right offer to the right customer at the right time.

In summary, optimization is a process where customer preferences, customer value, and casino/corporate goals and constraints are managed to create an optimal contact strategy that increases the return on marketing investment. More specifically, optimization requires the identification of a goal (e.g., maximizing profits), as well as the identification of constraints (e.g., marketing budget, channel capacity, maximum number of contacts per individual per month, etc). Optimal contact strategies are then typically based on the development of and prioritization of business rules, or via linear or non-linear optimization algorithms.

Summary

The use of additional customer data, data mining, sophisticated segmentation schemes and contact optimization can lead to increased customer satisfaction, loyalty and spend in the casino. These benefits can even extend beyond the casino. Marketing strategies based on a 360o view of the customer can benefit the property (including Lodging, Spa, Meetings, etc.) as well as the enterprise in the case of a multi-property entity.

Casino’s not utilizing these strategies will fall behind and realize less and less share of wallet.

The Evolution of Customer Analytics & Marketing in the Gaming Industry

Leading casinos create competitive advantage by integrating predictive analytics into their marketing processes and business systems. Predictive Analytics allows casinos to accurately predict the future behavior and value of their customers to drive intelligent marketing and sales activities. Today, leading casinos are going even farther by utilizing new data and optimization techniques to increase marketing ROI and customer loyalty.

There has been significant growth in the use of analytics to increase customer and casino profitability over the last two decades. The benefits of creating customer segments based on profitability, and aligning appropriate marketing offers is well know and practiced today. As competition has increased, casinos are working even harder to increase their share of each customer’s wallet. In an effort to create and sustain a competitive advantage, cutting edge casinos have moved into the realm of “predicting” customer value and behavior to proactively manage customer relationships in an optimal manner. For instance, leading casinos predict which low to medium value customers are likely to become high rollers. These “high potential” customers are then targeted with more costly and attractive promotions and comps in efforts to migrate these customers to more profitable segments.

While predictive analytics is seen as a competitive differentiator, some casinos realize there are still relatively untapped assets and technologies available to keep them ahead of the curve. The use of customer data from disparate systems and optimization techniques are the tools of leading casinos today.

Casinos are realizing more and more profits from non-gaming activities like events, lodging and restaurants. The ability to integrate and utilize these data represents significant untapped potential to increase customer loyalty and profitability across the entire enterprise. More recently, a subset of the most advanced casinos are now starting to utilize technology to optimize their customer relationships and profitability.

The evolutionary path of analytics in the gaming industry includes the following stages:

1) Business Intelligence – Basis customer profiling and reporting. Manually driven.
2) Segmentation of customers based on profitability. Drives differing messaging and offers.
3) Prediction of future customer value which drives campaigns to reap full customer value.
4) Improve predictions based on 360 degree view of customer from all casino systems, e.g., gaming, lodging, spa, etc.
5) Campaign/offer optimization to maximize revenues or profitability.

Real competitive advantage begins at stage three. A few select players in the gaming industry are pushing the envelope with stage four and five initiatives.

Our next article will examine stages three, four and five in detail.

Jim Stafford has worked for leading companies in the Marketing Automation space (BI, data mining, campaign management and eMarketing) for over 10 years. He has held roles of Director – Database Marketing Solutions, Pre-Sales Manager, Product Manager, and Solution Architect at companies like Aprimo, Group1 Software, SAS, Siebel, SPSS and Unica. Mr. Stafford has consistently helped sales teams meet or beat established sales targets. He was the principal pre-sales contributor to Siebel’s second largest MA sale with General Motors. Jim has had considerable exposure to many verticals including: Financial Services, Hospitality & Entertainment, Automotive, Communications, and Utilities. He is a seasoned expert at discovery and knows key industry trends. Jim has an M.A. Degree in Economics from the University of Maryland and has been a frequent speaker at annual National Center for Database Marketing and Direct Marketing Associations events. Visit http://www.staffordsbsg.com/ to learn more about Jim and his company’s services.

 

 
 

 

 

What Data Do I Need for Data Mining?

Part II of Successful Data Mining: 80% Data Prep, 20% Modeling & Assesment

At a high level, there are two types of data — primary and secondary.

Primary data are data that the customer has directly provided (e.g., from product registrations, web-page profile registrations, surveys, etc.) and data that you have collected directly from customer interactions, like purchases.  Secondary data is data that is acquired from another indirect source.  These data elements can include demographic (for B2C companies) or firmographics (for B2B companies) elements.  Some of these data elements may be specific to your customer or inferred.  An example of inferred is to assume that a specific customer has the same household income as that represented by the average of all households at the zip code or zip+4 level – a level provided by the U.S. Census.

The most important data elements are those related to prior transactions (primary data).  These data almost always give the most lift to predictive models.  Secondary data is not always found to be important in terms of explanatory power.  However, they can be very helpful when building customer acquisition models and for profiling customer segments to help deliver the right message.

So, if you want to use data mining to increase your marketing ROI, start collecting these data at the customer level today!

Next posting will address, “How much Data Do I Need”, or “How Many Customer Records Do I Need to Have Confidence in My Models?”

 
 

 

 

 

 

Basic Data Profiling

When it comes to analyzing data, whether for discovery or prediction, fully understanding the ‘ingredients’ is a key.  To profile data sources, a variety of techniques can be employed including a data audit, calculating summary statistics, and graphical techniques.

 

Data Audit

Creation of a Data Audit report will provide a complete initial look at the data and will guide the approach for dealing with records and fields that are anomalous or extreme.  A data audit includes an analysis of all features of the data attributes such as outliers, extremes, missing values, percent complete, valid records, null values, empty values, white space, unique values and blanks.  Once questionable instances are identified, techniques such as filtering, selecting, de-selecting and imputation methods can be employed to address these issues.   

 

Summary Statistics

Analyzing summary statistics for all numeric fields is an excellent approach for an initial profiling of a data set.  Statistics such as minimum, maximum, sum, range, mean, standard deviation, variance, skewness and kurtosis are all useful measures to assess location, range and variability of attributes.  Summarizing data in table format is also useful, such as cross tabs or aggregating / averaging numeric fields by other attributes.  Examples include calculating the average age by gender or the total revenue by state.  Armed with these results, data errors, problematic distributions and data obstacles can be revealed and dealt with prior to subsequent discovery and modeling efforts.

 

Graphical Techniques

Numerous graphical techniques are available for preliminary profiling of data elements.  These techniques include 2-D and 3-D scatter plots, distribution (bar & pie charts) plots, histograms, heat maps, box plots, time series plots and web plots.  Visual techniques will provide a comprehensive view into an entire data field or fields, and extending the plots with color, shape, size or panel overlays using additional variables can reveal further insights.  

 

When data challenges are uncovered during initial steps of data understanding and data preparation, time and resources can be used efficiently and effectively to correct these situations.  The data profiling steps apply to both source data and derived fields, so these techniques are typically focused on the early stages of the data mining process, though it can also be useful to employ profiling during modeling, evaluation and deployment steps for assessing analytic results.

 

Successful Data Mining: 80% Data Prep, 20% Modeling & Assessment

Successful data mining is really all about getting your data properly prepared. Data miners spend about 80% of their model building efforts on data preparation. Preparation includes:

1) Missing data analysis. What fields have missing values? Should you fill in the missing values? If so, what values do you use? Should the field be used at all?

2) Outlier detection. Is “33 children in a household” extreme? Probably — and consequently this value should be adjusted to perhaps the average or maximum number of children in your customer’s households.

3) Transformations and standardizations. When various fields have vastly different ranges (e.g., number of children per household and income), it’s often helpful to standardize or normalize your data to get better results. It’s also useful to transform data to get better predictive relationships. For instance, it’s common to transform monetary variables by using their natural logs.

4) Binning Data. Binning continuous variables is an approach that can help with noisy data. It is also required by some data mining algorithms.

These topics will be discussed in more detail in future posts.

Surviving a Recession with Analytics-based Targeted Marketing

During a recession you not only have to compete against your regular competitors, you must also fight the most dreaded enemy of all—no decision! During tough economic times the only thing you can count on is that you’re going to have to work twice as hard to close the same amount of business as before. Therefore, you must ramp up your targeted marketing efforts accordingly.

RFM and data mining help you find a subset of your customers that are most likely to react/respond to your marketing campaigns.  By targeting ONLY those likely to respond, you achieve about the same response/sales at a fraction of the cost.

Data Mining Lift Curve

Data Mining Lift Curve

Data Mining For Marketers

Fortune 1000 companies stay Fortune 1000 companies by allocating large budgets on teams of statisticians and data miners to help increase marketing and sales effectiveness. The types of studies performed by these teams enable marketers to more effectively target specific customers for campaigns, resulting in higher response rates, lower costs, and greater customer satisfaction. Leading non-profits also employ these techniques. 

Now small businesses can benefit from RFM (Recency, Frequency and Monetary Value) and data mining too.  Post your comment here to have us follow-up and explain our services.

Follow

Get every new post delivered to your Inbox.