ruleMining

mbanalysis

Description

The mbanalysis action performs market basket analysis, a common technique in data mining used to discover co-occurrence relationships among items. It identifies association rules from a transactional dataset, which is useful for understanding customer purchasing patterns, such as which products are frequently bought together. This information can be leveraged for store layout design, cross-marketing promotions, and catalog design.

ruleMining.mbanalysis <result=results> <status=rc> / antecedentList={"string-1" <, "string-2", ...>}, conf=double, consequentList={"string-1" <, "string-2", ...>}, hierarchy={{castable-1} <, {castable-2}, ...>}, idVariable="variable-name", items=integer, lift=double, maxItems=integer, minItems=integer, nLHS_range={nLHSRHSOpts}, norm=TRUE | FALSE, nRHS_range={nLHSRHSOpts}, out={casouttable}, outfreq={casouttable}, outrule={casouttable}, saveState={casouttable}, separator="string", sup_lift=double, supmin=double, suppct=double, table={castable}, tgtVariable="variable-name";
Settings
ParameterDescription
antecedentListSpecifies the regular expression strings to match in the antecedent (left-hand side) of a rule.
confSpecifies the minimum confidence for the rules, as a percentage. Default: 50.
consequentListSpecifies the regular expression strings to match in the consequent (right-hand side) of a rule.
hierarchySpecifies one or more hierarchy tables. If you omit this parameter, the action performs simple association analysis without a hierarchy. You can specify up to five tables, each specifying one level of the hierarchy.
idVariableSpecifies the variable used to group the target variable into baskets (transactions).
itemsSpecifies the number of items in a rule. Default is 2 when 'out' or 'outrule' is specified, otherwise 1.
liftSpecifies the minimum lift value necessary to generate a rule. Default: 1.
maxItemsSpecifies a maximum basket size; baskets larger than this value are rejected. Default: 1000.
minItemsSpecifies a minimum basket size; baskets smaller than this value are rejected. Default: 1.
nLHS_rangeSpecifies the range for the number of items in the left-hand side (LHS) of a rule, including 'lower' and 'upper' bounds.
normWhen set to True, normalizes the values of the target variable and the items in the output tables. Default: FALSE.
nRHS_rangeSpecifies the range for the number of items in the right-hand side (RHS) of a rule, including 'lower' and 'upper' bounds.
outSpecifies the output table to contain frequent item sets used to generate rules, including transaction counts and support.
outfreqSpecifies the output table to contain the unique frequent items along with their transaction counts and support.
outruleSpecifies the output table to contain the generated rules, including LHS, RHS, support, confidence, and lift.
saveStateSpecifies the table in which to save the mining model for future scoring.
separatorSpecifies the separator character used in the rule's antecedent and consequent strings. Default: '&'.
sup_liftSpecifies the minimum support lift necessary to generate a rule. Default: 0.
supminSpecifies the minimum absolute support count for a rule. This overrides the 'suppct' parameter.
suppctSpecifies the minimum support for a rule as a percentage of the total number of baskets.
tableSpecifies the input data table containing transaction data.
tgtVariableSpecifies the nominal variable to be used as the target item in the transactions.
Data Preparation View data prep sheet
Create Sample Transaction Data

This example code creates a simple CAS table named 'sample_transactions' containing transaction data. Each row represents an item within a transaction, identified by 'transaction_id'. This format is typical for market basket analysis.

Copied!
1DATA casuser.sample_transactions;
2 INFILE DATALINES;
3 INPUT transaction_id item $;
4 DATALINES;
51 apple
61 banana
71 milk
82 bread
92 butter
103 apple
113 bread
123 cheese
13;
14RUN;

Examples

This example performs a basic market basket analysis on the 'sample_transactions' table. It identifies rules with a minimum support of 1% and a minimum confidence of 50%. The resulting rules are saved to the 'basic_rules' CAS table.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ruleMining.mbanalysis /
3 TABLE={name='sample_transactions'},
4 idVariable='transaction_id',
5 tgtVariable='item',
6 suppct=1,
7 conf=50,
8 outrule={name='basic_rules', replace=true};
9RUN;
Result :
An output table named 'basic_rules' is created in the 'casuser' caslib. This table contains association rules that meet the specified support and confidence thresholds, showing relationships like which items are often purchased together.

This example demonstrates a more targeted analysis. It searches for rules containing exactly 3 items, filters for rules where 'apple' is in the antecedent (LHS), and requires a minimum lift of 1.2. This helps identify strong, specific associations involving a particular product.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ruleMining.mbanalysis /
3 TABLE={name='sample_transactions'},
4 idVariable='transaction_id',
5 tgtVariable='item',
6 items=3,
7 lift=1.2,
8 antecedentList={'apple'},
9 outrule={name='detailed_apple_rules', replace=true};
10RUN;
Result :
The action produces a CAS table named 'detailed_apple_rules'. This table will contain only the 3-item rules where 'apple' is part of the antecedent and the rule's lift is at least 1.2, indicating a stronger-than-random co-occurrence with the consequent items.

FAQ

What does the mbanalysis action do?
What are the required parameters for the mbanalysis action?
What is the purpose of the 'conf' parameter?
How can I filter rules based on lift?
Is it possible to perform analysis with a product hierarchy?
How can I save the results of the market basket analysis?