The mbanalysis action performs market basket analysis, a common technique in data mining used to discover co-occurrence relationships among items. It identifies association rules from a transactional dataset, which is useful for understanding customer purchasing patterns, such as which products are frequently bought together. This information can be leveraged for store layout design, cross-marketing promotions, and catalog design.
| Parameter | Description |
|---|---|
| antecedentList | Specifies the regular expression strings to match in the antecedent (left-hand side) of a rule. |
| conf | Specifies the minimum confidence for the rules, as a percentage. Default: 50. |
| consequentList | Specifies the regular expression strings to match in the consequent (right-hand side) of a rule. |
| hierarchy | Specifies one or more hierarchy tables. If you omit this parameter, the action performs simple association analysis without a hierarchy. You can specify up to five tables, each specifying one level of the hierarchy. |
| idVariable | Specifies the variable used to group the target variable into baskets (transactions). |
| items | Specifies the number of items in a rule. Default is 2 when 'out' or 'outrule' is specified, otherwise 1. |
| lift | Specifies the minimum lift value necessary to generate a rule. Default: 1. |
| maxItems | Specifies a maximum basket size; baskets larger than this value are rejected. Default: 1000. |
| minItems | Specifies a minimum basket size; baskets smaller than this value are rejected. Default: 1. |
| nLHS_range | Specifies the range for the number of items in the left-hand side (LHS) of a rule, including 'lower' and 'upper' bounds. |
| norm | When set to True, normalizes the values of the target variable and the items in the output tables. Default: FALSE. |
| nRHS_range | Specifies the range for the number of items in the right-hand side (RHS) of a rule, including 'lower' and 'upper' bounds. |
| out | Specifies the output table to contain frequent item sets used to generate rules, including transaction counts and support. |
| outfreq | Specifies the output table to contain the unique frequent items along with their transaction counts and support. |
| outrule | Specifies the output table to contain the generated rules, including LHS, RHS, support, confidence, and lift. |
| saveState | Specifies the table in which to save the mining model for future scoring. |
| separator | Specifies the separator character used in the rule's antecedent and consequent strings. Default: '&'. |
| sup_lift | Specifies the minimum support lift necessary to generate a rule. Default: 0. |
| supmin | Specifies the minimum absolute support count for a rule. This overrides the 'suppct' parameter. |
| suppct | Specifies the minimum support for a rule as a percentage of the total number of baskets. |
| table | Specifies the input data table containing transaction data. |
| tgtVariable | Specifies the nominal variable to be used as the target item in the transactions. |
This example code creates a simple CAS table named 'sample_transactions' containing transaction data. Each row represents an item within a transaction, identified by 'transaction_id'. This format is typical for market basket analysis.
| 1 | DATA casuser.sample_transactions; |
| 2 | INFILE DATALINES; |
| 3 | INPUT transaction_id item $; |
| 4 | DATALINES; |
| 5 | 1 apple |
| 6 | 1 banana |
| 7 | 1 milk |
| 8 | 2 bread |
| 9 | 2 butter |
| 10 | 3 apple |
| 11 | 3 bread |
| 12 | 3 cheese |
| 13 | ; |
| 14 | RUN; |
This example performs a basic market basket analysis on the 'sample_transactions' table. It identifies rules with a minimum support of 1% and a minimum confidence of 50%. The resulting rules are saved to the 'basic_rules' CAS table.
| 1 | PROC CAS; |
| 2 | ruleMining.mbanalysis / |
| 3 | TABLE={name='sample_transactions'}, |
| 4 | idVariable='transaction_id', |
| 5 | tgtVariable='item', |
| 6 | suppct=1, |
| 7 | conf=50, |
| 8 | outrule={name='basic_rules', replace=true}; |
| 9 | RUN; |
This example demonstrates a more targeted analysis. It searches for rules containing exactly 3 items, filters for rules where 'apple' is in the antecedent (LHS), and requires a minimum lift of 1.2. This helps identify strong, specific associations involving a particular product.
| 1 | PROC CAS; |
| 2 | ruleMining.mbanalysis / |
| 3 | TABLE={name='sample_transactions'}, |
| 4 | idVariable='transaction_id', |
| 5 | tgtVariable='item', |
| 6 | items=3, |
| 7 | lift=1.2, |
| 8 | antecedentList={'apple'}, |
| 9 | outrule={name='detailed_apple_rules', replace=true}; |
| 10 | RUN; |