simple

compare

Description

Compares two tables by computing the index and frequency of each group, cumulative frequency, and the index of each record within its group.

simple.compare / table={...} <table2={...}> <casOut={...}> <casOut2={...}> <freqOut={...}> <inputs={{...}, ...}> <table2Inputs={{...}, ...}> <copyVars={"var1", "var2", ...}> <table2CopyVars={"var1", "var2", ...}> <attributes={{...}, ...}> <table2Attributes={{...}, ...}> <details=boolean> <divider="string"> <frequencyName="name"> <frequency2Name="name"> <cumFreqName="name"> <cumFreq2Name="name"> <groupIDName="name"> <groupID2Name="name"> <groupIDBothName="name"> <positionName="name"> <position2Name="name"> <generatedColumns={"ALL" | "CUMFREQ" | "F" | "FREQUENCY" | "GROUPID" | "NONE" | "POSITION"}> <includeDuplicates={"CASOUT" | "CASOUT2" | "FREQOUT"}> <includeMissing=boolean> <inputRegEx={"regex1", "regex2", ...}> <keyModify={"L" | "R" | "U" | "C", ...}> <maxFrequency=long> <maxPosition=long> <minFrequency=long> <minGroupsLL=long> <minPosition=long> <noVars=boolean> <noVars2=boolean> <outputNamedTables=boolean> <position=long> <sparse=boolean>;
Settings
ParameterDescription
tableSpecifies the base table for the comparison.
table2Specifies the comparison table. The records and groups in this table are compared with the records and groups in the base table.
casOutSpecifies the output table that contains records unique to the base table.
casOut2Specifies the output table that contains records unique to the comparison table.
freqOutSpecifies the output table that contains frequency information for the groups.
inputsSpecifies the input variables from the base table to use for grouping.
table2InputsSpecifies the input variables from the comparison table to use for grouping.
copyVarsSpecifies additional variables to copy from the base table to the casOut table.
table2CopyVarsSpecifies additional variables to copy from the comparison table to the casOut2 table.
detailsWhen set to TRUE, writes detailed information, such as the number of groups, to the SAS log.
dividerSpecifies the character used to concatenate character variable values for grouping.
frequencyNameSpecifies the name for the frequency column in the base table's output.
frequency2NameSpecifies the name for the frequency column in the comparison table's output.
cumFreqNameSpecifies the name for the cumulative frequency column in the base table's output.
cumFreq2NameSpecifies the name for the cumulative frequency column in the comparison table's output.
groupIDNameSpecifies the name for the group ID column for the base table.
groupID2NameSpecifies the name for the group ID column for the comparison table.
groupIDBothNameSpecifies the name for the column that contains group IDs common to both tables.
positionNameSpecifies the name for the column indicating a record's position within its group in the base table.
position2NameSpecifies the name for the column indicating a record's position within its group in the comparison table.
generatedColumnsSpecifies which generated columns (like _Frequency_, _GroupID_, _Position_) to include in the output.
includeDuplicatesSpecifies whether to include all records or only unique groups in the output tables.
includeMissingSpecifies whether to include groups with missing values in the analysis.
Data Preparation View data prep sheet
Data Creation

This example creates two tables, `sales_q1` and `sales_q2`, to represent sales data for two different quarters. These tables will be compared to identify differences in product sales.

Copied!
1DATA casuser.sales_q1;
2 INFILE DATALINES;
3 INPUT Product $ Sales;
4 DATALINES;
5Apple 100
6Orange 150
7Banana 200
8Apple 120
9;
10RUN;
11 
12DATA casuser.sales_q2;
13 INFILE DATALINES;
14 INPUT Product $ Sales;
15 DATALINES;
16Apple 110
17Orange 160
18Grape 220
19Orange 140
20;
21RUN;

Examples

This example compares the product groups between `sales_q1` and `sales_q2`. The `inputs` parameter specifies that the comparison should be based on the `Product` column. The results will show which products are unique to each quarter and which are common.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 SIMPLE.compare /
3 TABLE={name='sales_q1'},
4 table2={name='sales_q2'},
5 inputs={{name='Product'}},
6 casOut={name='q1_unique', replace=true},
7 casOut2={name='q2_unique', replace=true},
8 freqOut={name='comparison_freq', replace=true};
9RUN;
10QUIT;
Result :
Three output tables are created: `q1_unique` contains products only sold in Q1 (Banana), `q2_unique` contains products only sold in Q2 (Grape), and `comparison_freq` shows the frequency of products that appear in both or have differing counts.

This example performs a detailed comparison, generating several analytical columns. `generatedColumns` is used to create columns for Group ID, Frequency, and Position within the group for both tables. This provides a deeper insight into the structure and composition of sales in each quarter.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 SIMPLE.compare /
3 TABLE={name='sales_q1', caslib='casuser'},
4 table2={name='sales_q2', caslib='casuser'},
5 inputs={{name='Product'}},
6 casOut={name='q1_details', replace=true},
7 casOut2={name='q2_details', replace=true},
8 freqOut={name='comparison_details_freq', replace=true},
9 generatedColumns={'GROUPID', 'FREQUENCY', 'POSITION'},
10 groupIDName='Product_ID_Q1',
11 frequencyName='Count_Q1',
12 positionName='Pos_Q1',
13 groupID2Name='Product_ID_Q2',
14 frequency2Name='Count_Q2',
15 position2Name='Pos_Q2';
16RUN;
17QUIT;
Result :
The output tables `q1_details` and `q2_details` will contain the original data plus new columns: `Product_ID_Q1`, `Count_Q1`, `Pos_Q1` for the first table, and similarly named columns for the second. The `comparison_details_freq` table will provide a summary of group frequencies across both tables, helping to identify changes in sales patterns.

FAQ

What is the purpose of the simple.compare action?
What are the primary input tables for the compare action?
How can I manage the output tables generated by the compare action?
What does the 'generatedColumns' parameter control?