simple

crossTab

Description

The crossTab action performs one-way or two-way tabulations, also known as frequency counts. It can compute various statistics such as chi-square tests for independence and measures of association. This action is useful for understanding the distribution and relationship between categorical variables.

simple.crossTab / table={name='tableName', caslib='caslibName'} row='rowVar' col='colVar' weight='weightVar' chiSq=true association=true includeMissing=true descending=false aggregator='N' rowNBins=0 colNBins=0 niceBinning=true fullTable=false acrossBy=false orderByGbyRaw=false groupByLimit=0 rowFormat='format' colFormat='format';
Settings
ParameterDescription
acrossByWhen set to True, the levels of row and column variables are the same across the group-by variables.
aggregatorSpecifies the aggregator for which the values of the weight variable are rolled up into a rank order score if a weight variable is specified. Values include 'CSS', 'CV', 'KURTOSIS', 'MAX', 'MEAN', 'MIN', 'N', 'NMISS', 'PROBT', 'SKEWNESS', 'STD', 'STDERR', 'SUM', 'TSTAT', 'USS', 'VAR'.
associationWhen set to True, measures of association between the row and column variable of the crosstabulation are computed.
chiSqWhen set to True, chi-square statistics are computed for the test of independence of the row and column variables and their asymptotic p-values.
colSpecifies the column variable for the crosstabulation.
colFormatSpecifies a format for the column variable.
colNBinsSpecifies the number of bins to use in binning the column variable.
descendingWhen set to True, the formatted levels of the variables are arranged in descending order.
fullTableWhen set to True, a full-table scan is performed.
groupByLimitSpecifies the maximum number of levels in a group-by set. When the server determines this number of levels, the server stops and does not return a result.
includeMissingWhen set to True, missing values are included in the crosstabulation.
niceBinningWhen set to True, the nice binning algorithm is used.
orderByGbyRawWhen set to True, the ordering of the group-by variables is based on the raw values of the variables, not the formatted values.
rowSpecifies the row variable for the crosstabulation.
rowFormatSpecifies a format for the row variable.
rowNBinsSpecifies the number of bins to use in binning the row variable.
tableSpecifies the input CAS table for the analysis.
weightSpecifies the numeric weight variable used to compute the statistics in the table cell and in the margins of the table.
Data Preparation View data prep sheet
Data Creation

The following code creates the 'cars' table in the 'casuser' caslib, which will be used in the examples. This table is a copy of the 'sashelp.cars' dataset.

Copied!
1DATA casuser.cars; SET sashelp.cars; RUN;

Examples

This example generates a simple one-way frequency table for the 'Type' variable from the 'cars' table. This is the most basic use of the crossTab action.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3SIMPLE.crossTab / TABLE={name='cars', caslib='casuser'} row='Type';
4 
5RUN;
6 
Result :
The output will be a result table named 'Crosstab' showing the frequency counts ('_Freq_'), percentages ('_Pct_'), and cumulative frequencies for each car type found in the 'Type' column.

This example performs a two-way tabulation between the 'Type' and 'Origin' variables. It uses the 'MSRP' column as a weighting factor, meaning the frequency counts will be the sum of MSRP for each cell. It also requests chi-square statistics to test for independence between 'Type' and 'Origin'.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3SIMPLE.crossTab / TABLE={name='cars', caslib='casuser'} row='Type' col='Origin' weight='MSRP' chiSq=true;
4 
5RUN;
6 
Result :
The result includes a primary crosstabulation table where each cell contains the sum of 'MSRP' for that combination of 'Type' and 'Origin'. A second table, 'ChiSq', provides the Chi-Square test statistics, degrees of freedom (DF), and the corresponding p-value to assess the statistical significance of the association.

This example computes a two-way frequency table for 'Type' and 'Origin' and requests various measures of association to quantify the strength of the relationship between these two categorical variables.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3SIMPLE.crossTab / TABLE={name='cars', caslib='casuser'} row='Type' col='Origin' association=true;
4 
5RUN;
6 
Result :
The output will include the standard crosstabulation table. Additionally, a second table named 'Measures' will be generated, containing statistics like Gamma, Kendall's Tau-b, Stuart's Tau-c, Somers' D, and others, along with their standard errors and confidence limits.

FAQ

What is the primary function of the crossTab action?
How can I perform a two-way tabulation?
What does the 'chiSq' parameter do?
How can I handle missing values in the tabulation?
What is the purpose of the 'weight' parameter?
How can I calculate measures of association between variables?
What does the 'aggregator' parameter control?

Associated Scenarios

Use Case
Retail Revenue Distribution Analysis

A large retail chain wants to optimize its supply chain by analyzing sales performance. Instead of a simple transaction count, they need to sum the total 'MSRP' (Market Suggeste...

Use Case
IoT Temperature Binning and Independence Test

An industrial manufacturing plant monitors engine temperatures via IoT sensors. The data is continuous, but engineers need to categorize temperatures into 5 distinct 'bins' (ran...

Use Case
Clinical Trial Attrition with Missing Data

In a clinical drug trial, some patients dropped out, resulting in missing values for the 'SideEffects' variable. The researchers must include these missing values in the analysi...