crossTab

Q: What is the primary function of the crossTab action?

The crossTab action is used to perform one-way or two-way tabulations on a given dataset.

Q: How can I perform a two-way tabulation?

To perform a two-way tabulation, you must specify both the 'row' parameter for the row variable and the 'col' parameter for the column variable.

Q: What does the 'chiSq' parameter do?

When the 'chiSq' parameter is set to True, the action computes chi-square statistics to test the independence of the row and column variables, including their asymptotic p-values. The default is False.

Q: How can I handle missing values in the tabulation?

You can include missing values in the crosstabulation by setting the 'includeMissing' parameter to True. By default, it is False.

Q: What is the purpose of the 'weight' parameter?

The 'weight' parameter specifies a numeric variable whose values are used to compute weighted statistics for each cell in the table, as well as for the margins. The specific statistic is determined by the 'aggregator' parameter.

Q: How can I calculate measures of association between variables?

Set the 'association' parameter to True to compute various measures of association between the row and column variables of the crosstabulation. The default is False.

Q: What does the 'aggregator' parameter control?

The 'aggregator' parameter specifies the type of statistic to compute when a 'weight' variable is used. Options include "SUM", "MEAN", "N" (number of observations), "STD" (standard deviation), and many others.

Description

The crossTab action performs one-way or two-way tabulations, also known as frequency counts. It can compute various statistics such as chi-square tests for independence and measures of association. This action is useful for understanding the distribution and relationship between categorical variables.

simple.crossTab / table={name='tableName', caslib='caslibName'} row='rowVar' col='colVar' weight='weightVar' chiSq=true association=true includeMissing=true descending=false aggregator='N' rowNBins=0 colNBins=0 niceBinning=true fullTable=false acrossBy=false orderByGbyRaw=false groupByLimit=0 rowFormat='format' colFormat='format';

Settings

Parameter	Description
acrossBy	When set to True, the levels of row and column variables are the same across the group-by variables.
aggregator	Specifies the aggregator for which the values of the weight variable are rolled up into a rank order score if a weight variable is specified. Values include 'CSS', 'CV', 'KURTOSIS', 'MAX', 'MEAN', 'MIN', 'N', 'NMISS', 'PROBT', 'SKEWNESS', 'STD', 'STDERR', 'SUM', 'TSTAT', 'USS', 'VAR'.
association	When set to True, measures of association between the row and column variable of the crosstabulation are computed.
chiSq	When set to True, chi-square statistics are computed for the test of independence of the row and column variables and their asymptotic p-values.
col	Specifies the column variable for the crosstabulation.
colFormat	Specifies a format for the column variable.
colNBins	Specifies the number of bins to use in binning the column variable.
descending	When set to True, the formatted levels of the variables are arranged in descending order.
fullTable	When set to True, a full-table scan is performed.
groupByLimit	Specifies the maximum number of levels in a group-by set. When the server determines this number of levels, the server stops and does not return a result.
includeMissing	When set to True, missing values are included in the crosstabulation.
niceBinning	When set to True, the nice binning algorithm is used.
orderByGbyRaw	When set to True, the ordering of the group-by variables is based on the raw values of the variables, not the formatted values.
row	Specifies the row variable for the crosstabulation.
rowFormat	Specifies a format for the row variable.
rowNBins	Specifies the number of bins to use in binning the row variable.
table	Specifies the input CAS table for the analysis.
weight	Specifies the numeric weight variable used to compute the statistics in the table cell and in the margins of the table.

Data Preparation View data prep sheet

Data Creation

The following code creates the 'cars' table in the 'casuser' caslib, which will be used in the examples. This table is a copy of the 'sashelp.cars' dataset.

Copied!

1	DATA casuser.cars; SET sashelp.cars; RUN;

Examples

This example generates a simple one-way frequency table for the 'Type' variable from the 'cars' table. This is the most basic use of the crossTab action.

SAS® / CAS Code Code awaiting community validation

Copied!

1
2	PROC CAS;
3	SIMPLE.crossTab / TABLE={name='cars', caslib='casuser'} row='Type';
4
5	RUN;
6

Result :
The output will be a result table named 'Crosstab' showing the frequency counts ('_Freq_'), percentages ('_Pct_'), and cumulative frequencies for each car type found in the 'Type' column.

This example performs a two-way tabulation between the 'Type' and 'Origin' variables. It uses the 'MSRP' column as a weighting factor, meaning the frequency counts will be the sum of MSRP for each cell. It also requests chi-square statistics to test for independence between 'Type' and 'Origin'.

SAS® / CAS Code Code awaiting community validation

Copied!

1
2	PROC CAS;
3	SIMPLE.crossTab / TABLE={name='cars', caslib='casuser'} row='Type' col='Origin' weight='MSRP' chiSq=true;
4
5	RUN;
6

Result :
The result includes a primary crosstabulation table where each cell contains the sum of 'MSRP' for that combination of 'Type' and 'Origin'. A second table, 'ChiSq', provides the Chi-Square test statistics, degrees of freedom (DF), and the corresponding p-value to assess the statistical significance of the association.

This example computes a two-way frequency table for 'Type' and 'Origin' and requests various measures of association to quantify the strength of the relationship between these two categorical variables.

SAS® / CAS Code Code awaiting community validation

Copied!

1
2	PROC CAS;
3	SIMPLE.crossTab / TABLE={name='cars', caslib='casuser'} row='Type' col='Origin' association=true;
4
5	RUN;
6

Result :
The output will include the standard crosstabulation table. Additionally, a second table named 'Measures' will be generated, containing statistics like Gamma, Kendall's Tau-b, Stuart's Tau-c, Somers' D, and others, along with their standard errors and confidence limits.

FAQ

What is the primary function of the crossTab action?

How can I perform a two-way tabulation?

What does the 'chiSq' parameter do?

How can I handle missing values in the tabulation?

What is the purpose of the 'weight' parameter?

How can I calculate measures of association between variables?

What does the 'aggregator' parameter control?

Associated Scenarios

Use Case

Retail Revenue Distribution Analysis

A large retail chain wants to optimize its supply chain by analyzing sales performance. Instead of a simple transaction count, they need to sum the total 'MSRP' (Market Suggeste...

View scenario

Use Case

IoT Temperature Binning and Independence Test

An industrial manufacturing plant monitors engine temperatures via IoT sensors. The data is continuous, but engineers need to categorize temperatures into 5 distinct 'bins' (ran...

View scenario

Use Case

Clinical Trial Attrition with Missing Data

In a clinical drug trial, some patients dropped out, resulting in missing values for the 'SideEffects' variable. The researchers must include these missing values in the analysi...

View scenario

Actions associées

simple

compare

Compares two tables by computing the index and frequency of each group, cumul...

simple

correlation

The correlation action computes Pearson product-moment correlations. This is ...

simple

freq

Generates a frequency distribution for one or more variables. It calculates c...

simple

groupByInfo

The groupByInfo action computes the index and frequency of each group, and th...

simple

groupBy

Builds BY groups in terms of the variable value combinations given the variab...

simple

mdSummary

The mdSummary action calculates multidimensional summaries of numeric variabl...

Table of Contents

Chi-Square Statistical Analysis (Macro)

Description

Data Creation

Examples

FAQ

Associated Scenarios

Use Case

Retail Revenue Distribution Analysis

Use Case

IoT Temperature Binning and Independence Test

Use Case

Clinical Trial Attrition with Missing Data

Actions associées

compare

correlation

freq

groupByInfo

groupBy

mdSummary

Table of Contents

Chi-Square Statistical Analysis (Macro)

Description

Data Creation

Examples

One-Way Frequency Table

Two-Way Tabulation with Weighting and Chi-Square Test

Two-Way Tabulation with Measures of Association

FAQ

Associated Scenarios

Use Case

Retail Revenue Distribution Analysis

Use Case

IoT Temperature Binning and Independence Test

Use Case

Clinical Trial Attrition with Missing Data

Actions associées

compare

correlation

freq

groupByInfo

groupBy

mdSummary