Reshapes a table from a long format (multiple rows per subject) to a wide format (one row per subject with multiple columns for variables). This action is useful for preparing data for analyses that require a wide data structure.
| Parameter | Description |
|---|---|
| table | Specifies the input long-format table. Use its `groupBy` sub-parameter to specify the classification variable whose values will define the new columns in the output wide table. |
| id | Specifies the variable(s) that identify the observations. Each unique combination of ID variable values will form a single row in the output wide table. |
| inputs | Specifies the numeric variable(s) whose values will populate the cells of the new columns in the wide table. |
| casOut | Specifies the output wide-format table. |
| attributes | Specifies attributes for the variables, such as formats and labels. |
| charSeparatorChar | Specifies a character to use as a separator in the names of new variables when concatenating character values. |
| numSeparatorNum | Specifies a character to use as a separator in the names of new variables when concatenating numeric values. |
| cumFreqName | Specifies the variable name for the cumulative frequency in the output table. |
| frequencyName | Specifies the variable name for the frequency in the output table. |
| groupIdName | Specifies the variable in the output table that contains the group ID. |
| keyModify | Specifies modifications to character key values, such as converting to uppercase (U) or compressing blanks (C). |
| maxPosition | Specifies the maximum value of the position variable to consider. Records with a position value greater than this are ignored. |
| noPrefix | When set to True, prevents prefixing the statistic name to the variable name in the output table (e.g., '_sum' instead of 'var_sum'). |
| sum | Specifies numeric variables for which to compute the sum for each ID group. |
| mean | Specifies numeric variables for which to compute the mean for each ID group. |
| min | Specifies numeric variables for which to compute the minimum value for each ID group. |
| max | Specifies numeric variables for which to compute the maximum value for each ID group. |
| range | Specifies numeric variables for which to compute the range of values for each ID group. |
| nMiss | Specifies variables for which to count the number of missing values for each ID group. |
| orderByTable | Specifies a pre-sorted and grouped table to improve performance, typically from the groupBy or groupByInfo actions. |
This SAS code creates a sample table named 'sales_long' in the active caslib. The table is in a long format, with each row representing the sales of a single product for a single month. This format is ideal for reshaping into a wide format.
| 1 | DATA mycas.sales_long; |
| 2 | LENGTH product $ 10 month $ 3; |
| 3 | INFILE DATALINES; |
| 4 | INPUT product $ month $ sales; |
| 5 | DATALINES; |
| 6 | apple jan 100 |
| 7 | apple feb 110 |
| 8 | apple mar 120 |
| 9 | orange jan 80 |
| 10 | orange feb 85 |
| 11 | orange mar 95 |
| 12 | banana jan 120 |
| 13 | banana feb 125 |
| 14 | banana mar 130 |
| 15 | ; |
| 16 | RUN; |
This example converts the 'sales_long' table to a wide format. 'product' becomes the key row identifier, the unique values of 'month' become new columns, and the 'sales' values fill the cells of these new columns.
| 1 | PROC CAS; |
| 2 | datashaping.longToWide / |
| 3 | TABLE={name='sales_long', groupBy={'month'}} |
| 4 | id={'product'} |
| 5 | inputs={'sales'} |
| 6 | casOut={name='sales_wide', replace=true}; |
| 7 | RUN; |
| 8 | |
| 9 | PROC PRINT DATA=mycas.sales_wide; RUN; |
This example reshapes the long table while also generating summary statistics. The 'sum' and 'mean' parameters compute the total and average sales for each product. The 'numSeparatorNum' parameter specifies an underscore '_' to separate the original input variable name ('sales') from the group-by variable values ('jan', 'feb', 'mar') in the new column names.
| 1 | PROC CAS; |
| 2 | datashaping.longToWide / |
| 3 | TABLE={name='sales_long', groupBy={'month'}} |
| 4 | id={'product'} |
| 5 | inputs={'sales'} |
| 6 | sum={'sales'} |
| 7 | mean={'sales'} |
| 8 | numSeparatorNum='_' |
| 9 | casOut={name='sales_wide_stats', replace=true}; |
| 10 | RUN; |
| 11 | |
| 12 | PROC PRINT DATA=mycas.sales_wide_stats; RUN; |