In the world of data analysis with SAS©, producing descriptive statistics is a fundamental step. Three procedures particularly stand out for these tasks: PROC MEANS, PROC SUMMARY, and PROC TABULATE. Although they share a common foundation, each has its own specifics and precise use cases.
This article explores their differences, their respective operations, and how to use them effectively, whether through code or graphical interfaces like SAS© Enterprise Guide and SAS© Studio.
The PROC MEANS procedure is often the analyst's first instinct. Its main function is to calculate descriptive statistics for variables, either for all observations or by groups.
Its capabilities include:
Estimating quantiles (including the median).
Calculating confidence limits for the mean.
Identifying extreme values.
Performing Student's t-tests (t-test).
Main feature: By default, PROC MEANS displays its results directly in the Output window.
Technically, the PROC SUMMARY is identical to PROC MEANS in terms of statistical calculations. It offers the same processing options.
The key difference: Unlike MEANS, which favors display, SUMMARY is designed to write its results to an output table (dataset). It displays nothing by default in the results window, making it ideal for preparing intermediate data without cluttering reports.
The PROC TABULATE procedure builds on the concepts of MEANS and SUMMARY but goes much further in terms of presentation. It specializes in displaying descriptive statistics in hierarchical tables.
Its major strengths:
Flexibility: It allows for classifying variable values and establishing complex hierarchical relationships between them.
Dual Output: It can send results to the output window and/or to a data table.
Formatting: It offers full control over the labels and formatting of the generated statistics.
Note on SAS© Viya™: In the SAS© Viya™ environment, these three procedures use CAS actions (Cloud Analytic Services) when processing CAS tables, thus ensuring optimal performance on large volumes of data.
Here are concrete examples using the sashelp.cars dataset, included in all SAS© installations.
Using MEANS and SUMMARY
The code below illustrates the difference in output. PROC MEANS displays the result, while PROC SUMMARY creates a table named WORK.summaryout.
Using TABULATE for a structured table
Here, we create a table crossing the vehicle type (row) with weight and wheelbase statistics (column).
Going Further: Conditional Formatting with TABULATE
One of the great advantages of PROC TABULATE is its ability to use custom formats to highlight data (e.g., "traffic light" color coding).
Comparative Summary
To help you choose the right procedure for your needs, here is a summary table of the main functional differences and their availability in graphical interfaces:
| Characteristic | PROC MEANS | PROC SUMMARY | PROC TABULATE |
| Main Objective | Quick exploration and standard statistics | Calculation of statistics for storage (ETL) | Presentation reports and complex tables |
| Default Output | Results window (Output) | SAS© Data Table (Dataset) | Results window (Output) |
| Layout Flexibility | Standard (Vertical list) | N/A (Database structure) | High (Cross-tab and hierarchical tables) |
| SAS© Enterprise Guide Support | Yes (Via Wizard) | No (Code required) | Yes (Via Wizard) |
| SAS© Studio Support | Yes (Via Tasks) | No (Code required) | No (Code required) |
Although PROC MEANS, SUMMARY, and TABULATE are similar in the descriptive statistics they generate, they differ in the nature of their outputs and their flexibility.
While PROC MEANS is ideal for a quick check and PROC SUMMARY for creating data tables, PROC TABULATE offers the greatest control. Its versatility often makes it the best choice for scenarios requiring a polished and hierarchical presentation of data.