Data Step

Understanding Implicit vs. Explicit Output

Simon 15 vues

The difference between implicit and explicit writing (OUTPUT) is often a source of confusion for beginners, yet it's what gives the Data Step all its power.

On forums, the question often comes up: "I understand the theory, but when should I use one or the other?"

Here is a clear explanation based on expert answers, illustrating how to take control of SAS©'s writing cycle.

Note :
The Default Behavior: Implicit Output
To fully understand, you need to visualize what SAS© does "behind the scenes" during a classic Data step.

By default, every Data step has an implicit (automatic) output statement at the very end of the code, just before the RUN. The cycle proceeds as follows:

SAS© reads an observation.

It executes your calculations.

It writes the row to the output table (Implicit Output).

It clears the PDV (Program Data Vector) and returns to the beginning to read the next observation.

This is why you don't need to write OUTPUT for a simple copy:
1DATA test;
2 SET SOURCE;
3 /* L'output se fait tout seul ici ! */
4RUN;
Note :
Taking Control: Explicit Output
As soon as you write the OUTPUT keyword (or OUTPUT table_name) somewhere in your Data step, you disable the automatic implicit output.

SAS© then considers that you are in command: it will only write a row if and when you tell it to.

This opens the door to two major use cases explained in the discussion:

Case A: Dispatching (Distributing data)
As Cathy explains, it is essential for splitting one table into several sub-tables in a single pass.

Without explicit OUTPUT, SAS© would send every row to all tables. With OUTPUT, you direct the flow:
1DATA hommes femmes;
2 SET demog;
3 IF sexe = 'M' THEN OUTPUT hommes; /* Écrit seulement dans la table HOMMES */
4 ELSE IF sexe = 'F' THEN OUTPUT femmes; /* Écrit seulement dans la table FEMMES */
5RUN;
Note :
Case B: Aggregation and Filtering (Controlling write frequency)
The second expert shows how to reduce the number of rows (aggregation) without using a statistical procedure (PROC MEANS).

The idea is to perform calculations row by row (sums, counters), but only write the result at the end of a group (when last.variable is true).
1DATA synthese;
2 SET ventes;
3 BY produit; /* Nécessaire pour utiliser first. et last. */
4
5 /* Accumulation des sommes... */
6 IF first.produit THEN total = 0;
7 total + montant;
8 
9 /* On écrit SEULEMENT quand le groupe est fini */
10 IF last.produit THEN OUTPUT;
11RUN;
In this example:

If you had left the implicit output, you would have had one output row for each input row (with partial totals).

With the conditional explicit output (if ... then output), you only get one row per product.

Summary: The Golden Rule

TypeDescriptionWhen to use it?
ImplicitAutomatic at the end of the step.For simple transformations (1 input row = 1 output row).
ExplicitManual via the OUTPUT statement. Disables the automatic behavior.To split data (1 to N tables), multiply rows (1 to N rows), or aggregate (N to 1 row).

Technical Note: Another frequent use mentioned is parsing complex files. If a text file contains mixed headers, details, and footers, explicit output allows sending the headers to table A and the details to table B, all in a single pass.