The difference between implicit and explicit writing (OUTPUT) is often a source of confusion for beginners, yet it's what gives the Data Step all its power.
On forums, the question often comes up: "I understand the theory, but when should I use one or the other?"
By default, every Data step has an implicit (automatic) output statement at the very end of the code, just before the RUN. The cycle proceeds as follows:
It writes the row to the output table (Implicit Output).
It clears the PDV (Program Data Vector) and returns to the beginning to read the next observation.
This is why you don't need to write OUTPUT for a simple copy:
data test;
set source;
/* L'output se fait tout seul ici ! */
run;
1
DATA test;
2
SETSOURCE;
3
/* L'output se fait tout seul ici ! */
4
RUN;
Note : Taking Control: Explicit Output
As soon as you write the OUTPUT keyword (or OUTPUT table_name) somewhere in your Data step, you disable the automatic implicit output.
data hommes femmes;
set demog;
if sexe = 'M' then output hommes; /* Écrit seulement dans la table HOMMES */
else if sexe = 'F' then output femmes; /* Écrit seulement dans la table FEMMES */
run;
1
DATA hommes femmes;
2
SET demog;
3
IF sexe = 'M'THENOUTPUT hommes; /* Écrit seulement dans la table HOMMES */
4
ELSEIF sexe = 'F'THENOUTPUT femmes; /* Écrit seulement dans la table FEMMES */
5
RUN;
Note : Case B: Aggregation and Filtering (Controlling write frequency)
The second expert shows how to reduce the number of rows (aggregation) without using a statistical procedure (PROC MEANS).
The idea is to perform calculations row by row (sums, counters), but only write the result at the end of a group (when last.variable is true).
data synthese;
set ventes;
by produit; /* Nécessaire pour utiliser first. et last. */
/* Accumulation des sommes... */
if first.produit then total = 0;
total + montant;
/* On écrit SEULEMENT quand le groupe est fini */
if last.produit then output;
run;
1
DATA synthese;
2
SET ventes;
3
BY produit; /* Nécessaire pour utiliser first. et last. */
4
5
/* Accumulation des sommes... */
6
IF first.produit THEN total = 0;
7
total + montant;
8
9
/* On écrit SEULEMENT quand le groupe est fini */
10
IF last.produit THENOUTPUT;
11
RUN;
In this example:
If you had left the implicit output, you would have had one output row for each input row (with partial totals).
With the conditional explicit output (if ... then output), you only get one row per product.
Summary: The Golden Rule
Type
Description
When to use it?
Implicit
Automatic at the end of the step.
For simple transformations (1 input row = 1 output row).
Explicit
Manual via the OUTPUT statement. Disables the automatic behavior.
To split data (1 to N tables), multiply rows (1 to N rows), or aggregate (N to 1 row).
Technical Note: Another frequent use mentioned is parsing complex files. If a text file contains mixed headers, details, and footers, explicit output allows sending the headers to table A and the details to table B, all in a single pass.
Important Disclaimer
The codes and examples provided on WeAreCAS.eu are for educational purposes. It is imperative not to blindly copy-paste them into your production environments. The best approach is to understand the logic before applying it. We strongly recommend testing these scripts in a test environment (Sandbox/Dev). WeAreCAS accepts no responsibility for any impact or data loss on your systems.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.