Published on :
ETL MIXTE

Concatenation of SAS Tables

This code is also available in: Deutsch Español Français
Awaiting validation
The script compares the behaviors of different data appending methods. It examines the DATA step with the SET statement, PROC SQL set operators (UNION, UNION ALL, UNION CORR, OUTER UNION), and the PROC DATASETS procedure with the APPEND statement. It highlights duplicate handling, column alignment (different naming), and the use of the FORCE option when table structures differ.
Data Analysis

Type : MIXTE


Data is derived from the 'sashelp.class' system table for the 'big' table, and manually created via a DATA Step for the 'small' table.

1 Code Block
DATA STEP Data
Explanation :
Creation of two working tables: 'big' (extracted from sashelp.class with intentional duplication) and 'small' (a single row with different column names/order to simulate structural incompatibilities).
Copied!
1title '2.8 Appending Data Sets';
2* Create a not so big data set;
3DATA big;
4 SET sashelp.class(keep=name sex age height weight);
5 where name > 'L';
6 OUTPUT big;
7 * create a duplicate for Mary;
8 IF name=:'M' THEN OUTPUT big;
9 RUN;
10DATA small;
11 * The variable WEIGHT has been misspelled as WT;
12 * The variables WT and HEIGHT are out of order;
13 name='fred'; sex='m'; age=5; wt=45; height=30;
14 RUN;
2 Code Block
DATA STEP Data
Explanation :
Simple concatenation using the DATA step and the SET statement. This stacks the two tables. Columns present in one but not the other (like 'wt') will have missing values in rows from the table where they are absent.
Copied!
1title2 'Using the SET Statement';
2* Append using the SET statement;
3DATA bigger;
4 SET big small;
5 RUN;
6PROC PRINT DATA=bigger;
7 RUN;
3 Code Block
PROC SQL Data
Explanation :
Use of the SQL UNION operator. This method concatenates the results but removes exact duplicate rows (default behavior of UNION).
Copied!
1title2 'Using SQL UNION';
2* Append using UNION;
3PROC SQL noprint;
4create TABLE bigger as
5 select *
6 from big
7 union
8 select *
9 from small;
10 QUIT;
11PROC PRINT DATA=bigger;
12RUN;
4 Code Block
PROC SQL Data
Explanation :
Use of SQL UNION ALL. Unlike simple UNION, UNION ALL retains all rows, including duplicates.
Copied!
1title2 'Using SQL UNION ALL';
2* Append using UNION;
3PROC SQL noprint;
4create TABLE bigger as
5 select *
6 from big
7 union all
8 select *
9 from small;
10 QUIT;
11PROC PRINT DATA=bigger;
12RUN;
5 Code Block
PROC SQL Data
Explanation :
Use of UNION ALL with explicit renaming of the 'wt' column to 'Weight' in the second query to align data structures.
Copied!
1title2 'Using SQL UNION ALL';
2title3 'Naming the Variables';
3* Append using UNION ALL;
4PROC SQL noprint;
5create TABLE bigger as
6 select *
7 from big
8 union all
9 select Name,Sex,Age,Height,wt as Weight
10 from small;
11 QUIT;
12PROC PRINT DATA=bigger;
13RUN;
6 Code Block
PROC SQL Data
Explanation :
Use of the UNION CORR (CORResponding) operator. SAS associates columns by their name, not their position. Columns that do not exist in both tables are excluded from the result.
Copied!
1title2 'Using SQL UNION CORR';
2* Append using UNION;
3PROC SQL noprint;
4create TABLE bigger as
5 select *
6 from big
7 union corr
8 select *
9 from small;
10 QUIT;
11PROC PRINT DATA=bigger;
12RUN;
7 Code Block
PROC SQL Data
Explanation :
Use of OUTER UNION. This operator performs a full union without attempting to overlay columns if they do not match, although in a simple SAS SQL context, this often resembles a permissive append.
Copied!
1title2 'Using SQL OUTER UNION';
2* Append using UNION;
3PROC SQL noprint;
4create TABLE bigger as
5 select *
6 from big
7 outer union
8 select *
9 from small;
10 QUIT;
11PROC PRINT DATA=bigger;
12RUN;
8 Code Block
PROC DATASETS
Explanation :
Use of PROC DATASETS with the APPEND statement to add data directly to the base table ('big') without recreating the entire table. The first attempt fails because the structures (variable names/types) differ. The second succeeds thanks to the FORCE option, which forces the append by truncating overly long variables or ignoring those not in the base table.
Copied!
1title2 'Using the APPEND Statement';
2* Append using the APPEND Statement;
3* This step fails because of mismatched PDVs;
4PROC DATASETS library=work nolist;
5 append base=big DATA=small;
6 QUIT;
7 
8* Use the FORCE option;
9PROC DATASETS library=work nolist;
10 append base=big DATA=small force;
11 QUIT;
12PROC PRINT DATA=big;
13 RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.