The script compares the behaviors of different data appending methods. It examines the DATA step with the SET statement, PROC SQL set operators (UNION, UNION ALL, UNION CORR, OUTER UNION), and the PROC DATASETS procedure with the APPEND statement. It highlights duplicate handling, column alignment (different naming), and the use of the FORCE option when table structures differ.
Data Analysis
Type : MIXTE
Data is derived from the 'sashelp.class' system table for the 'big' table, and manually created via a DATA Step for the 'small' table.
1 Code Block
DATA STEP Data
Explanation : Creation of two working tables: 'big' (extracted from sashelp.class with intentional duplication) and 'small' (a single row with different column names/order to simulate structural incompatibilities).
Copied!
title '2.8 Appending Data Sets';
* Create a not so big data set;
data big;
set sashelp.class(keep=name sex age height weight);
where name > 'L';
output big;
* create a duplicate for Mary;
if name=:'M' then output big;
run;
data small;
* The variable WEIGHT has been misspelled as WT;
* The variables WT and HEIGHT are out of order;
name='fred'; sex='m'; age=5; wt=45; height=30;
run;
1
title '2.8 Appending Data Sets';
2
* Create a not so big data set;
3
DATA big;
4
SET sashelp.class(keep=name sex age height weight);
5
where name > 'L';
6
OUTPUT big;
7
* create a duplicate for Mary;
8
IF name=:'M'THENOUTPUT big;
9
RUN;
10
DATA small;
11
* The variable WEIGHT has been misspelled as WT;
12
* The variables WT and HEIGHT are out of order;
13
name='fred'; sex='m'; age=5; wt=45; height=30;
14
RUN;
2 Code Block
DATA STEP Data
Explanation : Simple concatenation using the DATA step and the SET statement. This stacks the two tables. Columns present in one but not the other (like 'wt') will have missing values in rows from the table where they are absent.
Copied!
title2 'Using the SET Statement';
* Append using the SET statement;
data bigger;
set big small;
run;
proc print data=bigger;
run;
1
title2 'Using the SET Statement';
2
* Append using the SET statement;
3
DATA bigger;
4
SET big small;
5
RUN;
6
PROC PRINTDATA=bigger;
7
RUN;
3 Code Block
PROC SQL Data
Explanation : Use of the SQL UNION operator. This method concatenates the results but removes exact duplicate rows (default behavior of UNION).
Copied!
title2 'Using SQL UNION';
* Append using UNION;
proc sql noprint;
create table bigger as
select *
from big
union
select *
from small;
quit;
proc print data=bigger;
run;
1
title2 'Using SQL UNION';
2
* Append using UNION;
3
PROC SQL noprint;
4
create TABLE bigger as
5
select *
6
from big
7
union
8
select *
9
from small;
10
QUIT;
11
PROC PRINTDATA=bigger;
12
RUN;
4 Code Block
PROC SQL Data
Explanation : Use of SQL UNION ALL. Unlike simple UNION, UNION ALL retains all rows, including duplicates.
Copied!
title2 'Using SQL UNION ALL';
* Append using UNION;
proc sql noprint;
create table bigger as
select *
from big
union all
select *
from small;
quit;
proc print data=bigger;
run;
1
title2 'Using SQL UNION ALL';
2
* Append using UNION;
3
PROC SQL noprint;
4
create TABLE bigger as
5
select *
6
from big
7
union all
8
select *
9
from small;
10
QUIT;
11
PROC PRINTDATA=bigger;
12
RUN;
5 Code Block
PROC SQL Data
Explanation : Use of UNION ALL with explicit renaming of the 'wt' column to 'Weight' in the second query to align data structures.
Copied!
title2 'Using SQL UNION ALL';
title3 'Naming the Variables';
* Append using UNION ALL;
proc sql noprint;
create table bigger as
select *
from big
union all
select Name,Sex,Age,Height,wt as Weight
from small;
quit;
proc print data=bigger;
run;
1
title2 'Using SQL UNION ALL';
2
title3 'Naming the Variables';
3
* Append using UNION ALL;
4
PROC SQL noprint;
5
create TABLE bigger as
6
select *
7
from big
8
union all
9
select Name,Sex,Age,Height,wt as Weight
10
from small;
11
QUIT;
12
PROC PRINTDATA=bigger;
13
RUN;
6 Code Block
PROC SQL Data
Explanation : Use of the UNION CORR (CORResponding) operator. SAS associates columns by their name, not their position. Columns that do not exist in both tables are excluded from the result.
Copied!
title2 'Using SQL UNION CORR';
* Append using UNION;
proc sql noprint;
create table bigger as
select *
from big
union corr
select *
from small;
quit;
proc print data=bigger;
run;
1
title2 'Using SQL UNION CORR';
2
* Append using UNION;
3
PROC SQL noprint;
4
create TABLE bigger as
5
select *
6
from big
7
union corr
8
select *
9
from small;
10
QUIT;
11
PROC PRINTDATA=bigger;
12
RUN;
7 Code Block
PROC SQL Data
Explanation : Use of OUTER UNION. This operator performs a full union without attempting to overlay columns if they do not match, although in a simple SAS SQL context, this often resembles a permissive append.
Copied!
title2 'Using SQL OUTER UNION';
* Append using UNION;
proc sql noprint;
create table bigger as
select *
from big
outer union
select *
from small;
quit;
proc print data=bigger;
run;
1
title2 'Using SQL OUTER UNION';
2
* Append using UNION;
3
PROC SQL noprint;
4
create TABLE bigger as
5
select *
6
from big
7
outer union
8
select *
9
from small;
10
QUIT;
11
PROC PRINTDATA=bigger;
12
RUN;
8 Code Block
PROC DATASETS
Explanation : Use of PROC DATASETS with the APPEND statement to add data directly to the base table ('big') without recreating the entire table. The first attempt fails because the structures (variable names/types) differ. The second succeeds thanks to the FORCE option, which forces the append by truncating overly long variables or ignoring those not in the base table.
Copied!
title2 'Using the APPEND Statement';
* Append using the APPEND Statement;
* This step fails because of mismatched PDVs;
proc datasets library=work nolist;
append base=big data=small;
quit;
* Use the FORCE option;
proc datasets library=work nolist;
append base=big data=small force;
quit;
proc print data=big;
run;
1
title2 'Using the APPEND Statement';
2
* Append using the APPEND Statement;
3
* This step fails because of mismatched PDVs;
4
PROC DATASETS library=work nolist;
5
append base=big DATA=small;
6
QUIT;
7
8
* Use the FORCE option;
9
PROC DATASETS library=work nolist;
10
append base=big DATA=small force;
11
QUIT;
12
PROC PRINTDATA=big;
13
RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.