Published on :
ETL INTERNAL_CREATION

Examples: Concatenating Data

This code is also available in: Deutsch Español
Awaiting validation
The functional analysis of this document focuses on three main methods for concatenating SAS© datasets. The SET statement is used to sequentially read observations from multiple input datasets into a single output dataset, appending observations one after another. PROC SQL uses the OUTER UNION CORRESPONDING clause to combine rows from two tables, superimposing common variables. Finally, PROC APPEND is used to add observations from one dataset to the end of an existing dataset, without creating a new dataset. Considerations for handling uncommon variables and using the OPEN=DEFER option are also discussed.
Data Analysis

Type : INTERNAL_CREATION


Examples use generated data (datalines) for datasets 'animal', 'plant', 'Year1', 'Year2', 'Table1', and 'Table2'.

1 Code Block
DATA STEP Data
Explanation :
This example uses the SET statement to concatenate two datasets ('animal' and 'plant') into a single output dataset ('concatenate'). Observations from 'animal' are placed before those from 'plant'. Common variables ('common') are handled, and uncommon variables are filled with missing values. The example demonstrates the creation of the 'animal' and 'plant' datasets using datalines to make them self-contained.
Copied!
1DATA animal;
2 INPUT common $ animal$;
3 DATALINES;
4a Ant
5b Bird
6c Cat
7d Dog
8e Eagle
9f Frog
10;
11RUN;
12 
13DATA plant;
14 INPUT common $ plant$;
15 DATALINES;
16a Apple
17b Banana
18c Coconut
19d Dewberry
20e Eggplant
21f Fig
22;
23RUN;
24 
25DATA concatenate;
26 SET animal plant;
27RUN;
28PROC PRINT DATA=concatenate;
29RUN;
2 Code Block
PROC SQL Data
Explanation :
This example uses PROC SQL with the OUTER UNION CORRESPONDING statement to concatenate the 'animal' and 'plant' datasets. This combines all rows from both tables and superimposes common variables. The result is stored in a new table named 'combined'. The 'animal' and 'plant' datasets are created using datalines to ensure the example's autonomy.
Copied!
1DATA animal;
2 INPUT common $ animal$;
3 DATALINES;
4a Ant
5b Bird
6c Cat
7d Dog
8e Eagle
9f Frog
10;
11RUN;
12 
13DATA plant;
14 INPUT common $ plant$;
15 DATALINES;
16a Apple
17b Banana
18c Coconut
19d Dewberry
20e Eggplant
21f Fig
22;
23RUN;
24 
25PROC SQL;
26 create TABLE combined as
27 select * from animal
28 outer union corresponding
29 select * from plant;
30QUIT;
3 Code Block
PROC APPEND Data
Explanation :
This example demonstrates the use of PROC APPEND to add observations from the 'Year2' dataset to the end of the 'Year1' dataset. Unlike the SET statement, PROC APPEND directly modifies the 'BASE=' dataset ('Year1' here) and does not create a new dataset. The 'Year1' and 'Year2' datasets are created with datalines.
Copied!
1DATA Year1;
2 INPUT Date;
3 DATALINES;
42009
52010
62011
72012
8;
9RUN;
10 
11DATA Year2;
12 INPUT Date;
13 DATALINES;
142010
152011
162012
172013
182014
19;
20RUN;
21 
22PROC APPEND base=Year1 DATA=Year2;
23RUN;
24PROC PRINT DATA=Year1;
25 title "Year1";
26RUN;
4 Code Block
DATA STEP Data
Explanation :
This example illustrates how the OPEN=DEFER option in the DATA statement can be used to generate a message in the SAS log when there are variables in the input datasets that are not present in the first dataset read. It first shows concatenation without OPEN=DEFER, where missing variables are simply filled with missing values. Then, it shows the same process with OPEN=DEFER, which results in the ignoring of additional variables and a warning in the log. The 'Table1' and 'Table2' datasets are created using datalines.
Copied!
1DATA Table1;
2 INPUT var1 var2 var3;
3 DATALINES;
41 10 100
52 20 200
63 30 300
74 40 400
8;
9RUN;
10 
11DATA Table2;
12 INPUT var1 var2 var3 predict lowermean;
13 DATALINES;
145 50 500 0.5 0.1
156 60 600 0.6 0.2
167 70 700 0.7 0.3
178 80 800 0.8 0.4
18;
19RUN;
20 
21DATA concat;
22 SET table1 table2;
23RUN;
24PROC PRINT DATA=concat;
25 title "Concatenate Table1 and Table2";
26RUN;
27 
28DATA concat2;
29 SET table1 table2 open=defer;
30RUN;
31PROC PRINT DATA=concat2;
32 title "Concatenate with OPEN=DEFER";
33RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Copyright © SAS Institute Inc. All Rights Reserved