Published on :
Data Manipulation CREATION_INTERNE

Demonstration of Adding SAS Data Sets

This code is also available in: Deutsch Español Français
Awaiting validation
The program begins by creating several data sets (advisees, advisees_2012, advisees_2013, advisees_2014) using DATA steps with DATALINES statements to generate internal data. It then uses PROC APPEND to merge these data sets in different ways. The first append is simple, with identical variable structures. The second append introduces a data set (advisees_2013) without the 'gender' variable, illustrating how PROC APPEND handles missing variables in the appending data set (by filling them with missing values in the new observations). The last examples show appending a data set (advisees_2014) that contains an additional variable ('program') not present in the base data set. A first attempt without the FORCE option will result in this new variable being ignored. The second attempt, using the FORCE option, adds the new variable to the base data set and populates it with missing values for existing observations, then adds the new observations with their values for this variable. PROC PRINTs are included to visualize the results at each step.
Data Analysis

Type : CREATION_INTERNE


All data used in this script is internally created via DATA STEP blocks and DATALINES statements. No external or SASHELP data sets are required.

1 Code Block
DATA STEP Data
Explanation :
This DATA STEP block creates a data set named 'advisees' with three variables: 'first' (first name, character), 'gender' (gender, character), and 'matric' (matriculation year, numeric). It is initialized with three observations.
Copied!
1DATA advisees;
2 INPUT first $ gender $ matric;
3 DATALINES;
4 Angela F 2010
5 Dawn F 2011
6 Aaron M 2011
7RUN;
2 Code Block
DATA STEP Data
Explanation :
This DATA STEP block creates a data set named 'advisees_2012' with the same variable structure as 'advisees'. It contains three observations for the year 2012.
Copied!
1DATA advisees_2012;
2 INPUT first $ gender $ matric;
3 DATALINES;
4 Sruthi F 2012
5 Lindsey F 2012
6 Natalie F 2012
7RUN;
3 Code Block
PROC APPEND
Explanation :
This PROC APPEND procedure adds all observations from the 'advisees_2012' data set to the 'advisees' data set. Since the data sets have identical variable structures, the operation proceeds smoothly.
Copied!
1 
2PROC APPEND base = advisees
3DATA = advisees_2012;
4RUN;
5 
4 Code Block
PROC PRINT
Explanation :
This PROC PRINT procedure displays the current content of the 'advisees' data set after the first append operation, showing the original observations and those from 'advisees_2012'.
Copied!
1PROC PRINT DATA = advisees;
2RUN;
5 Code Block
DATA STEP Data
Explanation :
This DATA STEP block creates a data set named 'advisees_2013' with only the 'first' and 'matric' variables. The 'gender' variable is missing compared to the 'advisees' data set.
Copied!
1DATA advisees_2013;
2 INPUT first $ matric;
3 DATALINES;
4 Sara 2013
5 Dennis 2013
6RUN;
6 Code Block
PROC APPEND
Explanation :
This PROC APPEND procedure attempts to append 'advisees_2013' to 'advisees'. Since 'advisees_2013' does not contain the 'gender' variable present in 'advisees', the new observations appended from 'advisees_2013' will have a missing value for the 'gender' variable.
Copied!
1 
2PROC APPEND base = advisees
3DATA = advisees_2013;
4RUN;
5 
7 Code Block
PROC PRINT
Explanation :
This PROC PRINT procedure displays the content of the 'advisees' data set after appending 'advisees_2013', highlighting the missing values for 'gender' in the newly added observations.
Copied!
1PROC PRINT DATA = advisees;
2RUN;
8 Code Block
DATA STEP Data
Explanation :
This DATA STEP block creates a data set named 'advisees_2014' which includes a new 'program' variable not present in the base 'advisees' data set.
Copied!
1DATA advisees_2014;
2 INPUT first $ matric program $;
3 DATALINES;
4 Nathan 2014 MPH
5 Gloria 2014 PhD
6RUN;
9 Code Block
PROC APPEND
Explanation :
This PROC APPEND procedure attempts to append 'advisees_2014' to 'advisees'. Without the FORCE option, the 'program' variable from 'advisees_2014' will be ignored and will not be added to the 'advisees' data set because it does not exist in the base data set.
Copied!
1 
2PROC APPEND base = advisees
3DATA = advisees_2014;
4RUN;
5 
10 Code Block
PROC PRINT
Explanation :
This PROC PRINT procedure displays the content of the 'advisees' data set after attempting to append 'advisees_2014' without the FORCE option. The 'program' variable will not be visible.
Copied!
1PROC PRINT DATA = advisees;
2RUN;
11 Code Block
PROC APPEND
Explanation :
This PROC APPEND procedure appends 'advisees_2014' to 'advisees' using the FORCE option. The FORCE option allows appending even if variable structures differ. In this case, the 'program' variable from 'advisees_2014' will be added to the 'advisees' data set, and existing observations in 'advisees' will receive a missing value for 'program'.
Copied!
1 
2PROC APPEND base = advisees
3DATA = advisees_2014 force;
4RUN;
5 
12 Code Block
PROC PRINT
Explanation :
This PROC PRINT procedure displays the final content of the 'advisees' data set after the forced append of 'advisees_2014', showing all variables, including the new 'program' variable.
Copied!
1PROC PRINT DATA = advisees;
2RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.