Published on :
ETL INTERNAL_CREATION

Comparison of variable length definition methods

This code is also available in: Deutsch Español Français
Awaiting validation
This script presents three approaches to creating a conditional variable 'result' in a Data step. It illustrates how SAS© determines the length of a character variable: implicitly via the first assignment (causing potential truncation), implicitly via a string of spaces, or explicitly via the LENGTH statement (best practice). It also calculates averages via the MEAN function and an arithmetic formula to show the different handling of missing values.
Data Analysis

Type : INTERNAL_CREATION


Data is generated directly in the code using the DATALINES statement.

1 Code Block
DATA STEP Data
Explanation :
Creation of the Grades1 table. WARNING: The 'result' variable is initialized to '' (empty string), which sets its length to 1 character. The values 'Failed' and 'Passed' will be truncated to 'F' and 'P'.
Copied!
1DATA Grades1;
2 INPUT subj 1-4 name $ 5-23 sex exam1 exam2 exam3 exam4 exam5 project finalexam;
3 actualavg = mean(exam1,exam2,exam3,exam4,exam5,project,finalexam);
4 avgscore = (exam1+exam2+exam3+exam4+exam5+project+finalexam)/7;
5
6 IF avgscore=. THEN RESULT=''; * missing value;
7 ELSE IF avgscore < 80 THEN RESULT='Failed';
8 ELSE RESULT='Passed';
9 DATALINES;
101011 Alia Bhatt 2 100 65 83 84 99 91 96
111012 Maria Smith 1 78 82 86 . 100 95 97
121111 Thomas Jones 2 88 81 96 69 91 90 98
131121 Benedictine Arnold 2 68 82 82 89 89 93 99
141301 Trisha Gupta 1 51 69 79 59 85 64 100
15;
16RUN;
2 Code Block
PROC PRINT
Explanation :
Display of Grades1 to observe the truncation of the 'result' variable.
Copied!
1 
2PROC PRINT
3DATA=Grades1;
4var name avgscore actualavg RESULT;
5RUN;
6 
3 Code Block
DATA STEP Data
Explanation :
Creation of Grades2. Here, 'result' is initialized with a 9-space string. This implicitly forces the length to 9, avoiding truncation, but consumes unnecessary space if the string is empty.
Copied!
1DATA Grades2;
2 INPUT subj 1-4 name $ 5-23 sex exam1 exam2 exam3 exam4 exam5 project finalexam;
3 actualavg = mean(exam1,exam2,exam3,exam4,exam5,project,finalexam);
4 avgscore = (exam1+exam2+exam3+exam4+exam5+project+finalexam)/7;
5
6 IF avgscore=. THEN RESULT=' '; * missing value;
7 ELSE IF avgscore < 80 THEN RESULT='Failed';
8 ELSE RESULT='Passed';
9 DATALINES;
101011 Alia Bhatt 2 100 65 83 84 99 91 96
111012 Maria Smith 1 78 82 86 . 100 95 97
121111 Thomas Jones 2 88 81 96 69 91 90 98
131121 Benedictine Arnold 2 68 82 82 89 89 93 99
141301 Trisha Gupta 1 51 69 79 59 85 64 100
15;
16RUN;
4 Code Block
PROC PRINT
Explanation :
Display of Grades2. The values 'Failed' and 'Passed' appear correctly.
Copied!
1 
2PROC PRINT
3DATA=Grades2;
4var name avgscore actualavg RESULT;
5RUN;
6 
5 Code Block
DATA STEP Data
Explanation :
Creation of Grades3. Using the 'LENGTH result $8.;' statement before using the variable. This is the recommended method for precisely controlling the type and length of variables.
Copied!
1DATA Grades3;
2 LENGTH RESULT $8.;
3 INPUT subj 1-4 name $ 5-23 sex exam1 exam2 exam3 exam4 exam5 project finalexam;
4 actualavg = mean(exam1,exam2,exam3,exam4,exam5,project,finalexam);
5 avgscore = (exam1+exam2+exam3+exam4+exam5+project+finalexam)/7;
6
7 IF avgscore=. THEN RESULT=''; * missing value;
8 ELSE IF avgscore < 80 THEN RESULT='Failed';
9 ELSE RESULT='Passed';
10 DATALINES;
111011 Alia Bhatt 2 100 65 83 84 99 91 96
121012 Maria Smith 1 78 82 86 . 100 95 97
131111 Thomas Jones 2 88 81 96 69 91 90 98
141121 Benedictine Arnold 2 68 82 82 89 89 93 99
151301 Trisha Gupta 1 51 69 79 59 85 64 100
16;
17RUN;
6 Code Block
PROC PRINT
Explanation :
Final display of Grades3 confirming correct variable handling.
Copied!
1 
2PROC PRINT
3DATA=Grades3;
4var name avgscore actualavg RESULT;
5RUN;
6 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.