Published on :
Général CREATION_INTERNE

Sans titre

This code is also available in: Deutsch Español Français
Awaiting validation
This script compares two approaches to clean data where the value '9' represents a missing value. It first creates a test dataset. Then, it shows a manual and repetitive approach with multiple conditional statements, followed by a professional and concise approach using a SAS© ARRAY and a DO loop to dynamically process all numeric variables.
Data Analysis

Type : CREATION_INTERNE


Data is generated directly within the script via the DATALINES statement.

1 Code Block
DATA STEP Data
Explanation :
Creation of the 'health_study' table with raw data included in the code.
Copied!
1DATA health_study;
2 INPUT id expc listen good take hlprob share livaln livchld slpsick nerves exclude count tellfl
3supress nocare satlife vigact liftgroc stairs bend;
4DATALINES;
510001 9 0 1 1 1 0 0 9 0 0 0 1 1 1 0 1 1 9 1 9
610003 1 1 0 0 0 0 1 0 1 0 1 0 1 1 1 0 0 0 9 1
710004 0 0 1 0 0 1 1 0 9 1 0 0 1 0 0 1 9 1 1 1
810005 0 9 9 9 9 9 9 1 0 9 9 9 9 9 9 9 1 0 0 0
9;
10RUN;
2 Code Block
DATA STEP
Explanation :
Manual method: Using individual IF conditions for each variable to convert the value 9 to a missing value (.).
Copied!
1DATA health_study2;
2 SET health_study;
3 IF expc = 9 THEN expc = .;
4 IF listen = 9 THEN listen = .;
5 IF good = 9 THEN good = .;
6 IF take = 9 THEN take = .;
7 IF hlprob = 9 THEN hlprob = .;
8 IF share = 9 THEN share = .;
9 IF livaln = 9 THEN livaln = .;
10 IF livchld = 9 THEN livchld = .;
11 IF slpsick = 9 THEN slpsick = .;
12 IF nerves = 9 THEN nerves = .;
13 IF exclude = 9 THEN exclude = .;
14 IF count = 9 THEN count = .;
15 IF tellfl = 9 THEN tellfl = .;
16 IF supress = 9 THEN supress = .;
17 IF nocare = 9 THEN nocare = .;
18 IF satlife = 9 THEN satlife = .;
19 IF vigact = 9 THEN vigact = .;
20 IF liftgroc = 9 THEN liftgroc = .;
21 IF stairs = 9 THEN stairs = .;
22 IF bend = 9 THEN bend = .;
23RUN;
3 Code Block
DATA STEP
Explanation :
Optimized method: Using an ARRAY grouping all numeric variables (_numeric_) and an iterative loop to automatically apply the transformation to all columns.
Copied!
1DATA health_study3;
2 SET health_study;
3 array variable {*} _numeric_;
4 DO i = 1 to dim(variable);
5 IF variable{i} = 9 THEN variable{i} = .;
6 END;
7RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.