Published on :
ETL CREATION_INTERNE

Restructuring SAS Tables (Wide to Long and vice versa)

This code is also available in: Deutsch Español Français
Awaiting validation
This educational script illustrates two fundamental data manipulation techniques. First, it transforms a 'wide' dataset containing baby weights at different months (weight3, weight6, etc. columns) into a 'long' format (one row per month per baby) using arrays (ARRAY) and an iterative loop in a DATA step. Second, it shows how to revert to the 'wide' format using the TRANSPOSE procedure, detailing the successive steps to refine the result (using BY, ID, and PREFIX).
Data Analysis

Type : CREATION_INTERNE


The 'baby_weight' data is statically generated in the script via the DATALINES statement.

1 Code Block
DATA STEP Data
Explanation :
Creation of the initial 'baby_weight' dataset containing identifiers, gender, and weights measured at 3, 6, 9, and 12 months.
Copied!
1DATA baby_weight;
2 INPUT id gender $ weight3 weight6 weight9 weight12;
3DATALINES;
41001 F 9 13 16 17
51002 F 11 16 17 20
61003 M 17 20 23 24
71004 F 16 18 21 22
81005 M 11 15 16 18
91006 M 17 21 25 26
101007 M 16 17 19 21
111008 F 15 16 18 19
12;
13RUN;
2 Code Block
PROC PRINT
Explanation :
Display of raw data for verification.
Copied!
1PROC PRINT DATA = baby_weight;
2RUN;
3 Code Block
DATA STEP Data
Explanation :
'Wide to Long' transformation. Uses two arrays (ARRAY): one for fixed months and one for weight variables. A DO loop iterates over the arrays and the OUTPUT statement generates a row for each month.
Copied!
1DATA baby_weight_long (keep = id gender month weight);
2 SET baby_weight;
3 array months{4} (3 6 9 12);
4 array weights{4} weight3--weight12;
5 DO i = 1 to 4;
6 month = months{i};
7 weight = weights{i};
8 OUTPUT;
9 END;
10RUN;
4 Code Block
PROC PRINT
Explanation :
Visualization of the restructured long format result.
Copied!
1PROC PRINT DATA = baby_weight_long;
2RUN;
5 Code Block
DATA STEP Data
Explanation :
Pedagogical example (intermediate or incorrect step for the final goal) showing iteration without the explicit OUTPUT statement at each iteration, which does not produce the desired 'long' structure for all observations.
Copied!
1DATA baby_weight_long1;
2 SET baby_weight;
3 array weights{4} weight3--weight12;
4 DO i = 1 to 4;
5 weight = weights{i};
6 END;
7RUN;
6 Code Block
PROC PRINT
Explanation :
Visualization of the intermediate step.
Copied!
1PROC PRINT DATA = baby_weight_long1;
2RUN;
7 Code Block
PROC SORT Data
Explanation :
Sorting 'long' data by identifier and month, a necessary prerequisite for correct or structured transposition.
Copied!
1 
2PROC SORT
3DATA = baby_weight_long;
4BY id month;
5RUN;
6 
8 Code Block
PROC TRANSPOSE Data
Explanation :
Transposition Step 1: Basic attempt at transposition without grouping variables (BY), which transposes the entire table.
Copied!
1 
2PROC TRANSPOSE
3DATA = baby_weight_long
4out = baby_weight_wide;
5RUN;
6 
9 Code Block
PROC TRANSPOSE Data
Explanation :
Transposition Step 2: Adding the BY 'id' statement to transpose the data for each subject separately.
Copied!
1PROC TRANSPOSE DATA = baby_weight_long
2 out = baby_weight_wide;
3 BY id;
4RUN;
10 Code Block
PROC TRANSPOSE Data
Explanation :
Transposition Step 3: Adding the ID 'month' statement to use the values of the 'month' variable as suffixes/names for the new transposed columns.
Copied!
1PROC TRANSPOSE DATA = baby_weight_long
2 out = baby_weight_wide;
3 BY id;
4 id month;
5RUN;
11 Code Block
PROC TRANSPOSE Data
Explanation :
Step 4 (Final): Refinement of the transposition with the PREFIX='weight' option to name the columns (e.g., weight3, weight6...) and deletion of the automatic _NAME_ column.
Copied!
1PROC TRANSPOSE DATA = baby_weight_long
2 out = baby_weight_wide (drop = _NAME_)
3 prefix = weight;
4 BY id;
5 id month;
6RUN;
12 Code Block
PROC PRINT
Explanation :
Visualization of the final transposition result (back to wide format).
Copied!
1PROC PRINT DATA = baby_weight_wide;
2RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.