Published on :

Data Flow Analysis with SCAPROC

This code is also available in: Deutsch Español Français
Awaiting validation
The program starts by configuring an autocallable macro library and activating detailed debug mode for SCAPROC. Then, it begins SCAPROC recording. Two initial DATA steps create copies of the 'sashelp.class' data. These copies are then aggregated by 'sex' using PROC SUMMARY, creating intermediate datasets (x2, y2). These aggregated datasets are sorted by 'sex' with PROC SORT (x3, y3). A merge DATA step then combines 'x3' and 'y3' into 'z'. A PROC PRINT is used to display the content of the 'z' dataset. Finally, a PROC SQL performs a left join between the original 'x' and 'y' datasets to create 'sql_table'. SCAPROC recording is stopped and the %scaproc_analyse macro is called to generate the GraphViz description of the processing flow.
Data Analysis

Type : SASHELP


The source data comes exclusively from the internal dataset 'SASHELP.CLASS', which is a standard example table provided with SAS. No external or inline-generated ('datalines') data is used.

1 Code Block
System Configuration and Macros
Explanation :
This block configures the SAS system for using specific macros. It defines 'mymacros' as an autocallable library pointing to the current directory, adds this library to the SASAUTOS options, and initializes the macro variable '_eandebug' to activate detailed SCAPROC tracing. Finally, the '%eanbegin' macro starts recording the program's execution flow under the name 'Sample 1'.
Copied!
1filename mymacros '.';
2options append=sasautos=(mymacros);
3 
4%* SET macro variable to turn on SCAPROC and SET verbose logging ;
5%let _eandebug=scaproc,verbose;
6
7%* Start recording SCAPROC DATA ;
8%eanbegin(Sample 1)
2 Code Block
DATA STEP Data
Explanation :
A DATA step that creates a new dataset named 'x'. It reads all observations and variables from the 'sashelp.class' dataset and copies them into 'x'.
Copied!
1DATA x ;
2 SET sashelp.class ;
3RUN ;
3 Code Block
DATA STEP Data
Explanation :
A second DATA step that creates a new dataset named 'y'. Similar to the previous one, it also copies all observations and variables from 'sashelp.class' into 'y'.
Copied!
1DATA y ;
2 SET sashelp.class ;
3RUN ;
4 Code Block
PROC SUMMARY Data
Explanation :
This procedure calculates summary statistics. For the 'x' dataset, it groups observations by the 'sex' variable and calculates the mean of the 'height' variable. The results are stored in the new 'x2' dataset.
Copied!
1PROC SUMMARY DATA=x ;
2 class sex ;
3 var height ;
4 OUTPUT out=x2 mean= ;
5RUN ;
5 Code Block
PROC SUMMARY Data
Explanation :
Similar to the previous block, this procedure calculates summary statistics for the 'y' dataset. It groups by 'sex', calculates the mean of 'height', and saves the results in 'y2'.
Copied!
1PROC SUMMARY DATA=y ;
2 class sex ;
3 var height ;
4 OUTPUT out=y2 mean= ;
5RUN ;
6 Code Block
PROC SORT Data
Explanation :
The PROC SORT procedure is used to sort the 'x2' dataset by the 'sex' variable. The sorted dataset is then saved as 'x3'. Sorting is an essential preparatory step for merge or join operations.
Copied!
1PROC SORT DATA=x2 out=x3 ;
2 BY sex ;
3RUN ;
7 Code Block
PROC SORT Data
Explanation :
Identically to the previous block, this PROC SORT sorts the 'y2' dataset by the 'sex' variable and stores the result in 'y3'. This prepares 'y3' for merging with 'x3'.
Copied!
1PROC SORT DATA=y2 out=y3 ;
2 BY sex ;
3RUN ;
8 Code Block
DATA STEP Data
Explanation :
This DATA step performs a merge of the sorted datasets 'x3' and 'y3'. The merge is performed on the 'sex' variable, which combines observations with the same 'sex' value from both datasets into the new 'z' dataset.
Copied!
1DATA z ;
2 MERGE x3 y3 ;
3 BY sex ;
4RUN ;
9 Code Block
PROC PRINT
Explanation :
The PROC PRINT procedure is used to display the content of the last dataset created in the SAS session. In this case, it is the 'z' dataset resulting from the merge operation.
Copied!
1PROC PRINT ;
2RUN ;
10 Code Block
PROC SQL Data
Explanation :
This block uses PROC SQL to perform a join. It creates a new table named 'sql_table' by combining the 'x' and 'y' datasets via a LEFT JOIN on the 'sex' variable. This includes all observations from 'x' and the corresponding observations from 'y'.
Copied!
1PROC SQL ;
2 create TABLE sql_table as
3 select *
4 from x
5 left join
6 y
7 on x.sex=y.sex ;
8QUIT ;
11 Code Block
SCAPROC Analysis
Explanation :
This block finalizes SCAPROC data recording with the '%eanend' macro. Then, the '%scaproc_analyse' macro is called to process the recorded SCAPROC data and generate GraphViz DOT format output, which can be used to create a graphical representation (flowchart) of the SAS program's execution flow.
Copied!
1%* Finish recording SCAPROC DATA and write it out ;
2%eanend
3
4
5%* Generate the graphViz dot language to be used to make diagram ;
6%scaproc_analyse
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.