Published on :
ETL CREATION_INTERNE

Creating Variables and Filtering Data with the DATA Step

This code is also available in: Deutsch Español Français
Awaiting validation
The DATA step allows building a new CAS table named CreditQualify from an existing table, CreditScores. Categorical variables such as 'Age_Range', 'FICO_Rating', and 'Credit_Qualification' are created based on the numerical values of the 'Age' and 'Credit_Score' variables. Classifications are made using IF-THEN/ELSE conditional blocks. The code includes the use of CAS sessions ('sessref=mysession') and viewing the resulting table via the 'table.fetch' action of PROC CAS.
Data Analysis

Type : CREATION_INTERNE


The example is based on a CAS table 'mycaslib.creditscores'. To make the example self-contained, we will simulate this table with a DATA step and datalines.

1 Code Block
DATA STEP Data
Explanation :
This code block first creates a source table 'mycaslib.creditscores' with dummy data for age and credit score. Then, it executes the original DATA step. This step reads observations from 'mycaslib.creditscores' and creates three new categorical variables: 'Age_Range' (age range), 'FICO_Rating' (FICO score evaluation), and 'Credit_Qualification' (credit qualification). The values of these variables are assigned conditionally based on 'Age' and 'Credit_Score' using IF-THEN/ELSE statements. The resulting table 'mycaslib.creditqualify' is stored on the CAS server.
Copied!
1DATA mycaslib.creditscores;
2 INPUT Age Credit_Score;
3 DATALINES;
4 20 550
5 30 680
6 40 750
7 50 810
8 60 400
9 70 700
10 22 350
11 35 600
12 45 780
13 55 850
14 65 590
15 ;
16RUN;
17 
18DATA mycaslib.creditqualify / sessref=mysession;
19 SET mycaslib.creditscores;
20 LENGTH Age_Range $8;
21 IF Age in (18, 19, 20, 21, 22, 23, 24, 25) THEN Age_Range="18-25";
22 ELSE IF Age in (26, 27, 28, 29, 30, 31, 32, 33, 34, 35) THEN Age_Range="26-35";
23 ELSE IF Age in (36, 37, 38, 39, 40, 41, 42, 43, 44, 45) THEN Age_Range="36-45";
24 ELSE IF Age in (46, 47, 48, 49, 50, 51, 52, 53, 54, 55) THEN Age_Range="46-55";
25 ELSE IF Age in (56, 57, 58, 59, 60, 61, 62, 63, 64) THEN Age_Range="56-64";
26 ELSE IF Age>=65 THEN Age_Range="65+";
27 
28 LENGTH FICO_Rating $12;
29 IF 300<=Credit_Score<=570 THEN FICO_Rating="Very Poor";
30 ELSE IF 580<=Credit_Score<=669 THEN FICO_Rating="Fair";
31 ELSE IF 670<=Credit_Score<=739 THEN FICO_Rating="Good";
32 ELSE IF 740<=Credit_Score<=799 THEN FICO_Rating="Very Good";
33 ELSE IF Credit_Score>=800 THEN FICO_Rating="Exceptional";
34 
35 LENGTH Credit_Qualification $12;
36 IF Credit_Score>=740 THEN Credit_Qualification="Platinum";
37 ELSE IF 650<=Credit_Score<=739 THEN Credit_Qualification="Gold";
38 ELSE IF 450<=Credit_Score<=649 THEN Credit_Qualification="Secured Card";
39 ELSE IF Credit_Score<=449 THEN Credit_Qualification="N/A";
40RUN;
2 Code Block
PROC CAS
Explanation :
This code block uses the CAS procedure to execute the 'table.fetch' action. This action retrieves and displays the observations from the 'creditqualify' CAS table that was created in the previous DATA step.
Copied!
1 
2PROC CAS;
3TABLE.fetch / TABLE="creditqualify" index=false;
4QUIT;
5 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.