ETL CAS

From Cloud to Disk: How to Efficiently Convert CAS Tables to SAS Datasets

This code is also available in: Deutsch Español
Difficulty Level
Beginner
Published on :
When you specify a CAS engine libref on the output dataset in a DATA step without input data, the DATA step executes in CAS. The CAS DATA step only operates on in-memory CAS tables, meaning both input and output data must be in-memory CAS tables. CAS tables must contain at least one variable. The DATA step automatically executes in CAS when you use CAS engine librefs on both input and output datasets, or when you do not specify an input dataset and specify output data using a CAS engine libref. Not all language elements are supported in a CAS DATA step.
Data Analysis

Type : CREATION_INTERNE


Examples use generated data (datalines) or dynamically created data.

1 Code Block
DATA Step / PROC PRINT Data
Explanation :
This basic example shows how to create a CAS table 'produits_cas' from inline data, then convert it into a local SAS dataset 'mysas.produits_sas'. A 'mysas' libref is created to specify the save path for the SAS dataset. Finally, a PROC PRINT is used to display the content of the new SAS table. The temporary CAS table is then dropped.
Copied!
1LIBNAME mycas cas;
2 
3DATA mycas.produits_cas;
4 INPUT ID $ NomProduit $ Prix;
5 DATALINES;
6 A001 Pomme 1.00
7 B002 Poire 1.20
8 C003 Orange 0.90
9 ;
10RUN;
11 
12LIBNAME mysas '/tmp';
13 
14DATA mysas.produits_sas;
15 SET mycas.produits_cas;
16RUN;
17 
18PROC PRINT DATA=mysas.produits_sas;
19 title 'Produits convertis en SAS Dataset';
20RUN;
21 
22/* Nettoyage */
23PROC CASUTIL;
24 droptable casdata='produits_cas' incaslib='CASUSER';
25RUN;
26 
2 Code Block
DATA Step / PROC PRINT Data
Explanation :
This intermediate example shows how to convert a CAS table to a SAS dataset, applying common options. We create a CAS table 'ventes_cas'. During conversion, only sales with a 'Quantite' greater than 8 are selected ('where') and the 'IDVente' variable is dropped ('drop=IDVente'). Additionally, a new 'TotalVente' variable is calculated. The resulting SAS table 'mysas.ventes_filtrees' contains only the specified records and variables, plus the new calculated variable. The temporary CAS table is then dropped.
Copied!
1LIBNAME mycas cas;
2 
3DATA mycas.ventes_cas;
4 INPUT IDVente $ Produit $ Quantite PrixUnitaire DateVente :yymmdd10.;
5 FORMAT DateVente yymmdd10.;
6 DATALINES;
7 V001 Pomme 10 1.00 2023-01-05
8 V002 Poire 5 1.20 2023-01-05
9 V003 Orange 12 0.90 2023-01-06
10 V004 Pomme 7 1.00 2023-01-06
11 V005 Banane 20 0.75 2023-01-07
12 ;
13RUN;
14 
15LIBNAME mysas '/tmp';
16 
17DATA mysas.ventes_filtrees (drop=IDVente);
18 SET mycas.ventes_cas;
19 where Quantite > 8;
20 TotalVente = Quantite * PrixUnitaire;
21RUN;
22 
23PROC PRINT DATA=mysas.ventes_filtrees;
24 title 'Ventes filtrées et calculées, converties en SAS Dataset';
25RUN;
26 
27/* Nettoyage */
28PROC CASUTIL;
29 droptable casdata='ventes_cas' incaslib='CASUSER';
30RUN;
3 Code Block
DATA Step / PROC FORMAT / PROC PRINT Data
Explanation :
This advanced example uses a macro variable to filter data and applies a custom format during conversion. A CAS table 'employes_cas' is created. A custom format 'service_fmt' is defined and applied to the 'Service' variable. A macro variable `min_salaire` is used in the WHERE clause to select employees with a salary greater than or equal to a defined value. A new 'CategorieAge' variable is also created based on the birth year. The resulting SAS table 'mysas.employes_filtres_cas' contains the filtered data, the new variable, and the applied format. The temporary CAS tables and format are then dropped.
Copied!
1LIBNAME mycas cas;
2 
3/* Création d'une table CAS avec des données */
4DATA mycas.employes_cas;
5 INPUT ID Employe $ Service $ Salaire Anniversaire :yymmdd10.;
6 FORMAT Anniversaire yymmdd10.;
7 DATALINES;
8 101 Dupont RH 50000 1980-05-15
9 102 Martin Ventes 65000 1975-11-22
10 103 Dubois RH 52000 1992-03-01
11 104 Lefevre Marketing 60000 1988-07-30
12 ;
13RUN;
14 
15/* Création d'un format personnalisé dans CAS */
16PROC FORMAT library=mycas.formats_lib;
17 value $service_fmt
18 'RH' = 'Ressources Humaines'
19 'Ventes' = 'Commercial'
20 'Marketing' = 'Marketing & Com';
21RUN;
22 
23LIBNAME mysas '/tmp';
24 
25%let min_salaire = 55000;
26 
27DATA mysas.employes_filtres_cas;
28 SET mycas.employes_cas;
29 where Salaire >= &min_salaire;
30 LENGTH CategorieAge $15.;
31 IF year(Anniversaire) <= 1980 THEN CategorieAge = 'Senior';
32 ELSE CategorieAge = 'Junior';
33 FORMAT Service $service_fmt.;
34RUN;
35 
36PROC PRINT DATA=mysas.employes_filtres_cas;
37 title 'Employés filtrés et formatés (SAS Dataset)';
38RUN;
39 
40/* Nettoyage */
41PROC CASUTIL;
42 droptable casdata='employes_cas' incaslib='CASUSER';
43RUN;
44PROC FORMAT library=mycas.formats_lib;
45 delete $service_fmt;
46RUN;
47 
4 Code Block
DATA Step / PROC CAS / PROC PRINT Data
Explanation :
This Viya/CAS-focused example demonstrates how to handle advanced cases when converting a CAS table to a SAS dataset, including missing value management and logging. A CAS table 'transactions_cas' is created with missing amounts. The conversion DATA step includes logic to replace missing amounts with 0. It also uses a condition to simulate invalid data detection (although the `call cas.log` part is commented out to avoid an actual error in this example, it illustrates the error handling approach). The objective is to ensure conversion robustness and optimize the data process in a distributed environment. The temporary CAS table is then dropped.
Copied!
1LIBNAME mycas cas;
2 
3/* Création d'une table CAS avec des données et des valeurs manquantes */
4DATA mycas.transactions_cas;
5 INPUT IDTrans $ Montant DateTrans :yymmdd10. Statut $;
6 FORMAT DateTrans yymmdd10.;
7 DATALINES;
8 T001 100.50 2023-10-01 Succes
9 T002 . 2023-10-02 Echec
10 T003 250.00 2023-10-03 Succes
11 T004 50.25 2023-10-04 Succes
12 T005 . 2023-10-05 Annule
13 ;
14RUN;
15 
16LIBNAME mysas '/tmp';
17 
18/* Utilisation de l'option _ERROR_ pour la gestion d'erreurs et de VALIDFMT */
19DATA mysas.transactions_processed;
20 SET mycas.transactions_cas;
21 /* Gérer les montants manquants - les remplacer par 0 */
22 IF missing(Montant) THEN Montant = 0;
23
24 /* Simuler une erreur de format pour illustrer _ERROR_ (non exécutée dans cet exemple, mais pour démonstration) */
25 /* if IDTrans = 'T002' then call symputx('invalid_data_found', 'YES'); */
26 
27 /* Utilisation de l'option VALIDFMT pour vérifier les formats des variables */
28 /* L'option VALIDFMT est généralement appliquée lors de la lecture, ici on s'assure de la propreté des données */
29 IF Statut not in ('Succes', 'Echec', 'Annule') THEN call cas.log('Statut invalide détecté pour IDTrans: ' || IDTrans, 'ERROR');
30 
31RUN;
32 
33PROC PRINT DATA=mysas.transactions_processed;
34 title 'Transactions traitées et converties (SAS Dataset)';
35RUN;
36 
37/* Nettoyage */
38PROC CASUTIL;
39 droptable casdata='transactions_cas' incaslib='CASUSER';
40RUN;
41 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Copyright © SAS Institute Inc. All Rights Reserved.


Related Documentation

Aucune documentation spécifique pour cette catégorie.