Get Started! A Beginner's Guide to Programming in the SAS® Cloud Analytic Services (CAS) Environment
Simon 25 views
Difficulty Level
Débutant
Published on :
Expert Advice
Michael
The secret to mastering SAS CAS is minimizing data movement between the client and the server. Always aim to process your data 'in-place' using CAS-enabled procedures rather than pulling tables back to the Compute Server for every step. This habit is the key to unlocking the true speed of distributed, in-memory analytics.
The Major Difference: Parallel Processing
In CAS, the DATA step runs in a distributed manner. Very large datasets are divided among the available "threads" on the different machines. The DATA step code is copied and executed simultaneously on each thread, processing only the portion of data local to that thread.
Example of Parallel Processing:
In the example below, we add transaction fees to a large banking table. Using the automatic variable _THREADID_ in the log shows that the code is running on multiple different threads (e.g., 4 threads).
data mycas.updated_transaction_history;
set mycas.transaction_history;
/* Logique pour ajouter des frais selon l'année */
if year(transaction_dt)=2013 then fee=1;
/* ... autres années ... */
new_transaction_amt=transaction_amt+fee;
put _threadid_=; /* Affiche le numéro du thread dans le journal */
run;
1
DATA mycas.updated_transaction_history;
2
SET mycas.transaction_history;
3
/* Logique pour ajouter des frais selon l'année */
4
IF year(transaction_dt)=2013THEN fee=1;
5
/* ... autres années ... */
6
new_transaction_amt=transaction_amt+fee;
7
put _threadid_=; /* Affiche le numéro du thread dans le journal */
PROC CAS is the interface for executing the CAS Language (CASL). CASL interacts with the server via "actions". Actions are requests for specific tasks (table management, analyses, etc.), grouped into "action sets".
Here are examples of actions from the table action set for managing data:
Check if a table exists and retrieve info:
proc cas;
session casauto;
/* Vérifier si la table existe */
table.tableexists result=r / caslib='casuser' name='updated_transaction_history';
if (r.exists) then do;
/* Obtenir les infos de la table */
table.tableinfo / caslib='casuser' name='updated_transaction_history';
/* Récupérer (fetch) un échantillon de lignes */
table.fetch / table={caslib='casuser', name='updated_transaction_history'} from=1 to=20;
end;
quit;
Method A: Use a CAS Action (e.g., simple.freq)
The simple action set provides basic analytical functions.
proc cas;
session casauto;
/* Distribution de fréquence du statut de transaction groupé par année */
simple.freq /
inputs={'transaction_status'}
table={caslib='casuser', name='updated_transaction_history', groupby={name='year'}};
quit;
1
PROC CAS;
2
SESSION casauto;
3
/* Distribution de fréquence du statut de transaction groupé par année */
/* Préparation des données : ajout d'une colonne mois */
data mycas.updated_transaction_history2;
set mycas.updated_transaction_history;
month=put(transaction_dt,monname8.);
run;
/* Calcul des sommes par année et mois */
proc mdsummary data=mycas.updated_transaction_history2(where=(fee ne 0));
var fee;
groupby year month / out=mycas.summary_transaction_history;
run;
/* Affichage des résultats (car MDSUMMARY ne produit qu'une table de sortie) */
proc print data=mycas.summary_transaction_history label;
title 'Résumé de l\'historique des transactions';
var year month _sum_;
label _sum_='Total collecté ($)';
format _sum_ dollar8.;
run;
1
/* Préparation des données : ajout d'une colonne mois */
2
DATA mycas.updated_transaction_history2;
3
SET mycas.updated_transaction_history;
4
month=put(transaction_dt,monname8.);
5
RUN;
6
7
/* Calcul des sommes par année et mois */
8
PROC MDSUMMARYDATA=mycas.updated_transaction_history2(where=(fee ne 0));
9
var fee;
10
groupby year month / out=mycas.summary_transaction_history;
11
RUN;
12
13
/* Affichage des résultats (car MDSUMMARY ne produit qu'une table de sortie) */
The codes and examples provided on WeAreCAS.eu are for educational purposes. It is imperative not to blindly copy-paste them into your production environments. The best approach is to understand the logic before applying it. We strongly recommend testing these scripts in a test environment (Sandbox/Dev). WeAreCAS accepts no responsibility for any impact or data loss on your systems.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.