Upload a Local CSV File and Apply SAS Format

The 'upload' action of the 'table action set' in PROC CAS is used to transfer a CSV file from the client's local file system (or the Compute server if executed there) to an in-memory table in CAS. The 'importOptions' option is crucial, as it allows specifying the file type (CSV), indicating whether the first row contains column names ('getNames=True'), and especially applying specific informats and formats to variables via the 'vars' sub-parameter. For dates in 'MM/JJ/AAAA' format, the 'mmddyy10.' informat is used to convert the string to a SAS^© numeric value (number of days since January 1, 1960), and the 'nldate20.' format is then applied to display these numeric values in a readable and local manner. Each example includes the creation of a temporary CSV file to ensure autonomy and executability.

Data Analysis

Type : CREATION_INTERNE

Examples use CSV data generated directly in SAS code via a DATA _NULL_ step, simulating the availability of a local CSV file.

1 Code Block

PROC CAS / UPLOAD Data

Explanation :
This example illustrates the upload of a simple CSV file. It first creates a CSV file named 'myusage_basic.csv' with basic data. Then, it uses the `upload` action in `PROC CAS` to load this file into a CAS table named 'myusage_basic'. The `importOptions` specify that the file is a CSV, that the first line contains column names (`getNames=True`), and apply the `mmddyy10.` informat and `nldate20.` format to the date columns (`recorddt`, `startdt`, `enddt`) to interpret them correctly. Finally, `print myusage_basic(obs=3)` displays the first three observations of the loaded table.

Copied!

1	/* Création du fichier CSV local */
2	DATA _NULL_;
3	FILE "/tmp/myusage_basic.csv" DSD LRECL=200;
4	PUT 'account,high,low,recorddt,startdt,enddt,kwh';
5	PUT '1000001,82,71,07/03/2013,07/01/2013,07/02/2013,54';
6	PUT '1000001,85,70,07/02/2013,06/30/2013,07/01/2013,70';
7	PUT '1000001,85,68,07/01/2013,06/29/2013,06/30/2013,78';
8	RUN;
9
10	/* Connexion au serveur CAS (remplacer par vos infos) */
11	*options cashost='cloud.example.com' casport=5570;
12	cas casauto;
13
14	PROC CAS;
15	SESSION casauto;
16	upload path="/tmp/myusage_basic.csv"
17	casout={name="myusage_basic", replace=True},
18	importOptions={
19	fileType="csv",
20	getNames=True,
21	vars={
22	recorddt={informat="mmddyy10.", FORMAT="nldate20."},
23	startdt={informat="mmddyy10.", FORMAT="nldate20."},
24	enddt={informat="mmddyy10.", FORMAT="nldate20."}
25	}};
26
27	PRINT myusage_basic(obs=3);
28	RUN;
29	QUIT;
30
31	/* Suppression du fichier CSV temporaire */
32	x "rm /tmp/myusage_basic.csv";

2 Code Block

PROC CAS / UPLOAD Data

Explanation :
This example extends the basic case by adding a 'cost' column and using `promote=True` for the CAS table, thus making it available to all CAS sessions. The CSV file 'myusage_inter.csv' is created with a new 'cost' column. During upload, the `comma8.2` informat and `dollar10.2` format are applied to the 'cost' column to handle monetary values. After the upload, the `table.columnInfo` action is used to check column properties, including the applied informats and formats, and then the complete table is displayed.

Copied!

1	/* Création du fichier CSV local avec plus de données */
2	DATA _NULL_;
3	FILE "/tmp/myusage_inter.csv" DSD LRECL=200;
4	PUT 'account,high,low,recorddt,startdt,enddt,kwh,cost';
5	PUT '1000001,82,71,07/03/2013,07/01/2013,07/02/2013,54,12.50';
6	PUT '1000001,85,70,07/02/2013,06/30/2013,07/01/2013,70,15.75';
7	PUT '1000002,85,68,07/01/2013,06/29/2013,06/30/2013,78,20.00';
8	PUT '1000002,90,70,06/30/2013,06/28/2013,06/29/2013,72,18.25';
9	PUT '1000003,88,68,06/29/2013,06/27/2013,06/28/2013,54,10.00';
10	RUN;
11
12	/* Connexion au serveur CAS (remplacer par vos infos) */
13	*options cashost='cloud.example.com' casport=5570;
14	cas casauto;
15
16	PROC CAS;
17	SESSION casauto;
18	upload path="/tmp/myusage_inter.csv"
19	casout={name="myusage_inter", replace=True, promote=True},
20	importOptions={
21	fileType="csv",
22	getNames=True,
23	vars={
24	recorddt={informat="mmddyy10.", FORMAT="nldate20."},
25	startdt={informat="mmddyy10.", FORMAT="nldate20."},
26	enddt={informat="mmddyy10.", FORMAT="nldate20."},
27	cost={informat="comma8.2", FORMAT="dollar10.2"}
28	}};
29
30	TABLE.columnInfo RESULT=r / TABLE={name='myusage_inter'};
31	PRINT r;
32
33	PRINT myusage_inter;
34	RUN;
35	QUIT;
36
37	/* Suppression du fichier CSV temporaire */
38	x "rm /tmp/myusage_inter.csv";

3 Code Block

PROC CAS / UPLOAD Data

Explanation :
This advanced example uses macro variables (`%LET`) to dynamically define the CSV file name (including system date and time for uniqueness), the CAS table name, and the caslib. Before uploading, it includes conditional logic (`if table.tableInfo(...) then do; table.dropTable...; end;`) to delete an existing CAS table if it has the same name, ensuring clean execution. The CSV file is created and uploaded. New informats/formats (`date9.`, `comma10.2`, `dollar10.2`) are used for various data types. Finally, an `x "rm &CSV_FILE";` command is added to clean up the temporary file on the operating system.

Copied!

1	/* Création du fichier CSV local */
2	%LET CSV_FILE = /tmp/mydata_advanced_&sysdate._%sysfunc(compress(&systime.)).csv;
3	%LET CAS_TABLE = mydata_advanced;
4	%LET CAS_LIB = CASUSER;
5
6	DATA _NULL_;
7	FILE "&CSV_FILE" DSD LRECL=200;
8	PUT 'ID,Category,Value,EntryDate';
9	PUT '1,A,123.45,12-JAN-2023';
10	PUT '2,B,678.90,01-FEB-2023';
11	PUT '3,A,111.22,03-MAR-2023';
12	PUT '4,C,333.33,15-APR-2023';
13	RUN;
14
15	/* Connexion au serveur CAS (remplacer par vos infos) */
16	*options cashost='cloud.example.com' casport=5570;
17	cas casauto;
18
19	PROC CAS;
20	SESSION casauto;
21	/* Vérifier si la table existe déjà et la supprimer si nécessaire */
22	IF TABLE.tableInfo(caslib="&CAS_LIB", name="&CAS_TABLE") THEN DO;
23	TABLE.dropTable caslib="&CAS_LIB" name="&CAS_TABLE";
24	END;
25
26	upload path="&CSV_FILE"
27	casout={name="&CAS_TABLE", caslib="&CAS_LIB", replace=True, promote=True},
28	importOptions={
29	fileType="csv",
30	getNames=True,
31	vars={
32	EntryDate={informat="date9.", FORMAT="nldate20."},
33	Value={informat="comma10.2", FORMAT="dollar10.2"}
34	}};
35
36	PRINT &CAS_TABLE;
37	RUN;
38	QUIT;
39
40	/* Supprimer le fichier CSV temporaire après utilisation */
41	x "rm &CSV_FILE";

4 Code Block

PROC CAS / UPLOAD, FEDSQL, SAVETABLE Data

Explanation :
This example illustrates a complete workflow in a Viya/CAS environment. It begins by creating a CSV file 'transactions.csv'. Then, the `upload` action loads this data into a CAS table 'raw_transactions', applying informats/formats (`yymmdd10.`, `comma10.2`, `dollar10.2`) for dates and amounts. The next step uses `fedsql.execDirect` to execute an SQL query directly on the CAS in-memory data, aggregating transactions by `CustomerID` to calculate the total number of transactions and the total amount spent. The resulting aggregated table is then persistently saved in the 'CASUSER' caslib as a SASHDA file (`table.save`). Finally, the aggregated table is displayed. This process highlights CAS's ability to manage data loading, processing, and persistence in memory.

Copied!

1	/* Création du fichier CSV local */
2	DATA _NULL_;
3	FILE "/tmp/transactions.csv" DSD LRECL=200;
4	PUT 'TransactionID,CustomerID,Amount,TransactionDate,ProductCode';
5	PUT '1,C001,150.75,2023-01-15,PROD001';
6	PUT '2,C002,230.00,2023-01-16,PROD002';
7	PUT '3,C001,50.25,2023-01-15,PROD003';
8	PUT '4,C003,100.00,2023-01-17,PROD001';
9	PUT '5,C002,75.50,2023-01-16,PROD004';
10	RUN;
11
12	/* Connexion au serveur CAS (remplacer par vos infos) */
13	*options cashost='cloud.example.com' casport=5570;
14	cas casauto;
15
16	PROC CAS;
17	SESSION casauto;
18	caslib _all_ assign;
19
20	/* 1. Charger le fichier CSV */
21	upload path="/tmp/transactions.csv"
22	casout={name="raw_transactions", replace=True, promote=True},
23	importOptions={
24	fileType="csv",
25	getNames=True,
26	vars={
27	TransactionDate={informat="yymmdd10.", FORMAT="nldate20."},
28	Amount={informat="comma10.2", FORMAT="dollar10.2"}
29	}};
30
31	/* 2. Effectuer une agrégation avec FedSQL sur la table CAS */
32	fedsql.execDirect RESULT=res / query="
33	CREATE TABLE aggregated_transactions AS
34	SELECT
35	CustomerID,
36	COUNT(TransactionID) AS TotalTransactions,
37	SUM(Amount) AS TotalAmountSpent
38	FROM raw_transactions
39	GROUP BY CustomerID
40	ORDER BY TotalAmountSpent DESC;
41	";
42	PRINT res;
43
44	/* 3. Sauvegarder la table agrégée dans une caslib (ex: CASUSER) */
45	TABLE.save RESULT=saveres /
46	TABLE={name="aggregated_transactions", caslib="CASUSER"},
47	name="aggregated_transactions.sashdat",
48	replace=True;
49	PRINT saveres;
50
51	/* 4. Afficher les résultats agrégés */
52	PRINT aggregated_transactions;
53	RUN;
54	QUIT;
55
56	x "rm /tmp/transactions.csv";

This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.

Retour à la liste

Expert Advice

Simon

Expert SAS et fondateur.

« For very large local files, ensure your client-to-server network bandwidth is sufficient. If the file is already on a server accessible by CAS, table.loadTable is generally faster than table.upload. »