Published on :
Data Management INTERNAL_CREATION

Index an Existing SASHDAT File

This code is also available in: Deutsch Español Français
Awaiting validation
The `table.index` action is used to create an index on specified columns of a SASHDAT file. Unlike in-memory indexing after loading, this method allows the SASHDAT file to be designated as input for the `index` action via the `table` parameter. This optimizes resource utilization because the action reads the file directly from disk rather than requiring the full original table to be loaded into memory. The result of the operation is a new in-memory CAS table with the indexed columns. This new table can then be saved to disk, potentially replacing the original SASHDAT file or creating a new one, to persist the indexing. The `table.tableDetails` action is then used to verify the presence and size of the index in the resulting table.
Data Analysis

Type : INTERNAL_CREATION


The examples use data generated via SAS DATA steps and CAS procedures to create temporary SASHDAT files for demonstration purposes, ensuring the autonomy of each code block.

1 Code Block
DATA STEP / table.index Data
Explanation :
This example creates a simple SASHDAT file `monfichier_unindexed.sashdat` in a temporary caslib. It then uses the `table.index` action to create an index on the 'ID' column by specifying the SASHDAT file directly. The indexed table `monfichier_indexed` is then saved to persist the indexing in the original SASHDAT file. The `table.tableDetails` action is used to confirm that the index has been created.
Copied!
1/* Assurez-vous qu'une session CAS est active */
2options cashost="localhost" casport=5570;
3cas casauto;
4 
5/* Crée une caslib temporaire si elle n'existe pas */
6PROC CAS;
7 caslib _all_ assign;
8 BUILTINS.addCaslib / name='mycaslib', path='/tmp/mycaslib', subdirs=TRUE, dataSource={srcType='PATH'};
9QUIT;
10 
11/* 1. Création d'un fichier SASHDAT simple sur disque */
12DATA casuser.monfichier_unindexed;
13 INPUT ID $ Nom $ Valeur;
14 DATALINES;
15 A Jean 100
16 B Marie 150
17 C Pierre 200
18 D Anne 120
19 ;
20RUN;
21 
22PROC CASUTIL;
23 save casuser.monfichier_unindexed / caslib='mycaslib' replace;
24QUIT;
25 
26/* 2. Indexation d'une seule colonne du fichier SASHDAT */
27PROC CAS;
28 TABLE.index /
29 TABLE={name='monfichier_unindexed.sashdat', caslib='mycaslib'},
30 casout={name='monfichier_indexed', replace=TRUE, indexVars={'ID'}};
31RUN;
32 
33/* 3. Sauvegarde de la table indexée, remplaçant l'originale pour persister l'index */
34 TABLE.save /
35 TABLE='monfichier_indexed',
36 name='monfichier_unindexed.sashdat',
37 caslib='mycaslib',
38 replace=TRUE;
39RUN;
40 
41/* 4. Vérification de l'indexation (optionnel) */
42 TABLE.tableDetails / TABLE={name='monfichier_indexed', caslib='mycaslib'};
43RUN;
44QUIT;
45 
2 Code Block
DATA STEP / table.index Data
Explanation :
This example illustrates the indexing of multiple columns ('Produit', 'Region') of a SASHDAT file. After creating and saving the `ventes_unindexed.sashdat` file, the `table.index` action creates a composite index. `table.tableDetails` is then used to display detailed information for the in-memory table `ventes_indexed`, confirming the presence and configuration of the new index, including its size.
Copied!
1/* Assurez-vous qu'une session CAS est active */
2options cashost="localhost" casport=5570;
3cas casauto;
4 
5/* Crée une caslib temporaire si elle n'existe pas */
6PROC CAS;
7 caslib _all_ assign;
8 BUILTINS.addCaslib / name='mycaslib', path='/tmp/mycaslib', subdirs=TRUE, dataSource={srcType='PATH'};
9QUIT;
10 
11/* 1. Création d'un autre fichier SASHDAT sur disque */
12DATA casuser.ventes_unindexed;
13 INPUT Produit $ Region $ Montant;
14 DATALINES;
15 A Est 1000
16 B Ouest 1500
17 A Nord 800
18 C Sud 2200
19 B Est 1100
20 ;
21RUN;
22 
23PROC CASUTIL;
24 save casuser.ventes_unindexed / caslib='mycaslib' replace;
25QUIT;
26 
27/* 2. Indexation de plusieurs colonnes */
28PROC CAS;
29 TABLE.index /
30 TABLE={name='ventes_unindexed.sashdat', caslib='mycaslib'},
31 casout={name='ventes_indexed', replace=TRUE, indexVars={'Produit', 'Region'}};
32RUN;
33 
34/* 3. Sauvegarde de la table indexée */
35 TABLE.save /
36 TABLE='ventes_indexed',
37 name='ventes_unindexed.sashdat',
38 caslib='mycaslib',
39 replace=TRUE;
40RUN;
41 
42/* 4. Affichage des détails de la table pour confirmer l'index */
43 TABLE.tableDetails / TABLE={name='ventes_indexed', caslib='mycaslib'};
44RUN;
45QUIT;
46 
3 Code Block
DATA STEP / table.index / table.loadTable Data
Explanation :
This advanced example demonstrates how to create an index on a large SASHDAT file and reload it into memory with the index intact. After creating a large dataset `grandedonnees_unindexed.sashdat` and indexing it on 'ID' and 'Groupe', the indexed table is saved. The in-memory table is then dropped to simulate a fresh start. Finally, the table is reloaded into memory via `table.loadTable`, and `table.tableDetails` confirms that the indexing was preserved during persistence and reloading, demonstrating the efficiency of this approach for large volumes.
Copied!
1/* Assurez-vous qu'une session CAS est active */
2options cashost="localhost" casport=5570;
3cas casauto;
4 
5/* Crée une caslib temporaire si elle n'existe pas */
6PROC CAS;
7 caslib _all_ assign;
8 BUILTINS.addCaslib / name='mycaslib', path='/tmp/mycaslib', subdirs=TRUE, dataSource={srcType='PATH'};
9QUIT;
10 
11/* 1. Création d'un fichier SASHDAT de plus grande taille */
12DATA casuser.grandedonnees_unindexed;
13 DO i = 1 to 100000;
14 ID = i;
15 Groupe = ceil(i / 1000);
16 Valeur = ranuni(0) * 1000;
17 OUTPUT;
18 END;
19RUN;
20 
21PROC CASUTIL;
22 save casuser.grandedonnees_unindexed / caslib='mycaslib' replace;
23QUIT;
24 
25/* 2. Indexation de colonnes clés directement depuis le fichier SASHDAT sur disque */
26PROC CAS;
27 TABLE.index /
28 TABLE={name='grandedonnees_unindexed.sashdat', caslib='mycaslib'},
29 casout={name='grandedonnees_indexed', replace=TRUE, indexVars={'ID', 'Groupe'}};
30RUN;
31 
32/* 3. Sauvegarde de la table indexée, pour persister l'index sur disque */
33 TABLE.save /
34 TABLE='grandedonnees_indexed',
35 name='grandedonnees_unindexed.sashdat',
36 caslib='mycaslib',
37 replace=TRUE;
38RUN;
39 
40/* 4. Nettoyage de la table en mémoire pour simuler un rechargement */
41 TABLE.dropTable / name='grandedonnees_indexed';
42RUN;
43 
44/* 5. Chargement de la table SASHDAT qui inclut maintenant l'index */
45 TABLE.loadTable /
46 caslib='mycaslib',
47 path='grandedonnees_unindexed.sashdat',
48 casout={name='grandedonnees_reloaded', replace=TRUE};
49RUN;
50 
51/* 6. Vérification que la table rechargée possède bien l'index */
52 TABLE.tableDetails / TABLE={name='grandedonnees_reloaded', caslib='mycaslib'};
53RUN;
54QUIT;
55 
4 Code Block
DATA STEP / table.upload / table.index Data
Explanation :
This example demonstrates how to work with SASHDAT files from the file system to CAS. A SASHDAT file is first created locally, then loaded into an in-memory CAS table. The `table.index` action is then applied to this in-memory CAS table to create a new CAS table with the desired indexes. Finally, this indexed CAS table is saved to a new SASHDAT file on disk, incorporating the indexing. `table.tableDetails` is used to confirm the index on the in-memory CAS table.
Copied!
1/* Assurez-vous qu'une session CAS est active */
2options cashost="localhost" casport=5570;
3cas casauto;
4 
5/* Crée une caslib temporaire si elle n'existe pas */
6PROC CAS;
7 caslib _all_ assign;
8 BUILTINS.addCaslib / name='mycaslib', path='/tmp/mycaslib', subdirs=TRUE, dataSource={srcType='PATH'};
9QUIT;
10 
11/* 1. Création d'un fichier SASHDAT local */
12DATA _null_;
13 file '/tmp/mycaslib/data_locale_unindexed.sashdat';
14 DO i = 1 to 5000;
15 ID = i;
16 Categorie = ceil(i / 1000);
17 Value = rannor(0);
18 OUTPUT;
19 END;
20RUN;
21 
22/* 2. Chargement du fichier SASHDAT dans CAS en tant que table en mémoire */
23PROC CASUTIL;
24 load DATA='/tmp/mycaslib/data_locale_unindexed.sashdat' outcaslib='mycaslib' casout='data_locale_cas' replace;
25QUIT;
26 
27/* 3. Création d'une nouvelle table CAS indexée à partir de la table en mémoire */
28PROC CAS;
29 TABLE.index /
30 TABLE={name='data_locale_cas', caslib='mycaslib'},
31 casout={name='data_locale_cas_indexed', replace=TRUE, indexVars={'ID', 'Categorie'}};
32RUN;
33 
34/* 4. Vérification de l'indexation de la nouvelle table CAS */
35 TABLE.tableDetails / TABLE={name='data_locale_cas_indexed', caslib='mycaslib'};
36RUN;
37 
38/* 5. Sauvegarde de la table CAS indexée vers un fichier SASHDAT */
39 TABLE.save /
40 TABLE='data_locale_cas_indexed',
41 name='data_locale_indexed.sashdat',
42 caslib='mycaslib',
43 replace=TRUE;
44RUN;
45QUIT;
46 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Copyright © SAS Institute Inc. All Rights Reserved. Last updated: January 5, 2022