Published on :
Statistical SASHELP / CREATION_INTERNE

Force a Specific Order on a Variable (PROC CARDINALITY)

This code is also available in: Deutsch Español Français
Awaiting validation
The CARDINALITY procedure analyzes the cardinality of variables in a CAS dataset. It can identify the number of unique levels for each specified variable and, if the number of levels exceeds a defined limit (MAXLEVELS), it reports the most frequent levels. The ORDER= option in the VAR statement allows specifying a sorting order (e.g., ASC for ascending, DESC for descending) for the selection and display of variable levels, thereby ignoring default user formats or standard alphabetical order. This example demonstrates how to apply an ascending numerical order to the 'engineSize' variable.
Data Analysis

Type : SASHELP / CREATION_INTERNE


The examples use the built-in 'sashelp.cars' dataset and data generated via DATA Step to ensure autonomy.

1 Code Block
PROC CARDINALITY
Explanation :
This example illustrates the simplest use of the CARDINALITY procedure. It loads the 'sashelp.cars' dataset into a CAS session (mylib.cars) then analyzes the 'Make', 'Model', and 'Type' variables to determine their cardinality without specifying any particular order. The results are stored in 'mylib.card_basic' and displayed. The default order is alphabetical or by frequency if a threshold is reached.
Copied!
1/* Étape DATA pour charger sashelp.cars dans une bibliothèque CAS */
2LIBNAME mylib cas;
3DATA mylib.cars;
4 SET sashelp.cars;
5RUN;
6 
7/* Utilisation de base de PROC CARDINALITY */
8title 'Cardinalité des variables par défaut';
9PROC CARDINALITY DATA=mylib.cars outcard=mylib.card_basic;
10 var Make Model Type;
11RUN;
12 
13PROC PRINT DATA=mylib.card_basic;
14 var _varname_ _order_ _cardinality_;
15 title 'Résumé de la cardinalité (Basique)';
16RUN;
2 Code Block
PROC CARDINALITY
Explanation :
This example follows the logic of the original documentation. It analyzes the 'engineSize' variable from the 'mylib.cars' dataset. The `ORDER=ASC` option is used in the VAR statement to enforce an ascending sort of 'engineSize' levels, ignoring any formats. `MAXLEVELS=5` limits the detailed display to the first 5 levels. The `outcard` and `outdetails` output tables are used to store and display the results.
Copied!
1/* Étape DATA pour charger sashelp.cars dans une bibliothèque CAS */
2LIBNAME mylib cas;
3DATA mylib.cars;
4 SET sashelp.cars;
5RUN;
6 
7/* Forcer un ordre ascendant sur 'engineSize' */
8title 'Cardinalité de engineSize avec ordre ascendant et MAXLEVELS';
9PROC CARDINALITY DATA=mylib.cars outcard=mylib.card_asc outdetails=mylib.details_asc maxlevels=5;
10 var engineSize / order=asc;
11RUN;
12 
13PROC PRINT DATA=mylib.card_asc;
14 var _varname_ _order_ _more_ _cardinality_;
15 title 'Résumé de la cardinalité (Ordre ASC)';
16RUN;
17 
18/* Afficher les détails des niveaux */
19PROC PRINT DATA=mylib.details_asc;
20 title 'Détails des niveaux de engineSize (Ordre ASC)';
21RUN;
3 Code Block
PROC CARDINALITY Data
Explanation :
This advanced example shows how to handle a categorical variable (`Status`) with a user-defined format (`$statusfmt.`). The `mylib.employees` dataset is created using `datalines` and then loaded into CAS. The CARDINALITY procedure is used with `ORDER=DESC` to sort the unique levels of 'Status' in descending order based on their raw values (A, I, S), not their displayed formats (Actif, Inactif, Suspendu).
Copied!
1/* Création de données avec un format utilisateur et chargement en CAS */
2LIBNAME mylib cas;
3 
4PROC FORMAT;
5 value $statusfmt 'A' = 'Actif' 'I' = 'Inactif' 'S' = 'Suspendu';
6RUN;
7 
8DATA mylib.employees;
9 INPUT EmployeeID STATUS $ Salary;
10 FORMAT STATUS $statusfmt.;
11 DATALINES;
12101 A 50000
13102 I 60000
14103 A 55000
15104 S 70000
16105 A 48000
17106 I 62000
18107 S 75000
19108 A 51000
20109 I 58000
21110 A 53000
22;
23RUN;
24 
25/* Analyse de la cardinalité avec ordre descendant pour la variable Status */
26title 'Cardinalité de Status avec ordre descendant sur les valeurs brutes';
27PROC CARDINALITY DATA=mylib.employees outcard=mylib.card_desc outdetails=mylib.details_desc maxlevels=3;
28 var STATUS / order=desc;
29RUN;
30 
31PROC PRINT DATA=mylib.card_desc;
32 var _varname_ _order_ _cardinality_;
33 title 'Résumé de la cardinalité (Ordre DESC)';
34RUN;
35 
36PROC PRINT DATA=mylib.details_desc;
37 title 'Détails des niveaux de Status (Ordre DESC)';
38RUN;
4 Code Block
PROC CARDINALITY
Explanation :
This example highlights more advanced features of `PROC CARDINALITY` in a CAS environment. It uses the `GROUP BY` statement to calculate the cardinality of the `Type` variable separately for each `Origin`. The `ORDER=FREQ` option is applied to sort the levels of `Type` by their frequency of occurrence. A `MAXLEVELS=2` is deliberately low to show how the `_MORE_` column indicates unreported levels, simulating a case where some groups might have more unique levels than displayed. This demonstrates handling large volumes of data and the ability to quickly identify high cardinality issues per group.
Copied!
1/* Assurez-vous que sashelp.cars est chargé en CAS */
2LIBNAME mylib cas;
3DATA mylib.cars;
4 SET sashelp.cars;
5RUN;
6 
7/* Analyse de cardinalité groupée avec peu de niveaux affichés */
8title 'Cardinalité groupée avec peu de niveaux affichés';
9PROC CARDINALITY DATA=mylib.cars outcard=mylib.card_grouped outdetails=mylib.details_grouped maxlevels=2;
10 var Type / order=freq;
11 class Origin;
12 group BY Origin;
13RUN;
14 
15PROC PRINT DATA=mylib.card_grouped;
16 var Origin _varname_ _order_ _cardinality_;
17 title 'Résumé de la cardinalité groupée';
18RUN;
19 
20PROC PRINT DATA=mylib.details_grouped;
21 where _varname_ = 'Type';
22 title 'Détails des niveaux de Type (groupé par Origin)';
23RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Banner
Expert Advice
Expert
Michael
Responsable de l'infrastructure Viya.
« In the SAS Viya environment, PROC CARDINALITY is more than a simple counting tool; it is a high-performance data profiling engine designed to handle massive distributed datasets. Understanding how to manipulate the ORDER= and MAXLEVELS options is essential for preparing data for machine learning or complex reporting. »