The CARDINALITY procedure analyzes the cardinality of variables in a CAS dataset. It can identify the number of unique levels for each specified variable and, if the number of levels exceeds a defined limit (MAXLEVELS), it reports the most frequent levels. The ORDER= option in the VAR statement allows specifying a sorting order (e.g., ASC for ascending, DESC for descending) for the selection and display of variable levels, thereby ignoring default user formats or standard alphabetical order. This example demonstrates how to apply an ascending numerical order to the 'engineSize' variable.
Data Analysis
Type : SASHELP / CREATION_INTERNE
The examples use the built-in 'sashelp.cars' dataset and data generated via DATA Step to ensure autonomy.
1 Code Block
PROC CARDINALITY
Explanation : This example illustrates the simplest use of the CARDINALITY procedure. It loads the 'sashelp.cars' dataset into a CAS session (mylib.cars) then analyzes the 'Make', 'Model', and 'Type' variables to determine their cardinality without specifying any particular order. The results are stored in 'mylib.card_basic' and displayed. The default order is alphabetical or by frequency if a threshold is reached.
Copied!
/* Étape DATA pour charger sashelp.cars dans une bibliothèque CAS */
libname mylib cas;
data mylib.cars;
set sashelp.cars;
run;
/* Utilisation de base de PROC CARDINALITY */
title 'Cardinalité des variables par défaut';
proc cardinality data=mylib.cars outcard=mylib.card_basic;
var Make Model Type;
run;
proc print data=mylib.card_basic;
var _varname_ _order_ _cardinality_;
title 'Résumé de la cardinalité (Basique)';
run;
1
/* Étape DATA pour charger sashelp.cars dans une bibliothèque CAS */
Explanation : This example follows the logic of the original documentation. It analyzes the 'engineSize' variable from the 'mylib.cars' dataset. The `ORDER=ASC` option is used in the VAR statement to enforce an ascending sort of 'engineSize' levels, ignoring any formats. `MAXLEVELS=5` limits the detailed display to the first 5 levels. The `outcard` and `outdetails` output tables are used to store and display the results.
Copied!
/* Étape DATA pour charger sashelp.cars dans une bibliothèque CAS */
libname mylib cas;
data mylib.cars;
set sashelp.cars;
run;
/* Forcer un ordre ascendant sur 'engineSize' */
title 'Cardinalité de engineSize avec ordre ascendant et MAXLEVELS';
proc cardinality data=mylib.cars outcard=mylib.card_asc outdetails=mylib.details_asc maxlevels=5;
var engineSize / order=asc;
run;
proc print data=mylib.card_asc;
var _varname_ _order_ _more_ _cardinality_;
title 'Résumé de la cardinalité (Ordre ASC)';
run;
/* Afficher les détails des niveaux */
proc print data=mylib.details_asc;
title 'Détails des niveaux de engineSize (Ordre ASC)';
run;
1
/* Étape DATA pour charger sashelp.cars dans une bibliothèque CAS */
2
LIBNAME mylib cas;
3
DATA mylib.cars;
4
SET sashelp.cars;
5
RUN;
6
7
/* Forcer un ordre ascendant sur 'engineSize' */
8
title 'Cardinalité de engineSize avec ordre ascendant et MAXLEVELS';
title 'Détails des niveaux de engineSize (Ordre ASC)';
21
RUN;
3 Code Block
PROC CARDINALITY Data
Explanation : This advanced example shows how to handle a categorical variable (`Status`) with a user-defined format (`$statusfmt.`). The `mylib.employees` dataset is created using `datalines` and then loaded into CAS. The CARDINALITY procedure is used with `ORDER=DESC` to sort the unique levels of 'Status' in descending order based on their raw values (A, I, S), not their displayed formats (Actif, Inactif, Suspendu).
Copied!
/* Création de données avec un format utilisateur et chargement en CAS */
libname mylib cas;
proc format;
value $statusfmt 'A' = 'Actif' 'I' = 'Inactif' 'S' = 'Suspendu';
run;
data mylib.employees;
input EmployeeID Status $ Salary;
format Status $statusfmt.;
datalines;
101 A 50000
102 I 60000
103 A 55000
104 S 70000
105 A 48000
106 I 62000
107 S 75000
108 A 51000
109 I 58000
110 A 53000
;
run;
/* Analyse de la cardinalité avec ordre descendant pour la variable Status */
title 'Cardinalité de Status avec ordre descendant sur les valeurs brutes';
proc cardinality data=mylib.employees outcard=mylib.card_desc outdetails=mylib.details_desc maxlevels=3;
var Status / order=desc;
run;
proc print data=mylib.card_desc;
var _varname_ _order_ _cardinality_;
title 'Résumé de la cardinalité (Ordre DESC)';
run;
proc print data=mylib.details_desc;
title 'Détails des niveaux de Status (Ordre DESC)';
run;
1
/* Création de données avec un format utilisateur et chargement en CAS */
2
LIBNAME mylib cas;
3
4
PROC FORMAT;
5
value $statusfmt 'A' = 'Actif''I' = 'Inactif''S' = 'Suspendu';
6
RUN;
7
8
DATA mylib.employees;
9
INPUT EmployeeID STATUS $ Salary;
10
FORMATSTATUS $statusfmt.;
11
DATALINES;
12
101 A 50000
13
102 I 60000
14
103 A 55000
15
104 S 70000
16
105 A 48000
17
106 I 62000
18
107 S 75000
19
108 A 51000
20
109 I 58000
21
110 A 53000
22
;
23
RUN;
24
25
/* Analyse de la cardinalité avec ordre descendant pour la variable Status */
26
title 'Cardinalité de Status avec ordre descendant sur les valeurs brutes';
title 'Détails des niveaux de Status (Ordre DESC)';
38
RUN;
4 Code Block
PROC CARDINALITY
Explanation : This example highlights more advanced features of `PROC CARDINALITY` in a CAS environment. It uses the `GROUP BY` statement to calculate the cardinality of the `Type` variable separately for each `Origin`. The `ORDER=FREQ` option is applied to sort the levels of `Type` by their frequency of occurrence. A `MAXLEVELS=2` is deliberately low to show how the `_MORE_` column indicates unreported levels, simulating a case where some groups might have more unique levels than displayed. This demonstrates handling large volumes of data and the ability to quickly identify high cardinality issues per group.
Copied!
/* Assurez-vous que sashelp.cars est chargé en CAS */
libname mylib cas;
data mylib.cars;
set sashelp.cars;
run;
/* Analyse de cardinalité groupée avec peu de niveaux affichés */
title 'Cardinalité groupée avec peu de niveaux affichés';
proc cardinality data=mylib.cars outcard=mylib.card_grouped outdetails=mylib.details_grouped maxlevels=2;
var Type / order=freq;
class Origin;
group by Origin;
run;
proc print data=mylib.card_grouped;
var Origin _varname_ _order_ _cardinality_;
title 'Résumé de la cardinalité groupée';
run;
proc print data=mylib.details_grouped;
where _varname_ = 'Type';
title 'Détails des niveaux de Type (groupé par Origin)';
run;
1
/* Assurez-vous que sashelp.cars est chargé en CAS */
2
LIBNAME mylib cas;
3
DATA mylib.cars;
4
SET sashelp.cars;
5
RUN;
6
7
/* Analyse de cardinalité groupée avec peu de niveaux affichés */
8
title 'Cardinalité groupée avec peu de niveaux affichés';
title 'Détails des niveaux de Type (groupé par Origin)';
23
RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
« In the SAS Viya environment, PROC CARDINALITY is more than a simple counting tool; it is a high-performance data profiling engine designed to handle massive distributed datasets. Understanding how to manipulate the ORDER= and MAXLEVELS options is essential for preparing data for machine learning or complex reporting. »
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.