The examples use internally generated data via DATA steps with DATALINES, ensuring their autonomy and reproducibility.
1 Code Block
DATA / PROC DATAMETRICS Data
Explanation : This example initializes a 'my_data' dataset with dummy information (name, address, city, state). Then, the PROC DATAMETRICS procedure is executed with the minimum required parameters to calculate data quality metrics for the 'name' and 'address' variables. The results are stored in the 'basic_metrics' table.
Copied!
data work.my_data;
length name $30 address $50 city $20 state $2;
input name $ address $ city $ state $;
datalines;
"John Doe" "123 Main St" "Anytown" "NY"
"Jane Smith" "456 Oak Ave" "Anycity" "CA"
"John Doe" "123 Main St" "Anytown" "NY"
"Peter Jones" "789 Pine Ln" "Otherville" "TX"
"Alice Brown" "101 Maple Dr" "Anytown" "NY"
"Bob White" "202 Elm St" "Otherville" "TX"
"Charlie Green" "303 Cedar Rd" "Anycity" "CA"
"David Black" "404 Birch Ct" "Anytown" "NY"
;
run;
proc datametrics data=work.my_data out=work.basic_metrics;
variables name address;
run;
proc print data=work.basic_metrics;
title "Métriques Basiques pour Nom et Adresse";
run;
Explanation : Based on the data from the previous example, this example uses common options: 'frequencies=10' for the 10 most frequent values, 'minmax=5' for 5 minimum and maximum values, and 'median' to calculate the median. The 'identities' statement is used to integrate a Quality Knowledge Base (QKB) specific to the 'ENUSA' locale and 'Field Content' definition to enrich the identification analysis.
Copied!
/* Assurez-vous que work.my_data est déjà créé à partir de l'Exemple 1 */
proc datametrics data=work.my_data out=work.common_metrics frequencies=10
minmax=5 median;
identities qkb='/sas/dqc/QKBLoc' locale='ENUSA' def='Field Content';
variables name address city;
run;
proc print data=work.common_metrics;
title "Métriques avec Fréquences, Min/Max, Médiane et QKB";
run;
1
/* Assurez-vous que work.my_data est déjà créé à partir de l'Exemple 1 */
title "Métriques avec Fréquences, Min/Max, Médiane et QKB";
11
RUN;
3 Code Block
PROC FORMAT / DATA / PROC DATAMETRICS
Explanation : This example introduces a custom format for the 'state' variable, then applies this format to a new dataset 'formatted_data'. PROC DATAMETRICS is then executed on this formatted table. The 'frequencies', 'minmax', and 'threads=4' options are used for parallel processing. The 'multiidentity' option in the 'identities' statement allows for analyzing multiple data quality identities for the specified variables.
Copied!
/* Assurez-vous que work.my_data est déjà créé à partir de l'Exemple 1 */
proc format;
value $statefmt
'NY'='New York'
'CA'='California'
'TX'='Texas'
other='Autre';
run;
data work.formatted_data;
set work.my_data;
format state $statefmt.;
run;
proc datametrics data=work.formatted_data out=work.advanced_metrics
frequencies=20 minmax=10 threads=4 format;
identities qkb='/sas/dqc/QKBLoc' locale='ENUSA'
def='Field Content' multiidentity;
variables name address city state;
run;
proc print data=work.advanced_metrics;
title "Métriques Avancées avec Formats, Threads et Multi-identités";
run;
1
/* Assurez-vous que work.my_data est déjà créé à partir de l'Exemple 1 */
title "Métriques Avancées avec Formats, Threads et Multi-identités";
25
RUN;
4 Code Block
CASLIB / PROC CASUTIL / PROC DATAMETRICS
Explanation : This example demonstrates integration with the SAS Viya Cloud Analytic Services (CAS) environment. The 'my_data' dataset is first loaded into a CAS library ('casuser.my_cas_data') using PROC CASUTIL. Then, PROC DATAMETRICS is executed directly on the in-memory CAS table. Options such as 'frequencies', 'minmax', and 'threads' are applied to optimize metric analysis in a distributed environment. The results are also stored in a CAS table.
Copied!
/* Assurez-vous que work.my_data est déjà créé à partir de l'Exemple 1 */
caslib _all_ assign;
proc casutil;
load data=work.my_data outcaslib='casuser' casout='my_cas_data' replace;
run;
proc datametrics data=casuser.my_cas_data out=casuser.cas_metrics
frequencies=5 minmax=3 threads=2;
identities qkb='/sas/dqc/QKBLoc' locale='ENUSA' def='Field Content';
variables name address city;
run;
proc print data=casuser.cas_metrics;
title "Métriques via DATAMETRICS sur CAS";
run;
1
/* Assurez-vous que work.my_data est déjà créé à partir de l'Exemple 1 */
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.