simple compare

Edge Case: Customer Merge with Nulls and Duplicates

Scénario de test & Cas d'usage

Business Context

A CRM system cleanup process needs to compare two lists of potential leads where data quality is poor (missing IDs, duplicate entries) to ensure robust handling of exceptions.
Data Preparation

Creation of datasets with missing keys (nulls) and duplicate entries for the same key.

Copied!
1DATA casuser.crm_list_A; LENGTH Email $20; INPUT Email $ STATUS $; DATALINES; . Active
2john@doe.com Active
3john@doe.com Pending
4; RUN; DATA casuser.crm_list_B; LENGTH Email $20; INPUT Email $ STATUS $; DATALINES; jane@doe.com New
5. Archived
6; RUN;

Étapes de réalisation

1
Run comparison including missing values and duplicates to test robustness.
Copied!
1 
2PROC CAS;
3SIMPLE.compare / TABLE={name='crm_list_A'} table2={name='crm_list_B'} inputs={{name='Email'}} includeMissing=true includeDuplicates='CASOUT' casOut={name='unique_source_A', replace=true};
4 
5RUN;
6 
7QUIT;
8 
2
Check if duplicates were preserved in the output as requested.
Copied!
1 
2PROC CAS;
3TABLE.recordCount / TABLE={name='unique_source_A'};
4 
5RUN;
6 
7QUIT;
8 

Expected Result


The comparison does not fail on null keys due to 'includeMissing=true'. The 'unique_source_A' table includes both entries for 'john@doe.com' because 'includeDuplicates' was set to 'CASOUT', verifying the system handles non-unique keys correctly.