simple compare

High-Volume Call Log Synchronization Check

Scénario de test & Cas d'usage

Business Context

A telecom operator validates that millions of call records generated by cell towers (Edge) are correctly replicated to the central Data Lake without data loss.
Data Preparation

Generating a larger dataset representing call logs with simulated IDs using a loop.

Copied!
1 
2DATA casuser.tower_logs;
3DO i=1 to 10000;
4CallID = catx('-', 'C', i);
5Duration = rand('integer', 10, 600);
6OUTPUT;
7END;
8 
9RUN;
10 
11DATA casuser.datalake_logs;
12DO i=1 to 9950;
13CallID = catx('-', 'C', i);
14Duration = rand('integer', 10, 600);
15OUTPUT;
16END;
17 
18RUN;
19 

Étapes de réalisation

1
Compare the two large tables using generated columns to index the differences efficiently.
Copied!
1 
2PROC CAS;
3SIMPLE.compare / TABLE={name='tower_logs'} table2={name='datalake_logs'} inputs={{name='CallID'}} generatedColumns={'GROUPID', 'POSITION'} groupIDName='Log_ID_Group' casOut={name='lost_packets', replace=true};
4 
5RUN;
6 
7QUIT;
8 

Expected Result


The action completes successfully on larger data. The 'lost_packets' table contains exactly the 50 CallIDs present in the tower logs but missing from the data lake (IDs 9951-10000). The generated column 'Log_ID_Group' helps index these missing records.