Scénario de test & Cas d'usage
Active learning to optimize data labeling.
Discover all actions of activeLearnCreates two tables with non-perfectly overlapping keys. 'customer_list' has customers 1-5. 'subscription_list' has subscriptions for customers 4, 5, and 6 (an orphan record).
| 1 | DATA casuser.customer_list; |
| 2 | LENGTH cust_id $ 5; |
| 3 | DO i = 1 TO 5; |
| 4 | cust_id = 'C' || LEFT(PUT(i, 4.)); |
| 5 | region = CHOOSE(i, 'NORTH', 'SOUTH', 'EAST', 'WEST', 'NORTH'); |
| 6 | OUTPUT; |
| 7 | END; |
| 8 | RUN; |
| 9 | |
| 10 | DATA casuser.subscription_list; |
| 11 | LENGTH cust_id $ 5 sub_type $ 10; |
| 12 | cust_id = 'C4'; sub_type = 'PREMIUM'; OUTPUT; |
| 13 | cust_id = 'C5'; sub_type = 'BASIC'; OUTPUT; |
| 14 | cust_id = 'C6'; sub_type = 'PREMIUM'; OUTPUT; |
| 15 | RUN; |
| 1 | PROC CAS; |
| 2 | ACTION activeLearn.alJoin / |
| 3 | TABLE={name='customer_list'}, |
| 4 | annotatedTable={name='subscription_list'}, |
| 5 | id='cust_id', |
| 6 | joinType='FULL', |
| 7 | casOut={name='reconciliation_view', replace=true}; |
| 8 | RUN; |
| 9 | QUIT; |
| 1 | PROC CAS; |
| 2 | TABLE.fetch / TABLE={name='reconciliation_view'}; |
| 3 | TABLE.rowCount / TABLE={name='reconciliation_view'}; |
| 4 | RUN; |
| 5 | QUIT; |
The output table 'reconciliation_view' must contain 6 rows in total. Customers C1, C2, C3 will have missing values for 'sub_type'. Customers C4 and C5 will have complete data. Customer C6 will have a value for 'sub_type' but missing values for 'region'. This confirms the FULL join works as expected, retaining all records and highlighting data integrity issues.