Scénario de test & Cas d'usage
Active learning to optimize data labeling.
Discover all actions of activeLearnCreates a main customer table ('customers_main') and a smaller table of campaign responders ('campaign_responders'). The join will be performed on 'customer_id'.
| 1 | DATA casuser.customers_main; |
| 2 | LENGTH customer_id $ 10; |
| 3 | DO i = 1 TO 50; |
| 4 | customer_id = 'CUST' || LEFT(PUT(i, 8.)); |
| 5 | age = 20 + FLOOR(RAND('UNIFORM') * 40); |
| 6 | OUTPUT; |
| 7 | END; |
| 8 | RUN; |
| 9 | |
| 10 | DATA casuser.campaign_responders; |
| 11 | LENGTH customer_id $ 10 response_type $ 8; |
| 12 | DO i = 5, 15, 25, 35, 45; |
| 13 | customer_id = 'CUST' || LEFT(PUT(i, 8.)); |
| 14 | response_type = 'CLICK'; |
| 15 | OUTPUT; |
| 16 | END; |
| 17 | RUN; |
| 1 | PROC CAS; |
| 2 | TABLE.tableInfo / caslib='casuser'; |
| 3 | RUN; |
| 4 | QUIT; |
| 1 | PROC CAS; |
| 2 | ACTION activeLearn.alJoin / |
| 3 | TABLE={name='customers_main'}, |
| 4 | annotatedTable={name='campaign_responders'}, |
| 5 | id='customer_id', |
| 6 | joinType='LEFT', |
| 7 | casOut={name='campaign_results', replace=true}; |
| 8 | RUN; |
| 9 | QUIT; |
| 1 | PROC CAS; |
| 2 | TABLE.fetch / |
| 3 | TABLE={name='campaign_results'}, |
| 4 | to=10; |
| 5 | RUN; |
| 6 | TABLE.tableInfo / name='campaign_results'; |
| 7 | RUN; |
| 8 | QUIT; |
The output table 'campaign_results' should contain exactly 50 rows, one for each customer. The 'response_type' column will be 'CLICK' for the 5 customers who responded and missing (null) for the other 45 customers. This correctly prepares the data for profiling responders vs. non-responders.