Scénario de test & Cas d'usage
Rule-based scoring of text documents.
Discover all actions of textRuleScoreSimulate a very large dataset of documents. We don't need complex text, just a high volume of rows to test the action's throughput. A simple mock model is also created.
| 1 | DATA mycas.legal_docs_large; |
| 2 | LENGTH doc_uuid $ 36 doc_content $ 256; |
| 3 | DO i = 1 to 2000000; |
| 4 | doc_uuid = uuidgen(); |
| 5 | IF mod(i, 100) = 0 THEN doc_content = 'This document discusses the confidential merger agreement and is privileged.'; |
| 6 | ELSE doc_content = 'Please find attached the weekly status report.'; |
| 7 | OUTPUT; |
| 8 | END; |
| 9 | RUN; |
| 10 | |
| 11 | DATA mycas.legal_model; |
| 12 | LENGTH _mco_ long; |
| 13 | _mco_ = 445566; |
| 14 | RUN; |
| 1 | PROC CAS; |
| 2 | textRuleScore.applyCategory / |
| 3 | TABLE={name='legal_docs_large'}, |
| 4 | docId='doc_uuid', |
| 5 | text='doc_content', |
| 6 | model={name='legal_model'}, |
| 7 | casOut={name='legal_docs_categorized', replace=true}; |
| 8 | RUN; |
| 9 | QUIT; |
The action successfully processes the 2 million documents without errors and in a timely manner. A single output table, 'legal_docs_categorized', is created. The test validates the scalability and stability of the action under a heavy load.