langModel calculateErrorRate

High Volume Customer Support Analytics

Scénario de test & Cas d'usage

Business Context

A telecommunications company processes 10,000 customer support calls per hour. The data science team needs to ensure the `calculateErrorRate` action can handle batch processing of high-volume short utterances without performance degradation or memory errors.
About the Set : langModel

Management of Large Language Models (LLM) and NLP.

Discover all actions of langModel
Data Preparation

Programmatically generating 10,000 rows of synthetic call logs. The reference data is identical to the hypothesis data for 90% of cases to simulate a high-performing model, with random errors injected into the remaining 10%.

Copied!
1 
2DATA mycas.call_center_truth;
3LENGTH call_id $20 transcript $100;
4DO i=1 to 10000;
5call_id=cats('CALL_', i);
6transcript='Customer requests cancellation of service plan A';
7OUTPUT;
8END;
9 
10RUN;
11 
12DATA mycas.call_center_pred;
13LENGTH call_id $20 transcript $100;
14DO i=1 to 10000;
15call_id=cats('CALL_', i);
16IF mod(i, 10) = 0 THEN transcript='Customer request cancel service plan A';
17ELSE transcript='Customer requests cancellation of service plan A';
18OUTPUT;
19END;
20 
21RUN;
22 

Étapes de réalisation

1
Ensure tables are loaded into memory.
Copied!
1 
2PROC CAS;
3TABLE.tableDetails / name='call_center_truth';
4 
5RUN;
6 
2
Run the calculation on the full 10k dataset using default column assumptions (since column names match this time).
Copied!
1 
2PROC CAS;
3langModel.calculateErrorRate / TABLE={name='call_center_pred'} reference={name='call_center_truth'};
4 
5RUN;
6 

Expected Result


The action completes within a reasonable execution time. The aggregate report should reflect exactly 10% of rows having non-zero WER. No system timeouts or memory allocation errors should occur.