langModel calculateErrorRate

Robustness to Missing Data and ID Mismatches

Scénario de test & Cas d'usage

Business Context

In a real-world pipeline, audio files sometimes fail to process, or metadata gets corrupted. This test simulates a 'dirty' dataset where some hypothesis IDs are missing (audio failure), some reference IDs are missing (extra predictions), and some text fields are empty.
About the Set : langModel

Management of Large Language Models (LLM) and NLP.

Discover all actions of langModel
Data Preparation

Creation of disjoint sets of data: 'A' exists in both, 'B' only in Ref, 'C' only in Hyp, and 'D' has empty text content.

Copied!
1 
2DATA mycas.dirty_ref;
3LENGTH uid $5 text $50;
4INPUT uid $ text &;
5DATALINES;
6ID_A The quick brown fox ID_B Jumps over the dog ID_D Silent audio segment ;
7 
8RUN;
9 
10DATA mycas.dirty_hyp;
11LENGTH uid $5 text $50;
12INPUT uid $ text &;
13DATALINES;
14ID_A The quick brown fox ID_C New unmatched sentence ID_D ;
15 
16RUN;
17 

Étapes de réalisation

1
Load the disjoint datasets.
Copied!
1 
2PROC CAS;
3TABLE.fetch / TABLE='dirty_ref';
4TABLE.fetch / TABLE='dirty_hyp';
5 
6RUN;
7 
2
Execute action expecting handling of unmatched keys and empty strings.
Copied!
1 
2PROC CAS;
3langModel.calculateErrorRate / TABLE={name='dirty_hyp'} reference={name='dirty_ref'} tableId='uid' referenceId='uid';
4 
5RUN;
6 

Expected Result


The action should not crash. It should calculate error rates only for the intersecting ID ('ID_A'). For 'ID_D' (empty text in hypothesis vs content in ref), it should report a 100% deletion error rate. Unmatched IDs ('ID_B', 'ID_C') should ideally be ignored or flagged in a log/warning, but must not stop the execution.