Standard MFCC Extraction for Call Center Transcription

Business Context

A banking call center wants to automate the transcription of customer support calls to analyze sentiment and intent. The speech-to-text model requires standard Mel-Frequency Cepstral Coefficients (MFCC) as input. The data pipeline must process raw audio blobs and output a feature matrix retaining the call ID.

Data Preparation

Creation of a simulated table 'CALL_LOGS' containing call IDs and raw binary audio data.

Copied!

1	DATA casuser.call_logs; LENGTH call_id $10 audio_data $2000; INPUT call_id $ audio_data $; DATALINES;
2	CUST_001
3	CUST_002
4	; RUN;

Étapes de réalisation

Execution of feature extraction using standard MFCC parameters (13 coefficients) and transferring the Call ID.

Copied!

1	PROC CAS;
2	audio.computeFeatures /
3	TABLE={name='call_logs', caslib='casuser'}
4	audioColumn='audio_data'
5	copyVars={'call_id'}
6	mfccOptions={nCeps=13}
7	casOut={name='call_features', caslib='casuser', replace=true};
8	RUN;

Expected Result

The output table 'call_features' should contain the 'call_id' column and the computed MFCC feature vectors for each audio frame. The number of cepstral coefficients per frame should be 13.

Voir la documentation technique de computeFeatures