High-Volume FBank Computation with Context for Noise Monitoring

Business Context

A smart city project deploys thousands of sensors to monitor urban noise pollution. The objective is to classify sound events (sirens, drilling, traffic) using a Deep Neural Network. The model performs better with Filter Bank (FBank) features that include temporal context (frames before/after) and are standardized to handle varying volume levels.

Data Preparation

Simulation of a high-volume sensor data table 'SENSOR_STREAM' with varying recording locations.

Copied!

1
2	DATA casuser.sensor_stream;
3	LENGTH sensor_loc $20 raw_sound $1000;
4	DO i=1 to 1000;
5	sensor_loc=cats('LOC_', i);
6	raw_sound='<simulated_noise_bytes>';
7	OUTPUT;
8	END;
9
10	RUN;
11

Étapes de réalisation

Computing FBank features with Log-Filterbank values, adding 5 frames of context on each side, and applying standardization.

Copied!

1	PROC CAS;
2	audio.computeFeatures /
3	TABLE={name='sensor_stream', caslib='casuser'}
4	audioColumn='raw_sound'
5	fbankOptions={useLogFbank=true, usePower=true}
6	nContextFrames=5
7	featureScalingMethod='STANDARDIZATION'
8	casOut={name='noise_features', caslib='casuser', replace=true};
9	RUN;

Expected Result

The 'noise_features' table contains FBank features where each frame vector includes data from the 5 preceding and 5 succeeding frames. Values are standardized (mean 0, variance 1) across frames.

Voir la documentation technique de computeFeatures