langModel

lmImport

Description

The lmImport action imports an n-gram language model from a CAS table into a binary format that is optimized for speech-to-text processing. This action is a preparatory step to make a language model usable by other actions, such as `lmDecode`. It can also apply a label mapping during the import process, which is useful for aligning the vocabulary of the language model with the character set of an acoustic model.

proc cas; langModel.lmImport / table={name='myLanguageModel'}, casOut={name='myImportedLm', replace=true}, labelMapTable={name='myLabelMapTable'} ; run;
Settings
ParameterDescription
tableSpecifies the input CAS table that contains the n-gram language model. This table should include columns for the n-grams, their log probabilities, and backoff weights.
casOutSpecifies the output CAS table where the imported language model will be stored in a binary format.
labelMapTableSpecifies an optional input CAS table that maps the labels (words) from the source language model to a new set of labels. This is useful for aligning with the vocabulary of a speech-to-text system.
Data Preparation View data prep sheet
Creating Input Data: n-Gram Language Model and Label Map Tables

This SAS code demonstrates how to create a sample n-gram language model table (`myLanguageModel`) and a label map table (`myLabelMapTable`). The n-gram table must contain the n-gram term, its log probability, and its backoff weight. The label map table provides a mapping from one set of labels to another.

Copied!
1DATA public.myLanguageModel(promote=true);
2 LENGTH _NGRAM_ $50;
3 INFILE DATALINES dlm='|';
4 INPUT _NGRAM_ $ _LOGPROB_ _BACKOFF_;
5 DATALINES;
6a|-0.5|-0.1
7b|-0.6|-0.2
8a b|-1.2|-0.3
9;
10RUN;
11 
12DATA public.myLabelMapTable(promote=true);
13 LENGTH _FROM_LABEL_ $50 _TO_LABEL_ $50;
14 INFILE DATALINES dlm='|';
15 INPUT _FROM_LABEL_ $ _TO_LABEL_ $;
16 DATALINES;
17a|ah
18b|bee
19;
20RUN;

Examples

This example shows the basic usage of the `lmImport` action. It takes an n-gram model stored in a CAS table and converts it into the required binary format in an output table.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 langModel.lmImport /
3 TABLE={name='myLanguageModel'},
4 casOut={name='myImportedLm', replace=true};
5RUN;
Result :
The action imports the language model from the 'myLanguageModel' table and creates a new CAS table named 'myImportedLm'. This output table contains the model in a binary format ready for use in speech-to-text tasks.

This example demonstrates how to import an n-gram language model while applying a label mapping. The `labelMapTable` parameter is used to specify a table that maps the original words in the model to new labels. This is particularly useful for ensuring consistency between the language model's vocabulary and the acoustic model's characters or phonemes.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 langModel.lmImport /
3 TABLE={name='myLanguageModel'},
4 labelMapTable={name='myLabelMapTable'},
5 casOut={name='myImportedLmWithMap', replace=true};
6RUN;
Result :
The language model from 'myLanguageModel' is imported. During the import process, the labels 'a' and 'b' are mapped to 'ah' and 'bee' respectively, as defined in 'myLabelMapTable'. The resulting re-mapped model is stored in the 'myImportedLmWithMap' CAS table.

FAQ

What does the lmImport action do?
What is the purpose of the 'table' parameter in the lmImport action?
What does the 'casOut' parameter do?
What is the 'labelMapTable' parameter used for?