The lmImport action imports an n-gram language model from a CAS table into a binary format that is optimized for speech-to-text processing. This action is a preparatory step to make a language model usable by other actions, such as `lmDecode`. It can also apply a label mapping during the import process, which is useful for aligning the vocabulary of the language model with the character set of an acoustic model.
| Parameter | Description |
|---|---|
| table | Specifies the input CAS table that contains the n-gram language model. This table should include columns for the n-grams, their log probabilities, and backoff weights. |
| casOut | Specifies the output CAS table where the imported language model will be stored in a binary format. |
| labelMapTable | Specifies an optional input CAS table that maps the labels (words) from the source language model to a new set of labels. This is useful for aligning with the vocabulary of a speech-to-text system. |
This SAS code demonstrates how to create a sample n-gram language model table (`myLanguageModel`) and a label map table (`myLabelMapTable`). The n-gram table must contain the n-gram term, its log probability, and its backoff weight. The label map table provides a mapping from one set of labels to another.
| 1 | DATA public.myLanguageModel(promote=true); |
| 2 | LENGTH _NGRAM_ $50; |
| 3 | INFILE DATALINES dlm='|'; |
| 4 | INPUT _NGRAM_ $ _LOGPROB_ _BACKOFF_; |
| 5 | DATALINES; |
| 6 | a|-0.5|-0.1 |
| 7 | b|-0.6|-0.2 |
| 8 | a b|-1.2|-0.3 |
| 9 | ; |
| 10 | RUN; |
| 11 | |
| 12 | DATA public.myLabelMapTable(promote=true); |
| 13 | LENGTH _FROM_LABEL_ $50 _TO_LABEL_ $50; |
| 14 | INFILE DATALINES dlm='|'; |
| 15 | INPUT _FROM_LABEL_ $ _TO_LABEL_ $; |
| 16 | DATALINES; |
| 17 | a|ah |
| 18 | b|bee |
| 19 | ; |
| 20 | RUN; |
This example shows the basic usage of the `lmImport` action. It takes an n-gram model stored in a CAS table and converts it into the required binary format in an output table.
| 1 | PROC CAS; |
| 2 | langModel.lmImport / |
| 3 | TABLE={name='myLanguageModel'}, |
| 4 | casOut={name='myImportedLm', replace=true}; |
| 5 | RUN; |
This example demonstrates how to import an n-gram language model while applying a label mapping. The `labelMapTable` parameter is used to specify a table that maps the original words in the model to new labels. This is particularly useful for ensuring consistency between the language model's vocabulary and the acoustic model's characters or phonemes.
| 1 | PROC CAS; |
| 2 | langModel.lmImport / |
| 3 | TABLE={name='myLanguageModel'}, |
| 4 | labelMapTable={name='myLabelMapTable'}, |
| 5 | casOut={name='myImportedLmWithMap', replace=true}; |
| 6 | RUN; |