Scénario de test & Cas d'usage
Creation of a dataset containing tokenized customer reviews with associated features (POS tags, capitalization) and target labels (B-PROD, I-PROD, O).
| 1 | DATA casuser.retail_reviews; LENGTH _token_ $20 feature_pos $5 feature_cap $5 label $10; INPUT _start_ $ _end_ $ _token_ $ feature_pos $ feature_cap $ label $; DATALINES; |
| 2 | BEGIN,WORD,Great,ADJ,Cap,O |
| 3 | WORD,WORD,running,VERB,Low,O |
| 4 | WORD,END,shoes,NOUN,Low,B-PROD |
| 5 | BEGIN,WORD,I,PRON,Cap,O |
| 6 | WORD,WORD,love,VERB,Low,O |
| 7 | WORD,WORD,my,PRON,Low,O |
| 8 | WORD,WORD,Nike,NOUN,Cap,B-BRAND |
| 9 | WORD,END,Air,NOUN,Cap,I-BRAND |
| 10 | ; RUN; |
| 1 | |
| 2 | PROC CAS; |
| 3 | conditionalRandomFields.crfTrain TABLE={name='retail_reviews', caslib='casuser'} target='label' template='U00:%x[0,0]' model={label={name='retail_labels'}, attr={name='retail_attrs'}, feature={name='retail_feats'}, attrfeature={name='retail_attrfeats'}, template={name='retail_tpl'}}; |
| 4 | |
| 5 | RUN; |
| 6 |
The action should successfully train the model and generate the five output tables (labels, attributes, features, etc.) defined in the model parameter, capturing the relationship between capitalized tokens and Brand labels.