Scénario de test & Cas d'usage
Generación de un gran volumen de reseñas simuladas mediante un bucle.
| 1 | |
| 2 | DATA casuser.large_reviews; |
| 3 | LENGTH _doc_id_ $20 _text_ $500; |
| 4 | DO i=1 to 100000; |
| 5 | _doc_id_=cat('REV_', i); |
| 6 | _text_='Este producto es excelente, la entrega fue rápida pero el embalaje estaba dañado.'; |
| 7 | OUTPUT; |
| 8 | END; |
| 9 | |
| 10 | RUN; |
| 11 |
| 1 | |
| 2 | PROC CAS; |
| 3 | TABLE.loadTable / path='sentiment_label.csv' casOut={name='s_label', replace=true}; |
| 4 | TABLE.loadTable / path='sentiment_attr.csv' casOut={name='s_attr', replace=true}; |
| 5 | TABLE.loadTable / path='sentiment_feature.csv' casOut={name='s_feature', replace=true}; |
| 6 | TABLE.loadTable / path='sentiment_attrfeature.csv' casOut={name='s_attrfeature', replace=true}; |
| 7 | TABLE.loadTable / path='sentiment_template.csv' casOut={name='s_template', replace=true}; |
| 8 | |
| 9 | RUN; |
| 10 |
| 1 | |
| 2 | PROC CAS; |
| 3 | conditionalRandomFields.crfScore TABLE={name='large_reviews'} model={label={name='s_label'}, attr={name='s_attr'}, feature={name='s_feature'}, attrfeature={name='s_attrfeature'}, template={name='s_template'}} casOut={name='reviews_scored', replace=true} target='ner_tag' threadBlockSize=32768; |
| 4 | |
| 5 | RUN; |
| 6 |
El proceso finaliza sin errores de memoria. La tabla 'reviews_scored' contiene 100,000 documentos procesados con las entidades etiquetadas correctamente en la variable 'ner_tag'.