Scénario de test & Cas d'usage
Simulation of a larger dataset representing medical notes, generating 1000 sequences to test the optimizer's performance.
| 1 | |
| 2 | DATA casuser.medical_notes; |
| 3 | LENGTH _token_ $20 feature_suffix $3 label $10; |
| 4 | DO i=1 to 1000; |
| 5 | _start_='BEGIN'; |
| 6 | _end_='WORD'; |
| 7 | _token_='Patient'; |
| 8 | feature_suffix='ent'; |
| 9 | label='O'; |
| 10 | OUTPUT; |
| 11 | _start_='WORD'; |
| 12 | _end_='WORD'; |
| 13 | _token_='shows'; |
| 14 | feature_suffix='ows'; |
| 15 | label='O'; |
| 16 | OUTPUT; |
| 17 | _start_='WORD'; |
| 18 | _end_='END'; |
| 19 | _token_='symptoms'; |
| 20 | feature_suffix='oms'; |
| 21 | label='B-SYM'; |
| 22 | OUTPUT; |
| 23 | END; |
| 24 | |
| 25 | RUN; |
| 26 |
| 1 | |
| 2 | PROC CAS; |
| 3 | conditionalRandomFields.crfTrain TABLE={name='medical_notes', caslib='casuser'} target='label' template='U00:%x[0,0] |
| 4 | U01:%x[0,1]' nloOpts={algorithm='LBFGS', optmlOpt={regL1=0.2, maxIters=100}, lbfgsOpt={lineSearchMethod='WOLFE'}} model={label={name='med_labels'}, attr={name='med_attrs'}, feature={name='med_features'}, attrfeature={name='med_attrfeats'}, template={name='med_template'}}; |
| 5 | |
| 6 | RUN; |
| 7 |
The model trains using the LBFGS solver. The output log should reflect the use of Wolfe line search and L1 regularization. The process completes within the maximum iteration limit.