What options are available for the 'tokenizer' parameter and what is the default?
1 vues
Réponse
The 'tokenizer' parameter specifies which tokenizer to use. The default is 'STANDARD', which applies a language-specific tokenizer. The alternative is 'BASIC', which separates words by white spaces and punctuation, and is available for Chinese, Japanese, and Korean to enhance rule matching.