class  | 
ClassicTokenizerDescriptor | 
 This tokenizer has heuristics for special treatment of acronyms, company names, email addresses, and internet host
 names. 
 | 
class  | 
KeywordTokenizerDescriptor | 
 The keyword tokenizer is a “noop” tokenizer that accepts whatever text it is given and outputs the exact same text
 as a single term. 
 | 
class  | 
LetterTokenizerDescriptor | 
 The letter tokenizer breaks text into terms whenever it encounters a character which is not a letter. 
 | 
class  | 
NGramTokenizerDescriptor | 
 A tokenizer that produces a stream of n-gram. 
 | 
class  | 
PathHierarchyTokenizerDescriptor | 
 Tokenizer for path-like hierarchies. 
 | 
class  | 
PatternTokenizerDescriptor | 
 The pattern tokenizer uses a regular expression to either split text into terms whenever it matches a word separator,
 or to capture matching text as terms. 
 | 
class  | 
StandardTokenizerDescriptor | 
 A standard tokenizer based on unicode segmentation standard. 
 | 
class  | 
UAXURLEmailTokenizerDescriptor | 
 Tokenizer is like the standard tokenizer except that it recognises URLs and email addresses as single tokens. 
 | 
class  | 
WhitespaceTokenizerDescriptor | 
 The whitespace tokenizer breaks text into terms whenever it encounters a whitespace character. 
 |