Package com.atlassian.confluence.search.v2.analysis
package com.atlassian.confluence.search.v2.analysis
-
ClassDescriptionFolds a token into an ascii one.This filter removes the english possessive from the end of words, and it removes dots from acronym.This tokenizer has heuristics for special treatment of acronyms, company names, email addresses, and internet host names.Token filter that generates bigrams for frequently occurring terms.A token filter that decomposes compound words found in many Germanic languages based on dictionary.Strip html tag.A token filter that only keeps tokens with text contained in a predefined set of words.The keyword tokenizer is a “noop” tokenizer that accepts whatever text it is given and outputs the exact same text as a single term.The letter tokenizer breaks text into terms whenever it encounters a character which is not a letter.A token filter of type lowercase that normalizes token text to lower case.A char filter that maps one string to another.A tokenizer that produces a stream of n-gram.Tokenizer for path-like hierarchies.A char filter that replace a string matching the given pattern with the specified replacement.The pattern tokenizer uses a regular expression to either split text into terms whenever it matches a word separator, or to capture matching text as terms.Interface which checks if a search language is supported by the current platform.Breaks a text of a text field into tokens.A token filter that constructs shingles (token n-grams) from a token stream.A standard token filter.A standard tokenizer based on unicode segmentation standard.Tokenizer is like the standard tokenizer except that it recognises URLs and email addresses as single tokens.The whitespace tokenizer breaks text into terms whenever it encounters a whitespace character.