Class LimitedTextContentExtractor
java.lang.Object
com.atlassian.confluence.search.v2.extractor.BaseAttachmentContentExtractor
com.atlassian.confluence.impl.search.v2.extractor.LimitedTextContentExtractor
- All Implemented Interfaces:
Extractor2
A subclass of
BaseAttachmentContentExtractor which places a limit on how many bytes of the input stream
are read into memory. This prevents it from potentially reading in huge attachment streams that trigger memory starvation.
This may have the side-effect of some content not being indexed if it is to be found "beyond" the limit, but that's preferable to an OOME.
The default value was changed from fixed 10Mb to be in line with the value set for Attachments:
- Since:
- 7.17
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected StringextractText(InputStream is, SearchableAttachment attachment) protected booleanshouldExtractFrom(String fileName, String contentType) Extract text from mime types like 'text/*', 'application/xml*' and 'application/*+xml'Methods inherited from class com.atlassian.confluence.search.v2.extractor.BaseAttachmentContentExtractor
extractFields, extractText, extractText
-
Constructor Details
-
LimitedTextContentExtractor
public LimitedTextContentExtractor()
-
-
Method Details
-
shouldExtractFrom
Extract text from mime types like 'text/*', 'application/xml*' and 'application/*+xml'- Specified by:
shouldExtractFromin classBaseAttachmentContentExtractor
-
extractText
- Specified by:
extractTextin classBaseAttachmentContentExtractor- Parameters:
is- a stream containing the attachment contentsattachment- contains useful attachment metadata, e.g. filename- Returns:
- a String with a textual representation of the attachment's contents
-