TextAbsorber
Contents
[
Hide
]TextAbsorber class
Represents an absorber object of a text. Performs text extraction and provides access to the result via Text
object.
public class TextAbsorber
Constructors
Name | Description |
---|---|
TextAbsorber() | Initializes a new instance of the TextAbsorber . |
TextAbsorber(TextExtractionOptions) | Initializes a new instance of the TextAbsorber with extraction options. |
TextAbsorber(TextSearchOptions) | Initializes a new instance of the TextAbsorber with text search options. |
TextAbsorber(TextExtractionOptions, TextSearchOptions) | Initializes a new instance of the TextAbsorber with extraction and text search options. |
Properties
Name | Description |
---|---|
Errors { get; } | List of TextExtractionError objects. It contain information about errors were found during text extraction. Searching for errors will performed only if TextSearchOptions.LogTextExtractionErrors = true; And it may decrease performance. |
virtual ExtractionOptions { get; set; } | Gets or sets text extraction options. |
HasErrors { get; } | Value indicates whether errors were found during text extraction. Searching for errors will performed only if TextSearchOptions.LogTextExtractionErrors = true; And it may decrease performance. |
virtual Text { get; } | Gets extracted text that the TextAbsorber extracts on the PDF document or page. |
virtual TextSearchOptions { get; set; } | Gets or sets text search options. |
Methods
Name | Description |
---|---|
virtual Visit(Document) | Extracts text on the specified document |
virtual Visit(Page) | Extracts text on the specified page |
virtual Visit(XForm) | Extracts text on the specified XForm. |
Remarks
The TextAbsorber
object is used to extract text from a Pdf document or the document’s page.
Examples
The example demonstrates how to extract text on the first PDF document page.
// open document
Document doc = new Document(inFile);
// create TextAbsorber object to extract text
TextAbsorber absorber = new TextAbsorber();
// accept the absorber for first page
doc.Pages[1].Accept(absorber);
// get the extracted text
string extractedText = absorber.Text;
See Also
- namespace Aspose.Pdf.Text
- assembly Aspose.PDF