Class TextExtractorOptions
Contents
[
Hide
]TextExtractorOptions class
Represents text extraction options for the TextExtractor plugin.
public sealed class TextExtractorOptions : PdfExtractorOptions
Constructors
Name | Description |
---|---|
TextExtractorOptions() | Initializes a new instance of the TextExtractorOptions object with ‘Raw’ (default) text formatting mode. |
TextExtractorOptions(TextFormattingMode) | Initializes a new instance of the TextExtractorOptions object for the specified text formatting mode. |
Properties
Name | Description |
---|---|
FormattingMode { get; } | Gets formatting mode. |
Inputs { get; } | Returns PdfExtractor plugin data collection. |
override OperationName { get; } | Returns name of the operation. |
Methods
Name | Description |
---|---|
AddInput(IDataSource) | Adds new data source to the PdfExtractor plugin data collection. |
Other Members
Name | Description |
---|---|
enum TextFormattingMode | Defines different modes which can be used while converting a PDF document into text. See TextExtractorOptions class. |
Remarks
The TextExtractorOptions
object is used to set TextFormattingMode
and another options for the text extraction operation. Also, it inherits functions to add data (files, streams) representing input PDF documents.
Examples
The example demonstrates how to extract text content of PDF document.
// create TextExtractor object to extract PDF contents
using (TextExtractor extractor = new TextExtractor())
{
// create TextExtractorOptions object to set TextFormattingMode (Pure, or Raw - default)
extractorOptions = new TextExtractorOptions(TextExtractorOptions.TextFormattingMode.Pure);
// add input file path to data sources
extractorOptions.AddInput(new FileDataSource(inputPath));
// perform extraction process
ResultContainer resultContainer = extractor.Process(extractorOptions);
// get the extracted text from the ResultContainer object
string textExtracted = resultContainer.ResultCollection[0].ToString();
}
See Also
- class PdfExtractorOptions
- namespace Aspose.Pdf.Plugins
- assembly Aspose.PDF