PdfTextExtractionOptions
Contents
[
Hide
]PdfTextExtractionOptions class
Represents text extraction options for the PdfExtractor plugin.
public sealed class PdfTextExtractionOptions : PdfExtractorOptions
Constructors
Name | Description |
---|---|
PdfTextExtractionOptions() | Initializes a new instance of the PdfTextExtractionOptions object with ‘Raw’ (default) text formatting mode. |
PdfTextExtractionOptions(TextFormattingMode) | Initializes a new instance of the PdfTextExtractionOptions object for the specified text formatting mode. |
Properties
Name | Description |
---|---|
DataCollection { get; } | Returns PdfExtractor plugin data collection. |
FormattingMode { get; } | Gets formatting mode. |
override OperationName { get; } | Returns name of the operation. |
Methods
Name | Description |
---|---|
AddDataSource(IDataSource) | Adds new data source to the PdfExtractor plugin data collection. |
Other Members
Name | Description |
---|---|
enum TextFormattingMode | Defines different modes which can be used while converting a PDF document into text. See PdfTextExtractionOptions class. |
Remarks
The PdfTextExtractionOptions
object is used to set TextFormattingMode
and another options for the text extraction operation. Also, it inherits functions to add data (files, streams) representing input PDF documents.
Examples
The example demonstrates how to extract text content of PDF document.
// create PdfExtractor object to extract PDF contents
using (PdfExtractor extractor = new PdfExtractor())
{
// create PdfTextExtractionOptions object to set TextFormattingMode (Pure, or Raw - default)
extractorOptions = new PdfTextExtractionOptions(PdfTextExtractionOptions.TextFormattingMode.Pure);
// add input file path to data sources
extractorOptions.AddDataSource(new FileDataSource(inputPath));
// perform extraction process
ResultContainer resultContainer = extractor.Process(extractorOptions);
// get the extracted text from the ResultContainer object
string textExtracted = resultContainer.ResultCollection[0].ToText();
}
See Also
- class PdfExtractorOptions
- namespace Aspose.Pdf.Plugins
- assembly Aspose.PDF