Aspose::Pdf::Facades::PdfExtractor class

PdfExtractor class

Class for extracting images and text from PDF document.

class PdfExtractor : public Aspose::Pdf::Facades::Facade

Methods

MethodDescription
BindPdf(System::String) overrideBind input PDF file.
BindPdf(System::SharedPtr<System::IO::Stream>) overrideBinds PDF document from stream.
BindPdf(System::SharedPtr<Aspose::Pdf::Document>) overrideInitializes the facade.
virtual BindPdf(System::SharedPtr<Document>)Binds PDF document for editing.
Close() overrideDisposes Aspose.Pdf.Document bound with a facade.
Dispose() overrideDisposes the facade.
ExtractAttachment()Extracts attachments from a Pdf document.
ExtractAttachment(System::String)Extracts attachment to PDF file by attachment name.
ExtractImage()Extract images from PDF file.
ExtractText()Extracts text from a Pdf document using Unicode encoding.
ExtractText(System::SharedPtr<System::Text::Encoding>)Extracts text from a Pdf document using specified encoding.
get_Document() constGets the document facade is working on.
get_EndPage() constGets end page in the page range where extracting operation will be performed.
get_ExtractImageMode() constSets the mode for extract images process.
get_ExtractTextMode() constSets the mode for extract text’s result.
get_IsBidi()Is true when text has hebriew or arabic symbols. This case must be specially considered because string functions change their behaviour and start process text from right to left (except numbers and other non text chars).
get_Password() constGets input file’s password.
get_Resolution() constSet or gets resolution for extracted images. Default value is 150. Images which have greater resolution value are more clear. However increasing resolution value results in increasing time and memory needed to extract images. Usually to get clear image it’s enough to set resolution to 150 or 300.
get_StartPage() constGets start page in the page range where extracting operation will be performed.
get_TextSearchOptions() constGets text search options.
GetAttachment(System::String)Stores attachment into file.
GetAttachment()Saves all the attachment file to streams.
GetAttachmentInfo()Gets the list of attachments.
GetAttachNames()Returns list of attachments in PDF file. Note: ExtractAttachments must be called before using this method.
GetNextImage(System::String)Retrieves next image from PDF document. Note: ExtractImage must be called before using of this method.
GetNextImage(System::String, System::SharedPtr<System::Drawing::Imaging::ImageFormat>)Retrieves next image from PDF document with given image format. Note: ExtractImage must be called before using of this method.
GetNextImage(System::SharedPtr<System::IO::Stream>, System::SharedPtr<System::Drawing::Imaging::ImageFormat>)Retrieve next image from PDF file and stores it into stream with given image format.
GetNextImage(System::SharedPtr<System::IO::Stream>)Retrieve next image from PDF file and stores it into stream.
GetNextPageText(System::String)Saves one page’s text to file.
GetNextPageText(System::SharedPtr<System::IO::Stream>)Saves one page’s text to stream.
GetText(System::String)Saves text to file. see also:ExtractText
GetText(System::SharedPtr<System::IO::Stream>)Saves text to stream. see also:ExtractText
GetText(System::SharedPtr<System::IO::Stream>, bool)Saves text to stream. see also:ExtractText
HasNextImage()Checks if more images are accessible in PDF document. Note: ExtractImage must be called before using of this method.
HasNextPageText()Indicates that whether can get more texts or not.
PdfExtractor()Initializes new PdfExtractor object.
PdfExtractor(System::SharedPtr<Aspose::Pdf::Document>)Initializes new PdfExtractor object on base of the document .
set_EndPage(int32_t)Sets end page in the page range where extracting operation will be performed.
set_ExtractImageMode(Aspose::Pdf::ExtractImageMode)Sets the mode for extract images process.
set_ExtractTextMode(int32_t)Sets the mode for extract text’s result.
set_Password(System::String)Sets input file’s password.
set_Resolution(int32_t)Set or gets resolution for extracted images. Default value is 150. Images which have greater resolution value are more clear. However increasing resolution value results in increasing time and memory needed to extract images. Usually to get clear image it’s enough to set resolution to 150 or 300.
set_StartPage(int32_t)Sets start page in the page range where extracting operation will be performed.
set_TextSearchOptions(System::SharedPtr<Aspose::Pdf::Text::TextSearchOptions>)Sets text search options.

See Also