ParagraphAbsorber class

Represents an absorber object of page structure objects such as sections and paragraphs. Performs search for sections and paragraphs of text and provides access for rectangles and polydons that describes it in text coordinate space. Also performs text segments search and provides access to search results via !:TextFragments collections grouped by structure elements.

public class ParagraphAbsorber


Name Description
ParagraphAbsorber() Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page.
ParagraphAbsorber(int) Initializes a new instance of the ParagraphAbsorber that performs search for sections/paragraphs of the document or page.


Name Description
IsMulticolumnParagraphsAllowed { get; set; } Gets or sets value that indicates whether starting text lines of a next section may be treated as continuation of the last paragraph of a previous section.
PageMarkups { get; } Gets collection of PageMarkup that were absorbed.
SectionsSearchDepth { get; set; } Gets or sets value that instructs how many times sequential searches for more fine elements of structure will be performed. Default search depth is 3. It means three searches for horizontally divided sections (headers, paragraphs etc) and three searches for vertically divided ones (columns).


Name Description
Visit(Document) Performs search for sections and paragraphs on the specified Document.
Visit(Page) Performs search on the specified Page.


When the search is completed the PageMarkups collection will contains PageMarkup objects that represents page structure by collections of MarkupSection and MarkupParagraph. The TextFragment object provides access to the search occurrence text, text properties, and allows to edit text and change the text state (font, font size, color etc).


The example demonstrates how to find first text segment of each paragraph on the first PDF document page and highlight it.

// Open document
Document doc = new Document("input.pdf");

// Create ParagraphAbsorber object
ParagraphAbsorber absorber = new ParagraphAbsorber();

// Accept the absorber for first page

// Get markup object of first page
PageMarkup markup = absorber.PageMarkups[0];

// Loop through structure elements of the page text to find first text fragment of each paragraph
foreach (MarkupSection section in markup.Sections)
    foreach (MarkupParagraph paragraph in section.Paragraphs)
        TextFragment fragment = paragraph.Fragments[0];
        // Update text properties
        fragment.TextState.BackgroundColor = Color.LightBlue;

// Save document

See Also