DocumentSplitCriteria enumeration

DocumentSplitCriteria enumeration

Specifies how the document is split into parts when saving to SaveFormat.HTML, SaveFormat.EPUB or SaveFormat.AZW3 format.

DocumentSplitCriteria is a set of flags which can be combined. For instance you can split the document at page breaks and heading paragraphs in the same export operation.

Different criteria can partially overlap. For instance, Heading 1 style is frequently given ParagraphFormat.page_break_before property so it falls under two criteria: DocumentSplitCriteria.PAGE_BREAK and DocumentSplitCriteria.HEADING_PARAGRAPH. Some section breaks can cause page breaks and so on. In typical cases specifying only one flag is the most practical option.

Members

NameDescription
NONEThe document is not split.
PAGE_BREAKThe document is split into parts at explicit page breaks. A page break can be specified by a ControlChar.PAGE_BREAK character, a section break specifying start of new section on a new page, or a paragraph that has its ParagraphFormat.page_break_before property set to True.
COLUMN_BREAKThe document is split into parts at column breaks. A column break can be specified by a ControlChar.COLUMN_BREAK character or a section break specifying start of new section in a new column.
SECTION_BREAKThe document is split into parts at a section break of any type.
HEADING_PARAGRAPHThe document is split into parts at a paragraph formatted using a heading style Heading 1, Heading 2 etc. Use together with HtmlSaveOptions.document_split_heading_level to specify the heading levels (from 1 to the specified level) at which to split.

Examples

Shows how to use a specific encoding when saving a document to .epub.

doc = aw.Document(MY_DIR + "Rendering.docx")

# Use a SaveOptions object to specify the encoding for a document that we will save.
save_options = aw.saving.HtmlSaveOptions()
save_options.save_format = aw.SaveFormat.EPUB
save_options.encoding = "utf-8"

# By default, an output .epub document will have all its contents in one HTML part.
# A split criterion allows us to segment the document into several HTML parts.
# We will set the criteria to split the document into heading paragraphs.
# This is useful for readers who cannot read HTML files more significant than a specific size.
save_options.document_split_criteria = aw.saving.DocumentSplitCriteria.HEADING_PARAGRAPH

# Specify that we want to export document properties.
save_options.export_document_properties = True

doc.save(ARTIFACTS_DIR + "HtmlSaveOptions.doc2_epub_save_options.epub", save_options)

See Also