Document class

Document class

Represents a Word document. To learn more, visit the Working with Document documentation article.

Remarks

The Document is a central object in the Aspose.Words library.

To load an existing document in any of the LoadFormat formats, pass a file name or a stream into one of the Document constructors. To create a blank document, call the constructor without parameters.

Use one of the Save method overloads to save the document in any of the SaveFormat formats.

Document.mail_merge is the Aspose.Words’s reporting engine that allows to populate reports designed in Microsoft Word with data from various data sources quickly and easily.

Document stores document-wide information such as DocumentBase.styles, Document.built_in_document_properties, Document.custom_document_properties, lists and macros. Most of these objects are accessible via the corresponding properties of the Document.

The Document is a root node of a tree that contains all other nodes of the document. The tree is a Composite design pattern and in many ways similar to XmlDocument. The content of the document can be manipulated freely programmatically:

Consider using DocumentBuilder that simplifies the task of programmatically creating or populating the document tree.

The Document can contain only Section objects.

In Microsoft Word, a valid document needs to have at least one section.

Inheritance: DocumentDocumentBaseCompositeNodeNode

Constructors

NameDescription
Document()Creates a blank Word document.
Document(file_name)Opens an existing document from a file. Automatically detects the file format.
Document(file_name, load_options)Opens an existing document from a file. Allows to specify additional options such as an encryption password.
Document(stream)Opens an existing document from a stream. Automatically detects the file format.
Document(stream, load_options)Opens an existing document from a stream. Allows to specify additional options such as an encryption password.

Properties

NameDescription
attached_templateGets or sets the full path of the template attached to the document.
automatically_update_stylesGets or sets a flag indicating whether the styles in the document are updated to match the styles in the attached template each time the document is opened in MS Word.
background_shapeGets or sets the background shape of the document. Can be None.
(Inherited from DocumentBase)
bibliographyGets the Document.bibliography object that represents the list of sources available in the document.
built_in_document_propertiesReturns a collection that represents all the built-in document properties of the document.
compatibility_optionsProvides access to document compatibility options (that is, the user preferences entered on the Compatibility tab of the Options dialog in Word).
complianceGets the OOXML compliance version determined from the loaded document content. Makes sense only for OOXML documents.
countGets the number of immediate children of this node.
(Inherited from CompositeNode)
custom_document_propertiesReturns a collection that represents all the custom document properties of the document.
custom_node_idSpecifies custom node identifier.
(Inherited from Node)
custom_xml_partsGets or sets the collection of Custom XML Data Storage Parts.
default_tab_stopGets or sets the interval (in points) between the default tab stops.
digital_signaturesGets the collection of digital signatures for this document and their validation results.
documentGets the document to which this node belongs.
(Inherited from Node)
endnote_optionsProvides options that control numbering and positioning of endnotes in this document.
field_optionsGets a FieldOptions object that represents options to control field handling in the document.
first_childGets the first child of the node.
(Inherited from CompositeNode)
first_sectionGets the first section in the document.
font_infosProvides access to properties of fonts used in this document.
(Inherited from DocumentBase)
font_settingsGets or sets document font settings.
footnote_optionsProvides options that control numbering and positioning of footnotes in this document.
framesetReturns a Document.frameset instance if this document represents a frames page.
glossary_documentGets or sets the glossary document within this document or template. A glossary document is a storage for AutoText, AutoCorrect and Building Block entries defined in a document.
grammar_checkedReturns True if the document has been checked for grammar.
has_child_nodesReturns True if this node has any child nodes.
(Inherited from CompositeNode)
has_macrosReturns True if the document has a VBA project (macros).
has_revisionsReturns True if the document has any tracked changes.
hyphenation_optionsProvides access to document hyphenation options.
include_textboxes_footnotes_endnotes_in_statSpecifies whether to include textboxes, footnotes and endnotes in word count statistics.
is_compositeReturns True if this node can contain other nodes.
(Inherited from Node)
justification_modeGets or sets the character spacing adjustment of a document.
last_childGets the last child of the node.
(Inherited from CompositeNode)
last_sectionGets the last section in the document.
layout_optionsGets a LayoutOptions object that represents options to control the layout process of this document.
listsProvides access to the list formatting used in the document.
(Inherited from DocumentBase)
mail_mergeReturns a MailMerge object that represents the mail merge functionality for the document.
mail_merge_settingsGets or sets the object that contains all of the mail merge information for a document.
next_siblingGets the node immediately following this node.
(Inherited from Node)
node_changing_callbackCalled when a node is inserted or removed in the document.
(Inherited from DocumentBase)
node_typeReturns NodeType.DOCUMENT.
original_file_nameGets the original file name of the document.
original_load_formatGets the format of the original document that was loaded into this object.
package_custom_partsGets or sets the collection of custom parts (arbitrary content) that are linked to the OOXML package using “unknown relationships”.
page_colorGets or sets the page color of the document. This property is a simpler version of DocumentBase.background_shape.
(Inherited from DocumentBase)
page_countGets the number of pages in the document as calculated by the most recent page layout operation.
parent_nodeGets the immediate parent of this node.
(Inherited from Node)
previous_siblingGets the node immediately preceding this node.
(Inherited from Node)
protection_typeGets the currently active document protection type.
rangeReturns a Range object that represents the portion of a document that is contained in this node.
(Inherited from Node)
remove_personal_informationGets or sets a flag indicating that Microsoft Word will remove all user information from comments, revisions and document properties upon saving the document.
resource_loading_callbackAllows to control how external resources are loaded.
(Inherited from DocumentBase)
revisionsGets a collection of revisions (tracked changes) that exist in this document.
revisions_viewGets or sets a value indicating whether to work with the original or revised version of a document.
sectionsReturns a collection that represents all sections in the document.
shade_form_dataSpecifies whether to turn on the gray shading on form fields.
show_grammatical_errorsSpecifies whether to display grammar errors in this document.
show_spelling_errorsSpecifies whether to display spelling errors in this document.
spelling_checkedReturns True if the document has been checked for spelling.
stylesReturns a collection of styles defined in the document.
(Inherited from DocumentBase)
themeGets the Document.theme object for this document.
track_revisionsTrue if changes are tracked when this document is edited in Microsoft Word.
variablesReturns the collection of variables added to a document or template.
vba_projectGets or sets a Document.vba_project.
versions_countGets the number of document versions that was stored in the DOC document.
view_optionsProvides options to control how the document is displayed in Microsoft Word.
warning_callbackCalled during various document processing procedures when an issue is detected that might result in data or formatting fidelity loss.
(Inherited from DocumentBase)
watermarkProvides access to the document watermark.
web_extension_task_panesReturns a collection that represents a list of task pane add-ins.
write_protectionProvides access to the document write protection options.

Methods

NameDescription
accept(visitor)Accepts a visitor.
accept_all_revisions()Accepts all tracked changes in the document.
accept_end(visitor)Accepts a visitor for visiting the end of the document.
accept_start(visitor)Accepts a visitor for visiting the start of the document.
append_child(new_child)Adds the specified node to the end of the list of child nodes for this node.
(Inherited from CompositeNode)
append_document(src_doc, import_format_mode)Appends the specified document to the end of this document.
append_document(src_doc, import_format_mode, import_format_options)Appends the specified document to the end of this document.
cleanup()Cleans unused styles and lists from the document.
cleanup(options)Cleans unused styles and lists from the document depending on given CleanupOptions.
clone()Performs a deep copy of the Document.
clone(is_clone_children)Performs a deep copy of the Document.
compare(document, author, date_time)Compares this document with another document producing changes as number of edit and format revisions Revision.
compare(document, author, date_time, options)Compares this document with another document producing changes as a number of edit and format revisions Revision. Allows to specify comparison options using CompareOptions.
copy_styles_from_template(template)Copies styles from the specified template to a document.
copy_styles_from_template(template)Copies styles from the specified template to a document.
ensure_minimum()If the document contains no sections, creates one section with one paragraph.
expand_table_styles_to_direct_formatting()Converts formatting specified in table styles into direct formatting on tables in the document.
extract_pages(index, count)Returns the Document object representing specified range of pages.
get_ancestor(ancestor_type)Gets the first ancestor of the specified object type.
(Inherited from Node)
get_ancestor(ancestor_type)Gets the first ancestor of the specified NodeType.
(Inherited from Node)
get_child(node_type, index, is_deep)Returns an Nth child node that matches the specified type.
(Inherited from CompositeNode)
get_child_nodes(node_type, is_deep)Returns a live collection of child nodes that match the specified type.
(Inherited from CompositeNode)
get_page_info(page_index)Gets the page size, orientation and other information about a page that might be useful for printing or rendering.
get_text()Gets the text of this node and of all its children.
(Inherited from Node)
import_node(src_node, is_import_children)Imports a node from another document to the current document.
(Inherited from DocumentBase)
import_node(src_node, is_import_children, import_format_mode)Imports a node from another document to the current document with an option to control formatting.
(Inherited from DocumentBase)
index_of(child)Returns the index of the specified child node in the child node array.
(Inherited from CompositeNode)
insert_after(new_child, ref_child)Inserts the specified node immediately after the specified reference node.
(Inherited from CompositeNode)
insert_before(new_child, ref_child)Inserts the specified node immediately before the specified reference node.
(Inherited from CompositeNode)
join_runs_with_same_formatting()Joins runs with same formatting in all paragraphs of the document.
next_pre_order(root_node)Gets next node according to the pre-order tree traversal algorithm.
(Inherited from Node)
node_type_to_string(node_type)A utility method that converts a node type enum value into a user friendly string.
(Inherited from Node)
normalize_field_types()Changes field type values FieldChar.field_type of FieldStart, FieldSeparator, FieldEnd in the whole document so that they correspond to the field types contained in the field codes.
prepend_child(new_child)Adds the specified node to the beginning of the list of child nodes for this node.
(Inherited from CompositeNode)
previous_pre_order(root_node)Gets the previous node according to the pre-order tree traversal algorithm.
(Inherited from Node)
protect(type)Protects the document from changes without changing the existing password or assigns a random password.
protect(type, password)Protects the document from changes and optionally sets a protection password.
remove()Removes itself from the parent.
(Inherited from Node)
remove_all_children()Removes all the child nodes of the current node.
(Inherited from CompositeNode)
remove_blank_pages()Removes blank pages from the document.
remove_child(old_child)Removes the specified child node.
(Inherited from CompositeNode)
remove_external_schema_references()Removes external XML schema references from this document.
remove_macros()Removes all macros (the VBA project) as well as toolbars and command customizations from the document.
remove_smart_tags()Removes all SmartTag descendant nodes of the current node.
(Inherited from CompositeNode)
save(file_name)Saves the document to a file. Automatically determines the save format from the extension.
save(file_name, save_format)Saves the document to a file in the specified format.
save(file_name, save_options)Saves the document to a file using the specified save options.
save(stream, save_format)Saves the document to a stream using the specified format.
save(stream, save_options)Saves the document to a stream using the specified save options.
select_nodes(xpath)Selects a list of nodes matching the XPath expression.
(Inherited from CompositeNode)
select_single_node(xpath)Selects the first Node that matches the XPath expression.
(Inherited from CompositeNode)
start_track_revisions(author, date_time)Starts automatically marking all further changes you make to the document programmatically as revision changes.
start_track_revisions(author)Starts automatically marking all further changes you make to the document programmatically as revision changes.
stop_track_revisions()Stops automatic marking of document changes as revisions.
to_string(save_format)Exports the content of the node into a string in the specified format.
(Inherited from Node)
to_string(save_options)Exports the content of the node into a string using the specified save options.
(Inherited from Node)
unlink_fields()Unlinks fields in the whole document.
unprotect()Removes protection from the document regardless of the password.
unprotect(password)Removes protection from the document if a correct password is specified.
update_actual_reference_marks()Updates the Footnote.actual_reference_mark property of all footnotes and endnotes in the document.
update_fields()Updates the values of fields in the whole document.
update_list_labels()Updates list labels for all list items in the document.
update_page_layout()Rebuilds the page layout of the document.
update_table_layout()Implements an earlier approach to table column widths re-calculation that has known issues.
update_thumbnail(options)Updates BuiltInDocumentProperties.thumbnail of the document according to the specified options.
update_thumbnail()Updates BuiltInDocumentProperties.thumbnail of the document using default options.
update_word_count()Updates word count properties of the document.
update_word_count(update_lines_count)Updates word count properties of the document, optionally updates BuiltInDocumentProperties.lines property.

Examples

Shows how to execute a mail merge with data from a DataTable.

def test_execute_data_table(self):

    table = DataTable("Test")
    table.columns.add("CustomerName")
    table.columns.add("Address")
    table.rows.add(["Thomas Hardy", "120 Hanover Sq., London"])
    table.rows.add(["Paolo Accorti", "Via Monte Bianco 34, Torino"])

    # Below are two ways of using a DataTable as the data source for a mail merge.
    # 1 -  Use the entire table for the mail merge to create one output mail merge document for every row in the table:
    doc = ExMailMerge.create_source_doc_execute_data_table()

    doc.mail_merge.execute(table)

    doc.save(ARTIFACTS_DIR + "MailMerge.execute_data_table.whole_table.docx")

    # 2 -  Use one row of the table to create one output mail merge document:
    doc = ExMailMerge.create_source_doc_execute_data_table()

    doc.mail_merge.execute(table.rows[1])

    doc.save(ARTIFACTS_DIR + "MailMerge.execute_data_table.one_row.docx")

@staticmethod
def create_source_doc_execute_data_table() -> aw.Document:
    """Creates a mail merge source document."""

    doc = aw.Document()
    builder = aw.DocumentBuilder(doc)

    builder.insert_field(" MERGEFIELD CustomerName ")
    builder.insert_paragraph()
    builder.insert_field(" MERGEFIELD Address ")

    return doc

See Also