Document class
Document class
Represents a Word document. To learn more, visit the Working with Document documentation article.
Remarks
The Document is a central object in the Aspose.Words library.
To load an existing document in any of the LoadFormat formats, pass a file name or a stream into one of the Document constructors. To create a blank document, call the constructor without parameters.
Use one of the Save method overloads to save the document in any of the SaveFormat formats.
Document.mail_merge is the Aspose.Words’s reporting engine that allows to populate reports designed in Microsoft Word with data from various data sources quickly and easily.
Document stores document-wide information such as DocumentBase.styles, Document.built_in_document_properties, Document.custom_document_properties, lists and macros. Most of these objects are accessible via the corresponding properties of the Document.
The Document is a root node of a tree that contains all other nodes of the document. The tree is a Composite design pattern and in many ways similar to XmlDocument. The content of the document can be manipulated freely programmatically:
The nodes of the document can be accessed via typed collections, for example Document.sections, ParagraphCollection etc.
The nodes of the document can be selected by their node type using CompositeNode.get_child_nodes() or using an XPath query with CompositeNode.select_nodes() or CompositeNode.select_single_node().
Content nodes can be added or removed from anywhere in the document using CompositeNode.insert_before(), CompositeNode.insert_after(), CompositeNode.remove_child() and other methods provided by the base class CompositeNode.
The formatting attributes of each node can be changed via the properties of that node.
Consider using DocumentBuilder that simplifies the task of programmatically creating or populating the document tree.
The Document can contain only Section objects.
In Microsoft Word, a valid document needs to have at least one section.
Inheritance: Document → DocumentBase → CompositeNode → Node
Constructors
Name | Description |
---|---|
Document() | Creates a blank Word document. |
Document(file_name) | Opens an existing document from a file. Automatically detects the file format. |
Document(file_name, load_options) | Opens an existing document from a file. Allows to specify additional options such as an encryption password. |
Document(stream) | Opens an existing document from a stream. Automatically detects the file format. |
Document(stream, load_options) | Opens an existing document from a stream. Allows to specify additional options such as an encryption password. |
Properties
Name | Description |
---|---|
attached_template | Gets or sets the full path of the template attached to the document. |
automatically_update_styles | Gets or sets a flag indicating whether the styles in the document are updated to match the styles in the attached template each time the document is opened in MS Word. |
background_shape | Gets or sets the background shape of the document. Can be None .(Inherited from DocumentBase) |
bibliography | Gets the Document.bibliography object that represents the list of sources available in the document. |
built_in_document_properties | Returns a collection that represents all the built-in document properties of the document. |
compatibility_options | Provides access to document compatibility options (that is, the user preferences entered on the Compatibility tab of the Options dialog in Word). |
compliance | Gets the OOXML compliance version determined from the loaded document content. Makes sense only for OOXML documents. |
count | Gets the number of immediate children of this node. (Inherited from CompositeNode) |
custom_document_properties | Returns a collection that represents all the custom document properties of the document. |
custom_node_id | Specifies custom node identifier. (Inherited from Node) |
custom_xml_parts | Gets or sets the collection of Custom XML Data Storage Parts. |
default_tab_stop | Gets or sets the interval (in points) between the default tab stops. |
digital_signatures | Gets the collection of digital signatures for this document and their validation results. |
document | Gets the document to which this node belongs. (Inherited from Node) |
endnote_options | Provides options that control numbering and positioning of endnotes in this document. |
field_options | Gets a FieldOptions object that represents options to control field handling in the document. |
first_child | Gets the first child of the node. (Inherited from CompositeNode) |
first_section | Gets the first section in the document. |
font_infos | Provides access to properties of fonts used in this document. (Inherited from DocumentBase) |
font_settings | Gets or sets document font settings. |
footnote_options | Provides options that control numbering and positioning of footnotes in this document. |
footnote_separators | Provides access to the footnote/endnote separators defined in the document. (Inherited from DocumentBase) |
frameset | Returns a Document.frameset instance if this document represents a frames page. |
glossary_document | Gets or sets the glossary document within this document or template. A glossary document is a storage for AutoText, AutoCorrect and Building Block entries defined in a document. |
grammar_checked | Returns True if the document has been checked for grammar. |
has_child_nodes | Returns True if this node has any child nodes.(Inherited from CompositeNode) |
has_macros | Returns True if the document has a VBA project (macros). |
has_revisions | Returns True if the document has any tracked changes. |
hyphenation_options | Provides access to document hyphenation options. |
include_textboxes_footnotes_endnotes_in_stat | Specifies whether to include textboxes, footnotes and endnotes in word count statistics. |
is_composite | Returns True if this node can contain other nodes.(Inherited from Node) |
justification_mode | Gets or sets the character spacing adjustment of a document. |
last_child | Gets the last child of the node. (Inherited from CompositeNode) |
last_section | Gets the last section in the document. |
layout_options | Gets a LayoutOptions object that represents options to control the layout process of this document. |
lists | Provides access to the list formatting used in the document. (Inherited from DocumentBase) |
mail_merge | Returns a MailMerge object that represents the mail merge functionality for the document. |
mail_merge_settings | Gets or sets the object that contains all of the mail merge information for a document. |
next_sibling | Gets the node immediately following this node. (Inherited from Node) |
node_changing_callback | Called when a node is inserted or removed in the document. (Inherited from DocumentBase) |
node_type | Returns NodeType.DOCUMENT. |
original_file_name | Gets the original file name of the document. |
original_load_format | Gets the format of the original document that was loaded into this object. |
package_custom_parts | Gets or sets the collection of custom parts (arbitrary content) that are linked to the OOXML package using “unknown relationships”. |
page_color | Gets or sets the page color of the document. This property is a simpler version of DocumentBase.background_shape. (Inherited from DocumentBase) |
page_count | Gets the number of pages in the document as calculated by the most recent page layout operation. |
parent_node | Gets the immediate parent of this node. (Inherited from Node) |
previous_sibling | Gets the node immediately preceding this node. (Inherited from Node) |
protection_type | Gets the currently active document protection type. |
punctuation_kerning | Specifies whether kerning applies to both Latin text and punctuation. |
range | Returns a Range object that represents the portion of a document that is contained in this node. (Inherited from Node) |
remove_personal_information | Gets or sets a flag indicating that Microsoft Word will remove all user information from comments, revisions and document properties upon saving the document. |
resource_loading_callback | Allows to control how external resources are loaded. (Inherited from DocumentBase) |
revisions | Gets a collection of revisions (tracked changes) that exist in this document. |
revisions_view | Gets or sets a value indicating whether to work with the original or revised version of a document. |
sections | Returns a collection that represents all sections in the document. |
shade_form_data | Specifies whether to turn on the gray shading on form fields. |
show_grammatical_errors | Specifies whether to display grammar errors in this document. |
show_spelling_errors | Specifies whether to display spelling errors in this document. |
spelling_checked | Returns True if the document has been checked for spelling. |
styles | Returns a collection of styles defined in the document. (Inherited from DocumentBase) |
theme | Gets the Document.theme object for this document. |
track_revisions | True if changes are tracked when this document is edited in Microsoft Word. |
variables | Returns the collection of variables added to a document or template. |
vba_project | Gets or sets a Document.vba_project. |
versions_count | Gets the number of document versions that was stored in the DOC document. |
view_options | Provides options to control how the document is displayed in Microsoft Word. |
warning_callback | Called during various document processing procedures when an issue is detected that might result in data or formatting fidelity loss. (Inherited from DocumentBase) |
watermark | Provides access to the document watermark. |
web_extension_task_panes | Returns a collection that represents a list of task pane add-ins. |
write_protection | Provides access to the document write protection options. |
Methods
Name | Description |
---|---|
accept(visitor) | Accepts a visitor. |
accept_all_revisions() | Accepts all tracked changes in the document. |
accept_end(visitor) | Accepts a visitor for visiting the end of the document. |
accept_start(visitor) | Accepts a visitor for visiting the start of the document. |
append_child(new_child) | Adds the specified node to the end of the list of child nodes for this node. (Inherited from CompositeNode) |
append_document(src_doc, import_format_mode) | Appends the specified document to the end of this document. |
append_document(src_doc, import_format_mode, import_format_options) | Appends the specified document to the end of this document. |
cleanup() | Cleans unused styles and lists from the document. |
cleanup(options) | Cleans unused styles and lists from the document depending on given CleanupOptions. |
clone() | Performs a deep copy of the Document. |
clone(is_clone_children) | Performs a deep copy of the Document. |
compare(document, author, date_time) | Compares this document with another document producing changes as number of edit and format revisions Revision. |
compare(document, author, date_time, options) | Compares this document with another document producing changes as a number of edit and format revisions Revision. Allows to specify comparison options using CompareOptions. |
copy_styles_from_template(template) | Copies styles from the specified template to a document. |
copy_styles_from_template(template) | Copies styles from the specified template to a document. |
ensure_minimum() | If the document contains no sections, creates one section with one paragraph. |
expand_table_styles_to_direct_formatting() | Converts formatting specified in table styles into direct formatting on tables in the document. |
extract_pages(index, count) | Returns the Document object representing specified range of pages. |
get_ancestor(ancestor_type) | Gets the first ancestor of the specified object type. (Inherited from Node) |
get_ancestor(ancestor_type) | Gets the first ancestor of the specified NodeType. (Inherited from Node) |
get_child(node_type, index, is_deep) | Returns an Nth child node that matches the specified type. (Inherited from CompositeNode) |
get_child_nodes(node_type, is_deep) | Returns a live collection of child nodes that match the specified type. (Inherited from CompositeNode) |
get_page_info(page_index) | Gets the page size, orientation and other information about a page that might be useful for printing or rendering. |
get_text() | Gets the text of this node and of all its children. (Inherited from Node) |
import_node(src_node, is_import_children) | Imports a node from another document to the current document. (Inherited from DocumentBase) |
import_node(src_node, is_import_children, import_format_mode) | Imports a node from another document to the current document with an option to control formatting. (Inherited from DocumentBase) |
index_of(child) | Returns the index of the specified child node in the child node array. (Inherited from CompositeNode) |
insert_after(new_child, ref_child) | Inserts the specified node immediately after the specified reference node. (Inherited from CompositeNode) |
insert_before(new_child, ref_child) | Inserts the specified node immediately before the specified reference node. (Inherited from CompositeNode) |
join_runs_with_same_formatting() | Joins runs with same formatting in all paragraphs of the document. |
next_pre_order(root_node) | Gets next node according to the pre-order tree traversal algorithm. (Inherited from Node) |
node_type_to_string(node_type) | A utility method that converts a node type enum value into a user friendly string. (Inherited from Node) |
normalize_field_types() | Changes field type values FieldChar.field_type of FieldStart, FieldSeparator, FieldEnd in the whole document so that they correspond to the field types contained in the field codes. |
prepend_child(new_child) | Adds the specified node to the beginning of the list of child nodes for this node. (Inherited from CompositeNode) |
previous_pre_order(root_node) | Gets the previous node according to the pre-order tree traversal algorithm. (Inherited from Node) |
protect(type) | Protects the document from changes without changing the existing password or assigns a random password. |
protect(type, password) | Protects the document from changes and optionally sets a protection password. |
remove() | Removes itself from the parent. (Inherited from Node) |
remove_all_children() | Removes all the child nodes of the current node. (Inherited from CompositeNode) |
remove_blank_pages() | Removes blank pages from the document. |
remove_child(old_child) | Removes the specified child node. (Inherited from CompositeNode) |
remove_external_schema_references() | Removes external XML schema references from this document. |
remove_macros() | Removes all macros (the VBA project) as well as toolbars and command customizations from the document. |
remove_smart_tags() | Removes all SmartTag descendant nodes of the current node. (Inherited from CompositeNode) |
save(file_name) | Saves the document to a file. Automatically determines the save format from the extension. |
save(file_name, save_format) | Saves the document to a file in the specified format. |
save(file_name, save_options) | Saves the document to a file using the specified save options. |
save(stream, save_format) | Saves the document to a stream using the specified format. |
save(stream, save_options) | Saves the document to a stream using the specified save options. |
select_nodes(xpath) | Selects a list of nodes matching the XPath expression. (Inherited from CompositeNode) |
select_single_node(xpath) | Selects the first Node that matches the XPath expression. (Inherited from CompositeNode) |
start_track_revisions(author, date_time) | Starts automatically marking all further changes you make to the document programmatically as revision changes. |
start_track_revisions(author) | Starts automatically marking all further changes you make to the document programmatically as revision changes. |
stop_track_revisions() | Stops automatic marking of document changes as revisions. |
to_string(save_format) | Exports the content of the node into a string in the specified format. (Inherited from Node) |
to_string(save_options) | Exports the content of the node into a string using the specified save options. (Inherited from Node) |
unlink_fields() | Unlinks fields in the whole document. |
unprotect() | Removes protection from the document regardless of the password. |
unprotect(password) | Removes protection from the document if a correct password is specified. |
update_actual_reference_marks() | Updates the Footnote.actual_reference_mark property of all footnotes and endnotes in the document. |
update_fields() | Updates the values of fields in the whole document. |
update_list_labels() | Updates list labels for all list items in the document. |
update_page_layout() | Rebuilds the page layout of the document. |
update_table_layout() | Implements an earlier approach to table column widths re-calculation that has known issues. |
update_thumbnail(options) | Updates BuiltInDocumentProperties.thumbnail of the document according to the specified options. |
update_thumbnail() | Updates BuiltInDocumentProperties.thumbnail of the document using default options. |
update_word_count() | Updates word count properties of the document. |
update_word_count(update_lines_count) | Updates word count properties of the document, optionally updates BuiltInDocumentProperties.lines property. |
See Also
- module aspose.words
- class DocumentBase