Document constructor
Document()
Creates a blank Word document.
def __init__(self):
...
Remarks
A blank document is retrieved from resources, and by default, the resulting document looks more like created by MsWordVersion.WORD2007. This blank document contains a default fonts table, minimal default styles, and latent styles.
CompatibilityOptions.optimize_for() method can be used to optimize the document contents as well as default Aspose.Words behavior to a particular version of MS Word.
The document paper size is Letter by default. If you want to change page setup, use Section.page_setup.
After creation, you can use DocumentBuilder to add document content easily.
Document(file_name)
Opens an existing document from a file. Automatically detects the file format.
def __init__(self, file_name: str):
...
| Parameter | Type | Description |
|---|---|---|
| file_name | str | File name of the document to open. |
Exceptions
| exception | condition |
|---|---|
| RuntimeError (Proxy error(UnsupportedFileFormatException)) | The document format is not recognized or not supported. |
| RuntimeError (Proxy error(FileCorruptedException)) | The document appears to be corrupted and cannot be loaded. |
| RuntimeError (Proxy error(Exception)) | There is a problem with the document and it should be reported to Aspose.Words developers. |
| RuntimeError (Proxy error(IOException)) | There is an input/output exception. |
| RuntimeError (Proxy error(IncorrectPasswordException)) | The document is encrypted and requires a password to open, but you supplied an incorrect password. |
| RuntimeError (Proxy error(ArgumentException)) | The name of the file cannot be null or empty string. |
Document(file_name, load_options)
Opens an existing document from a file. Allows to specify additional options such as an encryption password.
def __init__(self, file_name: str, load_options: aspose.words.loading.LoadOptions):
...
| Parameter | Type | Description |
|---|---|---|
| file_name | str | File name of the document to open. |
| load_options | LoadOptions | Additional options to use when loading a document. Can be None. |
Exceptions
| exception | condition |
|---|---|
| RuntimeError (Proxy error(UnsupportedFileFormatException)) | The document format is not recognized or not supported. |
| RuntimeError (Proxy error(FileCorruptedException)) | The document appears to be corrupted and cannot be loaded. |
| RuntimeError (Proxy error(Exception)) | There is a problem with the document and it should be reported to Aspose.Words developers. |
| RuntimeError (Proxy error(IOException)) | There is an input/output exception. |
| RuntimeError (Proxy error(IncorrectPasswordException)) | The document is encrypted and requires a password to open, but you supplied an incorrect password. |
| RuntimeError (Proxy error(ArgumentException)) | The name of the file cannot be null or empty string. |
Document(stream)
Opens an existing document from a stream. Automatically detects the file format.
def __init__(self, stream: io.BytesIO):
...
| Parameter | Type | Description |
|---|---|---|
| stream | io.BytesIO | Stream where to load the document from. |
Remarks
The document must be stored at the beginning of the stream. The stream must support random positioning.
Exceptions
| exception | condition |
|---|---|
| RuntimeError (Proxy error(UnsupportedFileFormatException)) | The document format is not recognized or not supported. |
| RuntimeError (Proxy error(FileCorruptedException)) | The document appears to be corrupted and cannot be loaded. |
| RuntimeError (Proxy error(Exception)) | There is a problem with the document and it should be reported to Aspose.Words developers. |
| RuntimeError (Proxy error(IOException)) | There is an input/output exception. |
| RuntimeError (Proxy error(IncorrectPasswordException)) | The document is encrypted and requires a password to open, but you supplied an incorrect password. |
| RuntimeError (Proxy error(ArgumentNullException)) | The stream cannot be null. |
| RuntimeError (Proxy error(NotSupportedException)) | The stream does not support reading or seeking. |
| RuntimeError (Proxy error(ObjectDisposedException)) | The stream is a disposed object. |
Document(stream, load_options)
Opens an existing document from a stream. Allows to specify additional options such as an encryption password.
def __init__(self, stream: io.BytesIO, load_options: aspose.words.loading.LoadOptions):
...
| Parameter | Type | Description |
|---|---|---|
| stream | io.BytesIO | The stream where to load the document from. |
| load_options | LoadOptions | Additional options to use when loading a document. Can be None. |
Remarks
The document must be stored at the beginning of the stream. The stream must support random positioning.
Exceptions
| exception | condition |
|---|---|
| RuntimeError (Proxy error(UnsupportedFileFormatException)) | The document format is not recognized or not supported. |
| RuntimeError (Proxy error(FileCorruptedException)) | The document appears to be corrupted and cannot be loaded. |
| RuntimeError (Proxy error(Exception)) | There is a problem with the document and it should be reported to Aspose.Words developers. |
| RuntimeError (Proxy error(IOException)) | There is an input/output exception. |
| RuntimeError (Proxy error(IncorrectPasswordException)) | The document is encrypted and requires a password to open, but you supplied an incorrect password. |
| RuntimeError (Proxy error(ArgumentNullException)) | The stream cannot be null. |
| RuntimeError (Proxy error(NotSupportedException)) | The stream does not support reading or seeking. |
| RuntimeError (Proxy error(ObjectDisposedException)) | The stream is a disposed object. |
Examples
Shows how to create and load documents.
# There are two ways of creating a Document object using Aspose.Words.
# 1 - Create a blank document:
doc = aw.Document()
# New Document objects by default come with the minimal set of nodes
# required to begin adding content such as text and shapes: a Section, a Body, and a Paragraph.
doc.first_section.body.first_paragraph.append_child(aw.Run(doc=doc, text='Hello world!'))
# 2 - Load a document that exists in the local file system:
doc = aw.Document(file_name=MY_DIR + 'Document.docx')
# Loaded documents will have contents that we can access and edit.
self.assertEqual('Hello World!', doc.first_section.body.first_paragraph.get_text().strip())
# Some operations that need to occur during loading, such as using a password to decrypt a document,
# can be done by passing a LoadOptions object when loading the document.
doc = aw.Document(file_name=MY_DIR + 'Encrypted.docx', load_options=aw.loading.LoadOptions(password='docPassword'))
self.assertEqual('Test encrypted document.', doc.first_section.body.first_paragraph.get_text().strip())
Shows how to create simple document.
doc = aw.Document()
# New Document objects by default come with the minimal set of nodes
# required to begin adding content such as text and shapes: a Section, a Body, and a Paragraph.
section = doc.append_child(aw.Section(doc)).as_section()
body = section.append_child(aw.Body(doc)).as_body()
para = body.append_child(aw.Paragraph(doc)).as_paragraph()
para.append_child(aw.Run(doc=doc, text='Hello world!'))
Shows how to open a document and convert it to .PDF.
doc = aw.Document(file_name=MY_DIR + 'Document.docx')
doc.save(file_name=ARTIFACTS_DIR + 'Document.ConvertToPdf.pdf')
Shows how to load a PDF.
doc = aw.Document()
builder = aw.DocumentBuilder(doc)
builder.write('Hello world!')
doc.save(ARTIFACTS_DIR + 'PDF2Word.load_pdf.pdf')
# Below are two ways of loading PDF documents using Aspose products.
# 1 - Load as an Aspose.Words document:
aspose_words_doc = aw.Document(ARTIFACTS_DIR + 'PDF2Word.load_pdf.pdf')
self.assertEqual('Hello world!', aspose_words_doc.get_text().strip())
Shows how to convert a PDF to a .docx.
doc = aw.Document()
builder = aw.DocumentBuilder(doc)
builder.write('Hello world!')
doc.save(ARTIFACTS_DIR + 'PDF2Word.convert_pdf_to_docx.pdf')
# Load the PDF document that we just saved, and convert it to .docx.
pdf_doc = aw.Document(ARTIFACTS_DIR + 'PDF2Word.convert_pdf_to_docx.pdf')
pdf_doc.save(ARTIFACTS_DIR + 'PDF2Word.convert_pdf_to_docx.docx')
Shows how to load an encrypted Microsoft Word document.
doc = None
# Aspose.Words throw an exception if we try to open an encrypted document without its password.
with self.assertRaises(Exception):
doc = aw.Document(file_name=MY_DIR + 'Encrypted.docx')
# When loading such a document, the password is passed to the document's constructor using a LoadOptions object.
options = aw.loading.LoadOptions(password='docPassword')
# There are two ways of loading an encrypted document with a LoadOptions object.
# 1 - Load the document from the local file system by filename:
doc = aw.Document(file_name=MY_DIR + 'Encrypted.docx', load_options=options)
# 2 - Load the document from a stream:
with system_helper.io.File.open_read(MY_DIR + 'Encrypted.docx') as stream:
doc = aw.Document(stream=stream, load_options=options)
Shows how to load a document using a stream.
with system_helper.io.File.open_read(MY_DIR + 'Document.docx') as stream:
doc = aw.Document(stream=stream)
self.assertEqual('Hello World!\r\rHello Word!\r\r\rHello World!', doc.get_text().strip())
Shows how to load a document from a URL.
from io import BytesIO
from api_example_base import ApiExampleBase, MY_DIR, ARTIFACTS_DIR, GOLDS_DIR, TEMP_DIR, IMAGE_DIR, FONTS_DIR
import aspose.words as aw
from urllib.request import urlopen
# Create a URL that points to a Microsoft Word document.
url = 'https://filesamples.com/samples/document/docx/sample3.docx'
# Download the document into a byte array, then load that array into a document using a memory stream.
with urlopen(url) as response:
data_bytes = response.read()
with BytesIO(data_bytes) as byte_stream:
doc = aw.Document(byte_stream)
# At this stage, we can read and edit the document's contents and then save it to the local file system.
assert doc.first_section.body.paragraphs[3].get_text().strip() == 'There are eight section headings in this document. At the beginning, "Sample Document" is a level 1 heading. ' + 'The main section headings, such as "Headings" and "Lists" are level 2 headings. ' + 'The Tables section contains two sub-headings, "Simple Table" and "Complex Table," which are both level 3 headings.'
doc.save(ARTIFACTS_DIR + 'Document.LoadFromWeb.docx')
Shows how to open an HTML document with images from a stream using a base URI.
with system_helper.io.File.open_read(MY_DIR + 'Document.html') as stream:
# Pass the URI of the base folder while loading it
# so that any images with relative URIs in the HTML document can be found.
load_options = aw.loading.LoadOptions()
load_options.base_uri = IMAGE_DIR
doc = aw.Document(stream=stream, load_options=load_options)
# Verify that the first shape of the document contains a valid image.
shape = doc.get_child(aw.NodeType.SHAPE, 0, True).as_shape()
self.assertTrue(shape.is_image)
self.assertIsNotNone(shape.image_data.image_bytes)
self.assertAlmostEqual(32, aw.ConvertUtil.point_to_pixel(points=shape.width), delta=0.01)
self.assertAlmostEqual(32, aw.ConvertUtil.point_to_pixel(points=shape.height), delta=0.01)
See Also
- module aspose.words
- class Document