HTMLDocument

HTMLDocument class

Represents an HTML document. All top level HTML objects are added to this object. This class represents the HTML page as what we see in browser. All forms, tables, scripts, … are added to the HTML page via the interfaces of this class. HTMLDocument is html implementation of most general Document interface and both are core or root point of DOM - Document Object Model. These concepts are in full accordance with officical web development basis or standards. For the purposes of web development, you can generally think of HTMLDocument as an alias for Document, upon which HTMLDocument is based.

public class HTMLDocument : Document, IDocumentCSS

Constructors

Name Description
HTMLDocument() The HTMLDocument constructor creates a new HTML Document object that is a web page loaded in the browser and serving as an entry point into the page’s content.
HTMLDocument(Configuration) The HTMLDocument constructor creates a new HTML Document object that is a web page loaded in the browser and serving as an entry point into the page’s content.
HTMLDocument(RequestMessage) Creates an HTML document from the RequestMessage object.
HTMLDocument(string) Loads the HTML document from an address.
HTMLDocument(Url) Loads the HTML document from a URL.
HTMLDocument(RequestMessage, Configuration) Creates an HTML document from the RequestMessage object.
HTMLDocument(Stream, string) Creates an HTML document from a Stream content with specified base-uri that is used to resolve the relative resources’ path.
HTMLDocument(Stream, Url) Creates an HTML document from a Stream content with specified base-uri that is used to resolve the relative resources’ path.
HTMLDocument(string, Configuration) Loads the HTML document from an address with specified environment configuration settings.
HTMLDocument(string, string) Creates an HTML document from a String content with specified base-uri.
HTMLDocument(string, Url) Creates an HTML document from a String content with specified base-uri.
HTMLDocument(Url, Configuration) Loads the HTML document from a URL with specified environment configuration settings.
HTMLDocument(Stream, string, Configuration) Creates an HTML document from a Stream content with specified base-uri and environment configuration settings.
HTMLDocument(Stream, Url, Configuration) Creates an HTML document from a Stream content with specified base-uri and environment configuration settings.
HTMLDocument(string, string, Configuration) Creates an HTML document from a String content with specified base-uri and environment configuration settings.
HTMLDocument(string, Url, Configuration) Creates an HTML document from a String content with specified base-uri and environment configuration settings.

Properties

Name Description
Anchors { get; } A collection of all the anchor (A) elements in a document with a value for the name attribute. For reasons of backward compatibility, the returned set of anchors only contains those anchors created with the name attribute, not those created with the id attribute. Note that in [XHTML 1.0], the name attribute (see section 4.10) has no semantics and is only present for legacy user agents: the id attribute is used instead. Users should prefer the iterator mechanisms provided by [DOM Level 2 Traversal] instead.
Applets { get; } A collection of all the OBJECT elements that include applets and APPLET (deprecated) elements in a document.
virtual Attributes { get; } The attributes property returns a live collection of all attribute nodes registered to the specified node. Attributes is a key/value pair of strings that represents any information regarding that attribute.
override BaseURI { get; } The absolute base URI of this node or null if the implementation wasn’t able to obtain an absolute URI.
Body { get; set; } The element that contains the content for the document. In documents with BODY contents, returns the BODYelement. In frameset documents, this returns the outermost FRAMESET element.
CharacterSet { get; } Gets the document’s encoding.
Charset { get; } Gets the document’s encoding.
ChildElementCount { get; } Returns the current number of element nodes that are children of this element. 0 if this element has no child nodes that are of nodeType 1.
ChildNodes { get; } The read-only childNodes property of the Node interface returns a live NodeList of child nodes of the given element where the first child node is assigned index 0. Child nodes include elements, text and comments.
Children { get; } Returns the child elements.
ContentType { get; } Gets the document content type.
Context { get; } Gets the current browsing context.
DefaultView { get; } The defaultView IDL attribute of the Document interface, on getting, must return this Document’s browsing context’s WindowProxy object, if this Document has an associated browsing context, or null otherwise.
Doctype { get; } The Document Type Declaration associated with this document.
DocumentElement { get; } This is a convenience attribute that allows direct access to the child node that is the document element of the document.
DocumentURI { get; } The location of the document or null if undefined or if the Document was created using DOMImplementation.createDocument.
Domain { get; } The domain name of the server that served the document, or null if the server cannot be identified by a domain name.
FirstChild { get; } The read-only firstChild property of the Node interface returns the node’s first child in the tree, or null if the node has no children.
FirstElementChild { get; } Returns the first child element node of this element. null if this element has no child elements.
Forms { get; } A collection of all the forms of a document.
Images { get; } A collection of all the IMG elements in a document. The behavior is limited to IMG elements for backwards compatibility. As suggested by [HTML 4.01], to include images, authors may use the OBJECT element or the IMG element. Therefore, it is recommended not to use this attribute to find the images in the document but getElementsByTagName with HTML 4.01 or getElementsByTagNameNS with XHTML 1.0.
Implementation { get; } The DOMImplementation object that handles this document.
InputEncoding { get; } Gets the document’s encoding.
LastChild { get; } The read-only lastChild property of the Node interface returns the last child of the node. If its parent is an element, then the child is generally an element node, a text node, or a comment node. It returns null if there are no child elements
LastElementChild { get; } Returns the last child element node of this element. null if this element has no child elements.
Links { get; } A collection of all AREA elements and anchor (A) elements in a document with a value for the href attribute.
virtual LocalName { get; } Returns the local part of the qualified name of this node. For nodes of any type other than ELEMENT_NODE and ATTRIBUTE_NODE and nodes created with a DOM Level 1 method, such as Document.createElement(), this is always null.
Location { get; } The location of the document.
virtual NamespaceURI { get; } The Element.namespaceURI read-only property returns the namespace URI of the element, or null if the element is not in a namespace.
NextElementSibling { get; } Returns the next sibling element node of this element. null if this element has no element sibling nodes that come after this one in the document tree.
NextSibling { get; } The read-only nextSibling property of the Node interface returns the node immediately following the specified one in their parent’s childNodes, or returns null if the specified node is the last child in the parent element.
override NodeName { get; } The name of this node, depending on its type.
override NodeType { get; } A code representing the type of the underlying object.
virtual NodeValue { get; set; } The nodeValue property of the Node interface returns or sets the value of the current node.
Origin { get; } Gets the document origin.
override OwnerDocument { get; } Gets the owner document.
ParentElement { get; } The read-only parentElement property of Node interface returns the DOM node’s parent Element, or null if the node either has no parent, or its parent isn’t a DOM Element.
ParentNode { get; } The read-only parentNode property of the Node interface returns the parent of the specified node in the DOM tree.
virtual Prefix { get; set; } The prefix read-only property returns the namespace prefix of the specified element, or null if no prefix is specified.
PreviousElementSibling { get; } Returns the previous sibling element node of this element. null if this element has no element sibling nodes that come before this one in the document tree.
PreviousSibling { get; } The read-only previousSibling property of the Node interface returns the node immediately preceding the specified one in its parent’s childNodes list, or null if the specified node is the first in that list.
ReadyState { get; } Returns the document readiness. The “loading” while the Document is loading, “interactive” once it is finished parsing but still loading sub-resources, and “complete” once it has loaded.
Referrer { get; } Returns the URI [IETF RFC 2396] of the page that linked to this page. The value is an empty string if the user navigated to the page directly (not through a link, but, for example, via a bookmark).
StrictErrorChecking { get; set; } An attribute specifying whether error checking is enforced or not. When set to false, the implementation is free to not test every possible error case normally defined on DOM operations, and not raise any DOMException on DOM operations or report errors while using Document.normalizeDocument(). In case of error, the behavior is undefined. This attribute is true by default.
StyleSheets { get; } A list containing all the style sheets explicitly linked into or embedded in a document. For HTML documents, this includes external style sheets, included via the HTML LINK element, and inline STYLE elements.
virtual TextContent { get; set; } The textContent property of the Node interface represents the text content of the node and its descendants.
Title { get; set; } The title of a document as specified by the TITLE element in the head of the document.
XmlStandalone { get; set; } An attribute specifying, as part of the XML declaration, whether this document is standalone. This is false when unspecified.
XmlVersion { get; set; } An attribute specifying, as part of the XML declaration, the version number of this document. If there is no declaration and if this document supports the “XML” feature, the value is “1.0”. If this document does not support the “XML” feature, the value is always null.

Methods

Name Description
AddEventListener(string, IEventListener) The addEventListener() method of the EventTarget interface sets up a function that will be called whenever the specified event is delivered to the target.
AddEventListener(string, DOMEventHandler, bool) The addEventListener() method of the EventTarget interface sets up a function that will be called whenever the specified event is delivered to the target.
AddEventListener(string, IEventListener, bool) The addEventListener() method of the EventTarget interface sets up a function that will be called whenever the specified event is delivered to the target.
AppendChild(Node) The appendChild() method of the Node interface adds a node to the end of the list of children of a specified parent node. If the given child is a reference to an existing node in the document, appendChild() moves it from its current position to the new position (there is no requirement to remove the node from its parent node before appending it to some other node).
CloneNode() The cloneNode() method of the Node interface returns a duplicate of the node on which this method was called. Its parameter controls if the subtree contained in a node is also cloned or not.
CloneNode(bool) The cloneNode() method of the Node interface returns a duplicate of the node on which this method was called. Its parameter controls if the subtree contained in a node is also cloned or not.
CreateAttribute(string) The Document.createAttribute() method creates a new attribute node, and returns it. The object created a node implementing the Attr interface. The DOM does not enforce what sort of attributes can be added to a particular element in this manner.
CreateAttributeNS(string, string) The Document.createAttribute() method creates a new attribute node, and returns it. The object created a node implementing the Attr interface. The DOM does not enforce what sort of attributes can be added to a particular element in this manner.
CreateCDATASection(string) Creates a CDATASection node whose value is the specified string.
CreateComment(string) Creates a Comment node given the specified string.
CreateDocumentFragment() Creates a new empty DocumentFragment into which DOM nodes can be added to build an offscreen DOM tree.
CreateDocumentType(string, string, string, string) The method returns a DocumentType object which can either be used with DOMImplementation.createDocument upon document creation or can be put into the document via methods like Node.insertBefore() or Node.replaceChild().
CreateElement(string) In an HTML document, the document.createElement() method creates the HTML element specified by tagName, or an HTMLUnknownElement if tagName isn’t recognized.
CreateElementNS(string, string) Creates an element of the given qualified name and namespace URI.
CreateEntityReference(string) Creates an EntityReference object. In addition, if the referenced entity is known, the child list of the EntityReference node is made the same as that of the corresponding Entity node.
CreateEvent(string) Creates an Event of a type supported by the implementation.
CreateExpression(string, IXPathNSResolver) Creates a parsed XPath expression with resolved namespaces. This is useful when an expression will be reused in an application since it makes it possible to compile the expression string into a more efficient internal form and preresolve all namespace prefixes which occur within the expression.
CreateNodeIterator(Node) Create a new NodeIterator over the subtree rooted at the specified node.
CreateNodeIterator(Node, long) Create a new NodeIterator over the subtree rooted at the specified node.
CreateNodeIterator(Node, long, INodeFilter) Create a new NodeIterator over the subtree rooted at the specified node.
CreateNSResolver(Node) Adapts any DOM node to resolve namespaces so that an XPath expression can be easily evaluated relative to the context of the node where it appeared within the document. This adapter works like the DOM Level 3 method lookupNamespaceURI on nodes in resolving the namespaceURI from a given prefix using the current information available in the node’s hierarchy at the time lookupNamespaceURI is called, also correctly resolving the implicit xml prefix.
CreateProcessingInstruction(string, string) Creates a ProcessingInstruction node given the specified name and data strings.
CreateTextNode(string) Creates a Text node given the specified string.
CreateTreeWalker(Node) Create a new TreeWalker over the subtree rooted at the specified node.
CreateTreeWalker(Node, long) Create a new TreeWalker over the subtree rooted at the specified node.
CreateTreeWalker(Node, long, INodeFilter) Create a new TreeWalker over the subtree rooted at the specified node.
DispatchEvent(Event) Dispatches an Event at the specified EventTarget, (synchronously) invoking the affected EventListeners in the appropriate order. The normal event processing rules (including the capturing and optional bubbling phase) also apply to events dispatched manually with dispatchEvent().
Dispose() Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
Evaluate(string, Node, IXPathNSResolver, XPathResultType, object) Evaluates an XPath expression string and returns a result of the specified type if possible.
GetElementById(string) The Document method getElementById() returns an Element object representing the element whose id property matches the specified string. Since element IDs are required to be unique if specified, they’re a useful way to get access to a specific element quickly.
GetElementsByClassName(string) The getElementsByClassName method of Document interface returns an array-like object of all child elements which have all of the given class name(s).
GetElementsByTagName(string) The getElementsByTagName method of Document interface returns an HTMLCollection of elements with the given tag name.
GetElementsByTagNameNS(string, string) Returns a list of elements with the given tag name belonging to the given namespace. The complete document is searched, including the root node.
GetOverrideStyle(Element, string) This method is used to retrieve the override style declaration for a specified element and a specified pseudo-element.
virtual GetPlatformType() This method is used to retrieve the ECMAScript object .
virtual HasAttributes() The hasAttributes() method of the Element interface returns a boolean value indicating whether the current element has any attributes or not.
HasChildNodes() The hasChildNodes() method of the Node interface returns a boolean value indicating whether the given Node has child nodes or not.
ImportNode(Node, bool) Imports a node from another document to this document, without altering or removing the source node from the original document; this method creates a new copy of the source node.
InsertBefore(Node, Node) The insertBefore() method of the Node interface inserts a node before a reference node as a child of a specified parent node.
IsDefaultNamespace(string) The isDefaultNamespace() method of the Node interface accepts a namespace URI as an argument. It returns a boolean value that is true if the namespace is the default namespace on the given node and false if not.
IsEqualNode(Node) The isEqualNode() method of the Node interface tests whether two nodes are equal. Two nodes are equal when they have the same type, defining characteristics (for elements, this would be their ID, number of children, and so forth), its attributes match, and so on. The specific set of data points that must match varies depending on the types of the nodes.
IsSameNode(Node) The isSameNode() method of the Node interface is a legacy alias the for the === strict equality operator. That is, it tests whether two nodes are the same (in other words, whether they reference the same object).
LookupNamespaceURI(string) The lookupNamespaceURI() method of the Node interface takes a prefix as parameter and returns the namespace URI associated with it on the given node if found (and null if not).
LookupPrefix(string) The lookupPrefix() method of the Node interface returns a String containing the prefix for a given namespace URI, if present, and null if not. When multiple prefixes are possible, the first prefix is returned.
Navigate(RequestMessage) Loads the document based on specified request object, replacing the previous content.
Navigate(string) Loads the document at the specified Uniform Resource Locator (URL) into the current instance, replacing the previous content.
Navigate(Url) Loads the document at the specified Uniform Resource Locator (URL) into the current instance, replacing the previous content.
Navigate(Stream, string) Loads the document from specified content and using baseUri to resolve relative resources, replacing the previous content. Document loading starts from the current position in the stream.
Navigate(Stream, Url) Loads the document from specified content and using baseUri to resolve relative resources, replacing the previous content. Document loading starts from the current position in the stream.
Navigate(string, string) Loads the document from specified content and using baseUri to resolve relative resources, replacing the previous content.
Navigate(string, Url) Loads the document from specified content and using baseUri to resolve relative resources, replacing the previous content.
Normalize() Puts all Text nodes in the full depth of the sub-tree underneath this Node, including attribute nodes, into a “normal” form where only structure (e.g., elements, comments, processing instructions, CDATA sections, and entity references) separates Text nodes, i.e., there are neither adjacent Text nodes nor empty Text nodes. This can be used to ensure that the DOM view of a document is the same as if it were saved and re-loaded, and is useful when operations (such as XPointer [XPointer] lookups) that depend on a particular document tree structure are to be used. If the parameter “normalize-characters” of the DOMConfiguration object attached to the Node.ownerDocument is true, this method will also fully normalize the characters of the Text nodes.
QuerySelector(string) Returns the first Element in document, which match selector
QuerySelectorAll(string) Returns a NodeList of all the Elements in document, which match selector
RemoveChild(Node) The removeChild() method of the Node interface removes a child node from the DOM and returns the removed node.
RemoveEventListener(string, IEventListener) This method allows the removal of event listeners from the event target. If an is removed from an while it is processing an event, it will not be triggered by the current actions. Event Listeners can never be invoked after being removed.
RemoveEventListener(string, DOMEventHandler, bool) This method allows the removal of event listeners from the event target. If an is removed from an while it is processing an event, it will not be triggered by the current actions. Event Listeners can never be invoked after being removed.
RemoveEventListener(string, IEventListener, bool) This method allows the removal of event listeners from the event target. If an is removed from an while it is processing an event, it will not be triggered by the current actions. Event Listeners can never be invoked after being removed.
override RenderTo(IDevice) This method is used to print the contents of the current document to the specified device.
ReplaceChild(Node, Node) Replaces the child node oldChild with newChild in the list of children, and returns the oldChild node. If newChild is a DocumentFragment object, oldChild is replaced by all of the DocumentFragment children, which are inserted in the same order. If the newChild is already in the tree, it is first removed.
Save(IOutputStorage) Saves the document content and resources to the output storage.
Save(string) Saves the document to a local file specified by path. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as: output_file_name + “_files”.
Save(Url) Saves the document to a local file specified by url. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as output_file_name + “_files”.
Save(IOutputStorage, HTMLSaveFormat) Saves the document content and resources to the output storage.
Save(IOutputStorage, HTMLSaveOptions) Saves the document content and resources to the output storage.
Save(IOutputStorage, MarkdownSaveOptions) Saves the document content and resources to the output storage.
Save(IOutputStorage, MHTMLSaveOptions) Saves the document content and resources to the output storage.
Save(string, HTMLSaveFormat) Saves the document to a local file specified by path. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as output_file_name + “_files”.
Save(string, HTMLSaveOptions) Saves the document to a local file specified by path. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as: output_file_name + “_files”.
Save(string, MarkdownSaveOptions) Saves the document to a local file specified by path. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as: output_file_name + “_files”.
Save(string, MHTMLSaveOptions) Saves the document to a local file specified by path. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as: output_file_name + “_files”.
Save(Url, HTMLSaveFormat) Saves the document to a local file specified by url. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as output_file_name + “_files”.
Save(Url, HTMLSaveOptions) Saves the document to a local file specified by url. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as: output_file_name + “_files”.
Save(Url, MarkdownSaveOptions) Saves the document to a local file specified by url. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as: output_file_name + “_files”.
Save(Url, MHTMLSaveOptions) Saves the document to a local file specified by url. All resources used in this document will be saved into an adjacent folder, whose name will be constructed as: output_file_name + “_files”.
override ToString() Returns a String that represents this instance.
Write(params string[]) Write a string of text to a document stream opened by open(). Note that the function will produce a document which is not necessarily driven by a DTD and therefore might be produce an invalid result in the context of the document.
WriteLn(params string[]) Write a string of text followed by a newline character to a document stream opened by open(). Note that the function will produce a document which is not necessarily driven by a DTD and therefore might be produce an invalid result in the context of the document

Remarks

More info about HTMLDocument, Document and DOM can be obtained in popular web development resources:

General Document interface.Html specific HTMLDocument interface.What is the HTML DOM.

Standards Reference:

DOM Standard - defines a platform-neutral model for events, aborting activities, and node trees.DOM Standard (DOM) # htmldocument.GitHub - repository hosts the DOM Standard.

Examples

    // Create an instance of an HTML document
	using (var document = new HTMLDocument())
      {
        // Create a style element and assign the green color for all elements with class-name equals 'gr'.
        var style = document.CreateElement("style");
        style.TextContent = ".gr { color: green }";

        // Find the document header element and append style element to the header
        var head = document.GetElementsByTagName("head").First();
        head.AppendChild(style);

        // Create a paragraph element with class-name 'gr'.
        var p = (HTMLParagraphElement)document.CreateElement("p");
        p.ClassName = "gr";

        // Create a text node
        var text = document.CreateTextNode("Hello World!!");

        // Append the text node to the paragraph
        p.AppendChild(text);

        // Append the paragraph to the document body element
        document.Body.AppendChild(p);

        // Save the HTML document to a file 
        document.Save(Path.Combine(OutputDir, "using-dom.html"));

        // Create an instance of the PDF output device and render the document into this device
        using (var device = new PdfDevice(Path.Combine(OutputDir, "using-dom.pdf")))
        {
          // Render HTML to PDF
          document.RenderTo(device);
        }
      }       

See Also