DocumentVisitor
Inheritance: java.lang.Object
public abstract class DocumentVisitor
Base class for custom document visitors.
To learn more, visit the Aspose.Words Document Object Model (DOM) documentation article.
Remarks:
With DocumentVisitor you can define and execute custom operations that require enumeration over the document tree.
For example, Aspose.Words uses DocumentVisitor internally for saving Document in various formats and for other operations like finding fields or bookmarks over a fragment of a document.
To use DocumentVisitor:
- Create a class derived from DocumentVisitor.
- Override and provide implementations for some or all of the VisitXXX methods to perform some custom operations.
- Call Node.accept(com.aspose.words.DocumentVisitor) on the Node that you want to start the enumeration from.
DocumentVisitor provides default implementations for all of the VisitXXX methods to make it easier to create new document visitors as only the methods required for the particular visitor need to be overridden. It is not necessary to override all of the visitor methods.
For more information see the Visitor design pattern.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Methods
Method | Description |
---|---|
visitAbsolutePositionTab(AbsolutePositionTab tab) | Called when a AbsolutePositionTab node is encountered in the document. |
visitBodyEnd(Body body) | Called when enumeration of the main text story in a section has ended. |
visitBodyStart(Body body) | Called when enumeration of the main text story in a section has started. |
visitBookmarkEnd(BookmarkEnd bookmarkEnd) | Called when an end of a bookmark is encountered in the document. |
visitBookmarkStart(BookmarkStart bookmarkStart) | Called when a start of a bookmark is encountered in the document. |
visitBuildingBlockEnd(BuildingBlock block) | Called when enumeration of a building block has ended. |
visitBuildingBlockStart(BuildingBlock block) | Called when enumeration of a building block has started. |
visitCellEnd(Cell cell) | Called when enumeration of a table cell has ended. |
visitCellStart(Cell cell) | Called when enumeration of a table cell has started. |
visitCommentEnd(Comment comment) | Called when enumeration of a comment text has ended. |
visitCommentRangeEnd(CommentRangeEnd commentRangeEnd) | Called when the end of a commented range of text is encountered. |
visitCommentRangeStart(CommentRangeStart commentRangeStart) | Called when the start of a commented range of text is encountered. |
visitCommentStart(Comment comment) | Called when enumeration of a comment text has started. |
visitDocumentEnd(Document doc) | Called when enumeration of the document has finished. |
visitDocumentStart(Document doc) | Called when enumeration of the document has started. |
visitEditableRangeEnd(EditableRangeEnd editableRangeEnd) | Called when an end of an editable range is encountered in the document. |
visitEditableRangeStart(EditableRangeStart editableRangeStart) | Called when a start of an editable range is encountered in the document. |
visitFieldEnd(FieldEnd fieldEnd) | Called when a field ends in the document. |
visitFieldSeparator(FieldSeparator fieldSeparator) | Called when a field separator is encountered in the document. |
visitFieldStart(FieldStart fieldStart) | Called when a field starts in the document. |
visitFootnoteEnd(Footnote footnote) | Called when enumeration of a footnote or endnote text has ended. |
visitFootnoteStart(Footnote footnote) | Called when enumeration of a footnote or endnote text has started. |
visitFormField(FormField formField) | Called when a form field is encountered in the document. |
visitGlossaryDocumentEnd(GlossaryDocument glossary) | Called when enumeration of a glossary document has ended. |
visitGlossaryDocumentStart(GlossaryDocument glossary) | Called when enumeration of a glossary document has started. |
visitGroupShapeEnd(GroupShape groupShape) | Called when enumeration of a group shape has ended. |
visitGroupShapeStart(GroupShape groupShape) | Called when enumeration of a group shape has started. |
visitHeaderFooterEnd(HeaderFooter headerFooter) | Called when enumeration of a header or footer in a section has ended. |
visitHeaderFooterStart(HeaderFooter headerFooter) | Called when enumeration of a header or footer in a section has started. |
visitOfficeMathEnd(OfficeMath officeMath) | Called when enumeration of a Office Math object has ended. |
visitOfficeMathStart(OfficeMath officeMath) | Called when enumeration of a Office Math object has started. |
visitParagraphEnd(Paragraph paragraph) | Called when enumeration of a paragraph has ended. |
visitParagraphStart(Paragraph paragraph) | Called when enumeration of a paragraph has started. |
visitRowEnd(Row row) | Called when enumeration of a table row has ended. |
visitRowStart(Row row) | Called when enumeration of a table row has started. |
visitRun(Run run) | Called when a run of text in the is encountered. |
visitSectionEnd(Section section) | Called when enumeration of a section has ended. |
visitSectionStart(Section section) | Called when enumeration of a section has started. |
visitShapeEnd(Shape shape) | Called when enumeration of a shape has ended. |
visitShapeStart(Shape shape) | Called when enumeration of a shape has started. |
visitSmartTagEnd(SmartTag smartTag) | Called when enumeration of a smart tag has ended. |
visitSmartTagStart(SmartTag smartTag) | Called when enumeration of a smart tag has started. |
visitSpecialChar(SpecialChar specialChar) | Called when a SpecialChar node is encountered in the document. |
visitStructuredDocumentTagEnd(StructuredDocumentTag sdt) | Called when enumeration of a structured document tag has ended. |
visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd) | Called when a StructuredDocumentTagRangeEnd is encountered. |
visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart) | Called when a StructuredDocumentTagRangeStart is encountered. |
visitStructuredDocumentTagStart(StructuredDocumentTag sdt) | Called when enumeration of a structured document tag has started. |
visitSubDocument(SubDocument subDocument) | Called when a sub-document is encountered. |
visitTableEnd(Table table) | Called when enumeration of a table has ended. |
visitTableStart(Table table) | Called when enumeration of a table has started. |
visitAbsolutePositionTab(AbsolutePositionTab tab)
public int visitAbsolutePositionTab(AbsolutePositionTab tab)
Called when a AbsolutePositionTab node is encountered in the document.
Examples:
Shows how to process absolute position tab characters with a document visitor.
public void documentToTxt() throws Exception {
Document doc = new Document(getMyDir() + "Absolute position tab.docx");
// Extract the text contents of our document by accepting this custom document visitor.
DocTextExtractor myDocTextExtractor = new DocTextExtractor();
Section fisrtSection = doc.getFirstSection();
fisrtSection.getBody().accept(myDocTextExtractor);
// Visit only start of the document body.
fisrtSection.getBody().acceptStart(myDocTextExtractor);
// Visit only end of the document body.
fisrtSection.getBody().acceptEnd(myDocTextExtractor);
// The absolute position tab, which has no equivalent in string form, has been explicitly converted to a tab character.
Assert.assertEquals("Before AbsolutePositionTab\tAfter AbsolutePositionTab", myDocTextExtractor.getText());
// An AbsolutePositionTab can accept a DocumentVisitor by itself too.
AbsolutePositionTab absPositionTab = (AbsolutePositionTab) doc.getFirstSection().getBody().getFirstParagraph().getChild(NodeType.SPECIAL_CHAR, 0, true);
myDocTextExtractor = new DocTextExtractor();
absPositionTab.accept(myDocTextExtractor);
Assert.assertEquals("\t", myDocTextExtractor.getText());
}
///
/// Collects the text contents of all runs in the visited document. Replaces all absolute tab characters with ordinary tabs.
///
public static class DocTextExtractor extends DocumentVisitor {
public DocTextExtractor() {
mBuilder = new StringBuilder();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
appendText(run.getText());
return VisitorAction.CONTINUE;
}
///
/// Called when an AbsolutePositionTab node is encountered in the document.
///
public int visitAbsolutePositionTab(final AbsolutePositionTab tab) {
mBuilder.append("\t");
return VisitorAction.CONTINUE;
}
///
/// Adds text to the current output. Honors the enabled/disabled output flag.
///
public void appendText(final String text) {
mBuilder.append(text);
}
///
/// Plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
tab | AbsolutePositionTab | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitBodyEnd(Body body)
public int visitBodyEnd(Body body)
Called when enumeration of the main text story in a section has ended.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
body | Body | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitBodyStart(Body body)
public int visitBodyStart(Body body)
Called when enumeration of the main text story in a section has started.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
body | Body | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitBookmarkEnd(BookmarkEnd bookmarkEnd)
public int visitBookmarkEnd(BookmarkEnd bookmarkEnd)
Called when an end of a bookmark is encountered in the document.
Examples:
Shows how to add bookmarks and update their contents.
public void createUpdateAndPrintBookmarks() throws Exception {
// Create a document with three bookmarks, then use a custom document visitor implementation to print their contents.
Document doc = createDocumentWithBookmarks(3);
BookmarkCollection bookmarks = doc.getRange().getBookmarks();
printAllBookmarkInfo(bookmarks);
// Bookmarks can be accessed in the bookmark collection by index or name, and their names can be updated.
bookmarks.get(0).setName("{bookmarks[0].Name}_NewName");
bookmarks.get("MyBookmark_2").setText("Updated text contents of {bookmarks[1].Name}");
// Print all bookmarks again to see updated values.
printAllBookmarkInfo(bookmarks);
}
///
/// Create a document with a given number of bookmarks.
///
private static Document createDocumentWithBookmarks(int numberOfBookmarks) throws Exception {
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
for (int i = 1; i <= numberOfBookmarks; i++) {
String bookmarkName = "MyBookmark_" + i;
builder.write("Text before bookmark.");
builder.startBookmark(bookmarkName);
builder.write(MessageFormat.format("Text inside {0}.", bookmarkName));
builder.endBookmark(bookmarkName);
builder.writeln("Text after bookmark.");
}
return doc;
}
///
/// Use an iterator and a visitor to print info of every bookmark in the collection.
///
private static void printAllBookmarkInfo(BookmarkCollection bookmarks) throws Exception {
BookmarkInfoPrinter bookmarkVisitor = new BookmarkInfoPrinter();
// Get each bookmark in the collection to accept a visitor that will print its contents.
Iterator enumerator = bookmarks.iterator();
while (enumerator.hasNext()) {
Bookmark currentBookmark = enumerator.next();
if (currentBookmark != null) {
currentBookmark.getBookmarkStart().accept(bookmarkVisitor);
currentBookmark.getBookmarkEnd().accept(bookmarkVisitor);
System.out.println(currentBookmark.getBookmarkStart().getText());
}
}
}
///
/// Prints contents of every visited bookmark to the console.
///
public static class BookmarkInfoPrinter extends DocumentVisitor {
public int visitBookmarkStart(BookmarkStart bookmarkStart) throws Exception {
System.out.println(MessageFormat.format("BookmarkStart name: \"{0}\", Content: \"{1}\"", bookmarkStart.getName(),
bookmarkStart.getBookmark().getText()));
return VisitorAction.CONTINUE;
}
public int visitBookmarkEnd(BookmarkEnd bookmarkEnd) {
System.out.println(MessageFormat.format("BookmarkEnd name: \"{0}\"", bookmarkEnd.getName()));
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
bookmarkEnd | BookmarkEnd | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitBookmarkStart(BookmarkStart bookmarkStart)
public int visitBookmarkStart(BookmarkStart bookmarkStart)
Called when a start of a bookmark is encountered in the document.
Examples:
Shows how to add bookmarks and update their contents.
public void createUpdateAndPrintBookmarks() throws Exception {
// Create a document with three bookmarks, then use a custom document visitor implementation to print their contents.
Document doc = createDocumentWithBookmarks(3);
BookmarkCollection bookmarks = doc.getRange().getBookmarks();
printAllBookmarkInfo(bookmarks);
// Bookmarks can be accessed in the bookmark collection by index or name, and their names can be updated.
bookmarks.get(0).setName("{bookmarks[0].Name}_NewName");
bookmarks.get("MyBookmark_2").setText("Updated text contents of {bookmarks[1].Name}");
// Print all bookmarks again to see updated values.
printAllBookmarkInfo(bookmarks);
}
///
/// Create a document with a given number of bookmarks.
///
private static Document createDocumentWithBookmarks(int numberOfBookmarks) throws Exception {
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
for (int i = 1; i <= numberOfBookmarks; i++) {
String bookmarkName = "MyBookmark_" + i;
builder.write("Text before bookmark.");
builder.startBookmark(bookmarkName);
builder.write(MessageFormat.format("Text inside {0}.", bookmarkName));
builder.endBookmark(bookmarkName);
builder.writeln("Text after bookmark.");
}
return doc;
}
///
/// Use an iterator and a visitor to print info of every bookmark in the collection.
///
private static void printAllBookmarkInfo(BookmarkCollection bookmarks) throws Exception {
BookmarkInfoPrinter bookmarkVisitor = new BookmarkInfoPrinter();
// Get each bookmark in the collection to accept a visitor that will print its contents.
Iterator enumerator = bookmarks.iterator();
while (enumerator.hasNext()) {
Bookmark currentBookmark = enumerator.next();
if (currentBookmark != null) {
currentBookmark.getBookmarkStart().accept(bookmarkVisitor);
currentBookmark.getBookmarkEnd().accept(bookmarkVisitor);
System.out.println(currentBookmark.getBookmarkStart().getText());
}
}
}
///
/// Prints contents of every visited bookmark to the console.
///
public static class BookmarkInfoPrinter extends DocumentVisitor {
public int visitBookmarkStart(BookmarkStart bookmarkStart) throws Exception {
System.out.println(MessageFormat.format("BookmarkStart name: \"{0}\", Content: \"{1}\"", bookmarkStart.getName(),
bookmarkStart.getBookmark().getText()));
return VisitorAction.CONTINUE;
}
public int visitBookmarkEnd(BookmarkEnd bookmarkEnd) {
System.out.println(MessageFormat.format("BookmarkEnd name: \"{0}\"", bookmarkEnd.getName()));
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
bookmarkStart | BookmarkStart | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitBuildingBlockEnd(BuildingBlock block)
public int visitBuildingBlockEnd(BuildingBlock block)
Called when enumeration of a building block has ended.
Remarks:
Note: A building block node and its children are not visited when you execute a Visitor over a Document. If you want to execute a Visitor over a building block, you need to execute the visitor over GlossaryDocument or call BuildingBlock.accept(com.aspose.words.DocumentVisitor).
Examples:
Shows ways of accessing building blocks in a glossary document.
public void glossaryDocument() throws Exception {
Document doc = new Document();
GlossaryDocument glossaryDoc = new GlossaryDocument();
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 1"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 2"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 3"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 4"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 5"));
Assert.assertEquals(glossaryDoc.getBuildingBlocks().getCount(), 5);
doc.setGlossaryDocument(glossaryDoc);
// There are various ways of accessing building blocks.
// 1 - Get the first/last building blocks in the collection:
Assert.assertEquals("Block 1", glossaryDoc.getFirstBuildingBlock().getName());
Assert.assertEquals("Block 5", glossaryDoc.getLastBuildingBlock().getName());
// 2 - Get a building block by index:
Assert.assertEquals("Block 2", glossaryDoc.getBuildingBlocks().get(1).getName());
Assert.assertEquals("Block 3", glossaryDoc.getBuildingBlocks().toArray()[2].getName());
// 3 - Get the first building block that matches a gallery, name and category:
Assert.assertEquals("Block 4",
glossaryDoc.getBuildingBlock(BuildingBlockGallery.ALL, "(Empty Category)", "Block 4").getName());
// We will do that using a custom visitor,
// which will give every BuildingBlock in the GlossaryDocument a unique GUID
GlossaryDocVisitor visitor = new GlossaryDocVisitor();
// Visit start/end of the Glossary document.
glossaryDoc.accept(visitor);
// Visit only start of the Glossary document.
glossaryDoc.acceptStart(visitor);
// Visit only end of the Glossary document.
glossaryDoc.acceptEnd(visitor);
System.out.println(visitor.getText());
// In Microsoft Word, we can access the building blocks via "Insert" -> "Quick Parts" -> "Building Blocks Organizer".
doc.save(getArtifactsDir() + "BuildingBlocks.GlossaryDocument.dotx");
}
public static BuildingBlock createNewBuildingBlock(final GlossaryDocument glossaryDoc, final String buildingBlockName) {
BuildingBlock buildingBlock = new BuildingBlock(glossaryDoc);
buildingBlock.setName(buildingBlockName);
return buildingBlock;
}
///
/// Gives each building block in a visited glossary document a unique GUID.
/// Stores the GUID-building block pairs in a dictionary.
///
public static class GlossaryDocVisitor extends DocumentVisitor {
public GlossaryDocVisitor() {
mBlocksByGuid = new HashMap<>();
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public HashMap getDictionary() {
return mBlocksByGuid;
}
public int visitGlossaryDocumentStart(final GlossaryDocument glossary) {
mBuilder.append("Glossary document found!\n");
return VisitorAction.CONTINUE;
}
public int visitGlossaryDocumentEnd(final GlossaryDocument glossary) {
mBuilder.append("Reached end of glossary!\n");
mBuilder.append("BuildingBlocks found: " + mBlocksByGuid.size() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockStart(final BuildingBlock block) {
block.setGuid(UUID.randomUUID());
mBlocksByGuid.put(block.getGuid(), block);
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockEnd(final BuildingBlock block) {
mBuilder.append("\tVisited block \"" + block.getName() + "\"" + "\r\n");
mBuilder.append("\t Type: " + block.getType() + "\r\n");
mBuilder.append("\t Gallery: " + block.getGallery() + "\r\n");
mBuilder.append("\t Behavior: " + block.getBehavior() + "\r\n");
mBuilder.append("\t Description: " + block.getDescription() + "\r\n");
return VisitorAction.CONTINUE;
}
private final HashMap mBlocksByGuid;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
block | BuildingBlock | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitBuildingBlockStart(BuildingBlock block)
public int visitBuildingBlockStart(BuildingBlock block)
Called when enumeration of a building block has started.
Remarks:
Note: A building block node and its children are not visited when you execute a Visitor over a Document. If you want to execute a Visitor over a building block, you need to execute the visitor over GlossaryDocument or call BuildingBlock.accept(com.aspose.words.DocumentVisitor).
Examples:
Shows ways of accessing building blocks in a glossary document.
public void glossaryDocument() throws Exception {
Document doc = new Document();
GlossaryDocument glossaryDoc = new GlossaryDocument();
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 1"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 2"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 3"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 4"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 5"));
Assert.assertEquals(glossaryDoc.getBuildingBlocks().getCount(), 5);
doc.setGlossaryDocument(glossaryDoc);
// There are various ways of accessing building blocks.
// 1 - Get the first/last building blocks in the collection:
Assert.assertEquals("Block 1", glossaryDoc.getFirstBuildingBlock().getName());
Assert.assertEquals("Block 5", glossaryDoc.getLastBuildingBlock().getName());
// 2 - Get a building block by index:
Assert.assertEquals("Block 2", glossaryDoc.getBuildingBlocks().get(1).getName());
Assert.assertEquals("Block 3", glossaryDoc.getBuildingBlocks().toArray()[2].getName());
// 3 - Get the first building block that matches a gallery, name and category:
Assert.assertEquals("Block 4",
glossaryDoc.getBuildingBlock(BuildingBlockGallery.ALL, "(Empty Category)", "Block 4").getName());
// We will do that using a custom visitor,
// which will give every BuildingBlock in the GlossaryDocument a unique GUID
GlossaryDocVisitor visitor = new GlossaryDocVisitor();
// Visit start/end of the Glossary document.
glossaryDoc.accept(visitor);
// Visit only start of the Glossary document.
glossaryDoc.acceptStart(visitor);
// Visit only end of the Glossary document.
glossaryDoc.acceptEnd(visitor);
System.out.println(visitor.getText());
// In Microsoft Word, we can access the building blocks via "Insert" -> "Quick Parts" -> "Building Blocks Organizer".
doc.save(getArtifactsDir() + "BuildingBlocks.GlossaryDocument.dotx");
}
public static BuildingBlock createNewBuildingBlock(final GlossaryDocument glossaryDoc, final String buildingBlockName) {
BuildingBlock buildingBlock = new BuildingBlock(glossaryDoc);
buildingBlock.setName(buildingBlockName);
return buildingBlock;
}
///
/// Gives each building block in a visited glossary document a unique GUID.
/// Stores the GUID-building block pairs in a dictionary.
///
public static class GlossaryDocVisitor extends DocumentVisitor {
public GlossaryDocVisitor() {
mBlocksByGuid = new HashMap<>();
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public HashMap getDictionary() {
return mBlocksByGuid;
}
public int visitGlossaryDocumentStart(final GlossaryDocument glossary) {
mBuilder.append("Glossary document found!\n");
return VisitorAction.CONTINUE;
}
public int visitGlossaryDocumentEnd(final GlossaryDocument glossary) {
mBuilder.append("Reached end of glossary!\n");
mBuilder.append("BuildingBlocks found: " + mBlocksByGuid.size() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockStart(final BuildingBlock block) {
block.setGuid(UUID.randomUUID());
mBlocksByGuid.put(block.getGuid(), block);
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockEnd(final BuildingBlock block) {
mBuilder.append("\tVisited block \"" + block.getName() + "\"" + "\r\n");
mBuilder.append("\t Type: " + block.getType() + "\r\n");
mBuilder.append("\t Gallery: " + block.getGallery() + "\r\n");
mBuilder.append("\t Behavior: " + block.getBehavior() + "\r\n");
mBuilder.append("\t Description: " + block.getDescription() + "\r\n");
return VisitorAction.CONTINUE;
}
private final HashMap mBlocksByGuid;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
block | BuildingBlock | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitCellEnd(Cell cell)
public int visitCellEnd(Cell cell)
Called when enumeration of a table cell has ended.
Examples:
Shows how to print the node structure of every table in a document.
public void tableToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
TableStructurePrinter visitor = new TableStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Table nodes and their children.
///
public static class TableStructurePrinter extends DocumentVisitor {
public TableStructurePrinter() {
mVisitedTables = new StringBuilder();
mVisitorIsInsideTable = false;
}
public String getText() {
return mVisitedTables.toString();
}
///
/// Called when a Run node is encountered in the document.
/// Runs that are not within tables are not recorded.
///
public int visitRun(Run run) {
if (mVisitorIsInsideTable) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a Table is encountered in the document.
///
public int visitTableStart(final Table table) {
int rows = 0;
int columns = 0;
if (table.getRows().getCount() > 0) {
rows = table.getRows().getCount();
columns = table.getFirstRow().getCount();
}
indentAndAppendLine("[Table start] Size: " + rows + "x" + columns);
mDocTraversalDepth++;
mVisitorIsInsideTable = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Table node have been visited.
///
public int visitTableEnd(final Table table) {
mDocTraversalDepth--;
indentAndAppendLine("[Table end]");
mVisitorIsInsideTable = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Row node is encountered in the document.
///
public int visitRowStart(final Row row) {
String rowContents = row.getText().replaceAll("\\u0007", ", ").replaceAll(", , ", "");
int rowWidth = row.indexOf(row.getLastCell()) + 1;
int rowIndex = row.getParentTable().indexOf(row);
String rowStatusInTable = row.isFirstRow() && row.isLastRow() ? "only" : row.isFirstRow() ? "first" : row.isLastRow() ? "last" : "";
if (!"".equals(rowStatusInTable)) {
rowStatusInTable = MessageFormat.format(", the {0} row in this table,", rowStatusInTable);
}
indentAndAppendLine(MessageFormat.format("[Row start] Row #{0}{1} width {2}, \"{3}\"", ++rowIndex, rowStatusInTable, rowWidth, rowContents));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Row node have been visited.
///
public int visitRowEnd(final Row row) {
mDocTraversalDepth--;
indentAndAppendLine("[Row end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Cell node is encountered in the document.
///
public int visitCellStart(final Cell cell) {
Row row = cell.getParentRow();
Table table = row.getParentTable();
String cellStatusInRow = cell.isFirstCell() && cell.isLastCell() ? "only" : cell.isFirstCell() ? "first" : cell.isLastCell() ? "last" : "";
if (!"".equals(cellStatusInRow)) {
cellStatusInRow = MessageFormat.format(", the {0} cell in this row", cellStatusInRow);
}
indentAndAppendLine(MessageFormat.format("[Cell start] Row {0}, Col {1}{2}", table.indexOf(row) + 1, row.indexOf(cell) + 1, cellStatusInRow));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Cell node have been visited.
///
public int visitCellEnd(final Cell cell) {
mDocTraversalDepth--;
indentAndAppendLine("[Cell end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the current table's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mVisitedTables.append("| ");
}
mVisitedTables.append(text + "\r\n");
}
private boolean mVisitorIsInsideTable;
private int mDocTraversalDepth;
private final StringBuilder mVisitedTables;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
cell | Cell | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitCellStart(Cell cell)
public int visitCellStart(Cell cell)
Called when enumeration of a table cell has started.
Examples:
Shows how to print the node structure of every table in a document.
public void tableToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
TableStructurePrinter visitor = new TableStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Table nodes and their children.
///
public static class TableStructurePrinter extends DocumentVisitor {
public TableStructurePrinter() {
mVisitedTables = new StringBuilder();
mVisitorIsInsideTable = false;
}
public String getText() {
return mVisitedTables.toString();
}
///
/// Called when a Run node is encountered in the document.
/// Runs that are not within tables are not recorded.
///
public int visitRun(Run run) {
if (mVisitorIsInsideTable) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a Table is encountered in the document.
///
public int visitTableStart(final Table table) {
int rows = 0;
int columns = 0;
if (table.getRows().getCount() > 0) {
rows = table.getRows().getCount();
columns = table.getFirstRow().getCount();
}
indentAndAppendLine("[Table start] Size: " + rows + "x" + columns);
mDocTraversalDepth++;
mVisitorIsInsideTable = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Table node have been visited.
///
public int visitTableEnd(final Table table) {
mDocTraversalDepth--;
indentAndAppendLine("[Table end]");
mVisitorIsInsideTable = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Row node is encountered in the document.
///
public int visitRowStart(final Row row) {
String rowContents = row.getText().replaceAll("\\u0007", ", ").replaceAll(", , ", "");
int rowWidth = row.indexOf(row.getLastCell()) + 1;
int rowIndex = row.getParentTable().indexOf(row);
String rowStatusInTable = row.isFirstRow() && row.isLastRow() ? "only" : row.isFirstRow() ? "first" : row.isLastRow() ? "last" : "";
if (!"".equals(rowStatusInTable)) {
rowStatusInTable = MessageFormat.format(", the {0} row in this table,", rowStatusInTable);
}
indentAndAppendLine(MessageFormat.format("[Row start] Row #{0}{1} width {2}, \"{3}\"", ++rowIndex, rowStatusInTable, rowWidth, rowContents));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Row node have been visited.
///
public int visitRowEnd(final Row row) {
mDocTraversalDepth--;
indentAndAppendLine("[Row end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Cell node is encountered in the document.
///
public int visitCellStart(final Cell cell) {
Row row = cell.getParentRow();
Table table = row.getParentTable();
String cellStatusInRow = cell.isFirstCell() && cell.isLastCell() ? "only" : cell.isFirstCell() ? "first" : cell.isLastCell() ? "last" : "";
if (!"".equals(cellStatusInRow)) {
cellStatusInRow = MessageFormat.format(", the {0} cell in this row", cellStatusInRow);
}
indentAndAppendLine(MessageFormat.format("[Cell start] Row {0}, Col {1}{2}", table.indexOf(row) + 1, row.indexOf(cell) + 1, cellStatusInRow));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Cell node have been visited.
///
public int visitCellEnd(final Cell cell) {
mDocTraversalDepth--;
indentAndAppendLine("[Cell end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the current table's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mVisitedTables.append("| ");
}
mVisitedTables.append(text + "\r\n");
}
private boolean mVisitorIsInsideTable;
private int mDocTraversalDepth;
private final StringBuilder mVisitedTables;
}
Parameters:
Parameter | Type | Description |
---|---|---|
cell | Cell | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitCommentEnd(Comment comment)
public int visitCommentEnd(Comment comment)
Called when enumeration of a comment text has ended.
Examples:
Shows how to print the node structure of every comment and comment range in a document.
public void commentsToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
CommentStructurePrinter visitor = new CommentStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Comment/CommentRange nodes and their children.
///
public static class CommentStructurePrinter extends DocumentVisitor {
public CommentStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideComment = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
/// A Run is only recorded if it is a child of a Comment or CommentRange node.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideComment) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeStart node is encountered in the document.
///
public int visitCommentRangeStart(final CommentRangeStart commentRangeStart) {
indentAndAppendLine("[Comment range start] ID: " + commentRangeStart.getId());
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeEnd node is encountered in the document.
///
public int visitCommentRangeEnd(final CommentRangeEnd commentRangeEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment range end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment node is encountered in the document.
///
public int visitCommentStart(final Comment comment) {
indentAndAppendLine(MessageFormat.format("[Comment start] For comment range ID {0}, By {1} on {2}", comment.getId(),
comment.getAuthor(), comment.getDateTime()));
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Comment node have been visited.
///
public int visitCommentEnd(final Comment comment) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into a comment/comment range's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideComment;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
comment | Comment | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitCommentRangeEnd(CommentRangeEnd commentRangeEnd)
public int visitCommentRangeEnd(CommentRangeEnd commentRangeEnd)
Called when the end of a commented range of text is encountered.
Examples:
Shows how to print the node structure of every comment and comment range in a document.
public void commentsToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
CommentStructurePrinter visitor = new CommentStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Comment/CommentRange nodes and their children.
///
public static class CommentStructurePrinter extends DocumentVisitor {
public CommentStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideComment = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
/// A Run is only recorded if it is a child of a Comment or CommentRange node.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideComment) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeStart node is encountered in the document.
///
public int visitCommentRangeStart(final CommentRangeStart commentRangeStart) {
indentAndAppendLine("[Comment range start] ID: " + commentRangeStart.getId());
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeEnd node is encountered in the document.
///
public int visitCommentRangeEnd(final CommentRangeEnd commentRangeEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment range end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment node is encountered in the document.
///
public int visitCommentStart(final Comment comment) {
indentAndAppendLine(MessageFormat.format("[Comment start] For comment range ID {0}, By {1} on {2}", comment.getId(),
comment.getAuthor(), comment.getDateTime()));
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Comment node have been visited.
///
public int visitCommentEnd(final Comment comment) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into a comment/comment range's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideComment;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
commentRangeEnd | CommentRangeEnd | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitCommentRangeStart(CommentRangeStart commentRangeStart)
public int visitCommentRangeStart(CommentRangeStart commentRangeStart)
Called when the start of a commented range of text is encountered.
Examples:
Shows how to print the node structure of every comment and comment range in a document.
public void commentsToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
CommentStructurePrinter visitor = new CommentStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Comment/CommentRange nodes and their children.
///
public static class CommentStructurePrinter extends DocumentVisitor {
public CommentStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideComment = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
/// A Run is only recorded if it is a child of a Comment or CommentRange node.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideComment) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeStart node is encountered in the document.
///
public int visitCommentRangeStart(final CommentRangeStart commentRangeStart) {
indentAndAppendLine("[Comment range start] ID: " + commentRangeStart.getId());
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeEnd node is encountered in the document.
///
public int visitCommentRangeEnd(final CommentRangeEnd commentRangeEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment range end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment node is encountered in the document.
///
public int visitCommentStart(final Comment comment) {
indentAndAppendLine(MessageFormat.format("[Comment start] For comment range ID {0}, By {1} on {2}", comment.getId(),
comment.getAuthor(), comment.getDateTime()));
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Comment node have been visited.
///
public int visitCommentEnd(final Comment comment) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into a comment/comment range's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideComment;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
commentRangeStart | CommentRangeStart | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitCommentStart(Comment comment)
public int visitCommentStart(Comment comment)
Called when enumeration of a comment text has started.
Examples:
Shows how to print the node structure of every comment and comment range in a document.
public void commentsToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
CommentStructurePrinter visitor = new CommentStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Comment/CommentRange nodes and their children.
///
public static class CommentStructurePrinter extends DocumentVisitor {
public CommentStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideComment = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
/// A Run is only recorded if it is a child of a Comment or CommentRange node.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideComment) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeStart node is encountered in the document.
///
public int visitCommentRangeStart(final CommentRangeStart commentRangeStart) {
indentAndAppendLine("[Comment range start] ID: " + commentRangeStart.getId());
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called when a CommentRangeEnd node is encountered in the document.
///
public int visitCommentRangeEnd(final CommentRangeEnd commentRangeEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment range end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment node is encountered in the document.
///
public int visitCommentStart(final Comment comment) {
indentAndAppendLine(MessageFormat.format("[Comment start] For comment range ID {0}, By {1} on {2}", comment.getId(),
comment.getAuthor(), comment.getDateTime()));
mDocTraversalDepth++;
mVisitorIsInsideComment = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Comment node have been visited.
///
public int visitCommentEnd(final Comment comment) {
mDocTraversalDepth--;
indentAndAppendLine("[Comment end]");
mVisitorIsInsideComment = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into a comment/comment range's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideComment;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
comment | Comment | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitDocumentEnd(Document doc)
public int visitDocumentEnd(Document doc)
Called when enumeration of the document has finished.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
doc | Document | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitDocumentStart(Document doc)
public int visitDocumentStart(Document doc)
Called when enumeration of the document has started.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
doc | Document | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitEditableRangeEnd(EditableRangeEnd editableRangeEnd)
public int visitEditableRangeEnd(EditableRangeEnd editableRangeEnd)
Called when an end of an editable range is encountered in the document.
Examples:
Shows how to print the node structure of every editable range in a document.
public void editableRangeToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
EditableRangeStructurePrinter visitor = new EditableRangeStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered EditableRange nodes and their children.
///
public static class EditableRangeStructurePrinter extends DocumentVisitor {
public EditableRangeStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideEditableRange = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
// We want to print the contents of runs, but only if they are inside shapes, as they would be in the case of text boxes.
if (mVisitorIsInsideEditableRange) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when an EditableRange node is encountered in the document.
///
public int visitEditableRangeStart(final EditableRangeStart editableRangeStart) {
indentAndAppendLine("[EditableRange start] ID: " + editableRangeStart.getId() + " Owner: "
+ editableRangeStart.getEditableRange().getSingleUser());
mDocTraversalDepth++;
mVisitorIsInsideEditableRange = true;
return VisitorAction.CONTINUE;
}
///
/// Called when the visiting of a EditableRange node is ended.
///
public int visitEditableRangeEnd(final EditableRangeEnd editableRangeEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[EditableRange end]");
mVisitorIsInsideEditableRange = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideEditableRange;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
editableRangeEnd | EditableRangeEnd | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitEditableRangeStart(EditableRangeStart editableRangeStart)
public int visitEditableRangeStart(EditableRangeStart editableRangeStart)
Called when a start of an editable range is encountered in the document.
Examples:
Shows how to print the node structure of every editable range in a document.
public void editableRangeToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
EditableRangeStructurePrinter visitor = new EditableRangeStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered EditableRange nodes and their children.
///
public static class EditableRangeStructurePrinter extends DocumentVisitor {
public EditableRangeStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideEditableRange = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
// We want to print the contents of runs, but only if they are inside shapes, as they would be in the case of text boxes.
if (mVisitorIsInsideEditableRange) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when an EditableRange node is encountered in the document.
///
public int visitEditableRangeStart(final EditableRangeStart editableRangeStart) {
indentAndAppendLine("[EditableRange start] ID: " + editableRangeStart.getId() + " Owner: "
+ editableRangeStart.getEditableRange().getSingleUser());
mDocTraversalDepth++;
mVisitorIsInsideEditableRange = true;
return VisitorAction.CONTINUE;
}
///
/// Called when the visiting of a EditableRange node is ended.
///
public int visitEditableRangeEnd(final EditableRangeEnd editableRangeEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[EditableRange end]");
mVisitorIsInsideEditableRange = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideEditableRange;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
editableRangeStart | EditableRangeStart | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitFieldEnd(FieldEnd fieldEnd)
public int visitFieldEnd(FieldEnd fieldEnd)
Called when a field ends in the document.
Remarks:
For more info see visitFieldStart(com.aspose.words.FieldStart)
Examples:
Shows how to print the node structure of every field in a document.
public void fieldToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
FieldStructurePrinter visitor = new FieldStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Field nodes and their children.
///
public static class FieldStructurePrinter extends DocumentVisitor {
public FieldStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideField = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideField) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(final FieldStart fieldStart) {
indentAndAppendLine("[Field start] FieldType: " + fieldStart.getFieldType());
mDocTraversalDepth++;
mVisitorIsInsideField = true;
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(final FieldEnd fieldEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[Field end]");
mVisitorIsInsideField = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(final FieldSeparator fieldSeparator) {
indentAndAppendLine("[FieldSeparator]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the field's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideField;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
fieldEnd | FieldEnd | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitFieldSeparator(FieldSeparator fieldSeparator)
public int visitFieldSeparator(FieldSeparator fieldSeparator)
Called when a field separator is encountered in the document.
Remarks:
The field separator separates field code from field value in the document. Note that some fields have only field code and do not have field separator and field value.
For more info see visitFieldStart(com.aspose.words.FieldStart)
Examples:
Shows how to print the node structure of every field in a document.
public void fieldToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
FieldStructurePrinter visitor = new FieldStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Field nodes and their children.
///
public static class FieldStructurePrinter extends DocumentVisitor {
public FieldStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideField = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideField) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(final FieldStart fieldStart) {
indentAndAppendLine("[Field start] FieldType: " + fieldStart.getFieldType());
mDocTraversalDepth++;
mVisitorIsInsideField = true;
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(final FieldEnd fieldEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[Field end]");
mVisitorIsInsideField = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(final FieldSeparator fieldSeparator) {
indentAndAppendLine("[FieldSeparator]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the field's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideField;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
fieldSeparator | FieldSeparator | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitFieldStart(FieldStart fieldStart)
public int visitFieldStart(FieldStart fieldStart)
Called when a field starts in the document.
Remarks:
A field in a Word document consists of a field code and field value.
For example, a field that displays a page number can be represented as follows:
[FieldStart]PAGE[FieldSeparator]98[FieldEnd]
The field separator separates field code from field value in the document. Note that some fields have only field code and do not have field separator and field value.
Fields can be nested.
Examples:
Shows how to print the node structure of every field in a document.
public void fieldToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
FieldStructurePrinter visitor = new FieldStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Field nodes and their children.
///
public static class FieldStructurePrinter extends DocumentVisitor {
public FieldStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideField = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideField) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(final FieldStart fieldStart) {
indentAndAppendLine("[Field start] FieldType: " + fieldStart.getFieldType());
mDocTraversalDepth++;
mVisitorIsInsideField = true;
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(final FieldEnd fieldEnd) {
mDocTraversalDepth--;
indentAndAppendLine("[Field end]");
mVisitorIsInsideField = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(final FieldSeparator fieldSeparator) {
indentAndAppendLine("[FieldSeparator]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the field's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideField;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
fieldStart | FieldStart | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitFootnoteEnd(Footnote footnote)
public int visitFootnoteEnd(Footnote footnote)
Called when enumeration of a footnote or endnote text has ended.
Examples:
Shows how to print the node structure of every footnote in a document.
public void footnoteToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
FootnoteStructurePrinter visitor = new FootnoteStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Footnote nodes and their children.
///
public static class FootnoteStructurePrinter extends DocumentVisitor {
public FootnoteStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideFootnote = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Footnote node is encountered in the document.
///
public int visitFootnoteStart(final Footnote footnote) {
indentAndAppendLine("[Footnote start] Type: " + footnote.getFootnoteType());
mDocTraversalDepth++;
mVisitorIsInsideFootnote = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Footnote node have been visited.
///
public int visitFootnoteEnd(final Footnote footnote) {
mDocTraversalDepth--;
indentAndAppendLine("[Footnote end]");
mVisitorIsInsideFootnote = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideFootnote) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideFootnote;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
footnote | Footnote | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitFootnoteStart(Footnote footnote)
public int visitFootnoteStart(Footnote footnote)
Called when enumeration of a footnote or endnote text has started.
Examples:
Shows how to print the node structure of every footnote in a document.
public void footnoteToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
FootnoteStructurePrinter visitor = new FootnoteStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Footnote nodes and their children.
///
public static class FootnoteStructurePrinter extends DocumentVisitor {
public FootnoteStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideFootnote = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Footnote node is encountered in the document.
///
public int visitFootnoteStart(final Footnote footnote) {
indentAndAppendLine("[Footnote start] Type: " + footnote.getFootnoteType());
mDocTraversalDepth++;
mVisitorIsInsideFootnote = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Footnote node have been visited.
///
public int visitFootnoteEnd(final Footnote footnote) {
mDocTraversalDepth--;
indentAndAppendLine("[Footnote end]");
mVisitorIsInsideFootnote = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideFootnote) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideFootnote;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
footnote | Footnote | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitFormField(FormField formField)
public int visitFormField(FormField formField)
Called when a form field is encountered in the document.
Examples:
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
formField | FormField | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitGlossaryDocumentEnd(GlossaryDocument glossary)
public int visitGlossaryDocumentEnd(GlossaryDocument glossary)
Called when enumeration of a glossary document has ended.
Remarks:
Note: A glossary document node and its children are not visited when you execute a Visitor over a Document. If you want to execute a Visitor over a glossary document, you need to call GlossaryDocument.accept(com.aspose.words.DocumentVisitor).
Examples:
Shows ways of accessing building blocks in a glossary document.
public void glossaryDocument() throws Exception {
Document doc = new Document();
GlossaryDocument glossaryDoc = new GlossaryDocument();
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 1"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 2"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 3"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 4"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 5"));
Assert.assertEquals(glossaryDoc.getBuildingBlocks().getCount(), 5);
doc.setGlossaryDocument(glossaryDoc);
// There are various ways of accessing building blocks.
// 1 - Get the first/last building blocks in the collection:
Assert.assertEquals("Block 1", glossaryDoc.getFirstBuildingBlock().getName());
Assert.assertEquals("Block 5", glossaryDoc.getLastBuildingBlock().getName());
// 2 - Get a building block by index:
Assert.assertEquals("Block 2", glossaryDoc.getBuildingBlocks().get(1).getName());
Assert.assertEquals("Block 3", glossaryDoc.getBuildingBlocks().toArray()[2].getName());
// 3 - Get the first building block that matches a gallery, name and category:
Assert.assertEquals("Block 4",
glossaryDoc.getBuildingBlock(BuildingBlockGallery.ALL, "(Empty Category)", "Block 4").getName());
// We will do that using a custom visitor,
// which will give every BuildingBlock in the GlossaryDocument a unique GUID
GlossaryDocVisitor visitor = new GlossaryDocVisitor();
// Visit start/end of the Glossary document.
glossaryDoc.accept(visitor);
// Visit only start of the Glossary document.
glossaryDoc.acceptStart(visitor);
// Visit only end of the Glossary document.
glossaryDoc.acceptEnd(visitor);
System.out.println(visitor.getText());
// In Microsoft Word, we can access the building blocks via "Insert" -> "Quick Parts" -> "Building Blocks Organizer".
doc.save(getArtifactsDir() + "BuildingBlocks.GlossaryDocument.dotx");
}
public static BuildingBlock createNewBuildingBlock(final GlossaryDocument glossaryDoc, final String buildingBlockName) {
BuildingBlock buildingBlock = new BuildingBlock(glossaryDoc);
buildingBlock.setName(buildingBlockName);
return buildingBlock;
}
///
/// Gives each building block in a visited glossary document a unique GUID.
/// Stores the GUID-building block pairs in a dictionary.
///
public static class GlossaryDocVisitor extends DocumentVisitor {
public GlossaryDocVisitor() {
mBlocksByGuid = new HashMap<>();
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public HashMap getDictionary() {
return mBlocksByGuid;
}
public int visitGlossaryDocumentStart(final GlossaryDocument glossary) {
mBuilder.append("Glossary document found!\n");
return VisitorAction.CONTINUE;
}
public int visitGlossaryDocumentEnd(final GlossaryDocument glossary) {
mBuilder.append("Reached end of glossary!\n");
mBuilder.append("BuildingBlocks found: " + mBlocksByGuid.size() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockStart(final BuildingBlock block) {
block.setGuid(UUID.randomUUID());
mBlocksByGuid.put(block.getGuid(), block);
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockEnd(final BuildingBlock block) {
mBuilder.append("\tVisited block \"" + block.getName() + "\"" + "\r\n");
mBuilder.append("\t Type: " + block.getType() + "\r\n");
mBuilder.append("\t Gallery: " + block.getGallery() + "\r\n");
mBuilder.append("\t Behavior: " + block.getBehavior() + "\r\n");
mBuilder.append("\t Description: " + block.getDescription() + "\r\n");
return VisitorAction.CONTINUE;
}
private final HashMap mBlocksByGuid;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
glossary | GlossaryDocument | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitGlossaryDocumentStart(GlossaryDocument glossary)
public int visitGlossaryDocumentStart(GlossaryDocument glossary)
Called when enumeration of a glossary document has started.
Remarks:
Note: A glossary document node and its children are not visited when you execute a Visitor over a Document. If you want to execute a Visitor over a glossary document, you need to call GlossaryDocument.accept(com.aspose.words.DocumentVisitor).
Examples:
Shows ways of accessing building blocks in a glossary document.
public void glossaryDocument() throws Exception {
Document doc = new Document();
GlossaryDocument glossaryDoc = new GlossaryDocument();
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 1"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 2"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 3"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 4"));
glossaryDoc.appendChild(createNewBuildingBlock(glossaryDoc, "Block 5"));
Assert.assertEquals(glossaryDoc.getBuildingBlocks().getCount(), 5);
doc.setGlossaryDocument(glossaryDoc);
// There are various ways of accessing building blocks.
// 1 - Get the first/last building blocks in the collection:
Assert.assertEquals("Block 1", glossaryDoc.getFirstBuildingBlock().getName());
Assert.assertEquals("Block 5", glossaryDoc.getLastBuildingBlock().getName());
// 2 - Get a building block by index:
Assert.assertEquals("Block 2", glossaryDoc.getBuildingBlocks().get(1).getName());
Assert.assertEquals("Block 3", glossaryDoc.getBuildingBlocks().toArray()[2].getName());
// 3 - Get the first building block that matches a gallery, name and category:
Assert.assertEquals("Block 4",
glossaryDoc.getBuildingBlock(BuildingBlockGallery.ALL, "(Empty Category)", "Block 4").getName());
// We will do that using a custom visitor,
// which will give every BuildingBlock in the GlossaryDocument a unique GUID
GlossaryDocVisitor visitor = new GlossaryDocVisitor();
// Visit start/end of the Glossary document.
glossaryDoc.accept(visitor);
// Visit only start of the Glossary document.
glossaryDoc.acceptStart(visitor);
// Visit only end of the Glossary document.
glossaryDoc.acceptEnd(visitor);
System.out.println(visitor.getText());
// In Microsoft Word, we can access the building blocks via "Insert" -> "Quick Parts" -> "Building Blocks Organizer".
doc.save(getArtifactsDir() + "BuildingBlocks.GlossaryDocument.dotx");
}
public static BuildingBlock createNewBuildingBlock(final GlossaryDocument glossaryDoc, final String buildingBlockName) {
BuildingBlock buildingBlock = new BuildingBlock(glossaryDoc);
buildingBlock.setName(buildingBlockName);
return buildingBlock;
}
///
/// Gives each building block in a visited glossary document a unique GUID.
/// Stores the GUID-building block pairs in a dictionary.
///
public static class GlossaryDocVisitor extends DocumentVisitor {
public GlossaryDocVisitor() {
mBlocksByGuid = new HashMap<>();
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public HashMap getDictionary() {
return mBlocksByGuid;
}
public int visitGlossaryDocumentStart(final GlossaryDocument glossary) {
mBuilder.append("Glossary document found!\n");
return VisitorAction.CONTINUE;
}
public int visitGlossaryDocumentEnd(final GlossaryDocument glossary) {
mBuilder.append("Reached end of glossary!\n");
mBuilder.append("BuildingBlocks found: " + mBlocksByGuid.size() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockStart(final BuildingBlock block) {
block.setGuid(UUID.randomUUID());
mBlocksByGuid.put(block.getGuid(), block);
return VisitorAction.CONTINUE;
}
public int visitBuildingBlockEnd(final BuildingBlock block) {
mBuilder.append("\tVisited block \"" + block.getName() + "\"" + "\r\n");
mBuilder.append("\t Type: " + block.getType() + "\r\n");
mBuilder.append("\t Gallery: " + block.getGallery() + "\r\n");
mBuilder.append("\t Behavior: " + block.getBehavior() + "\r\n");
mBuilder.append("\t Description: " + block.getDescription() + "\r\n");
return VisitorAction.CONTINUE;
}
private final HashMap mBlocksByGuid;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
glossary | GlossaryDocument | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitGroupShapeEnd(GroupShape groupShape)
public int visitGroupShapeEnd(GroupShape groupShape)
Called when enumeration of a group shape has ended.
Examples:
Shows how to create a group of shapes, and print its contents using a document visitor.
public void groupOfShapes() throws Exception {
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
// If you need to create "NonPrimitive" shapes, such as SingleCornerSnipped, TopCornersSnipped, DiagonalCornersSnipped,
// TopCornersOneRoundedOneSnipped, SingleCornerRounded, TopCornersRounded, DiagonalCornersRounded
// please use DocumentBuilder.InsertShape methods.
Shape balloon = new Shape(doc, ShapeType.BALLOON);
balloon.setWidth(200.0);
balloon.setHeight(200.0);
balloon.setStrokeColor(Color.RED);
Shape cube = new Shape(doc, ShapeType.CUBE);
cube.setWidth(100.0);
cube.setHeight(100.0);
cube.setStrokeColor(Color.BLUE);
GroupShape group = new GroupShape(doc);
group.appendChild(balloon);
group.appendChild(cube);
Assert.assertTrue(group.isGroup());
builder.insertNode(group);
ShapeInfoPrinter printer = new ShapeInfoPrinter();
group.accept(printer);
System.out.println(printer.getText());
}
///
/// Prints the contents of a visited shape group to the console.
///
public static class ShapeInfoPrinter extends DocumentVisitor {
public ShapeInfoPrinter() {
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public int visitGroupShapeStart(final GroupShape groupShape) {
mBuilder.append("Shape group started:\r\n");
return VisitorAction.CONTINUE;
}
public int visitGroupShapeEnd(final GroupShape groupShape) {
mBuilder.append("End of shape group\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeStart(final Shape shape) {
mBuilder.append("\tShape - " + shape.getShapeType() + ":\r\n");
mBuilder.append("\t\tWidth: " + shape.getWidth() + "\r\n");
mBuilder.append("\t\tHeight: " + shape.getHeight() + "\r\n");
mBuilder.append("\t\tStroke color: " + shape.getStroke().getColor() + "\r\n");
mBuilder.append("\t\tFill color: " + shape.getFill().getForeColor() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeEnd(final Shape shape) {
mBuilder.append("\tEnd of shape\r\n");
return VisitorAction.CONTINUE;
}
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
groupShape | GroupShape | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitGroupShapeStart(GroupShape groupShape)
public int visitGroupShapeStart(GroupShape groupShape)
Called when enumeration of a group shape has started.
Examples:
Shows how to create a group of shapes, and print its contents using a document visitor.
public void groupOfShapes() throws Exception {
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
// If you need to create "NonPrimitive" shapes, such as SingleCornerSnipped, TopCornersSnipped, DiagonalCornersSnipped,
// TopCornersOneRoundedOneSnipped, SingleCornerRounded, TopCornersRounded, DiagonalCornersRounded
// please use DocumentBuilder.InsertShape methods.
Shape balloon = new Shape(doc, ShapeType.BALLOON);
balloon.setWidth(200.0);
balloon.setHeight(200.0);
balloon.setStrokeColor(Color.RED);
Shape cube = new Shape(doc, ShapeType.CUBE);
cube.setWidth(100.0);
cube.setHeight(100.0);
cube.setStrokeColor(Color.BLUE);
GroupShape group = new GroupShape(doc);
group.appendChild(balloon);
group.appendChild(cube);
Assert.assertTrue(group.isGroup());
builder.insertNode(group);
ShapeInfoPrinter printer = new ShapeInfoPrinter();
group.accept(printer);
System.out.println(printer.getText());
}
///
/// Prints the contents of a visited shape group to the console.
///
public static class ShapeInfoPrinter extends DocumentVisitor {
public ShapeInfoPrinter() {
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public int visitGroupShapeStart(final GroupShape groupShape) {
mBuilder.append("Shape group started:\r\n");
return VisitorAction.CONTINUE;
}
public int visitGroupShapeEnd(final GroupShape groupShape) {
mBuilder.append("End of shape group\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeStart(final Shape shape) {
mBuilder.append("\tShape - " + shape.getShapeType() + ":\r\n");
mBuilder.append("\t\tWidth: " + shape.getWidth() + "\r\n");
mBuilder.append("\t\tHeight: " + shape.getHeight() + "\r\n");
mBuilder.append("\t\tStroke color: " + shape.getStroke().getColor() + "\r\n");
mBuilder.append("\t\tFill color: " + shape.getFill().getForeColor() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeEnd(final Shape shape) {
mBuilder.append("\tEnd of shape\r\n");
return VisitorAction.CONTINUE;
}
private final StringBuilder mBuilder;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
groupShape | GroupShape | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitHeaderFooterEnd(HeaderFooter headerFooter)
public int visitHeaderFooterEnd(HeaderFooter headerFooter)
Called when enumeration of a header or footer in a section has ended.
Examples:
Shows how to print the node structure of every header and footer in a document.
public void headerFooterToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
HeaderFooterStructurePrinter visitor = new HeaderFooterStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
// An alternative way of accessing a document's header/footers section-by-section is by accessing the collection.
HeaderFooter[] headerFooters = doc.getFirstSection().getHeadersFooters().toArray();
Assert.assertEquals(3, headerFooters.length);
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered HeaderFooter nodes and their children.
///
public static class HeaderFooterStructurePrinter extends DocumentVisitor {
public HeaderFooterStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideHeaderFooter = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideHeaderFooter) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a HeaderFooter node is encountered in the document.
///
public int visitHeaderFooterStart(final HeaderFooter headerFooter) {
indentAndAppendLine("[HeaderFooter start] HeaderFooterType: " + headerFooter.getHeaderFooterType());
mDocTraversalDepth++;
mVisitorIsInsideHeaderFooter = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a HeaderFooter node have been visited.
///
public int visitHeaderFooterEnd(final HeaderFooter headerFooter) {
mDocTraversalDepth--;
indentAndAppendLine("[HeaderFooter end]");
mVisitorIsInsideHeaderFooter = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideHeaderFooter;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
headerFooter | HeaderFooter | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitHeaderFooterStart(HeaderFooter headerFooter)
public int visitHeaderFooterStart(HeaderFooter headerFooter)
Called when enumeration of a header or footer in a section has started.
Examples:
Shows how to print the node structure of every header and footer in a document.
public void headerFooterToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
HeaderFooterStructurePrinter visitor = new HeaderFooterStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
// An alternative way of accessing a document's header/footers section-by-section is by accessing the collection.
HeaderFooter[] headerFooters = doc.getFirstSection().getHeadersFooters().toArray();
Assert.assertEquals(3, headerFooters.length);
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered HeaderFooter nodes and their children.
///
public static class HeaderFooterStructurePrinter extends DocumentVisitor {
public HeaderFooterStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideHeaderFooter = false;
}
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideHeaderFooter) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a HeaderFooter node is encountered in the document.
///
public int visitHeaderFooterStart(final HeaderFooter headerFooter) {
indentAndAppendLine("[HeaderFooter start] HeaderFooterType: " + headerFooter.getHeaderFooterType());
mDocTraversalDepth++;
mVisitorIsInsideHeaderFooter = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a HeaderFooter node have been visited.
///
public int visitHeaderFooterEnd(final HeaderFooter headerFooter) {
mDocTraversalDepth--;
indentAndAppendLine("[HeaderFooter end]");
mVisitorIsInsideHeaderFooter = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideHeaderFooter;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
headerFooter | HeaderFooter | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitOfficeMathEnd(OfficeMath officeMath)
public int visitOfficeMathEnd(OfficeMath officeMath)
Called when enumeration of a Office Math object has ended.
Examples:
Shows how to print the node structure of every office math node in a document.
public void officeMathToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
OfficeMathStructurePrinter visitor = new OfficeMathStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered OfficeMath nodes and their children.
///
public static class OfficeMathStructurePrinter extends DocumentVisitor {
public OfficeMathStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideOfficeMath = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideOfficeMath) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when an OfficeMath node is encountered in the document.
///
public int visitOfficeMathStart(final OfficeMath officeMath) {
indentAndAppendLine("[OfficeMath start] Math object type: " + officeMath.getMathObjectType());
mDocTraversalDepth++;
mVisitorIsInsideOfficeMath = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of an OfficeMath node have been visited.
///
public int visitOfficeMathEnd(final OfficeMath officeMath) {
mDocTraversalDepth--;
indentAndAppendLine("[OfficeMath end]");
mVisitorIsInsideOfficeMath = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideOfficeMath;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
officeMath | OfficeMath | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitOfficeMathStart(OfficeMath officeMath)
public int visitOfficeMathStart(OfficeMath officeMath)
Called when enumeration of a Office Math object has started.
Examples:
Shows how to print the node structure of every office math node in a document.
public void officeMathToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
OfficeMathStructurePrinter visitor = new OfficeMathStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered OfficeMath nodes and their children.
///
public static class OfficeMathStructurePrinter extends DocumentVisitor {
public OfficeMathStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideOfficeMath = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideOfficeMath) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when an OfficeMath node is encountered in the document.
///
public int visitOfficeMathStart(final OfficeMath officeMath) {
indentAndAppendLine("[OfficeMath start] Math object type: " + officeMath.getMathObjectType());
mDocTraversalDepth++;
mVisitorIsInsideOfficeMath = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of an OfficeMath node have been visited.
///
public int visitOfficeMathEnd(final OfficeMath officeMath) {
mDocTraversalDepth--;
indentAndAppendLine("[OfficeMath end]");
mVisitorIsInsideOfficeMath = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideOfficeMath;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
officeMath | OfficeMath | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitParagraphEnd(Paragraph paragraph)
public int visitParagraphEnd(Paragraph paragraph)
Called when enumeration of a paragraph has ended.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
paragraph | Paragraph | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitParagraphStart(Paragraph paragraph)
public int visitParagraphStart(Paragraph paragraph)
Called when enumeration of a paragraph has started.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
paragraph | Paragraph | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitRowEnd(Row row)
public int visitRowEnd(Row row)
Called when enumeration of a table row has ended.
Examples:
Shows how to print the node structure of every table in a document.
public void tableToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
TableStructurePrinter visitor = new TableStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Table nodes and their children.
///
public static class TableStructurePrinter extends DocumentVisitor {
public TableStructurePrinter() {
mVisitedTables = new StringBuilder();
mVisitorIsInsideTable = false;
}
public String getText() {
return mVisitedTables.toString();
}
///
/// Called when a Run node is encountered in the document.
/// Runs that are not within tables are not recorded.
///
public int visitRun(Run run) {
if (mVisitorIsInsideTable) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a Table is encountered in the document.
///
public int visitTableStart(final Table table) {
int rows = 0;
int columns = 0;
if (table.getRows().getCount() > 0) {
rows = table.getRows().getCount();
columns = table.getFirstRow().getCount();
}
indentAndAppendLine("[Table start] Size: " + rows + "x" + columns);
mDocTraversalDepth++;
mVisitorIsInsideTable = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Table node have been visited.
///
public int visitTableEnd(final Table table) {
mDocTraversalDepth--;
indentAndAppendLine("[Table end]");
mVisitorIsInsideTable = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Row node is encountered in the document.
///
public int visitRowStart(final Row row) {
String rowContents = row.getText().replaceAll("\\u0007", ", ").replaceAll(", , ", "");
int rowWidth = row.indexOf(row.getLastCell()) + 1;
int rowIndex = row.getParentTable().indexOf(row);
String rowStatusInTable = row.isFirstRow() && row.isLastRow() ? "only" : row.isFirstRow() ? "first" : row.isLastRow() ? "last" : "";
if (!"".equals(rowStatusInTable)) {
rowStatusInTable = MessageFormat.format(", the {0} row in this table,", rowStatusInTable);
}
indentAndAppendLine(MessageFormat.format("[Row start] Row #{0}{1} width {2}, \"{3}\"", ++rowIndex, rowStatusInTable, rowWidth, rowContents));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Row node have been visited.
///
public int visitRowEnd(final Row row) {
mDocTraversalDepth--;
indentAndAppendLine("[Row end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Cell node is encountered in the document.
///
public int visitCellStart(final Cell cell) {
Row row = cell.getParentRow();
Table table = row.getParentTable();
String cellStatusInRow = cell.isFirstCell() && cell.isLastCell() ? "only" : cell.isFirstCell() ? "first" : cell.isLastCell() ? "last" : "";
if (!"".equals(cellStatusInRow)) {
cellStatusInRow = MessageFormat.format(", the {0} cell in this row", cellStatusInRow);
}
indentAndAppendLine(MessageFormat.format("[Cell start] Row {0}, Col {1}{2}", table.indexOf(row) + 1, row.indexOf(cell) + 1, cellStatusInRow));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Cell node have been visited.
///
public int visitCellEnd(final Cell cell) {
mDocTraversalDepth--;
indentAndAppendLine("[Cell end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the current table's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mVisitedTables.append("| ");
}
mVisitedTables.append(text + "\r\n");
}
private boolean mVisitorIsInsideTable;
private int mDocTraversalDepth;
private final StringBuilder mVisitedTables;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
row | Row | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitRowStart(Row row)
public int visitRowStart(Row row)
Called when enumeration of a table row has started.
Examples:
Shows how to print the node structure of every table in a document.
public void tableToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
TableStructurePrinter visitor = new TableStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Table nodes and their children.
///
public static class TableStructurePrinter extends DocumentVisitor {
public TableStructurePrinter() {
mVisitedTables = new StringBuilder();
mVisitorIsInsideTable = false;
}
public String getText() {
return mVisitedTables.toString();
}
///
/// Called when a Run node is encountered in the document.
/// Runs that are not within tables are not recorded.
///
public int visitRun(Run run) {
if (mVisitorIsInsideTable) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a Table is encountered in the document.
///
public int visitTableStart(final Table table) {
int rows = 0;
int columns = 0;
if (table.getRows().getCount() > 0) {
rows = table.getRows().getCount();
columns = table.getFirstRow().getCount();
}
indentAndAppendLine("[Table start] Size: " + rows + "x" + columns);
mDocTraversalDepth++;
mVisitorIsInsideTable = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Table node have been visited.
///
public int visitTableEnd(final Table table) {
mDocTraversalDepth--;
indentAndAppendLine("[Table end]");
mVisitorIsInsideTable = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Row node is encountered in the document.
///
public int visitRowStart(final Row row) {
String rowContents = row.getText().replaceAll("\\u0007", ", ").replaceAll(", , ", "");
int rowWidth = row.indexOf(row.getLastCell()) + 1;
int rowIndex = row.getParentTable().indexOf(row);
String rowStatusInTable = row.isFirstRow() && row.isLastRow() ? "only" : row.isFirstRow() ? "first" : row.isLastRow() ? "last" : "";
if (!"".equals(rowStatusInTable)) {
rowStatusInTable = MessageFormat.format(", the {0} row in this table,", rowStatusInTable);
}
indentAndAppendLine(MessageFormat.format("[Row start] Row #{0}{1} width {2}, \"{3}\"", ++rowIndex, rowStatusInTable, rowWidth, rowContents));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Row node have been visited.
///
public int visitRowEnd(final Row row) {
mDocTraversalDepth--;
indentAndAppendLine("[Row end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Cell node is encountered in the document.
///
public int visitCellStart(final Cell cell) {
Row row = cell.getParentRow();
Table table = row.getParentTable();
String cellStatusInRow = cell.isFirstCell() && cell.isLastCell() ? "only" : cell.isFirstCell() ? "first" : cell.isLastCell() ? "last" : "";
if (!"".equals(cellStatusInRow)) {
cellStatusInRow = MessageFormat.format(", the {0} cell in this row", cellStatusInRow);
}
indentAndAppendLine(MessageFormat.format("[Cell start] Row {0}, Col {1}{2}", table.indexOf(row) + 1, row.indexOf(cell) + 1, cellStatusInRow));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Cell node have been visited.
///
public int visitCellEnd(final Cell cell) {
mDocTraversalDepth--;
indentAndAppendLine("[Cell end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the current table's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mVisitedTables.append("| ");
}
mVisitedTables.append(text + "\r\n");
}
private boolean mVisitorIsInsideTable;
private int mDocTraversalDepth;
private final StringBuilder mVisitedTables;
}
Parameters:
Parameter | Type | Description |
---|---|---|
row | Row | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitRun(Run run)
public int visitRun(Run run)
Called when a run of text in the is encountered.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
run | Run | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitSectionEnd(Section section)
public int visitSectionEnd(Section section)
Called when enumeration of a section has ended.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
section | Section | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitSectionStart(Section section)
public int visitSectionStart(Section section)
Called when enumeration of a section has started.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
section | Section | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitShapeEnd(Shape shape)
public int visitShapeEnd(Shape shape)
Called when enumeration of a shape has ended.
Examples:
Shows how to create a group of shapes, and print its contents using a document visitor.
public void groupOfShapes() throws Exception {
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
// If you need to create "NonPrimitive" shapes, such as SingleCornerSnipped, TopCornersSnipped, DiagonalCornersSnipped,
// TopCornersOneRoundedOneSnipped, SingleCornerRounded, TopCornersRounded, DiagonalCornersRounded
// please use DocumentBuilder.InsertShape methods.
Shape balloon = new Shape(doc, ShapeType.BALLOON);
balloon.setWidth(200.0);
balloon.setHeight(200.0);
balloon.setStrokeColor(Color.RED);
Shape cube = new Shape(doc, ShapeType.CUBE);
cube.setWidth(100.0);
cube.setHeight(100.0);
cube.setStrokeColor(Color.BLUE);
GroupShape group = new GroupShape(doc);
group.appendChild(balloon);
group.appendChild(cube);
Assert.assertTrue(group.isGroup());
builder.insertNode(group);
ShapeInfoPrinter printer = new ShapeInfoPrinter();
group.accept(printer);
System.out.println(printer.getText());
}
///
/// Prints the contents of a visited shape group to the console.
///
public static class ShapeInfoPrinter extends DocumentVisitor {
public ShapeInfoPrinter() {
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public int visitGroupShapeStart(final GroupShape groupShape) {
mBuilder.append("Shape group started:\r\n");
return VisitorAction.CONTINUE;
}
public int visitGroupShapeEnd(final GroupShape groupShape) {
mBuilder.append("End of shape group\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeStart(final Shape shape) {
mBuilder.append("\tShape - " + shape.getShapeType() + ":\r\n");
mBuilder.append("\t\tWidth: " + shape.getWidth() + "\r\n");
mBuilder.append("\t\tHeight: " + shape.getHeight() + "\r\n");
mBuilder.append("\t\tStroke color: " + shape.getStroke().getColor() + "\r\n");
mBuilder.append("\t\tFill color: " + shape.getFill().getForeColor() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeEnd(final Shape shape) {
mBuilder.append("\tEnd of shape\r\n");
return VisitorAction.CONTINUE;
}
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
shape | Shape | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitShapeStart(Shape shape)
public int visitShapeStart(Shape shape)
Called when enumeration of a shape has started.
Examples:
Shows how to create a group of shapes, and print its contents using a document visitor.
public void groupOfShapes() throws Exception {
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
// If you need to create "NonPrimitive" shapes, such as SingleCornerSnipped, TopCornersSnipped, DiagonalCornersSnipped,
// TopCornersOneRoundedOneSnipped, SingleCornerRounded, TopCornersRounded, DiagonalCornersRounded
// please use DocumentBuilder.InsertShape methods.
Shape balloon = new Shape(doc, ShapeType.BALLOON);
balloon.setWidth(200.0);
balloon.setHeight(200.0);
balloon.setStrokeColor(Color.RED);
Shape cube = new Shape(doc, ShapeType.CUBE);
cube.setWidth(100.0);
cube.setHeight(100.0);
cube.setStrokeColor(Color.BLUE);
GroupShape group = new GroupShape(doc);
group.appendChild(balloon);
group.appendChild(cube);
Assert.assertTrue(group.isGroup());
builder.insertNode(group);
ShapeInfoPrinter printer = new ShapeInfoPrinter();
group.accept(printer);
System.out.println(printer.getText());
}
///
/// Prints the contents of a visited shape group to the console.
///
public static class ShapeInfoPrinter extends DocumentVisitor {
public ShapeInfoPrinter() {
mBuilder = new StringBuilder();
}
public String getText() {
return mBuilder.toString();
}
public int visitGroupShapeStart(final GroupShape groupShape) {
mBuilder.append("Shape group started:\r\n");
return VisitorAction.CONTINUE;
}
public int visitGroupShapeEnd(final GroupShape groupShape) {
mBuilder.append("End of shape group\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeStart(final Shape shape) {
mBuilder.append("\tShape - " + shape.getShapeType() + ":\r\n");
mBuilder.append("\t\tWidth: " + shape.getWidth() + "\r\n");
mBuilder.append("\t\tHeight: " + shape.getHeight() + "\r\n");
mBuilder.append("\t\tStroke color: " + shape.getStroke().getColor() + "\r\n");
mBuilder.append("\t\tFill color: " + shape.getFill().getForeColor() + "\r\n");
return VisitorAction.CONTINUE;
}
public int visitShapeEnd(final Shape shape) {
mBuilder.append("\tEnd of shape\r\n");
return VisitorAction.CONTINUE;
}
private final StringBuilder mBuilder;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
shape | Shape | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitSmartTagEnd(SmartTag smartTag)
public int visitSmartTagEnd(SmartTag smartTag)
Called when enumeration of a smart tag has ended.
Examples:
Shows how to print the node structure of every smart tag in a document.
public void smartTagToText() throws Exception {
Document doc = new Document(getMyDir() + "Smart tags.doc");
SmartTagStructurePrinter visitor = new SmartTagStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered SmartTag nodes and their children.
///
public static class SmartTagStructurePrinter extends DocumentVisitor {
public SmartTagStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideSmartTag = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideSmartTag) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a SmartTag node is encountered in the document.
///
public int visitSmartTagStart(final SmartTag smartTag) {
indentAndAppendLine("[SmartTag start] Name: " + smartTag.getElement());
mDocTraversalDepth++;
mVisitorIsInsideSmartTag = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a SmartTag node have been visited.
///
public int visitSmartTagEnd(final SmartTag smartTag) {
mDocTraversalDepth--;
indentAndAppendLine("[SmartTag end]");
mVisitorIsInsideSmartTag = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideSmartTag;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
smartTag | SmartTag | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitSmartTagStart(SmartTag smartTag)
public int visitSmartTagStart(SmartTag smartTag)
Called when enumeration of a smart tag has started.
Examples:
Shows how to print the node structure of every smart tag in a document.
public void smartTagToText() throws Exception {
Document doc = new Document(getMyDir() + "Smart tags.doc");
SmartTagStructurePrinter visitor = new SmartTagStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered SmartTag nodes and their children.
///
public static class SmartTagStructurePrinter extends DocumentVisitor {
public SmartTagStructurePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideSmartTag = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideSmartTag) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a SmartTag node is encountered in the document.
///
public int visitSmartTagStart(final SmartTag smartTag) {
indentAndAppendLine("[SmartTag start] Name: " + smartTag.getElement());
mDocTraversalDepth++;
mVisitorIsInsideSmartTag = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a SmartTag node have been visited.
///
public int visitSmartTagEnd(final SmartTag smartTag) {
mDocTraversalDepth--;
indentAndAppendLine("[SmartTag end]");
mVisitorIsInsideSmartTag = false;
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private boolean mVisitorIsInsideSmartTag;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
smartTag | SmartTag | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitSpecialChar(SpecialChar specialChar)
public int visitSpecialChar(SpecialChar specialChar)
Called when a SpecialChar node is encountered in the document.
Remarks:
This method is not be called for generic control characters (see ControlChar) that can be present in the document.
Examples:
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
specialChar | SpecialChar | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitStructuredDocumentTagEnd(StructuredDocumentTag sdt)
public int visitStructuredDocumentTagEnd(StructuredDocumentTag sdt)
Called when enumeration of a structured document tag has ended.
Examples:
Shows how to print the node structure of every structured document tag in a document.
public void structuredDocumentTagToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
StructuredDocumentTagNodePrinter visitor = new StructuredDocumentTagNodePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered StructuredDocumentTag nodes and their children.
///
public static class StructuredDocumentTagNodePrinter extends DocumentVisitor {
public StructuredDocumentTagNodePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideStructuredDocumentTag = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideStructuredDocumentTag) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a StructuredDocumentTag node is encountered in the document.
///
public int visitStructuredDocumentTagStart(final StructuredDocumentTag sdt) {
indentAndAppendLine("[StructuredDocumentTag start] Title: " + sdt.getTitle());
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a StructuredDocumentTag node have been visited.
///
public int visitStructuredDocumentTagEnd(final StructuredDocumentTag sdt) {
mDocTraversalDepth--;
indentAndAppendLine("[StructuredDocumentTag end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private final boolean mVisitorIsInsideStructuredDocumentTag;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
sdt | StructuredDocumentTag | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
Called when a StructuredDocumentTagRangeEnd is encountered.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
sdtRangeEnd | StructuredDocumentTagRangeEnd |
Returns: int
visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
Called when a StructuredDocumentTagRangeStart is encountered.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
sdtRangeStart | StructuredDocumentTagRangeStart |
Returns: int
visitStructuredDocumentTagStart(StructuredDocumentTag sdt)
public int visitStructuredDocumentTagStart(StructuredDocumentTag sdt)
Called when enumeration of a structured document tag has started.
Examples:
Shows how to print the node structure of every structured document tag in a document.
public void structuredDocumentTagToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
StructuredDocumentTagNodePrinter visitor = new StructuredDocumentTagNodePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered StructuredDocumentTag nodes and their children.
///
public static class StructuredDocumentTagNodePrinter extends DocumentVisitor {
public StructuredDocumentTagNodePrinter() {
mBuilder = new StringBuilder();
mVisitorIsInsideStructuredDocumentTag = false;
}
///
/// Gets the plain text of the document that was accumulated by the visitor.
///
public String getText() {
return mBuilder.toString();
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
if (mVisitorIsInsideStructuredDocumentTag) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
}
return VisitorAction.CONTINUE;
}
///
/// Called when a StructuredDocumentTag node is encountered in the document.
///
public int visitStructuredDocumentTagStart(final StructuredDocumentTag sdt) {
indentAndAppendLine("[StructuredDocumentTag start] Title: " + sdt.getTitle());
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a StructuredDocumentTag node have been visited.
///
public int visitStructuredDocumentTagEnd(final StructuredDocumentTag sdt) {
mDocTraversalDepth--;
indentAndAppendLine("[StructuredDocumentTag end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mBuilder.append("| ");
}
mBuilder.append(text + "\r\n");
}
private final boolean mVisitorIsInsideStructuredDocumentTag;
private int mDocTraversalDepth;
private final StringBuilder mBuilder;
}
Parameters:
Parameter | Type | Description |
---|---|---|
sdt | StructuredDocumentTag | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitSubDocument(SubDocument subDocument)
public int visitSubDocument(SubDocument subDocument)
Called when a sub-document is encountered.
Examples:
Shows how to use a document visitor to print a document’s node structure.
public void docStructureToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
DocStructurePrinter visitor = new DocStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's tree of child nodes.
/// Creates a map of this tree in the form of a string.
///
public static class DocStructurePrinter extends DocumentVisitor {
public DocStructurePrinter() {
mAcceptingNodeChildTree = new StringBuilder();
}
public String getText() {
return mAcceptingNodeChildTree.toString();
}
///
/// Called when a Document node is encountered.
///
public int visitDocumentStart(Document doc) {
int childNodeCount = doc.getChildNodes(NodeType.ANY, true).getCount();
indentAndAppendLine("[Document start] Child nodes: " + childNodeCount);
mDocTraversalDepth++;
// Allow the visitor to continue visiting other nodes.
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Document node have been visited.
///
public int visitDocumentEnd(Document doc) {
mDocTraversalDepth--;
indentAndAppendLine("[Document end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Section node is encountered in the document.
///
public int visitSectionStart(final Section section) {
// Get the index of our section within the document
NodeCollection docSections = section.getDocument().getChildNodes(NodeType.SECTION, false);
int sectionIndex = docSections.indexOf(section);
indentAndAppendLine("[Section start] Section index: " + sectionIndex);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Section node have been visited.
///
public int visitSectionEnd(final Section section) {
mDocTraversalDepth--;
indentAndAppendLine("[Section end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Body node is encountered in the document.
///
public int visitBodyStart(final Body body) {
int paragraphCount = body.getParagraphs().getCount();
indentAndAppendLine("[Body start] Paragraphs: " + paragraphCount);
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Body node have been visited.
///
public int visitBodyEnd(final Body body) {
mDocTraversalDepth--;
indentAndAppendLine("[Body end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(final Paragraph paragraph) {
indentAndAppendLine("[Paragraph start]");
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Paragraph node have been visited.
///
public int visitParagraphEnd(final Paragraph paragraph) {
mDocTraversalDepth--;
indentAndAppendLine("[Paragraph end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(final Run run) {
indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitSubDocument(final SubDocument subDocument) {
indentAndAppendLine("[SubDocument]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeStart(StructuredDocumentTagRangeStart sdtRangeStart)
{
indentAndAppendLine("[SdtRangeStart]");
return VisitorAction.CONTINUE;
}
///
/// Called when a SubDocument node is encountered in the document.
///
public int visitStructuredDocumentTagRangeEnd(StructuredDocumentTagRangeEnd sdtRangeEnd)
{
indentAndAppendLine("[SdtRangeEnd]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder and indent it depending on how deep the visitor is into the document tree.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mAcceptingNodeChildTree.append("| ");
}
mAcceptingNodeChildTree.append(text + "\r\n");
}
private int mDocTraversalDepth;
private final StringBuilder mAcceptingNodeChildTree;
}
Parameters:
Parameter | Type | Description |
---|---|---|
subDocument | SubDocument | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitTableEnd(Table table)
public int visitTableEnd(Table table)
Called when enumeration of a table has ended.
Examples:
Shows how to print the node structure of every table in a document.
public void tableToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
TableStructurePrinter visitor = new TableStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Table nodes and their children.
///
public static class TableStructurePrinter extends DocumentVisitor {
public TableStructurePrinter() {
mVisitedTables = new StringBuilder();
mVisitorIsInsideTable = false;
}
public String getText() {
return mVisitedTables.toString();
}
///
/// Called when a Run node is encountered in the document.
/// Runs that are not within tables are not recorded.
///
public int visitRun(Run run) {
if (mVisitorIsInsideTable) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a Table is encountered in the document.
///
public int visitTableStart(final Table table) {
int rows = 0;
int columns = 0;
if (table.getRows().getCount() > 0) {
rows = table.getRows().getCount();
columns = table.getFirstRow().getCount();
}
indentAndAppendLine("[Table start] Size: " + rows + "x" + columns);
mDocTraversalDepth++;
mVisitorIsInsideTable = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Table node have been visited.
///
public int visitTableEnd(final Table table) {
mDocTraversalDepth--;
indentAndAppendLine("[Table end]");
mVisitorIsInsideTable = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Row node is encountered in the document.
///
public int visitRowStart(final Row row) {
String rowContents = row.getText().replaceAll("\\u0007", ", ").replaceAll(", , ", "");
int rowWidth = row.indexOf(row.getLastCell()) + 1;
int rowIndex = row.getParentTable().indexOf(row);
String rowStatusInTable = row.isFirstRow() && row.isLastRow() ? "only" : row.isFirstRow() ? "first" : row.isLastRow() ? "last" : "";
if (!"".equals(rowStatusInTable)) {
rowStatusInTable = MessageFormat.format(", the {0} row in this table,", rowStatusInTable);
}
indentAndAppendLine(MessageFormat.format("[Row start] Row #{0}{1} width {2}, \"{3}\"", ++rowIndex, rowStatusInTable, rowWidth, rowContents));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Row node have been visited.
///
public int visitRowEnd(final Row row) {
mDocTraversalDepth--;
indentAndAppendLine("[Row end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Cell node is encountered in the document.
///
public int visitCellStart(final Cell cell) {
Row row = cell.getParentRow();
Table table = row.getParentTable();
String cellStatusInRow = cell.isFirstCell() && cell.isLastCell() ? "only" : cell.isFirstCell() ? "first" : cell.isLastCell() ? "last" : "";
if (!"".equals(cellStatusInRow)) {
cellStatusInRow = MessageFormat.format(", the {0} cell in this row", cellStatusInRow);
}
indentAndAppendLine(MessageFormat.format("[Cell start] Row {0}, Col {1}{2}", table.indexOf(row) + 1, row.indexOf(cell) + 1, cellStatusInRow));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Cell node have been visited.
///
public int visitCellEnd(final Cell cell) {
mDocTraversalDepth--;
indentAndAppendLine("[Cell end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the current table's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mVisitedTables.append("| ");
}
mVisitedTables.append(text + "\r\n");
}
private boolean mVisitorIsInsideTable;
private int mDocTraversalDepth;
private final StringBuilder mVisitedTables;
}
Shows how to use a DocumentVisitor implementation to remove all hidden content from a document.
public void removeHiddenContentFromDocument() throws Exception {
Document doc = new Document(getMyDir() + "Hidden content.docx");
RemoveHiddenContentVisitor hiddenContentRemover = new RemoveHiddenContentVisitor();
// Below are three types of fields which can accept a document visitor,
// which will allow it to visit the accepting node, and then traverse its child nodes in a depth-first manner.
// 1 - Paragraph node:
Paragraph para = (Paragraph) doc.getChild(NodeType.PARAGRAPH, 4, true);
para.accept(hiddenContentRemover);
// 2 - Table node:
Table table = doc.getFirstSection().getBody().getTables().get(0);
table.accept(hiddenContentRemover);
// 3 - Document node:
doc.accept(hiddenContentRemover);
doc.save(getArtifactsDir() + "Font.RemoveHiddenContentFromDocument.docx");
}
///
/// Removes all visited nodes marked as "hidden content".
///
public static class RemoveHiddenContentVisitor extends DocumentVisitor {
///
/// Called when a FieldStart node is encountered in the document.
///
public int visitFieldStart(FieldStart fieldStart) {
if (fieldStart.getFont().getHidden())
fieldStart.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldEnd node is encountered in the document.
///
public int visitFieldEnd(FieldEnd fieldEnd) {
if (fieldEnd.getFont().getHidden())
fieldEnd.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FieldSeparator node is encountered in the document.
///
public int visitFieldSeparator(FieldSeparator fieldSeparator) {
if (fieldSeparator.getFont().getHidden())
fieldSeparator.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Run node is encountered in the document.
///
public int visitRun(Run run) {
if (run.getFont().getHidden())
run.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Paragraph node is encountered in the document.
///
public int visitParagraphStart(Paragraph paragraph) {
if (paragraph.getParagraphBreakFont().getHidden())
paragraph.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a FormField is encountered in the document.
///
public int visitFormField(FormField formField) {
if (formField.getFont().getHidden())
formField.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a GroupShape is encountered in the document.
///
public int visitGroupShapeStart(GroupShape groupShape) {
if (groupShape.getFont().getHidden())
groupShape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Shape is encountered in the document.
///
public int visitShapeStart(Shape shape) {
if (shape.getFont().getHidden())
shape.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Comment is encountered in the document.
///
public int visitCommentStart(Comment comment) {
if (comment.getFont().getHidden())
comment.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a Footnote is encountered in the document.
///
public int visitFootnoteStart(Footnote footnote) {
if (footnote.getFont().getHidden())
footnote.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when a SpecialCharacter is encountered in the document.
///
public int visitSpecialChar(SpecialChar specialChar) {
if (specialChar.getFont().getHidden())
specialChar.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Table node is ended in the document.
///
public int visitTableEnd(Table table) {
// The content inside table cells may have the hidden content flag, but the tables themselves cannot.
// If this table had nothing but hidden content, this visitor would have removed all of it,
// and there would be no child nodes left.
// Thus, we can also treat the table itself as hidden content and remove it.
// Tables which are empty but do not have hidden content will have cells with empty paragraphs inside,
// which this visitor will not remove.
if (!table.hasChildNodes())
table.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Cell node is ended in the document.
///
public int visitCellEnd(Cell cell) {
if (!cell.hasChildNodes() && cell.getParentNode() != null)
cell.remove();
return VisitorAction.CONTINUE;
}
///
/// Called when visiting of a Row node is ended in the document.
///
public int visitRowEnd(Row row) {
if (!row.hasChildNodes() && row.getParentNode() != null)
row.remove();
return VisitorAction.CONTINUE;
}
}
Parameters:
Parameter | Type | Description |
---|---|---|
table | Table | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.
visitTableStart(Table table)
public int visitTableStart(Table table)
Called when enumeration of a table has started.
Examples:
Shows how to print the node structure of every table in a document.
public void tableToText() throws Exception {
Document doc = new Document(getMyDir() + "DocumentVisitor-compatible features.docx");
TableStructurePrinter visitor = new TableStructurePrinter();
// When we get a composite node to accept a document visitor, the visitor visits the accepting node,
// and then traverses all the node's children in a depth-first manner.
// The visitor can read and modify each visited node.
doc.accept(visitor);
System.out.println(visitor.getText());
}
///
/// Traverses a node's non-binary tree of child nodes.
/// Creates a map in the form of a string of all encountered Table nodes and their children.
///
public static class TableStructurePrinter extends DocumentVisitor {
public TableStructurePrinter() {
mVisitedTables = new StringBuilder();
mVisitorIsInsideTable = false;
}
public String getText() {
return mVisitedTables.toString();
}
///
/// Called when a Run node is encountered in the document.
/// Runs that are not within tables are not recorded.
///
public int visitRun(Run run) {
if (mVisitorIsInsideTable) indentAndAppendLine("[Run] \"" + run.getText() + "\"");
return VisitorAction.CONTINUE;
}
///
/// Called when a Table is encountered in the document.
///
public int visitTableStart(final Table table) {
int rows = 0;
int columns = 0;
if (table.getRows().getCount() > 0) {
rows = table.getRows().getCount();
columns = table.getFirstRow().getCount();
}
indentAndAppendLine("[Table start] Size: " + rows + "x" + columns);
mDocTraversalDepth++;
mVisitorIsInsideTable = true;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Table node have been visited.
///
public int visitTableEnd(final Table table) {
mDocTraversalDepth--;
indentAndAppendLine("[Table end]");
mVisitorIsInsideTable = false;
return VisitorAction.CONTINUE;
}
///
/// Called when a Row node is encountered in the document.
///
public int visitRowStart(final Row row) {
String rowContents = row.getText().replaceAll("\\u0007", ", ").replaceAll(", , ", "");
int rowWidth = row.indexOf(row.getLastCell()) + 1;
int rowIndex = row.getParentTable().indexOf(row);
String rowStatusInTable = row.isFirstRow() && row.isLastRow() ? "only" : row.isFirstRow() ? "first" : row.isLastRow() ? "last" : "";
if (!"".equals(rowStatusInTable)) {
rowStatusInTable = MessageFormat.format(", the {0} row in this table,", rowStatusInTable);
}
indentAndAppendLine(MessageFormat.format("[Row start] Row #{0}{1} width {2}, \"{3}\"", ++rowIndex, rowStatusInTable, rowWidth, rowContents));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Row node have been visited.
///
public int visitRowEnd(final Row row) {
mDocTraversalDepth--;
indentAndAppendLine("[Row end]");
return VisitorAction.CONTINUE;
}
///
/// Called when a Cell node is encountered in the document.
///
public int visitCellStart(final Cell cell) {
Row row = cell.getParentRow();
Table table = row.getParentTable();
String cellStatusInRow = cell.isFirstCell() && cell.isLastCell() ? "only" : cell.isFirstCell() ? "first" : cell.isLastCell() ? "last" : "";
if (!"".equals(cellStatusInRow)) {
cellStatusInRow = MessageFormat.format(", the {0} cell in this row", cellStatusInRow);
}
indentAndAppendLine(MessageFormat.format("[Cell start] Row {0}, Col {1}{2}", table.indexOf(row) + 1, row.indexOf(cell) + 1, cellStatusInRow));
mDocTraversalDepth++;
return VisitorAction.CONTINUE;
}
///
/// Called after all the child nodes of a Cell node have been visited.
///
public int visitCellEnd(final Cell cell) {
mDocTraversalDepth--;
indentAndAppendLine("[Cell end]");
return VisitorAction.CONTINUE;
}
///
/// Append a line to the StringBuilder, and indent it depending on how deep the visitor is
/// into the current table's tree of child nodes.
///
///
private void indentAndAppendLine(final String text) {
for (int i = 0; i < mDocTraversalDepth; i++) {
mVisitedTables.append("| ");
}
mVisitedTables.append(text + "\r\n");
}
private boolean mVisitorIsInsideTable;
private int mDocTraversalDepth;
private final StringBuilder mVisitedTables;
}
Parameters:
Parameter | Type | Description |
---|---|---|
table | Table | The object that is being visited. |
Returns: int - A VisitorAction value that specifies how to continue the enumeration. The returned value is one of VisitorAction constants.