Comparing Documents for Differences

Introduction

Ever wondered how to spot every single difference between two Word documents? Maybe you’re revising a document or trying to find changes made by a collaborator. Manual comparisons can be tedious and error-prone, but with Aspose.Words for Java, it’s a breeze! This library enables you to automate document comparison, highlight revisions, and merge changes effortlessly.

Prerequisites

Before jumping into the code, ensure you have the following ready:

  1. Java Development Kit (JDK) installed on your system.
  2. Aspose.Words for Java library. You can download it here.
  3. A development environment like IntelliJ IDEA or Eclipse.
  4. Basic familiarity with Java programming.
  5. A valid Aspose license. If you don’t have one, get a temporary license here.

Import Packages

To use Aspose.Words, you need to import the necessary classes. Below are the required imports:

import com.aspose.words.*;
import java.util.Date;

Make sure these packages are correctly added to your project dependencies.

In this section, we’ll break down the process into simple steps.

Step 1: Set Up Your Documents

To start, you need two documents: one representing the original and the other representing the edited version. Here’s how you create them:

Document doc1 = new Document();
DocumentBuilder builder = new DocumentBuilder(doc1);
builder.writeln("This is the original document.");

Document doc2 = new Document();
builder = new DocumentBuilder(doc2);
builder.writeln("This is the edited document.");

This creates two documents in memory with basic content. You can also load existing Word documents using new Document("path/to/document.docx").

Step 2: Check for Existing Revisions

Revisions in Word documents represent tracked changes. Before comparing, ensure neither document contains pre-existing revisions:

if (doc1.getRevisions().getCount() == 0 && doc2.getRevisions().getCount() == 0) {
    System.out.println("No revisions found. Proceeding with comparison...");
}

If revisions exist, you might want to accept or reject them before proceeding.

Step 3: Compare the Documents

Use the compare method to find differences. This method compares the target document (doc2) with the source document (doc1):

doc1.compare(doc2, "AuthorName", new Date());

Here:

  • AuthorName is the name of the person making the changes.
  • Date is the comparison timestamp.

Step 4: Process Revisions

Once compared, Aspose.Words will generate revisions in the source document (doc1). Let’s analyze these revisions:

for (Revision r : doc1.getRevisions()) {
    System.out.println("Revision type: " + r.getRevisionType());
    System.out.println("Node type: " + r.getParentNode().getNodeType());
    System.out.println("Changed text: " + r.getParentNode().getText());
}

This loop provides detailed information about each revision, such as the type of change and the affected text.

Step 5: Accept All Revisions

If you want the source document (doc1) to match the target document (doc2), accept all revisions:

doc1.getRevisions().acceptAll();

This updates doc1 to reflect all the changes made in doc2.

Step 6: Save the Updated Document

Finally, save the updated document to disk:

doc1.save("Document.Compare.docx");

To confirm the changes, reload the document and verify there are no remaining revisions:

doc1 = new Document("Document.Compare.docx");
if (doc1.getRevisions().getCount() == 0) {
    System.out.println("Documents are now identical.");
}

Step 7: Verify Document Equality

To ensure the documents are identical, compare their text:

if (doc1.getText().trim().equals(doc2.getText().trim())) {
    System.out.println("Documents are equal.");
}

If the texts match, congratulations—you’ve successfully compared and synchronized the documents!

Conclusion

Document comparison is no longer a chore, thanks to Aspose.Words for Java. With just a few lines of code, you can pinpoint differences, process revisions, and ensure document consistency. Whether you’re managing a collaborative writing project or auditing legal documents, this feature is a game-changer.

FAQ’s

Can I compare documents with images and tables?

Yes, Aspose.Words supports comparing complex documents, including those with images, tables, and formatting.

Do I need a license to use this feature?

Yes, a license is required for full functionality. Get a temporary license here.

What happens if there are pre-existing revisions?

You must accept or reject them before comparing documents to avoid conflicts.

Can I highlight the revisions in the document?

Yes, Aspose.Words allows you to customize how revisions are displayed, such as highlighting changes.

Is this feature available in other programming languages?

Yes, Aspose.Words supports multiple languages, including .NET and Python.