Monday, 20 December 2021

Count the Number of Words in Word Documents Using Java

Microsoft Word can count the number of words, pages, paragraphs, lines and characters (with or without spaces) in a Word document. In addition, it can also count the number of words in footnotes and endnotes. This article will show you how to count the number of words and characters (with or without spaces) in a Word document using Java codes.

DEPENDENCY

To programmatically achieve the operation mentioned above using Java, you’re required to download the package of Free Spire.Doc for Java, and then add Spire.Doc.jar to your Java program. If you use Maven, you can easily add the following code to your project’s pom.xml file to import the JAR file.

<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url> https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.doc.free</artifactId>
<version>3.9.0</version>
</dependency>
</dependencies>

USING THE CODE

Free Spire.Doc for Java supports counting the number of words, the number of characters with or without spaces using methods provided by the class SummaryDocumentProperties. The following are detailed steps.

l  Create a Document instance.

l  Load a sample Word document using Document.loadFromFile() method.

l  Get the document’s built-in properties using Document.getBuiltinDocumentProperties() method.

l  Count the number of words using SummaryDocumentProperties.getWordCount() method.

l  Count the number of characters with or without spaces using the getChartCount and getChartCount methods provided by the class SummaryDocumentProperties.

l  Output the result.

import com.spire.doc.*;

public class CountWords {
public static void main(String[] args) {
//Create a Document instance
Document document = new Document();

//Load a sample Word document
document.loadFromFile("C:\\Users\\Test1\\Desktop\\sample.docx");

//Count the number of words
System.out.println("WordCount: " + document.getBuiltinDocumentProperties().getWordCount());

//Count the number of characters without spaces
System.out.println("CharCount: " + document.getBuiltinDocumentProperties().getCharCount());

//Count the number of characters with spaces
System.out.println("CharCountWithSpace: " + document.getBuiltinDocumentProperties().getCharCountWithSpace());
}
}



No comments:

Post a Comment

Change PDF Versions in Java

In daily work, you might need to change the version of a PDF document you have in order to ensure compatibility with another version which a...