Tuesday, 15 February 2022

Extract Word Paragraphs That Use Specific Styles

In the process of manipulating Word documents, we often need to set specific styles for certain paragraphs in Word. For example, we can set “Heading 1” for the title of the document. That helps us extract the text of a paragraph by its specific style if it’s necessary. This article will show you how to do it using Java codes.

DEPENDENCY

First of all, you’re required to get the package of Free Spire.Doc forJava from this link, and then manually add Spire.Doc.jar to your Java program. Or if you use Maven, you can easily import the JAR file by using the following codes in the pom.xml file.

<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.doc.free</artifactId>
<version>5.1.0</version>
</dependency>
</dependencies>

USING THE CODE

Free Spire.Doc for Java offers the Paragraph.getStyleName() method to get the style name of a certain paragraph and determine if it is a specific style. If yes, then you can extract the Paragraph text using Paragraph.getText() method. The following are detailed steps.

l  Initialize a Document object and load a sample Word document.

l  Loop through all sections of the document.

l  Get a specific paragraph of a certain section using Section.getParagraphs().get() method.

l  Get the paragraph's style name using Paragraph.getStyleName() method and determine if the style is "Heading 1".

l  If yes, extract the text of the paragraph using Paragraph.getText() method.

import com.spire.doc.Document;
import com.spire.doc.documents.Paragraph;

public class GetParagraphWithStyle {
public static void main(String[] args) {
//Load a sample Word document while initializing the Document object
Document doc = new Document("C:\\Users\\Test1\\Desktop\\test.docx");

//Declare a variable
Paragraph paragraph;

//Loop through the sections
for (int i = 0; i < doc.getSections().getCount(); i++) {

//Loop through the paragraphs of a specific section
for (int j = 0; j < doc.getSections().get(i).getParagraphs().getCount(); j++) {

//Get a specific paragraph
paragraph = doc.getSections().get(i).getParagraphs().get(j);

//Determine if the paragraph style is "Heading 1"
if (paragraph.getStyleName().equals("Heading1")) {

//Get the text of the paragraph in "Heading 1"
System.out.println("Heading 1: " + paragraph.getText() + "\n");
}
}
}
}
}




 


No comments:

Post a Comment

Change PDF Versions in Java

In daily work, you might need to change the version of a PDF document you have in order to ensure compatibility with another version which a...