Wednesday, 24 June 2020

Merge PDF Documents in Java

In order to better manage and store documents, it is inevitable to combine multiple PDF documents into one document. Here I’ll introduce two ways to merge PDF documents using Free Spire.PDF for Java.

Before beginning, you need to download the required tool Free Spire.PDF for Java from the link, unzip it and import Spire.Pdf.jar located in the “lib” folder into your IDEA. Of course, if you use maven, you need to add the following code to your project’s pom.xml file.

<repositories> 

        <repository> 

            <id>com.e-iceblue</id> 

            <name>e-iceblue</name> 

            <url>http://repo.e-iceblue.com/nexus/content/groups/public/</url> 

        </repository> 

</repositories> 

<dependencies> 

    <dependency> 

        <groupId>e-iceblue</groupId> 

        <artifactId>spire.pdf.free</artifactId> 

        <version>2.6.3</version> 

    </dependency> 

</dependencies> 

Using the code

Method 1

Load three PDF documents and select the first PdfDocument for the purpose of merging the second and third PDF file to it.

import com.spire.pdf.PdfDocument;

public class MergePDF1 {
   
public static void main(String[] args) {
       
//Load PDF files
       
String[] files = new String[]
                {
                       
"C:\\Users\\Test1\\Desktop\\Sample01.pdf",
                       
"C:\\Users\\Test1\\Desktop\\Sample02.pdf",
                       
"C:\\Users\\Test1\\Desktop\\Sample03.pdf",
                };

       
//Open pdf documents
       
PdfDocument[] docs = new PdfDocument[files.length];
        PdfDocument doc =
new PdfDocument();
       
for (int i = 0; i < files.length; i++) {
            docs[i] =
new PdfDocument();
            docs[i].loadFromFile(files[i]);
        }
       
//Append document
       
docs[0].appendPage(docs[1]);

       
//import pages
       
for (int i = 0; i < docs[2].getPages().getCount(); i = i + 2) {
            docs[
0].insertPage(docs[2], i);
        }

       
// Save pdf file.
       
docs[0].saveToFile("output/MergeDocument.pdf");
        doc.close();
    }
}

Method 2

Input the three PDF documents by stream and then use mergeFiles(streams)methods to merge the PDF documents into one PDF document.

import com.spire.pdf.*;
import java.io.*;

public class MergePDF2 {
   
public static void main(String[] args) throws FileNotFoundException {
       
//Load PDF files
       
FileInputStream stream1 = new FileInputStream(new File("C:\\Users\\Test1\\Desktop\\Sample01.pdf"));
        FileInputStream stream2 =
new FileInputStream(new File("C:\\Users\\Test1\\Desktop\\Sample02.pdf"));
        FileInputStream stream3 =
new FileInputStream(new File("C:\\Users\\Test1\\Desktop\\Sample03.pdf"));
     
               //Input PDF files by stream
       
InputStream[] streams = new FileInputStream[]{stream1, stream2, stream3};

       
//Merge files by stream
       
PdfDocumentBase doc = PdfDocument.mergeFiles(streams);
       
//Save the file
       
doc.save("output/mergeFilesByStream.pdf");
        doc.close();
    }
}

Output


Tuesday, 16 June 2020

Extract Text and Images from Word in Java

Picking out text and images and saving them manually can be a long and frustrating process, especially in a large file with lots of pages. Fortunately, there’s a method that makes the process quite simple. Follow the tutorial below in order to extract text and images from a Word document in an easy way.

Before getting started, please download Free Spire.Doc for Java package through this link, unzip the package and then import Spire.Doc.jar fromlib folder into our application.

Extract Text

import com.spire.doc.Document;
import java.io.FileWriter;
import java.io.IOException;
public class ExtractText {
public static void main(String[] args) throws IOException {
//load Word document
Document document = new Document();
document.loadFromFile(
"C:\\Users\\Test1\\Desktop\\Sample.docx");
//get text from document as string
String text=document.getText();
//write string to a .txt file
writeStringToTxt(text,"output/ExtractedText.txt");
    }
public static void writeStringToTxt(String content, String txtFileName) throws IOException{
FileWriter fWriter=
new FileWriter(txtFileName,true);
try {

fWriter.write(content);
    }
catch(IOException ex){
       ex.printStackTrace();
        }
finally{
           
try{
                fWriter.flush();
                fWriter.close();
            }
catch (IOException ex) {
                ex.printStackTrace();
            }
        }
    }
}

Output

Extract Images

import com.spire.doc.Document;
import com.spire.doc.documents.DocumentObjectType;
import com.spire.doc.fields.DocPicture;
import com.spire.doc.interfaces.ICompositeObject; import com.spire.doc.interfaces.IDocumentObject;
import javax.imageio.ImageIO;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
public class ExtractImage {
   
public static void main(String[] args)throws IOException {
       
//load word document
       
Document document = new Document();
        document.loadFromFile(
"C:\\Users\\Test1\\Desktop\\Sample.docx");
      
//create a Queue object
       
Queue nodes = new LinkedList();
        nodes.add(document);
       
//create a List object
       
List images = new ArrayList();
        
//loop through the child objects of the document
       
while (nodes.size() > 0) {
            ICompositeObject node = (ICompositeObject) nodes.poll();
           
for (int i = 0; i < node.getChildObjects().getCount(); i++) {
                IDocumentObject child = node.getChildObjects().get(i);
               
if (child instanceof ICompositeObject) {
                    nodes.add((ICompositeObject) child);
                   
//get each image and add it to the list
                    
if (child.getDocumentObjectType() == DocumentObjectType.Picture) {
                        DocPicture picture = (DocPicture) child;
                        images.add(picture.getImage());
                    }
                }
            }
        }
       
//save images as .png files
       
for (int i = 0; i < images.size(); i++) {
            File file =
new File(String.format("output/ExtractedImage.png", i));
            ImageIO.write((RenderedImage) images.get(i),
"PNG", file);
        }
    }
}
Output


Monday, 8 June 2020

Delete Blank Rows and Columns in Excel in Java

If there are some blank rows and columns in your Excel workbook that make your excel data seem not so easy to read or edit, you need to delete them manually and accurately. The process is time-consuming. Here I’ll introduce a time-saver method to help you delete blank rows and columns more efficiently.

Add Dependencies

Before beginning, you need to add required dependencies into your Java project. There are two ways to do that.

Method 1: Please download FreeSpire.XLS for Java pack from the link, and then import Spire.Xls.jar located in the “lib” folder into your project IDEA as a dependency.

Method 2:  If you are using maven, you need to add the following code to your project’s pom.xml file.

<repositories> 

        <repository> 

            <id>com.e-iceblue</id> 

            <name>e-iceblue</name> 

            <url>http://repo.e-iceblue.com/nexus/content/groups/public/</url> 

        </repository> 

</repositories> 

<dependencies> 

    <dependency> 

        <groupId>e-iceblue</groupId> 

        <artifactId>spire.xls.free</artifactId> 

        <version>2.2.0</version> 

    </dependency> 

</dependencies>

Sample Document

Using the code

import com.spire.xls.ExcelVersion;
import com.spire.xls.Workbook;
import com.spire.xls.Worksheet;

public class DeleteBlankRowsAndColumns {
   
public static void main(String[] args) {
       
//Load the sample document
       
Workbook wb = new Workbook();
        wb.loadFromFile(
"C:\\Users\\Test1\\Desktop\\Sample.xlsx");

       
//Get the third worksheet
       
Worksheet sheet = wb.getWorksheets().get(2);

       
//Loop through the rows
       
for (int i = sheet.getLastRow(); i >= 1; i--)
        {
           
//Detect if a row is blank
           
if (sheet.getRows()[i-1].isBlank())
            {
               
//Remove blank row
               
sheet.deleteRow(i);
            }
        }

       
//Loop through the columns
       
for (int j = sheet.getLastColumn(); j >= 1; j--)
        {
           
//Detect if a column is blank
           
if (sheet.getColumns()[j-1].isBlank())
            {
               
//Remove blank column
               
sheet.deleteColumn(j);
            }
        }

       
//Save the document
       
wb.saveToFile("output/DeleteBlankRowsAndColumns.xlsx", ExcelVersion.Version2016);
    }
}

Output


Change PDF Versions in Java

In daily work, you might need to change the version of a PDF document you have in order to ensure compatibility with another version which a...