Tina's blog: June 2020

Wednesday, 24 June 2020

Merge PDF Documents in Java

In order to better manage and store documents, it is inevitable to combine multiple PDF documents into one document. Here I’ll introduce two ways to merge PDF documents using Free Spire.PDF for Java.

Before beginning, you need to download the required tool Free Spire.PDF for Java from the link, unzip it and import Spire.Pdf.jar located in the “lib” folder into your IDEA. Of course, if you use maven, you need to add the following code to your project’s pom.xml file.

<id>com.e-iceblue</id>

<name>e-iceblue</name>

<url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>

</repository>

</repositories>

<groupId>e-iceblue</groupId>

<artifactId>spire.pdf.free</artifactId>

</dependency>

</dependencies>

Using the code

Method 1

Load three PDF documents and select the first PdfDocument for the purpose of merging the second and third PDF file to it.

import com.spire.pdf.PdfDocument;

public class MergePDF1 {
    public static void main(String[] args) {
        //Load PDF files
        String[] files = new String[]
                {
                        "C:\\Users\\Test1\\Desktop\\Sample01.pdf",
                        "C:\\Users\\Test1\\Desktop\\Sample02.pdf",
                        "C:\\Users\\Test1\\Desktop\\Sample03.pdf",
                };

        //Open pdf documents
        PdfDocument[] docs = new PdfDocument[files.length];
        PdfDocument doc = new PdfDocument();
        for (int i = 0; i < files.length; i++) {
            docs[i] = new PdfDocument();
            docs[i].loadFromFile(files[i]);
        }
        //Append document
        docs[0].appendPage(docs[1]);

        //import pages
        for (int i = 0; i < docs[2].getPages().getCount(); i = i + 2) {
            docs[0].insertPage(docs[2], i);
        }

        // Save pdf file.
        docs[0].saveToFile("output/MergeDocument.pdf");
        doc.close();
    }
}

Method 2

Input the three PDF documents by stream and then use mergeFiles(streams)methods to merge the PDF documents into one PDF document.

import com.spire.pdf.*;

import java.io.*;


public class MergePDF2 {

    public static void main(String[] args) throws FileNotFoundException {

        //Load PDF files

        FileInputStream stream1 = new FileInputStream(new File("C:\\Users\\Test1\\Desktop\\Sample01.pdf"));

        FileInputStream stream2 = new FileInputStream(new File("C:\\Users\\Test1\\Desktop\\Sample02.pdf"));

        FileInputStream stream3 = new FileInputStream(new File("C:\\Users\\Test1\\Desktop\\Sample03.pdf"));

               //Input PDF files by stream

        InputStream[] streams = new FileInputStream[]{stream1, stream2, stream3};


        //Merge files by stream

        PdfDocumentBase doc = PdfDocument.mergeFiles(streams);


        //Save the file

        doc.save("output/mergeFilesByStream.pdf");

        doc.close();

    }

}

Output

Tuesday, 16 June 2020

Extract Text and Images from Word in Java

Picking out text and images and saving them manually can be a long and frustrating process, especially in a large file with lots of pages. Fortunately, there’s a method that makes the process quite simple. Follow the tutorial below in order to extract text and images from a Word document in an easy way.

Before getting started, please download Free Spire.Doc for Java package through this link, unzip the package and then import Spire.Doc.jar from＂lib＂ folder into our application.

Extract Text

import com.spire.doc.Document;
import java.io.FileWriter;
import java.io.IOException;

public class ExtractText {
public static void main(String[] args) throws IOException {

//load Word document
Document document = new Document();
document.loadFromFile("C:\\Users\\Test1\\Desktop\\Sample.docx");

//get text from document as string
String text=document.getText();

//write string to a .txt file
writeStringToTxt(text,"output/ExtractedText.txt");

    }

public static void writeStringToTxt(String content, String txtFileName) throws IOException{

FileWriter fWriter= new FileWriter(txtFileName,true);
try {
fWriter.write(content);

    }catch(IOException ex){

       ex.printStackTrace();

        }finally{

            try{

                fWriter.flush();

                fWriter.close();

            } catch (IOException ex) {

                ex.printStackTrace();

            }

        }

    }

}
Output
Extract Images
import com.spire.doc.Document;
import com.spire.doc.documents.DocumentObjectType;
import com.spire.doc.fields.DocPicture;
import com.spire.doc.interfaces.ICompositeObject;
import com.spire.doc.interfaces.IDocumentObject;
import javax.imageio.ImageIO;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
public class ExtractImage {

    public static void main(String[] args)throws IOException {

        //load word document

        Document document = new Document();

        document.loadFromFile("C:\\Users\\Test1\\Desktop\\Sample.docx");

        //create a Queue object

        Queue nodes = new LinkedList();

        nodes.add(document);

        //create a List object

        List images = new ArrayList();

        //loop through the child objects of the document

        while (nodes.size() > 0) {

            ICompositeObject node = (ICompositeObject) nodes.poll();

            for (int i = 0; i < node.getChildObjects().getCount(); i++) {

                IDocumentObject child = node.getChildObjects().get(i);

                if (child instanceof ICompositeObject) {

                    nodes.add((ICompositeObject) child);

                    //get each image and add it to the list

                    if (child.getDocumentObjectType() == DocumentObjectType.Picture) {

                        DocPicture picture = (DocPicture) child;

                        images.add(picture.getImage());

                    }

                }

            }

        }

        //save images as .png files

        for (int i = 0; i < images.size(); i++) {

            File file = new File(String.format("output/ExtractedImage.png", i));

            ImageIO.write((RenderedImage) images.get(i), "PNG", file);

        }

    }

}
Output

Monday, 8 June 2020

Delete Blank Rows and Columns in Excel in Java

If there are some blank rows and columns in your Excel workbook that make your excel data seem not so easy to read or edit, you need to delete them manually and accurately. The process is time-consuming. Here I’ll introduce a time-saver method to help you delete blank rows and columns more efficiently.

Add Dependencies

Before beginning, you need to add required dependencies into your Java project. There are two ways to do that.

Method 1: Please download FreeSpire.XLS for Java pack from the link, and then import Spire.Xls.jar located in the “lib” folder into your project IDEA as a dependency.

Method 2: If you are using maven, you need to add the following code to your project’s pom.xml file.

<id>com.e-iceblue</id>

<name>e-iceblue</name>

<url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>

</repository>

</repositories>

<groupId>e-iceblue</groupId>

<artifactId>spire.xls.free</artifactId>

</dependency>

</dependencies>

Sample Document

Using the code

import com.spire.xls.ExcelVersion;
import com.spire.xls.Workbook;
import com.spire.xls.Worksheet;

public class DeleteBlankRowsAndColumns {
    public static void main(String[] args) {
        //Load the sample document
        Workbook wb = new Workbook();
        wb.loadFromFile("C:\\Users\\Test1\\Desktop\\Sample.xlsx");

        //Get the third worksheet
        Worksheet sheet = wb.getWorksheets().get(2);

        //Loop through the rows
        for (int i = sheet.getLastRow(); i >= 1; i--)
        {
            //Detect if a row is blank
            if (sheet.getRows()[i-1].isBlank())
            {
                //Remove blank row
                sheet.deleteRow(i);
            }
        }

        //Loop through the columns
        for (int j = sheet.getLastColumn(); j >= 1; j--)
        {
            //Detect if a column is blank
            if (sheet.getColumns()[j-1].isBlank())
            {
                //Remove blank column
                sheet.deleteColumn(j);
            }
        }

        //Save the document
        wb.saveToFile("output/DeleteBlankRowsAndColumns.xlsx", ExcelVersion.Version2016);
    }
}

Output

Tina's blog