PDFBox轉(zhuǎn)圖片小工具

1. 效果

(1)轉(zhuǎn)換pdf所有頁(yè)面:

# 運(yùn)行
java -jar pdftransfer-1.0.jar D:\test\mypdflocation\文件.pdf
轉(zhuǎn)化pdf所有頁(yè)面.png

轉(zhuǎn)化結(jié)果.png

(2)轉(zhuǎn)化指定頁(yè)面

# 運(yùn)行,注意第二個(gè)參數(shù),代表頁(yè)面號(hào),多個(gè)頁(yè)面用空格隔開(kāi)
java -jar pdftransfer-1.0.jar D:\test\mypdflocation\文件.pdf 5 6 7
轉(zhuǎn)化指定頁(yè)面.png
image.png

2. 源碼

JDK:
建議1.8

maven依賴:

<dependency>
       <groupId>com.levigo.jbig2</groupId>
       <artifactId>levigo-jbig2-imageio</artifactId>
       <version>2.0</version>
</dependency>
<dependency>
       <groupId>com.twelvemonkeys.imageio</groupId>
       <artifactId>imageio-jpeg</artifactId>
       <version>3.4.1</version>
</dependency>
<dependency>
       <groupId>org.apache.pdfbox</groupId>
       <artifactId>pdfbox</artifactId>
       <version>2.0.17</version>
</dependency>
<dependency>
       <groupId>com.github.jai-imageio</groupId>
       <artifactId>jai-imageio-core</artifactId>
       <version>1.4.0</version>
</dependency>
<dependency>
      <groupId>com.github.jai-imageio</groupId>
      <artifactId>jai-imageio-jpeg2000</artifactId>
      <version>1.3.0</version>
</dependency>

文件源碼

package com.qzh;

import org.apache.pdfbox.cos.COSObject;
import org.apache.pdfbox.pdmodel.DefaultResourceCache;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.graphics.PDXObject;
import org.apache.pdfbox.rendering.PDFRenderer;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.EOFException;
import java.io.File;
import java.io.IOException;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.concurrent.*;

/**
 * 參數(shù)1:路徑
 * 參數(shù)2:指定頁(yè)數(shù)
 *
 * @author qu.zh
 * @date 2019/12/11 17:24
 */
public class PdfTransfer {

    private static String FILE_NAME = "";

    /**
     * 文件輸出路徑
     */
    private static String FILE_OUTPUT_PATH = "d:/pdf/output/";

    /**
     * 隊(duì)列
     */
    private volatile static ArrayBlockingQueue<DataEntity> queue = new ArrayBlockingQueue<DataEntity>(500);

    /**
     * CPU核心線程數(shù)
     */
    private static final int CPU_CORE = 7;

    /**
     * 默認(rèn)DPI,可以用參數(shù)進(jìn)行擴(kuò)展
     */
    private static final int DEFAULT_DPI = 400;


    static {
        System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider");
    }

    public static void main(String[] args) {
        // pdf名稱
        if (args == null || args.length == 0) {
            System.err.println("請(qǐng)傳入路徑");
            return;
        }
        String name = args[0];

        List<Integer> needTransferList = new ArrayList<>();
        if (args.length > 1) {
            for (int i = 1; i < args.length; i++) {
                needTransferList.add(Integer.valueOf(args[i]));
                System.out.println("打印頁(yè)數(shù):" + args[i]);
            }
        }

        FILE_NAME = name;

        if (FILE_NAME == null || "".equals(FILE_NAME)) {
            System.err.println("pdf名稱不能為空");
            return;
        }


        ThreadPoolExecutor executor = new ThreadPoolExecutor(CPU_CORE, CPU_CORE,
                1200, TimeUnit.SECONDS, new SynchronousQueue<>(), new ThreadPoolExecutor.CallerRunsPolicy());
        // i7+16G+8核的機(jī)器
        // 20進(jìn)程 62秒
        // 14進(jìn)程 46秒
        // 10進(jìn)程 44秒
        // 8進(jìn)程  40秒
        // 6進(jìn)程  42秒
        // 4線程  49秒
        // 趨近核數(shù)最快
        for (int i = 0; i < CPU_CORE; i++) {
            executor.submit(new MyTask(queue));
        }

        File file = new File(FILE_NAME);
        int end = file.getName().lastIndexOf(".");
        String folderName = file.getName().substring(0, end);
        File fileParent = new File(file.getParent());
        if (!fileParent.exists()) {
            file.mkdirs();
        }

        FILE_OUTPUT_PATH = FILE_OUTPUT_PATH + File.separator + folderName + File.separator;
        File output = new File(FILE_OUTPUT_PATH);
        if (!output.exists()) {
            output.mkdirs();
        }


        PDDocument pdDocument = null;
        try {
            Date startDate = new Date();
            System.out.println();
            pdDocument = PDDocument.load(new File(FILE_NAME));

            pdDocument.setResourceCache(new MyResourceCache());
            int pageCount = pdDocument.getNumberOfPages();

            PDFRenderer renderer = new PDFRenderer(pdDocument);
            CountDownLatch countDownLatch = new CountDownLatch(needTransferList.size() > 0 ? needTransferList.size() : pageCount);
            System.out.println("轉(zhuǎn)換頁(yè)數(shù)頁(yè)數(shù)一共:" + needTransferList.size());
            if (needTransferList.size() > 0) {
                for (int i = 0; i < needTransferList.size(); i++) {
                    int curPage = needTransferList.get(i);
                    DataEntity dataEntity = new DataEntity();
                    dataEntity.setPageNum(curPage - 1);
                    dataEntity.setPdfRenderer(renderer);
                    dataEntity.setCountDownLatch(countDownLatch);
                    dataEntity.setPageCount(pageCount);
                    queue.put(dataEntity);
                }
            } else {
                for (int j = 0; j < pageCount; j++) {
                    DataEntity dataEntity = new DataEntity();
                    dataEntity.setPageNum(j);
                    dataEntity.setPdfRenderer(renderer);
                    dataEntity.setCountDownLatch(countDownLatch);
                    dataEntity.setPageCount(pageCount);
                    int imageCount = 0;

                    int fontCount = 0;
                    // 圖片內(nèi)容
                    PDPage page = pdDocument.getPage(j);
                    PDResources resources = page.getResources();
                    Iterable<COSName> cosNames = resources.getXObjectNames();
                    BufferedImage bufferedImage = null;
                    if (cosNames != null) {
                        Iterator<COSName> cosNamesIter = cosNames.iterator();
                        while (cosNamesIter.hasNext()) {
                            COSName cosName = cosNamesIter.next();
                            PDFont font = resources.getFont(cosName);
                            if (resources.isImageXObject(cosName)) {
                                imageCount++;
                                PDImageXObject Ipdmage = (PDImageXObject) resources.getXObject(cosName);
                                bufferedImage = Ipdmage.getImage();

                            }
                            if (font != null) {
                                fontCount++;
                            }
                        }
                    }
                   // 如果每一頁(yè)只有一張圖片,直接提出來(lái)就行,不然效率低
                    if (fontCount == 0 && imageCount == 1) {
                        String imageFileName = FILE_OUTPUT_PATH + "number_" + (j + 1) + "_page.png";
                        FileOutputStream out = new FileOutputStream(imageFileName);
                        try {
                            ImageIO.write(bufferedImage, "png", out);
                        } catch (IOException e) {
                        } finally {
                            try {
                                out.close();
                            } catch (IOException e) {
                                e.printStackTrace();
                            }
                        }
                    } else {
                        queue.put(dataEntity);
                    }
                }
            }

            countDownLatch.await(200, TimeUnit.SECONDS);
            System.out.println("執(zhí)行完畢!?。。。。。。?!");
            System.out.println("頁(yè)數(shù):" + pageCount);
            Date endDate = new Date();
            System.out.println("用時(shí):" + (endDate.getTime() - startDate.getTime()) / 1000 + "秒");

        } catch (IOException e) {
            e.printStackTrace();
            System.err.println("IOException");
        } catch (InterruptedException e) {
            e.printStackTrace();
            System.err.println("InterruptedException");
        } catch (Throwable throwable) {
            throwable.printStackTrace();
        } finally {
            try {
                if (pdDocument != null) {
                    pdDocument.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
                System.err.println("IOException");
            }
        }
        System.out.println("按ctrl+c結(jié)束");
    }

    /**
     * 緩存優(yōu)化,官網(wǎng)說(shuō)這個(gè)指定為空會(huì)禁止使用緩存
     */
    private static class MyResourceCache extends DefaultResourceCache {
        @Override
        public void put(COSObject indirect, PDXObject xobject) throws IOException {
            // super .put(indirect,xobject);
        }
    }

    /**
     * 任務(wù)
     */
    public static class MyTask implements Runnable {

        private ArrayBlockingQueue<DataEntity> queue;

        public MyTask(ArrayBlockingQueue queue) {
            this.queue = queue;
        }

        @Override
        public void run() {
            while (true) {
                DataEntity dataEntity = null;
                BufferedImage image = null;
                try {
                    // 取出任務(wù)
                    dataEntity = queue.take();
                    PDFRenderer renderer = dataEntity.getPdfRenderer();
                    int pageNum = dataEntity.getPageNum();
                    String imageFileName = FILE_OUTPUT_PATH + "第" + (pageNum + 1) + "頁(yè).png";
                    // 進(jìn)行轉(zhuǎn)圖片
                    System.out.println("============第" + (pageNum + 1) + "頁(yè)轉(zhuǎn)換中============");

                    // renderer是不安全的,所以得加鎖,雖然會(huì)影響性能
                    synchronized (renderer) {
                        image = renderer.renderImageWithDPI(pageNum, DEFAULT_DPI);
                    }

                    ImageIO.write(image, "png", new File(imageFileName));
                } catch (InterruptedException e) {
                    e.printStackTrace();
                    System.err.println("InterruptedException==========");
                } catch (IOException e) {
                    if (e instanceof EOFException) {
                        System.err.println("EOFException========");
                    } else {
                        System.err.println("IOException========");
                        e.printStackTrace();
                    }


                } catch (Exception throwable) {
                    System.out.println("=================Throwable==========================");
                    throwable.printStackTrace();
                } finally {
                    if (image != null) {
                        image.flush();
                    }

                    if (dataEntity != null) {
                        CountDownLatch countDownLatch = dataEntity.getCountDownLatch();
                        countDownLatch.countDown();
                    }
                }
            }
        }
    }

    /**
     * 任務(wù)實(shí)體
     */
    private static class DataEntity implements Serializable {
        public static final long serialVersionUID = -1;
        private PDFRenderer pdfRenderer;

        private int pageNum;

        private CountDownLatch countDownLatch;

        private int pageCount;

        public int getPageCount() {
            return pageCount;
        }

        public void setPageCount(int pageCount) {
            this.pageCount = pageCount;
        }

        public PDFRenderer getPdfRenderer() {
            return pdfRenderer;
        }

        public CountDownLatch getCountDownLatch() {
            return countDownLatch;
        }

        public void setCountDownLatch(CountDownLatch countDownLatch) {
            this.countDownLatch = countDownLatch;
        }

        public void setPdfRenderer(PDFRenderer pdfRenderer) {
            this.pdfRenderer = pdfRenderer;
        }

        public int getPageNum() {
            return pageNum;
        }

        public void setPageNum(int pageNum) {
            this.pageNum = pageNum;
        }
    }
}



如果需要指定main執(zhí)行的話:

<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.3</version>
                <configuration>
                    <appendAssemblyId>false</appendAssemblyId>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <addClasspath>true</addClasspath>
                            <classpathPrefix>lib/</classpathPrefix>
                            <mainClass>com.qzh.PdfTransfer</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>assembly</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

3. 說(shuō)明

(1)核心線程數(shù)可以根據(jù)實(shí)際環(huán)境來(lái)調(diào)整,建議為cpu核數(shù)(加一或者減一)。
(2)出現(xiàn)其他問(wèn)題請(qǐng)參考:http://www.itdecent.cn/p/c85017f8577a

image.png

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容