Thursday, March 12, 2015

Converting PDF to EPS Figure for LaTeX

Sometimes we want to convert a figure in PDF format to EPS format and use the figure in a LaTex file. One important item we must take care is the bounding box of the figure. If the bounding box is not generated correctly, we would have a large area of wide space around the figure in final document.

In a Linux system, we have a set of free tools that can help us produce an appropriate bounding box and convert PDF format to EPS format.
On Ubuntu, you may install them using apt-get as follows,

sudo apt-get install \
     pdftk gv texlive-extra-utils \
     poppler-utils ghostscript ps2eps


Assume that PDF file foo.pdf contains an interesting figure in page 2 and we want to extract the figure for inserting it in a LaTex document. We would follow the steps below,
  1. Extract the page from the PDF file that contains the figure we would like to extract

    pdftk foo.pdf cat 2 output page2.pdf
    

    where foo.pdf is the input PDF file, page2.pdf  is the output PDF file, and 2 is the page number in the input PDF file.

  2. Measure roughtly the position and the dimension of a box that contains the figure using gv.
    
    gv page2.pdf
    

    We read the coordinates of the bottom left corner and the top right corner of the box from gv. Assume the readings are (61, 82) and (321, 161), respectivley.

  3. Crop the PDF file based on the box obtained in the above.
    
    pdfcrop --bbox "61 82 321 161" page2.pdf
    

    The output of this step is page2-crop.pdf.

  4. Crop the resulting PDF file from previous step to reduce white space around the figure.
    
    pdfcrop page2-crop.pdf
    

    The result is page2-crop-crop.pdf.

  5. Convert the PDF file to an EPS file.
    
    pdftops -eps page2-crop-crop.pdf page2.eps
    

    We could complete the last step using pdf2ps in Ghostscript instead of pdftops by the Poppler developers as the following two steps approach.
    
    pdf2ps page2-crop-crop.pdf 
    ps2eps page2-crop-crop.ps page2.eps
    

    However, we do not recommend pdf2ps, convinced by the argument made by Stefaan Lippens.

    As indicated by Stefaan Lippens, pdf2ps converts fonts in PDF files to bitmap fonts in resulting PS files. Some may consider this an advantage because the fonts used in the figure will always be "present" in the PDF files generated form the corresponding LaTeX files. However, as we discussed previously, it is not difficult to embedd all fonts in a PDF file.

No comments:

Post a Comment