The reasons for testing a particular region of a PDF page are described in section
3.21: “Layout - in Page Regions”.
To find the coordinates of the area you want to test, PDFUnit provides the small utility
RenderPdfPageRegionToImage
. Choose the width, height and the position of
the upper left corner and the corresponding region is then extracted into a file.
Verify this file then “by eye” and vary the parameters until you got the right region.
Once you have found the correct coordinates for your region, use those parameters
in your PDFUnit test.
:: :: Render a part of a PDF page into an image file :: @echo off setlocal set CLASSPATH=./lib/aspectj-1.8.7/*;%CLASSPATH% set CLASSPATH=./lib/commons-logging-1.2/*;%CLASSPATH% set CLASSPATH=./lib/pdfbox-2.0.0/*;%CLASSPATH% set CLASSPATH=./lib/pdfunit-2016.05/*;%CLASSPATH% set TOOL=com.pdfunit.tools.RenderPdfPageRegionToImage set OUT_DIR=./tmp set PAGENUMBER=1 set IN_FILE=documentForTextClipping.pdf set PASSWD= :: Put these values into your test code: :: Values in millimeter: set UPPERLEFTX=17 set UPPERLEFTY=45 set WIDTH=60 set HEIGHT=9 java %TOOL% %IN_FILE% %PAGENUMBER% %OUT_DIR% %FORMATUNIT% %UPPERLEFTX% %UPPERLEFTY% %WIDTH% %HEIGHT% %PASSWD% endlocal
The 4 values that define a page region have to have the unit millimeter (mm).
The upper part of the input file documentForTextClipping.pdf
contains
the text: “
Content on first page.
”
The generated image file has to be checked.
The name of the generated PNG includes the area’s coordinates.
Because PDFUnit and the utility program RenderPdfPageRegionToImage
use the same algorithm, you can use the parameter values from the script for
your test. And later, you can derive them from the file name:
# # Parameters from filename: # _rendered_documentForTextClipping_page-1_area-50-130-170-25.out.png | | | | | | | +- height | | +- width | +- upperLeftY +- upperLeftX