Bookmarks are essential for a quick navigation in large PDF documents. The value of a book drops dramatically when the chapters are not available via the table of contents. Use the following tests to ensure that the bookmarks are generated correctly.
<!-- Tags for tests on bookmarks: --> <hasNumberOfBookmarks /> <hasBookmarks /> <hasBookmark withLabel=".." (One of these attributes ... withLinkToName=".." ... withLinkToPage=".." ... withLinkToURI=".." ... withoutDeadLink=".." ... has to be used) /> <hasBookmark withLabel=".." (Only these two attributes ... linkingToPage=".." ... can be used together.) /> <!-- Nested tags of <hasBookmarks />: --> <hasBookmarks> <matchingXPath /> (optional) <matchingXML /> (optional) <hasBookmarks />
We can see bookmarks as starting points and “named destinations” as the landing points. Named destinations can be used by bookmarks and also by HTML links. So you can jump from a website directly to a specific location within a PDF document.
For named destinations, the following tags are available:
<!-- Tags to check named destinations: --> <hasNamedDestination /> <!-- Nested tags: --> <hasNamedDestination> <withName /> (optional) </hasNamedDestination>
The names of named destinations can be tested easily:
<testcase name="hasNamedDestination_WithName"> <assertThat testDocument="namedDestination/manyNamedDestinations.pdf"> <hasNamedDestination> <withName>Seventies</withName> <withName>Eighties</withName> <withName>1999</withName> <withName>2000</withName> </hasNamedDestination> </assertThat> </testcase>
Because a name also has to work with external links, it may not contain spaces.
For example, if a document in LibreOffice has a label
"Export to PDF"
(which contains spaces) then LibreOffice creates a
destination with the label "First2520Bookmark"
when exporting it to PDF. A test has to use the escaped value:
<!-- The convertion of the bookmarks by LibreOffice converts every space in a bookmark label into "2520" in the named destination". --> <testcase name="hasNamedDestination_CreatedWithLibreOffice"> <assertThat testDocument="namedDestination/problem_convert-bookmarks-to-pdf.pdf"> <hasNamedDestination> <withName>First2520Bookmark</withName> </hasNamedDestination> </assertThat> </testcase>
It is easy test to verify the existence of bookmarks:
<testcase name="hasBookmarks"> <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf"> <hasBookmarks /> </assertThat> </testcase>
After testing whether a document contains bookmarks at all, it is worth verifying the number of bookmarks:
<testcase name="hasNumberOfBookmarks"> <assertThat testDocument="bookmarks/manyBookmarks.pdf"> <hasNumberOfBookmarks>19</hasNumberOfBookmarks> </assertThat> </testcase>
An important property of a bookmark is its label. That is what the reader sees. So you should test that an expected bookmark has the expected label:
<testcase name="hasBookmark_withLabel"> <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf"> <hasBookmark withLabel="Content on page 3." /> </assertThat> </testcase>
Bookmarks can have different kinds of destinations.
A suitable attribute is provided for each destination inside the tag
<hasBookmark />
.
Does a particular bookmark point to the expected page number:
<testcase name="hasBookmark_WithLabelLinkingToPage"> <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf"> <hasBookmark withLabel="Content on first page." linkingToPage="1"/> </assertThat> </testcase>
The attribute linkingToPage=".."
can only be used together
with the attribute withLabel=".."
. In such a test the
given label has to point to the expected page number.
Is there any bookmark pointing to an expected page number:
<testcase name="hasBookmark_WithLinkToPage"> <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf"> <hasBookmark withLinkToPage="1" /> </assertThat> </testcase>
Does a bookmark exist which points to an expected destination:
<testcase name="hasBookmark_WithLinkToName"> <assertThat testDocument="bookmarks/twoBookmarkToSameDestination.pdf"> <hasBookmark withLinkToName="Destination on Page 1" /> </assertThat> </testcase>
Is there a bookmark pointing to a URI:
<testcase name="hasBookmark_WithLinkToURI"> <assertThat testDocument="bookmarks/bookmarkWithURLAction.pdf"> <hasBookmark withLinkToURI="http://www.wikipedia.org/" /> </assertThat> </testcase>
And finally. we can check that there is no bookmark having a “dead link”:
<!-- Looking for dead internal links (GOTO) of any bookmark. A 'dead link' means that a bookmark is not pointing to a page. --> <testcase name="hasBookmark_WithoutDeadLink"> <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf"> <hasBookmark withoutDeadLink="YES" /> </assertThat> </testcase>
PDFUnit does not access websites. So a “dead link” is a bookmark that does not point to a page or any other destination.
The next tests all use an XML structure which is created with
the utility program ExtractBookmarks
.
The bookmarks of a PDF document can be compared with an existing XML file. Each bookmark in the PDF must match an element in the XML file.
<!-- When comparing PDF parts against any XML, whitespaces and comments are ignored. --> <testcase name="hasBookmarks_MatchingXML_AsFileName"> <assertThat testDocument="bookmarks/bookmarksWithPdfOutline.pdf"> <hasBookmarks> <matchingXML file="bookmarks/bookmarksWithPdfOutline.xml"/> </hasBookmarks> </assertThat> </testcase>
Bookmark information can also be verified using individual XPath expressions:
<testcase name="hasBookmarks_MatchingXPath_MultipleInvocation_version1"> <assertThat testDocument="bookmarks/bookmarksWithPdfOutline.pdf"> <hasBookmarks> <matchingXPath expr="count(//Title) = 5" /> <matchingXPath expr="count(//Title[count(ancestor::*) > 2] ) = 0" /> </hasBookmarks> </assertThat> </testcase>