3.4.  Bookmarks and Named Destinations


Bookmarks are essential for a quick navigation in large PDF documents. The value of a book drops dramatically when the chapters are not available via the table of contents. Use the following tests to ensure that the bookmarks are generated correctly.

<!-- Tags for tests on bookmarks: -->

<hasNumberOfBookmarks />
<hasBookmarks />
<hasBookmark withLabel=".."        (One of these attributes ...
             withLinkToName=".."   ...
             withLinkToPage=".."   ...
             withLinkToURI=".."    ...
             withoutDeadLink=".."  ... has to be used)

<hasBookmark withLabel=".."        (Only these two attributes ...
             linkingToPage=".."     ... can be used together.)    

<!-- Nested tags of <hasBookmarks />: -->
  <matchingXPath />   (optional)
  <matchingXML   />   (optional)
<hasBookmarks />

We can see bookmarks as starting points and named destinations as the landing points. Named destinations can be used by bookmarks and also by HTML links. So you can jump from a website directly to a specific location within a PDF document.

For named destinations, the following tags are available:

<!-- Tags to check named destinations: -->

<hasNamedDestination />

<!-- Nested tags: -->
  <withName />          (optional)

Named Destinations

The names of named destinations can be tested easily:

<testcase name="hasNamedDestination_WithName">
  <assertThat testDocument="namedDestination/manyNamedDestinations.pdf">

Because a name also has to work with external links, it may not contain spaces. For example, if a document in LibreOffice has a label "Export to PDF" (which contains spaces) then LibreOffice creates a destination with the label "First2520Bookmark" when exporting it to PDF. A test has to use the escaped value:

  The convertion of the bookmarks by LibreOffice converts every 
  space in a bookmark label into "2520" in the named destination".
<testcase name="hasNamedDestination_CreatedWithLibreOffice">
  <assertThat testDocument="namedDestination/problem_convert-bookmarks-to-pdf.pdf">
      <withName>First2520Bookmark</withName> 1


"2520" stands for "%20" and that corresponds to a space.

Existence of Bookmarks

It is easy test to verify the existence of bookmarks:

<testcase name="hasBookmarks">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmarks /> 

Number of Bookmarks

After testing whether a document contains bookmarks at all, it is worth verifying the number of bookmarks:

<testcase name="hasNumberOfBookmarks">
  <assertThat testDocument="bookmarks/manyBookmarks.pdf">

Label of a Bookmark

An important property of a bookmark is its label. That is what the reader sees. So you should test that an expected bookmark has the expected label:

<testcase name="hasBookmark_withLabel">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withLabel="Content on page 3." /> 

Destinations of Bookmarks

Bookmarks can have different kinds of destinations. A suitable attribute is provided for each destination inside the tag <hasBookmark />.

Does a particular bookmark point to the expected page number:

<testcase name="hasBookmark_WithLabelLinkingToPage">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withLabel="Content on first page." linkingToPage="1"/>

The attribute linkingToPage=".." can only be used together with the attribute withLabel="..". In such a test the given label has to point to the expected page number.

Is there any bookmark pointing to an expected page number:

<testcase name="hasBookmark_WithLinkToPage">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withLinkToPage="1" />

Does a bookmark exist which points to an expected destination:

<testcase name="hasBookmark_WithLinkToName">
    <assertThat testDocument="bookmarks/twoBookmarkToSameDestination.pdf">
      <hasBookmark withLinkToName="Destination on Page 1" />

Is there a bookmark pointing to a URI:

<testcase name="hasBookmark_WithLinkToURI">
  <assertThat testDocument="bookmarks/bookmarkWithURLAction.pdf">
    <hasBookmark withLinkToURI="http://www.wikipedia.org/" />

And finally. we can check that there is no bookmark having a dead link:

  Looking for dead internal links (GOTO) of any bookmark.
  A 'dead link' means that a bookmark is not pointing to a page.
<testcase name="hasBookmark_WithoutDeadLink">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withoutDeadLink="YES" />

PDFUnit does not access websites. So a dead link is a bookmark that does not point to a page or any other destination.

Check Bookmarks with XML/XPath

The next tests all use an XML structure which is created with the utility program ExtractBookmarks.

The bookmarks of a PDF document can be compared with an existing XML file. Each bookmark in the PDF must match an element in the XML file.

  When comparing PDF parts against any XML, 
  whitespaces and comments are ignored. 
<testcase name="hasBookmarks_MatchingXML_AsFileName">
  <assertThat testDocument="bookmarks/bookmarksWithPdfOutline.pdf">
      <matchingXML file="bookmarks/bookmarksWithPdfOutline.xml"/> 1


When comparing PDF parts against any XML, whitespaces and comments are ignored.

Bookmark information can also be verified using individual XPath expressions:

<testcase name="hasBookmarks_MatchingXPath_MultipleInvocation_version1">
  <assertThat testDocument="bookmarks/bookmarksWithPdfOutline.pdf">
      <matchingXPath expr="count(//Title) = 5" />
      <matchingXPath expr="count(//Title[count(ancestor::*) > 2] ) = 0" />