13.4. Comparing Text

Expected text and actual text on a PDF page can be compared using the following methods:

// Methods with configurable whitespace processing. Default is NORMALIZE:
.containing(searchToken                                   )   
.containing(searchToken             , WhitespaceProcessing)   
.containing(String[] searchTokens                         )
.containing(String[] searchTokens   , WhitespaceProcessing)
.endingWith(searchToken                                   )
.endingWith(searchToken             , WhitespaceProcessing)
.equalsTo(searchToken                                     )
.equalsTo(searchToken               , WhitespaceProcessing)
.first(searchToken                                        )
.first(searchToken                  , WhitespaceProcessing)   
.notContaining(searchToken                                )
.notContaining(searchToken          , WhitespaceProcessing)
.notContaining(String[] searchTokens                      )
.notContaining(String[] searchTokens, WhitespaceProcessing)
.startingWith(searchToken                                 )
.startingWith(searchToken           , WhitespaceProcessing)
.then(searchToken)                                            

// Methods with whitespace processing NORMALIZE:
.notEndingWith(searchToken)
.notStartingWith(searchToken)

// Methods without whitespace processing:
.matchingRegex(regex)
.notMatchingRegex(regex)

	Methods without the second parameter normalize the whitespaces. That means whitespaces at the beginning and the end are removed and all sequences of any whitespace within a text are reduced to one space.
	The processing of whitespaces in these methods is controlled by the second parameter. For this parameter, the constants `IGNORE`, `NORMALIZE`, and `KEEP` exist. The constants are explained separately in section 13.5: “Whitespace Processing”. They can be used in all methods with 'WhitespaceProcessing' as a second parameter.
	The method `then(..)` always processes whitespaces in the same way as `first(..)`.

Comparisons with regular expressions follow the rules and possibilities of the class java.util.regex.Pattern :

// Using regular expression to compare page content
@Test
public void hasText_MatchingRegex() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .restrictedTo(FIRST_PAGE)
            .hasText()
            .matchingRegex(".*[Cc]ontent.*")  
  ;
}

The methods containing(String[]) and notContaining(String[]) can be called with multiple search terms. A test with containing(String[]) is considered successful if each expected term appears on every selected page. A test with notContaining(String[]) is considered successful if none of the terms exist on any of the selected pages:

@Test
public void hasText_NotContaining_MultipleSearchTokens() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .restrictedTo(FIRST_PAGE)
            .hasText()
            .notContaining("even pagenumber", "Page #2") 
  ;
}

Prev	Up	Next
13.3. Defining Page Areas	Home	13.5. Whitespace Processing