3.13.  Form Fields

Overview

It is often the content of form fields which is processed when PDF documents are part of a workflow. To avoid problems the fields should be created properly. So, field names should be unique and some field properties should be set right.

All information about form fields can be extracted into an XML file by using the utility ExtractFieldInfo. All properties in the XML file can be validated.

The following sections describe a lot of tests for field properties, size and content. Depending on the application context one of the following tags and attributes may be useful to you:

// Simple tests:
.hasField(..) 
.hasField(..).ofType(..)
.hasField(..).withHeight()
.hasField(..).withWidth()
.hasFields()                    
.hasFields(..)                   
.hasNumberOfFields(..) 
.hasSignatureField(..)
.hasSignatureFields()            1

// Tests belonging to all fields:
.hasFields().withoutDuplicateNames()
.hasFields().allWithoutTextOverflow()   2

// Content of a field:
.hasField(..).withText().containing()
.hasField(..).withText().endingWith()
.hasField(..).withText().equalsTo()
.hasField(..).withText().matchingRegex()
.hasField(..).withText().notContaining()
.hasField(..).withText().notMatchintRegex()
.hasField(..).withText().startingWith()

// JavaScript associated to a field:
.hasField(..).withJavaScript().containing(...)

// Field properties:
.hasField(..).withProperty().checked()
.hasField(..).withProperty().editable()
.hasField(..).withProperty().exportable()
.hasField(..).withProperty().multiLine()
.hasField(..).withProperty().multiSelect()
.hasField(..).withProperty().notExportable()
.hasField(..).withProperty().notSigned()
.hasField(..).withProperty().notVisibleInPrint()
.hasField(..).withProperty().notVisibleOnScreen()
.hasField(..).withProperty().optional()
.hasField(..).withProperty().passwordProtected()
.hasField(..).withProperty().readOnly()
.hasField(..).withProperty().required()
.hasField(..).withProperty().signed()
.hasField(..).withProperty().singleLine()
.hasField(..).withProperty().singleSelect()
.hasField(..).withProperty().unchecked()
.hasField(..).withProperty().visibleInPrint()
.hasField(..).withProperty().visibleOnScreen()
.hasField(..).withProperty().visibleOnScreenAndInPrint()

1

This test is described separately in chapter 3.28: “Signed PDF”

2

This test is described separately in chapter 3.14: “Form Fields - Text Overflow”:

Existence of Fields

The following test verifies whether or not fields exist:

@Test
public void hasFields() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .hasFields()  // throws an exception when no fields exist
  ;
}

Name of Fields

Because fields are accessed by their names to get their content, you could check that the names exist:

@Test
public void hasField_MultipleInvocation() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname1 = "name";
  String fieldname2 = "address";
  String fieldname3 = "postal_code";
  String fieldname4 = "email";
  
  AssertThat.document(filename)
            .hasField(fieldname1)
            .hasField(fieldname2)
            .hasField(fieldname3)
            .hasField(fieldname4)
  ;
}

The same result can be achieved using an array of field names and the method hasFields(..):

@Test
public void hasFields() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname1 = "name";
  String fieldname2 = "address";
  String fieldname3 = "postal_code";
  String fieldname4 = "email";
  
  AssertThat.document(filename)
            .hasFields(fieldname1, fieldname2, fieldname3, fieldname4)
  ;
}

Duplicate field names are allowed by the PDF specification, but they are probably a source of surprises in the later workflow. Thus PDFUnit provides a method to check the absence of duplicate names.

@Test
public void hasFields_WithoutDuplicateNames() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .hasFields()
            .withoutDuplicateNames()
  ;
}

Number of Fields

If you only need to verify the number of fields, you can use the method hasNumberOfFields(..):

@Test
public void hasNumberOfFields() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .hasNumberOfFields(4)
  ;
}

Perhaps it might also be interesting to ensure that a PDF document has no fields:

@Test
public void hasNumberOfFields_NoFieldsAvailable() throws Exception {
  String filename = "documentUnderTest.pdf";
  int zeroExpected = 0;
  
  AssertThat.document(filename)
            .hasNumberOfFields(zeroExpected)
  ;
}

Content of Fields

It is very simple to verify that a given field contains data:

@Test
public void hasField_WithAnyValue() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname = "ageField";
  
  AssertThat.document(filename)
            .hasField(fieldname)
            .withText()
  ;
}

To verify the actual content of fields with an expected string, the following methods are available:

.containing(..)
.endingWith(..)
.equalsTo(..)
.matchingRegex(..)
.notContaining(..)
.notMatchingRegex(..)
.startingWith(..)

The following examples should give you some ideas about how to use these methods:

@Test
public void hasField_EqualsTo() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname = "Text 1";
  String expectedValue = "Single Line Text";
  
  AssertThat.document(filename)
            .hasField(fieldname)
            .equalsTo(expectedValue)
  ;
}
/**
 * This is a small test to protect fields against SQL-Injection.
 */
@Test
public void hasField_NotContaining_SQLComment() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname = "Text 1";
  String sqlCommentSequence = "--";
  
  AssertThat.document(filename)
            .hasField(fieldname) 
            .notContaining(sqlCommentSequence)
  ;
}

Whitespaces will be normalized when comparing expected and actual field content.

Type of Fields

Each field has a type. Although a field type is not as important as the name, it can be tested with a special method:

@Test
public void hasField_WithType_MultipleInvocation() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .hasField("Text 25")       .ofType(TEXT)
            .hasField("Check Box 7")   .ofType(CHECKBOX)
            .hasField("Radio Button 4").ofType(RADIOBUTTON)
            .hasField("Button 19")     .ofType(PUSHBUTTON)
            .hasField("List Box 1")    .ofType(LIST)
            .hasField("List Box 1")    .ofType(CHOICE)
            .hasField("Combo Box 5")   .ofType(CHOICE)
            .hasField("Combo Box 5")   .ofType(COMBO)
  ;
}

The previous program listing shows all testable fields except for a signature field, because that document has no signature field. The document of the next listing has a signature field and that can be tested:

@Test
public void hasField_WithType_Signature() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .hasField("Signature2").withType(SIGNATURE)
  ;
}

Datailed tests for signatures are described in chapter 3.28: “Signed PDF”:

Available field types are defined as constants in com.pdfunit.Constants. The names of the constants correspond to the typical names of visible elements of a graphical user interface. But the PDF standard uses other names for the types. The following list shows the association between PDFUnit constants and PDF internal constants. These may appear in error messages:

// Mapping between PDFUnit-Constants and PDF-internal types.

PDFUnit-Constant    PDF-intern
-------------------------------
CHOICE          ->  "Ch"
COMBO           ->  "Ch"
LIST            ->  "Ch"
CHECKBOX        ->  "Btn"
PUSHBUTTON      ->  "Btn"
RADIOBUTTON     ->  "Btn"
SIGNATURE       ->  "Sig"
TEXT            ->  "Tx"

Field Size

If the size of form fields is important, methods can be used to verify width and height:

@Test
public void hasField_WidthAndHeight() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname = "Title of 'someField'"; 
  int allowedDeltaForMillis = 2;
  AssertThat.document(filename)
            .hasField(fieldname)
            .withWidth(159, MILLIMETERS, allowedDeltaForMillis)
            .withHeight(11, MILLIMETERS, allowedDeltaForMillis)
  ;
}

Both methods can be invoked with different pre-defined measuring units: points or millimeters. Because rounding is necessary, rounding tolerance must be given as a third parameter. The default is set to the unit points with a tolerance of zero.

@Test
public void hasField_Width() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname = "Title of 'someField'"; 
  int allowedDeltaForPoints = 0;
  int allowedDeltaForMillis = 2;
  AssertThat.document(filename)
            .hasField(fieldname)
            .withWidth(450, POINTS, allowedDeltaForPoints)
            .withWidth(159, MILLIMETERS, allowedDeltaForMillis)
            .withWidth(450) // default is POINTS
  ;
}

When you are creating a test you probably do not know the dimensions of a field. That is not a problem. Use any value for width and height and run the test. The resulting error message returns the real field size in millimeters

Whether a text fits into a field or not is not predictable by calculation using font size and field size. In addition to the font size the words at the end of each line determine the required number of rows and the required height. And the calculation has to consider hyphenation. Chapter 3.14: “Form Fields - Text Overflow” deals with this subject in detail.

Field Properties

Fields have more properties than just the size, for example editable and required. Since most of the properties can not be tested manually, appropriate test methods have to be part of every PDF testing tool. The following example shows the principle.

@Test
public void hasField_Editable() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldnameEditable = "Combo Box 4"; 
  
  AssertThat.document(filename)
            .hasField(fieldnameEditable) 
            .withProperty()
            .editable()
  ;
}

These are the available attributes for verifing properties of form fields:

// Check field properties

// All methods following .withProperty():
.checked()                .unchecked()
.editable(),              .readOnly()
.exportable(),            .notExportable()
.multiLine(),             .singleLine()
.multiSelect(),           .singleSelect()
.optional(),              .required()
.signed(),                .notSigned()
.visibleInPrint(),        .notVisibleInPrint()
.visibleOnScreen(),       .notVisibleOnScreen()

.visibleOnScreenAndInPrint()
.passwordProtected()

JavaScript Actions for Fields

Assuming that PDF documents are processed in a workflow, the input into fields is typically validated with constraints implemented in JavaScript. That prevents incorrect input.

PDFUnit can verify whether JavaScript is associated with a field. The expected content of JavaScript can be validated by the method 'containing()'. Whitespaces are ignored:

@Test
public void hasFieldWithJavaScript() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldname = "Calc1_A"; 
  
  String scriptText = "AFNumber_Keystroke";
  AssertThat.document(filename)
            .hasField(fieldname)
            .withJavaScript()
            .containing(scriptText)
  ;
}

Unicode

When tools for creating PDF do not handle Unicode sequences properly, it is difficult to test those sequences. But difficult does not mean impossible. The following picture shows the name of a field in the encoding UTF-16BE with a Byte Order Mark (BOM) at the beginning:

Although it is tricky, the name of this field can be tested as a Java Unicode sequence:

@Test
public void hasField_nameContainingUnicode_UTF16() throws Exception {
  String filename = "documentUnderTest.pdf";
  String fieldName = 
  //                     F           o           r           m           R
     "\u00fe\u00ff\u0000\u0046\u0000\u006f\u0000\u0072\u0000\u006d\u0000\u0052" +
  //         o           o           t           [           0           ]
     "\u0000\u006f\u0000\u006f\u0000\u0074\u0000\u005b\u0000\u0030\u0000\u005d" +
  //    
     "\u002e"                                                                   +
  //                     P           a           g           e           1
     "\u00fe\u00ff\u0000\u0050\u0000\u0061\u0000\u0067\u0000\u0065\u0000\u0031" +
  //         [           0           ] 
     "\u0000\u005b\u0000\u0030\u0000\u005d"                                     +
  //    
     "\u002e"                                                                   +
  //                     P           r           i           n           t
     "\u00fe\u00ff\u0000\u0050\u0000\u0072\u0000\u0069\u0000\u006e\u0000\u0074" +
  //         F           o           r           m           B           u
     "\u0000\u0046\u0000\u006f\u0000\u0072\u0000\u006d\u0000\u0042\u0000\u0075" +
  //         t           t           o           n           [           0
     "\u0000\u0074\u0000\u0074\u0000\u006f\u0000\u006e\u0000\u005b\u0000\u0030" +
  //         ]
     "\u0000\u005d";
  
  AssertThat.document(fileName)
            .hasField(fieldName)
  ; 
}

More information about Unicode and Byte-Order-Mark can be found in Wikipedia.