Accessible PDFs made easier

Written By: Peter Abrahams
Content Copyright © 2008 Bloor. All Rights Reserved.

There are hundreds of millions of PDF documents on the web. It is such a popular format because it can be viewed on most devices that attach to the web and because it looks the same on any of these devices. The producer of the document can be sure that what the reader sees is the same as that which was created. The documents contain a great deal of useful, interesting and important information, much of which is not available on-line in any other format. This information will be of interest to people with disabilities.

Adobe recognised in 2001 that the information in PDF files needed to be accessible by people with disabilities and extended the format with a set of tags that provided information that would make the files accessible to Assistive Technologies, such as screen-readers. However, as we approach 2009 most of the PDF files on the web are still not accessible. The reasons for this include:

  • The producers are not aware that it is possible to create accessible PDF.
  • The producers are not aware of the benefits of creating accessible PDF.
  • Older PDF documents were created before 2001.
  • Some authoring tools cannot produce accessible PDF directly.
  • Other authoring tools can produce accessible PDF but it is not the default and the process is not simple.
  • Tools for testing documents for accessibility and providing remediation have been very low-level technical tools. They could only be used by specialists with significant time available. They have not been usable by the majority of authors.

In 2008 a variety of authoring tools improved their ability to produce PDFs tagged for accessibility. However few of them get it absolutely right and there is therefore a need to test and then touch-up the document. Other tools do not produce tagged files and of course there are a vast number of existing files that are not tagged.

There is therefore a need for a tool to test PDF files to see if they are accessible and a related remediation tool to fix any issues. Adobe provides these functions as part of Acrobat. However there are two limitations to this support:

  • There is a disclaimer on the testing tool that says: ‘The Accessibility Checker can help you identify areas of your documents that may be in conflict with Adobe’s interpretations of the referenced guidelines. However, the Accessibility Checker does not check all accessibility guidelines and criteria, including those in such referenced guidelines, and Adobe does not warrant that your documents will comply with any specific guidelines or regulations.’
  • The remediation tools are editors that enable changes to be made to the internal structures of the documents including tags and content. These editors enable any required changes to be made; but they require a high level of skill, understanding, patience and time to create a fully accessible document. I believe that most document producers will struggle to make anything but minor changes using these tools.

NetCentric is a Canadian firm specialising in document compliance and has gained considerable experience providing consultancy and services to convert PDF documents into accessible PDF documents. Based on this experience they have developed CommonLook for Adobe Acrobat, a plug-in to Adobe Acrobat, which provides a more comprehensive testing tool and a higher-level interface for remediation.

The tools work on tagged documents so if the document is not tagged then Acrobat is used to add an initial set of tags.

The Logical Structure Editor is the main remediation tool. It is divided into two panes, one showing the source document and the other showing a simplified internal format. By outlining an area of the source the equivalent area of the internal format is highlighted. The highlighted area can then be tagged or dragged to its correct logical position.

On complex page layouts, for example one with multiple columns where headings and images cross multiple columns, Acrobat is not always capable of ascertaining the correct reading order. The logical structure editor allows the page to marked up with the correct order. It will bring together sections of text that Acrobat assumed were in separate columns, or divide a section of text into more than one column where Acrobat assumed that the text read straight across the page. Trying to fix this type of reading order error without CommonLook is very difficult and prone to error.

The Logical Structure Editor also has functions to create and modify tables. Once a table is created, areas of text can be highlighted and dragged into the correct cell. Again this is a task that can only been done by a highly trained expert using Acrobat by itself.

The editor includes a tool to automatically correct common errors such as the last word on one line being concatenated with the first word of the next line. These errors are often difficult to spot by hand and are very tedious to correct.

The Testing Tool is based around the Section 508 checklist. It provides automated checks where that is possible and identifies areas of the document that require some human checking. For example for each image it will show the image and the alternate text for the figure and ask the tester to confirm that the text is correct (pass), or not (fail). The tester can update the text at this point so that it will pass or add a comment as to why it failed. Tests include one to check if the document is readable by a person with colour-blindness.

When the test is complete CommonLook creates a report showing all the tests carried out on the document and the status. The report can be used as proof that testing is complete or handed back to the author to deal with the areas of failure.

To be able to use CommonLook effectively requires some training. After about an hour I began to feel fairly confident and some further clarification of detailed issues will make me fully proficient. Testing and remediation using CommonLook can still be time consuming, as each section of the document needs to be checked; however it is much faster, less prone to errors, and requires less skill than using Acrobat by itself.

I would recommend any organisation that creates significant numbers of PDF documents to incorporate CommonLook for Adobe Acrobat into the publication process.