Checkpoint 06: Metadata

PDFs must contain proper XMP metadata that identifies the document and declares PDF/UA conformance. This metadata helps users and assistive technologies understand and navigate documents.

What This Means

XMP (Extensible Metadata Platform) metadata is structured information embedded within the PDF that describes the document. For PDF/UA compliance, this metadata must include:

XMP metadata stream: The technical container for metadata
PDF/UA identifier: A declaration that the document claims PDF/UA conformance
dc:title: A Dublin Core title element that identifies the document
Meaningful title: The title must actually describe the document content

The metadata serves multiple purposes:

Identification: Helps users know what document they have opened
Accessibility declaration: Signals to assistive technology that accessibility features are present
Document management: Enables proper filing, searching, and cataloging
Screen reader behavior: The title is often announced when opening the document

Why It Matters

Document metadata significantly impacts the user experience for people using assistive technology:

Title announcement: Screen readers announce the document title when opening a PDF. A meaningful title helps users confirm they have the right document.
Tab labels: In browsers and PDF readers, the title appears in tabs and window titles
Search results: Document titles appear in search results, helping users find content
PDF/UA conformance: The PDF/UA identifier tells assistive technology to expect accessible features

Without proper metadata:

Screen readers may announce "Untitled" or a cryptic filename
Users cannot easily identify documents in tabs or task bars
PDF readers may not enable accessibility features
Accessibility validators cannot confirm PDF/UA conformance

Consider the difference between a screen reader announcing "document1-final-v3.pdf" versus "2024 Annual Accessibility Report - Beacon Corporation". The metadata transforms an opaque filename into useful identification.

Common Violations

The Matterhorn Protocol defines four failure conditions for metadata. The first three are machine testable, while the fourth requires human judgment.

06-001: Document Does Not Contain XMP Metadata Stream (Machine Testable)

What's Wrong: The PDF lacks an XMP metadata stream entirely. Without this technical structure, no standard metadata can be stored or read.

How to Identify:

PDF/UA validators will flag this automatically
In Acrobat, go to File > Properties > Description and check for basic document properties
If the Custom tab shows no XMP data, the stream may be missing

Why This Happens:

Very old PDF creation tools that predate XMP
PDFs created by minimal or non-standard software
Corrupted PDF files
PDFs generated by command-line tools without metadata options

06-002: Metadata Stream Does Not Include PDF/UA Identifier (Machine Testable)

What's Wrong: The XMP metadata exists but does not include the PDF/UA conformance identifier. This identifier declares that the document claims to meet PDF/UA requirements.

How to Identify:

PDF/UA validators will detect this automatically
The identifier should be in the pdfuaid namespace
Look for pdfuaid:part with value "1" (for PDF/UA-1)

Technical Details: The PDF/UA identifier should appear in XMP as:

<pdfuaid:part>1</pdfuaid:part>

This tells readers and validators: "This PDF claims PDF/UA-1 conformance."

06-003: Metadata Stream Does Not Contain dc:title (Machine Testable)

What's Wrong: The XMP metadata exists but lacks a dc:title element. Dublin Core title is a standard metadata field that should contain the document's human-readable title.

How to Identify:

PDF/UA validators will flag this
In Acrobat, go to File > Properties > Description
Check the "Title" field; if empty, dc:title is not set

Common Causes:

PDF exported without entering document properties
Source document (Word, InDesign) has no title set
Automated PDF generation that does not populate metadata

06-004: dc:title Does Not Clearly Identify the Document (Human Testing)

What's Wrong: A dc:title exists but does not meaningfully describe the document. Examples of poor titles:

"Untitled"
"Document1"
"Microsoft Word - report.docx"
"New Document"
"PDF"
Random characters or codes

How to Identify:

This requires human judgment
Read the title and ask: "Would a user know what this document is from the title alone?"
Consider whether the title is descriptive, accurate, and useful
Check if the title matches or relates to the actual document content

Examples of Good vs. Poor Titles:

Poor Title	Better Title
Untitled	2024 Q3 Financial Report
Document1.docx	Employee Handbook - Beacon Corp
Microsoft Word - policy	Travel Expense Policy v2.1
Report	Website Accessibility Audit - January 2024
form	Tax Form W-4 (2024)

How to Fix in Adobe Acrobat

Adobe Acrobat provides straightforward tools for setting document metadata.

Setting Basic Metadata

Open your PDF in Adobe Acrobat Pro
Go to File > Properties (or press Ctrl/Cmd + D)
On the Description tab:
- Enter a clear, descriptive Title
- Add Author, Subject, and Keywords as appropriate
Click OK to save

Verifying Metadata Exists

Go to File > Properties
Click the Description tab
Verify the Title field has a meaningful value
Click Additional Metadata to view the full XMP structure
Verify metadata is present and formatted correctly

Adding PDF/UA Identifier

The PDF/UA identifier typically requires special handling:

Method 1: Use the Accessibility Action Wizard

Go to Tools > Action Wizard
If available, use a PDF/UA remediation action
This automatically adds the identifier when making PDFs accessible

Method 2: Use Preflight

Go to Tools > Print Production > Preflight
Search for "PDF/UA" profiles
Run the Set PDF/UA-1 entry fixup
This adds the required identifier to metadata

Method 3: Manual XMP Editing (Advanced)

Go to File > Properties > Additional Metadata
Click Advanced tab
Add or modify the pdfuaid:part property
This requires understanding XMP structure

Checking Metadata with Preflight

Go to Tools > Print Production > Preflight
Select a PDF/UA validation profile
Run the check
Review results for metadata-related failures
Use built-in fixups to correct issues

How to Fix in Microsoft Word

Setting document properties in Word before PDF export ensures metadata transfers correctly.

Setting Document Properties

Click File in the ribbon
On the Info page, look at the Properties panel on the right
Click Properties > Advanced Properties
On the Summary tab:
- Enter a clear, descriptive Title
- Add Subject, Author, Keywords as appropriate
Click OK

Quick Properties Access

Click File > Info
On the right panel, you can directly edit:
- Title
- Tags (keywords)
- Comments
Click the arrow next to "Properties" for more fields

Verifying Before Export

Go to File > Info
Check that Title shows your document title, not "Add a title"
Verify other metadata fields are populated appropriately
Export to PDF

PDF Export Options

Go to File > Save As and choose PDF
Click Options
Ensure "Document properties" is checked under "Include non-printing information"
This transfers Word metadata to the PDF

Using Document Panel

For frequent metadata editing:

Go to File > Info > Properties > Show Document Panel
This adds a metadata panel to your document view
Edit properties directly while working

How to Fix in Other Applications

Adobe InDesign

Go to File > File Info
On the Description tab:
- Enter a Title
- Add other metadata as needed
Click OK
When exporting to PDF, metadata transfers automatically

LibreOffice

Go to File > Properties
On the Description tab, enter Title and other metadata
Click OK
Export to PDF; properties should transfer

Google Docs

Note: Google Docs does not have traditional document properties
The document name becomes the PDF title
After downloading as PDF, edit metadata in Acrobat
Or use a PDF post-processor to add metadata

Command-Line Tools

Using ExifTool to add metadata:

exiftool -Title="Document Title" -Author="Author Name" file.pdf

Using QPDF with JSON metadata:

qpdf --replace-input --linearize --object-streams=generate file.pdf

Testing Your Fix

Automated Testing

Adobe Acrobat:

Go to Tools > Accessibility > Accessibility Check
Select PDF/UA-1 as the standard
Run the check
Look for metadata-related failures
Review any title-related warnings

PAC (PDF Accessibility Checker):

Open the PDF in PAC
Run the PDF/UA check
Review Checkpoint 06 results
Check all four failure conditions

veraPDF:

Select PDF/UA-1 validation profile
Run validation
Review metadata rule results

Manual Verification

Open the PDF in Acrobat Reader or similar
Go to File > Properties
Check that:
- Title is displayed and meaningful
- The title clearly identifies the document
- The title is not a filename or generic placeholder

Screen Reader Testing

Open the PDF with a screen reader
Listen for the document title announcement
Verify the announced title is meaningful
Check if the title helps identify the document

Tab/Window Title Check

Open the PDF in a web browser
Look at the browser tab
Verify the tab shows a meaningful title
Open in Acrobat Reader and check the window title

Validation Checklist

XMP metadata stream exists
PDF/UA identifier is present (pdfuaid:part = 1)
dc:title element exists
Title is meaningful and descriptive
Title clearly identifies the document content
Title is not a filename or placeholder
Screen reader announces a useful title
Tab/window shows the title

Title Best Practices

What Makes a Good Title

A good document title:

Identifies the content: States what the document is about
Is specific: Distinguishes this document from similar ones
Is concise: Long enough to be clear, short enough to be readable
Includes context: Date, version, or organization if relevant
Is human-readable: Uses natural language, not codes

Title Writing Guidelines

Start with the document type or content: "Annual Report", "Policy Manual", "Tax Form"
Add specificity: What year? What department? What topic?
Include organization if needed: Especially for external documents
Add date/version for time-sensitive content: "(January 2024)", "v2.0"

Examples by Document Type

Type	Example Title
Report	Quarterly Sales Report - Q3 2024
Policy	Information Security Policy v3.1
Form	Employee Leave Request Form
Manual	User Guide - Document Management System
Presentation	Board Meeting Presentation - December 2024
Legal	Terms of Service - Beacon Software Inc.
Academic	Introduction to Accessible Design - Course Syllabus

Additional Resources

Official Standards and Guidelines

XMP and PDF Metadata

Tools

PAC (PDF Accessibility Checker) - Free PDF/UA validation
veraPDF - Open-source PDF/A and PDF/UA validator
ExifTool - Command-line metadata editor
QPDF - PDF manipulation tool

This documentation is based on the Matterhorn Protocol 1.02, the definitive reference for PDF/UA validation. Three of the four metadata violations are machine testable; the meaningfulness of the title requires human judgment. For the most current information, consult the PDF Association and W3C WCAG guidelines.

Checkpoint 06: Metadata

What This Means

Why It Matters

Common Violations

06-001: Document Does Not Contain XMP Metadata Stream (Machine Testable)

06-002: Metadata Stream Does Not Include PDF/UA Identifier (Machine Testable)

06-003: Metadata Stream Does Not Contain dc:title (Machine Testable)

06-004: dc:title Does Not Clearly Identify the Document (Human Testing)

How to Fix in Adobe Acrobat

Setting Basic Metadata

Verifying Metadata Exists

Adding PDF/UA Identifier

Checking Metadata with Preflight

How to Fix in Microsoft Word

Setting Document Properties

Quick Properties Access

Verifying Before Export

PDF Export Options

Using Document Panel

How to Fix in Other Applications

Adobe InDesign

LibreOffice

Google Docs

Command-Line Tools

Testing Your Fix

Automated Testing

Manual Verification

Screen Reader Testing

Tab/Window Title Check

Validation Checklist

Title Best Practices

What Makes a Good Title

Title Writing Guidelines

Examples by Document Type

Additional Resources

Official Standards and Guidelines

XMP and PDF Metadata

Tools

Scan Your PDFs for Accessibility Issues