Checkpoint 06: Metadata
PDFs must contain proper XMP metadata that identifies the document and declares PDF/UA conformance. This metadata helps users and assistive technologies understand and navigate documents.
What This Means
XMP (Extensible Metadata Platform) metadata is structured information embedded within the PDF that describes the document. For PDF/UA compliance, this metadata must include:
- XMP metadata stream: The technical container for metadata
- PDF/UA identifier: A declaration that the document claims PDF/UA conformance
- dc:title: A Dublin Core title element that identifies the document
- Meaningful title: The title must actually describe the document content
The metadata serves multiple purposes:
- Identification: Helps users know what document they have opened
- Accessibility declaration: Signals to assistive technology that accessibility features are present
- Document management: Enables proper filing, searching, and cataloging
- Screen reader behavior: The title is often announced when opening the document
Why It Matters
Document metadata significantly impacts the user experience for people using assistive technology:
- Title announcement: Screen readers announce the document title when opening a PDF. A meaningful title helps users confirm they have the right document.
- Tab labels: In browsers and PDF readers, the title appears in tabs and window titles
- Search results: Document titles appear in search results, helping users find content
- PDF/UA conformance: The PDF/UA identifier tells assistive technology to expect accessible features
Without proper metadata:
- Screen readers may announce "Untitled" or a cryptic filename
- Users cannot easily identify documents in tabs or task bars
- PDF readers may not enable accessibility features
- Accessibility validators cannot confirm PDF/UA conformance
Consider the difference between a screen reader announcing "document1-final-v3.pdf" versus "2024 Annual Accessibility Report - Beacon Corporation". The metadata transforms an opaque filename into useful identification.
Common Violations
The Matterhorn Protocol defines four failure conditions for metadata. The first three are machine testable, while the fourth requires human judgment.
06-001: Document Does Not Contain XMP Metadata Stream (Machine Testable)
What's Wrong: The PDF lacks an XMP metadata stream entirely. Without this technical structure, no standard metadata can be stored or read.
How to Identify:
- PDF/UA validators will flag this automatically
- In Acrobat, go to File > Properties > Description and check for basic document properties
- If the Custom tab shows no XMP data, the stream may be missing
Why This Happens:
- Very old PDF creation tools that predate XMP
- PDFs created by minimal or non-standard software
- Corrupted PDF files
- PDFs generated by command-line tools without metadata options
06-002: Metadata Stream Does Not Include PDF/UA Identifier (Machine Testable)
What's Wrong: The XMP metadata exists but does not include the PDF/UA conformance identifier. This identifier declares that the document claims to meet PDF/UA requirements.
How to Identify:
- PDF/UA validators will detect this automatically
- The identifier should be in the
pdfuaidnamespace - Look for
pdfuaid:partwith value "1" (for PDF/UA-1)
Technical Details: The PDF/UA identifier should appear in XMP as:
<pdfuaid:part>1</pdfuaid:part>
This tells readers and validators: "This PDF claims PDF/UA-1 conformance."
06-003: Metadata Stream Does Not Contain dc:title (Machine Testable)
What's Wrong: The XMP metadata exists but lacks a dc:title element. Dublin Core title is a standard metadata field that should contain the document's human-readable title.
How to Identify:
- PDF/UA validators will flag this
- In Acrobat, go to File > Properties > Description
- Check the "Title" field; if empty, dc:title is not set
Common Causes:
- PDF exported without entering document properties
- Source document (Word, InDesign) has no title set
- Automated PDF generation that does not populate metadata
06-004: dc:title Does Not Clearly Identify the Document (Human Testing)
What's Wrong: A dc:title exists but does not meaningfully describe the document. Examples of poor titles:
- "Untitled"
- "Document1"
- "Microsoft Word - report.docx"
- "New Document"
- "PDF"
- Random characters or codes
How to Identify:
- This requires human judgment
- Read the title and ask: "Would a user know what this document is from the title alone?"
- Consider whether the title is descriptive, accurate, and useful
- Check if the title matches or relates to the actual document content
Examples of Good vs. Poor Titles:
| Poor Title | Better Title |
|---|---|
| Untitled | 2024 Q3 Financial Report |
| Document1.docx | Employee Handbook - Beacon Corp |
| Microsoft Word - policy | Travel Expense Policy v2.1 |
| Report | Website Accessibility Audit - January 2024 |
| form | Tax Form W-4 (2024) |
How to Fix in Adobe Acrobat
Adobe Acrobat provides straightforward tools for setting document metadata.
Setting Basic Metadata
- Open your PDF in Adobe Acrobat Pro
- Go to File > Properties (or press Ctrl/Cmd + D)
- On the Description tab:
- Enter a clear, descriptive Title
- Add Author, Subject, and Keywords as appropriate
- Click OK to save
Verifying Metadata Exists
- Go to File > Properties
- Click the Description tab
- Verify the Title field has a meaningful value
- Click Additional Metadata to view the full XMP structure
- Verify metadata is present and formatted correctly
Adding PDF/UA Identifier
The PDF/UA identifier typically requires special handling:
Method 1: Use the Accessibility Action Wizard
- Go to Tools > Action Wizard
- If available, use a PDF/UA remediation action
- This automatically adds the identifier when making PDFs accessible
Method 2: Use Preflight
- Go to Tools > Print Production > Preflight
- Search for "PDF/UA" profiles
- Run the Set PDF/UA-1 entry fixup
- This adds the required identifier to metadata
Method 3: Manual XMP Editing (Advanced)
- Go to File > Properties > Additional Metadata
- Click Advanced tab
- Add or modify the
pdfuaid:partproperty - This requires understanding XMP structure
Checking Metadata with Preflight
- Go to Tools > Print Production > Preflight
- Select a PDF/UA validation profile
- Run the check
- Review results for metadata-related failures
- Use built-in fixups to correct issues
How to Fix in Microsoft Word
Setting document properties in Word before PDF export ensures metadata transfers correctly.
Setting Document Properties
- Click File in the ribbon
- On the Info page, look at the Properties panel on the right
- Click Properties > Advanced Properties
- On the Summary tab:
- Enter a clear, descriptive Title
- Add Subject, Author, Keywords as appropriate
- Click OK
Quick Properties Access
- Click File > Info
- On the right panel, you can directly edit:
- Title
- Tags (keywords)
- Comments
- Click the arrow next to "Properties" for more fields
Verifying Before Export
- Go to File > Info
- Check that Title shows your document title, not "Add a title"
- Verify other metadata fields are populated appropriately
- Export to PDF
PDF Export Options
- Go to File > Save As and choose PDF
- Click Options
- Ensure "Document properties" is checked under "Include non-printing information"
- This transfers Word metadata to the PDF
Using Document Panel
For frequent metadata editing:
- Go to File > Info > Properties > Show Document Panel
- This adds a metadata panel to your document view
- Edit properties directly while working
How to Fix in Other Applications
Adobe InDesign
- Go to File > File Info
- On the Description tab:
- Enter a Title
- Add other metadata as needed
- Click OK
- When exporting to PDF, metadata transfers automatically
LibreOffice
- Go to File > Properties
- On the Description tab, enter Title and other metadata
- Click OK
- Export to PDF; properties should transfer
Google Docs
- Note: Google Docs does not have traditional document properties
- The document name becomes the PDF title
- After downloading as PDF, edit metadata in Acrobat
- Or use a PDF post-processor to add metadata
Command-Line Tools
Using ExifTool to add metadata:
exiftool -Title="Document Title" -Author="Author Name" file.pdf
Using QPDF with JSON metadata:
qpdf --replace-input --linearize --object-streams=generate file.pdf
Testing Your Fix
Automated Testing
Adobe Acrobat:
- Go to Tools > Accessibility > Accessibility Check
- Select PDF/UA-1 as the standard
- Run the check
- Look for metadata-related failures
- Review any title-related warnings
PAC (PDF Accessibility Checker):
- Open the PDF in PAC
- Run the PDF/UA check
- Review Checkpoint 06 results
- Check all four failure conditions
veraPDF:
- Select PDF/UA-1 validation profile
- Run validation
- Review metadata rule results
Manual Verification
- Open the PDF in Acrobat Reader or similar
- Go to File > Properties
- Check that:
- Title is displayed and meaningful
- The title clearly identifies the document
- The title is not a filename or generic placeholder
Screen Reader Testing
- Open the PDF with a screen reader
- Listen for the document title announcement
- Verify the announced title is meaningful
- Check if the title helps identify the document
Tab/Window Title Check
- Open the PDF in a web browser
- Look at the browser tab
- Verify the tab shows a meaningful title
- Open in Acrobat Reader and check the window title
Validation Checklist
- XMP metadata stream exists
- PDF/UA identifier is present (pdfuaid:part = 1)
- dc:title element exists
- Title is meaningful and descriptive
- Title clearly identifies the document content
- Title is not a filename or placeholder
- Screen reader announces a useful title
- Tab/window shows the title
Title Best Practices
What Makes a Good Title
A good document title:
- Identifies the content: States what the document is about
- Is specific: Distinguishes this document from similar ones
- Is concise: Long enough to be clear, short enough to be readable
- Includes context: Date, version, or organization if relevant
- Is human-readable: Uses natural language, not codes
Title Writing Guidelines
- Start with the document type or content: "Annual Report", "Policy Manual", "Tax Form"
- Add specificity: What year? What department? What topic?
- Include organization if needed: Especially for external documents
- Add date/version for time-sensitive content: "(January 2024)", "v2.0"
Examples by Document Type
| Type | Example Title |
|---|---|
| Report | Quarterly Sales Report - Q3 2024 |
| Policy | Information Security Policy v3.1 |
| Form | Employee Leave Request Form |
| Manual | User Guide - Document Management System |
| Presentation | Board Meeting Presentation - December 2024 |
| Legal | Terms of Service - Beacon Software Inc. |
| Academic | Introduction to Accessible Design - Course Syllabus |
Additional Resources
Official Standards and Guidelines
- W3C WCAG 2.1 Success Criterion 2.4.2: Page Titled
- PDF Association Matterhorn Protocol 1.02
- Dublin Core Metadata Element Set
XMP and PDF Metadata
- Adobe XMP Specification
- PDF Association: PDF/UA Technical Implementation Guide
- ISO 14289-1 (PDF/UA-1) Standard
Tools
- PAC (PDF Accessibility Checker) - Free PDF/UA validation
- veraPDF - Open-source PDF/A and PDF/UA validator
- ExifTool - Command-line metadata editor
- QPDF - PDF manipulation tool
This documentation is based on the Matterhorn Protocol 1.02, the definitive reference for PDF/UA validation. Three of the four metadata violations are machine testable; the meaningfulness of the title requires human judgment. For the most current information, consult the PDF Association and W3C WCAG guidelines.