Checkpoint 30: XObjects
PDF documents must properly handle XObjects (external objects) for accessibility. Reference XObjects are prohibited, and Form XObjects containing marked content identifiers (MCIDs) must be uniquely referenced to maintain proper document structure.
What This Means
XObjects are reusable content objects in PDFs. The two types relevant to this checkpoint are:
Reference XObjects
Reference XObjects allow one PDF to reference content from another PDF file. While this can reduce file size and enable document modularity, it creates accessibility problems:
- The referenced content exists outside the main document
- Assistive technology may not be able to access the external content
- The document structure becomes dependent on external files
- If the referenced file is unavailable, content is lost
PDF/UA prohibits Reference XObjects entirely.
Form XObjects
Form XObjects are reusable content containers within a PDF. They're commonly used for:
- Repeated graphics (logos, watermarks)
- Header/footer content
- Complex illustrations used multiple times
- Pre-rendered content blocks
Form XObjects can contain Marked Content Identifiers (MCIDs), which link content to the document's tag structure. Each MCID must uniquely identify a piece of content.
The problem: If a Form XObject with MCIDs is used more than once, those MCIDs would appear multiple times in the document. This breaks the assumption that each MCID uniquely identifies content, causing:
- Confusion in the tag structure
- Incorrect reading order
- Assistive technology errors
- Broken content-to-tag mapping
Why It Matters
These XObject restrictions protect document structure integrity:
Reference XObjects:
- External dependencies make documents fragile
- Users may not have access to referenced files
- Screen readers cannot reliably process external content
- Document portability is compromised
- Validation becomes impossible without external files
Form XObjects with MCIDs:
- MCIDs must be unique for proper tag mapping
- Duplicate MCIDs create ambiguous content references
- The tag structure becomes inconsistent
- Screen readers may announce content incorrectly or incompletely
- Navigation through the document becomes unreliable
Consider a logo Form XObject with an MCID linked to a Figure tag with alt text. If this Form XObject appears on 50 pages, those 50 instances would share one MCID. Assistive technology expects each MCID to appear once, creating confusion about which instance is being referenced.
Common Violations
The Matterhorn Protocol defines two failure conditions for XObjects. Both are machine testable.
30-001: Reference XObject Is Present (Machine Testable)
What's Wrong: The PDF contains a Reference XObject, which references content from an external PDF file. This is prohibited in PDF/UA.
How to Identify:
- PDF/UA validators automatically detect Reference XObjects
- The violation appears in validation reports as a structural error
- Reference XObjects are relatively rare in modern PDFs
Technical Context: A Reference XObject contains:
<< /Type /XObject
/Subtype /Form
/Ref << /F (external-file.pdf) ... >>
>>
The presence of the /Ref key with a file reference indicates a Reference XObject.
Why This Happens:
- PDF creation tools that optimize for file size
- Document assembly workflows that link rather than embed
- CAD or technical drawing exports with external references
- Legacy PDF workflows from when Reference XObjects were common
30-002: Form XObject Contains MCIDs and Referenced More Than Once (Machine Testable)
What's Wrong: A Form XObject that contains Marked Content Identifiers (MCIDs) is referenced multiple times in the document. This creates duplicate MCIDs, breaking the tag structure.
How to Identify:
- PDF/UA validators detect this automatically
- The validator will identify the specific Form XObject
- Reports typically indicate which pages or locations reference the object
Technical Context: A Form XObject with MCIDs looks like:
<< /Type /XObject
/Subtype /Form
/BBox [0 0 100 100]
>>
stream
/P <</MCID 5>> BDC
... content ...
EMC
endstream
If this XObject is referenced on multiple pages, MCID 5 would appear multiple times.
Why This Happens:
- Automated PDF generation that reuses content efficiently
- Document templates with tagged repeating elements
- PDF optimization that merges identical content
- Export tools that don't understand PDF/UA constraints
How to Fix in Adobe Acrobat
Fixing XObject issues in Acrobat is limited because these are structural problems that may require specialized tools or document regeneration.
Detecting XObject Issues
- Go to Tools > Print Production > Preflight
- Search for PDF/UA profiles
- Run the PDF/UA-1 compliance check
- Look for XObject-related failures in results
- Note which XObjects are causing issues
Using the Content Panel
- Go to View > Show/Hide > Navigation Panes > Content
- Expand the page content tree
- Look for XObject references
- Note any that appear multiple times
For Reference XObjects (30-001)
If a Reference XObject is detected:
- Embed the external content: The referenced content needs to be brought into the PDF
- Use Preflight fixups: Some Preflight fixups can resolve references
- Regenerate the document: Often the best solution is to recreate the PDF without references
- Use PDF optimization: Go to File > Save As Other > Optimized PDF and look for options to resolve references
For Duplicate Form XObjects with MCIDs (30-002)
This requires removing MCIDs from the Form XObject or making copies:
Option 1: Remove MCIDs from Form XObject
- The Form XObject should become untagged (artifact)
- Tag the instances separately in the tag structure
- This requires understanding of PDF structure and may need specialized tools
Option 2: Create Separate XObject Instances
- Instead of reusing one XObject, create copies
- Each copy gets unique MCIDs
- This increases file size but ensures unique identification
- Requires PDF editing tools or regeneration
Option 3: Regenerate the Document
- Go back to the source application
- Export to PDF with different settings
- Avoid settings that aggressively reuse content
- Verify the new PDF passes validation
Using Preflight Fixups
- In Preflight, look for fixups related to XObjects or tagged content
- Some fixups may resolve the issue automatically
- Test after applying fixups
- Not all XObject issues have automatic fixups
How to Fix in Source Applications
Adobe InDesign
InDesign typically creates compliant XObjects:
- Ensure you're using current export settings
- When exporting to PDF:
- Use Adobe PDF (Interactive) or appropriate preset
- Check accessibility options
- If issues persist:
- Simplify repeated graphics
- Use placed images rather than embedded objects
- Test the exported PDF
Adobe Illustrator
For graphics that will be used in PDFs:
- When saving as PDF, use Save As rather than Export
- Choose appropriate PDF presets
- Avoid features that create Reference XObjects
- Test complex illustrations in the final PDF
Desktop Publishing Software
- Review export settings for PDF creation
- Look for options related to:
- Object reuse
- External references
- Tagged PDF creation
- Prefer settings that embed all content
- Test exported PDFs for XObject issues
Avoiding Reference XObjects
To prevent Reference XObjects:
- Embed all content: Don't link to external PDFs
- Flatten complex documents: Convert external references to embedded content
- Check export settings: Ensure "embed" rather than "link" options are selected
- Avoid document assembly: Merging PDFs with references can preserve them
Avoiding Duplicate MCIDs in Form XObjects
To prevent MCID duplication:
- Don't tag reusable content: Make repeated elements artifacts instead
- Tag instances individually: If content must be tagged, tag each use separately
- Use modern export tools: Current tools are more aware of PDF/UA constraints
- Test with validators: Check PDFs before distribution
Testing Your Fix
Automated Testing
PAC (PDF Accessibility Checker):
- Open the PDF in PAC
- Run the PDF/UA check
- Navigate to Checkpoint 30 results
- Both failure conditions are machine testable
- PAC will identify specific XObject issues
veraPDF:
- Select PDF/UA-1 validation profile
- Run validation
- Look for XObject-related rule failures
- Results identify specific violations
Adobe Acrobat Preflight:
- Go to Tools > Print Production > Preflight
- Run a PDF/UA profile
- Review XObject-related failures
- Check for both Reference XObjects and duplicate MCIDs
Manual Verification
After fixing:
- Run automated validation to confirm issues are resolved
- Check the document structure for consistency
- Test with a screen reader to verify content is properly read
- Ensure all content is accessible within the document
Validation Checklist
- No Reference XObjects present in document
- Form XObjects with MCIDs are not multiply referenced
- PDF/UA validator reports no Checkpoint 30 failures
- Document structure is internally consistent
- All content is embedded (no external dependencies)
- Repeated graphics are handled appropriately (untagged or unique instances)
- Screen reader can access all content correctly
Technical Background
Understanding XObjects
XObjects are efficiency features in PDF:
- Image XObjects: Store raster images
- Form XObjects: Store reusable vector/mixed content
- Reference XObjects: Point to content in external files
The PDF format allows XObjects to be defined once and referenced many times, reducing file size when the same content appears repeatedly.
MCIDs and Tagging
Marked Content Identifiers (MCIDs) link content in the page stream to tags in the structure tree:
- Content in the page stream is wrapped with MCID markers
- The structure tree references these MCIDs
- This creates the mapping between visual content and semantic structure
- Each MCID should appear exactly once
Why the Restrictions Exist
Reference XObjects break the "self-contained document" principle:
- Accessible documents must not depend on external resources
- Users need all content available without additional files
- Validation requires all content be present
Duplicate MCIDs break the structure tree integrity:
- The tag structure assumes unique content identification
- Multiple instances of the same MCID create ambiguity
- Assistive technology cannot reliably map content to structure
Additional Resources
Official Standards and Guidelines
- W3C WCAG 2.1 Success Criterion 1.3.1: Info and Relationships
- PDF Association Matterhorn Protocol 1.02
- ISO 32000-2: XObjects
PDF Technical Resources
Tools
- PAC (PDF Accessibility Checker) - Free PDF/UA validation
- veraPDF - Open-source PDF validator
- Adobe Acrobat Pro - PDF analysis and editing
- QPDF - Command-line PDF analysis
This documentation is based on the Matterhorn Protocol 1.02, the definitive reference for PDF/UA validation. Both XObject failure conditions are machine testable. For the most current information, consult the PDF Association and W3C WCAG guidelines.