Checkpoint 01: Real Content Tagged
Every piece of meaningful content in your PDF must be tagged appropriately, while decorative elements should be marked as artifacts. This fundamental distinction ensures assistive technologies announce exactly what users need to hear.
What This Means
PDF accessibility relies on a tagging structure that tells assistive technologies what content matters and what to skip. Think of tags as labels that describe the purpose and role of each element in your document.
Real content is anything that conveys meaning to readers: text, images, tables, lists, headings, and links. This content must be tagged so screen readers can announce it.
Artifacts are decorative or repetitive elements that do not add meaning: page numbers, headers, footers, background images, and decorative borders. These should be marked as artifacts so screen readers skip them.
When this distinction is wrong, screen reader users either miss important information or hear irrelevant details that clutter their experience.
Why It Matters
The tagging structure is the foundation of PDF accessibility. When it is incorrect:
- Missing content tags mean blind users never hear about important information
- Decorative items tagged as content create confusing, cluttered screen reader announcements
- Nested tagging errors cause unpredictable behavior where content may be read twice, out of order, or not at all
- Semantic mismatches confuse users about the structure and purpose of content
This checkpoint addresses the most fundamental question in PDF accessibility: "What should the screen reader say?"
Common Violations
The Matterhorn Protocol defines seven failure conditions for this checkpoint. Here is what each one means and how to address it.
01-001: Artifact Tagged as Real Content
What's Wrong: A decorative element or repeated content (like a watermark or header) has been tagged as real content instead of being marked as an artifact. Screen readers will announce this content, potentially confusing users or cluttering their experience.
How to Identify:
- Review the Tags panel and look for elements that should not be read aloud
- Listen with a screen reader for repeated announcements of headers, footers, or decorative text
- Check if background images or borders are tagged as Figure elements
How to Fix:
- In the Tags panel, locate the incorrectly tagged element
- Right-click and select Change Tag to Artifact
- Choose the appropriate artifact type (Background, Pagination, or Layout)
- Click OK
Note: This is a human-testable condition. Automated tools cannot always determine what is decorative versus meaningful.
01-002: Real Content Marked as Artifact
What's Wrong: Meaningful content has been incorrectly marked as an artifact and will be skipped entirely by screen readers. This is a critical error that makes content invisible to assistive technology users.
How to Identify:
- Compare the visual PDF to what screen readers announce
- Use the Content panel to find items marked as artifacts
- Look for missing content when navigating with assistive technology
How to Fix:
- Open View > Show/Hide > Navigation Panes > Content
- Locate the artifact container holding the real content
- Drag the content out of the artifact container
- Create appropriate tags for the content using the Tags panel or Reading Order tool
Note: This is a human-testable condition requiring judgment about what constitutes meaningful content.
01-003: Content Marked as Artifact Inside Tagged Content (Machine Testable)
What's Wrong: There is content marked as an artifact nested inside tagged content. This creates an invalid structure where part of a tagged element is hidden from assistive technology.
How to Identify:
- PDF/UA validators will flag this error automatically
- Look in the Tags panel for Figure, P, or other tags that contain hidden artifact content
- The Content panel will show artifact markers inside tag structures
How to Fix:
- Open the Content panel and Tags panel side by side
- Locate the artifact nested inside the tagged content
- Either:
- Remove the artifact marking if the content should be read, or
- Move the artifact outside the tag structure if it should remain hidden
01-004: Tagged Content Inside Artifact (Machine Testable)
What's Wrong: Tagged content exists inside something marked as an artifact. This creates a contradictory structure where content is simultaneously marked to be read and to be skipped.
How to Identify:
- PDF/UA validators detect this automatically
- Check the Content panel for tags appearing inside artifact containers
- This often happens when headers or footers contain links or other interactive elements
How to Fix:
- Determine whether the content should be read (tagged) or skipped (artifact)
- If it should be read: Remove the artifact container and ensure proper tagging
- If it should be skipped: Remove the tags from the content inside the artifact
01-005: Content Neither Artifact nor Tagged (Machine Testable)
What's Wrong: Some content in the PDF is not tagged as real content and is not marked as an artifact. This orphaned content may or may not be announced by screen readers, depending on the PDF reader being used.
How to Identify:
- PDF/UA validators will report untagged content
- Use the Reading Order tool to highlight untagged areas
- The Accessibility Checker in Acrobat will flag this as "Element is not tagged"
How to Fix:
- Open Tools > Accessibility > Reading Order
- Look for highlighted areas not yet assigned a content type
- Draw a selection around the untagged content
- Click the appropriate button (Text, Figure, Table, etc.) to tag it
- If the content is decorative, select it and click Background/Artifact
01-006: Structure Element Type or Attributes Not Semantically Appropriate
What's Wrong: Content has been tagged, but the tag type or its attributes do not match the actual meaning of the content. For example, a heading might be tagged as a paragraph, or a list might be tagged as a series of unrelated text blocks.
How to Identify:
- Compare visual formatting to tag structure
- Look for headings that appear bold and large but are tagged as P (paragraph)
- Check if lists are properly tagged as L (list) with LI (list items)
- Verify tables use Table, TR, TH, and TD tags correctly
How to Fix:
- In the Tags panel, right-click the incorrectly typed tag
- Select Properties
- Change the Type field to the appropriate tag type
- For attributes, adjust properties like Scope for table headers
Note: This requires human judgment about the semantic meaning of content.
01-007: Suspect Entry Has Value of True (Machine Testable)
What's Wrong: The PDF contains a "Suspect" flag set to true, indicating the document's tag structure was automatically generated and may contain errors. This flag is set when OCR software or automatic tagging creates a structure that has not been verified.
How to Identify:
- PDF/UA validators will flag this automatically
- This commonly occurs in scanned documents that have been OCR'd
- Check document properties for indication of automatic tagging
How to Fix:
- Review the entire tag structure to verify it is correct
- Once verified, remove the Suspect flag:
- Use Acrobat's Preflight tool
- Create a fixup to set the Suspect entry to false
- Run the fixup on the document
- Alternatively, use JavaScript in Acrobat to modify the catalog:
this.rootBookmark = {name: "Root"};
Important: Only remove the Suspect flag after you have verified the tag structure is correct. The flag exists as a warning that the structure needs review.
How to Fix in Adobe Acrobat
Adobe Acrobat Pro provides comprehensive tools for managing content tagging.
Viewing and Editing Tags
- Open your PDF in Adobe Acrobat Pro
- Go to View > Show/Hide > Navigation Panes > Tags
- Expand the tag tree to see the document structure
- Click on any tag to highlight its content in the document
- Right-click tags to access options for changing type, properties, or artifact status
Using the Reading Order Tool
- Go to Tools > Accessibility > Reading Order
- The document will show numbered regions indicating reading order
- To tag untagged content:
- Draw a box around the content
- Click the appropriate content type button (Text, Figure, Form Field, etc.)
- To change existing tags:
- Click on the numbered region
- Select a different content type
- To mark as artifact:
- Select the content
- Click Background/Artifact
Identifying Untagged Content
- Go to Tools > Accessibility > Accessibility Check
- Run the checker with "Document is tagged PDF" enabled
- Review results for any elements that are not tagged
- Click on each issue to locate it in the document
Converting Tags to Artifacts
- In the Tags panel, find the tag for decorative content
- Right-click the tag
- Select Change Tag to Artifact
- Choose the artifact type:
- Pagination: Headers, footers, page numbers
- Layout: Decorative rules, column separators
- Background: Background images, watermarks
- Click OK
Rescuing Content from Artifacts
- Open View > Show/Hide > Navigation Panes > Content
- Locate the artifact container
- Expand it to find the content inside
- Drag the content out of the artifact container
- Use the Tags panel to create appropriate tags for the content
How to Fix in Microsoft Word
Properly structuring your Word document before PDF conversion prevents most tagging issues.
Using Styles for Semantic Structure
- Select text that serves as a heading
- Apply the appropriate heading style (Heading 1, Heading 2, etc.) from the Home tab
- Use the Normal style for body paragraphs
- Use List Bullet or List Number for lists
- Avoid using bold or font size alone to create visual headings
Marking Decorative Elements
- For decorative images, right-click and select Edit Alt Text
- Check the box for Mark as decorative
- Word will export these as artifacts in the PDF
Managing Headers and Footers
Headers and footers are automatically treated as artifacts when you export to PDF from Word. Ensure you:
- Use Insert > Header and Insert > Footer for repeated content
- Do not place important information only in headers or footers
- If a footer contains essential links, also include them in the document body
Exporting with Correct Settings
- Go to File > Save As and select PDF
- Click Options
- Ensure Document structure tags for accessibility is checked
- Click OK and save
Testing Your Fix
After correcting tagging issues, verify your changes are correct.
Automated Testing with Acrobat
- Go to Tools > Accessibility > Accessibility Check
- Select PDF/UA-1 as the checking standard
- Run the full check
- Verify no errors for:
- "Element is not tagged"
- "Suspect entry in document catalog"
- Structure-related issues
Test with PAC (PDF Accessibility Checker)
- Open your PDF in PAC
- Run the PDF/UA check
- Review the Matterhorn Protocol section
- Check Checkpoint 01 results for any failures
- Use the screen reader preview to verify reading order
Manual Screen Reader Testing
- Open the PDF in a screen reader (NVDA, JAWS, or VoiceOver)
- Navigate through the entire document
- Verify that:
- All meaningful content is announced
- Decorative elements are not announced
- Content is announced in logical order
- Tag types match the actual content meaning
Review Checklist
- All meaningful content has appropriate tags
- Decorative content is marked as artifacts
- No artifacts exist inside tagged content
- No tagged content exists inside artifacts
- All content is either tagged or marked as artifact
- Tag types accurately reflect content meaning
- Suspect flag is removed (after verification)
Additional Resources
Official Standards and Guidelines
- W3C WCAG 2.1 Success Criterion 1.3.1: Info and Relationships
- W3C WCAG 2.1 Success Criterion 4.1.2: Name, Role, Value
- PDF Association Matterhorn Protocol 1.02
- ISO 14289-1 (PDF/UA-1) Standard
Tutorials and Guides
- Adobe: Creating Accessible PDFs
- PDF/UA Foundation: Tagged PDF Best Practices
- WebAIM: PDF Accessibility
Tools
- PAC (PDF Accessibility Checker) - Free PDF/UA validation
- Adobe Acrobat Pro Accessibility Tools
- NVDA Screen Reader - Free screen reader for testing
This documentation is based on the Matterhorn Protocol 1.02, the definitive reference for PDF/UA validation. For the most current information, consult the PDF Association and W3C WCAG guidelines.