Free community resource

Automated Testing Limitations

Automated accessibility testing tools can scan a web page against a ruleset
derived from WCAG and related standards in seconds. They are an essential
first step in any accessibility programme. However they are not sufficient
on their own — research consistently shows that automated tools detect
between 30 and 57% of real-world accessibility failures, leaving the
majority requiring manual or behavioural testing to discover.
^[1]

What automated tools can detect

Automated tools are reliable for failures that are programmatically
deterministic — where a rule can be expressed as a pass/fail check against
the DOM with no contextual judgement required:

Missing alt attributes on images
Insufficient colour contrast ratios on solid backgrounds
Missing form labels
Duplicate id attributes
Missing document language declaration
Empty page title
Skipped heading levels

^[2]

What automated tools cannot detect

A significant category of failures requires either operating the page or
exercising human judgement:

Behavioural failures — require operating the page by keyboard or
assistive technology:

Whether a skip link actually moves focus to the correct target
Whether a modal dialog traps and releases focus correctly
Whether Tab order follows a logical sequence
Whether focus is visible on every interactive element
Whether keyboard traps exist inside custom components

Quality failures — require contextual judgement:

Whether alt text is meaningful or generic
Whether heading structure makes logical sense for the content
Whether error messages are descriptive enough to be actionable
Whether accessible names on controls are useful in context
Whether landmark regions are used correctly for the content type

^[3]

How a11ytest.ai extends automated coverage

The full a11ytest.ai scan combines three detection layers, then uses AI
to generate remediation suggestions on reported issues.
^[4]

Layer 1 — axe-core with a broad tag set
Runs axe-core with tags covering WCAG 2.0, 2.1, and 2.2 AA, Section 508,
EN 301 549, and best-practice rules. Additional axe rules are explicitly
enabled beyond default configurations, including WCAG 2.2 target size checks
and other rules that minimal setups typically leave disabled.

Layer 2 — Custom deterministic checks
A second pass applies rules and heuristics beyond default axe-only setups,
including:

iframes without meaningful title values
Focusable elements inside aria-hidden containers
Duplicate id values
Multiple navigation or complementary landmarks without distinguishing names
Radio and checkbox groups missing fieldset and legend where expected
Non-semantic interactive elements
Additional checks including link and button naming, document title quality,
table headers, and scrollable regions depending on page content

Layer 3 — Behavioural checks via browser automation
The scanner drives the page with a headless browser to catch issues that
static DOM analysis cannot reach:

Skip link behaviour, including heuristics around the first Tab stop
Focus visibility using computed styles such as outline and box-shadow
Tab order problems including positive tabindex values
Modal dialog checks — whether focus escapes a dialog on Tab, and whether
Escape exits the dialog correctly when focus cannot leave

Coverage is strongest around modals and common keyboard order signals.
General keyboard trap detection across all page components is not fully
enumerated and should be verified with manual keyboard testing.

AI in the pipeline
AI is used to generate structured remediation guidance for issues already
raised by axe-core and the extended checks. It assists developers in
understanding and fixing detected issues — it is not a separate detection
layer adding new findings to the scan results.

Coverage

Industry research supports the position that automated tools — even
comprehensive ones — detect a minority of real-world accessibility issues
when measured against the full range of WCAG success criteria. The figures
most cited are 30% coverage by criteria count and approximately 57% by
volume of real-world issues detected in audits.
^[1]

a11ytest.ai's layered approach extends coverage beyond a basic axe-core
scan by adding deterministic extended rules and behavioural checks. The
remaining gap — quality judgements, contextual review, and real assistive
technology testing — cannot be closed by any automated tool.

Note: the full layered scan described above applies to the main product
scan. Lighter endpoints may run axe-core tags only without the full extended
rule set and behavioural suite.

A complete testing approach

No single method is sufficient. The most effective approach combines:

Method	What it catches
Automated scanning	Fast, scalable detection of rule-based failures
Keyboard testing	Behavioural failures — focus, traps, order, skip links
Screen reader testing	Announcement quality, live region behaviour, dynamic content
Manual expert review	Contextual quality issues — alt text, labels, headings
User testing with disabled users	Real-world usability issues that no automated test predicts

The false confidence problem

A passing automated scan does not mean a page is accessible. It means the
page passes the subset of WCAG criteria that automated tools can test. Sites
with high Lighthouse scores frequently have severe keyboard and screen reader
failures that are invisible to automated scanning.

Treat automated testing as a necessary first pass that catches obvious
failures early — not as a compliance certificate.

References

Deque Systems. The Automated Accessibility Coverage Report. https://www.deque.com/automated-accessibility-coverage-report/ ↑
W3C Web Accessibility Initiative. Selecting Web Accessibility Evaluation Tools. https://www.w3.org/WAI/test-evaluate/tools/selecting/ ↑
Accessible.org. Automated Scans and WCAG Criteria. https://accessible.org/automated-scans-wcag/ ↑
A11YTEST.AI LTD. a11ytest.ai — Automated Accessibility Scanning. https://a11ytest.ai ↑

Last edited Apr 5, 2026, 7:34 PM · P**** J****