Automated accessibility testing tools can scan a web page against a ruleset
derived from WCAG and related standards in seconds. They are an essential
first step in any accessibility programme. However they are not sufficient
on their own — research consistently shows that automated tools detect
between 30 and 57% of real-world accessibility failures, leaving the
majority requiring manual or behavioural testing to discover.
[1]
Automated tools are reliable for failures that are programmatically
deterministic — where a rule can be expressed as a pass/fail check against
the DOM with no contextual judgement required:
A significant category of failures requires either operating the page or
exercising human judgement:
Behavioural failures — require operating the page by keyboard or
assistive technology:
Quality failures — require contextual judgement:
The full a11ytest.ai scan combines three detection layers, then uses AI
to generate remediation suggestions on reported issues.
[4]
Layer 1 — axe-core with a broad tag set
Runs axe-core with tags covering WCAG 2.0, 2.1, and 2.2 AA, Section 508,
EN 301 549, and best-practice rules. Additional axe rules are explicitly
enabled beyond default configurations, including WCAG 2.2 target size checks
and other rules that minimal setups typically leave disabled.
Layer 2 — Custom deterministic checks
A second pass applies rules and heuristics beyond default axe-only setups,
including:
Layer 3 — Behavioural checks via browser automation
The scanner drives the page with a headless browser to catch issues that
static DOM analysis cannot reach:
Coverage is strongest around modals and common keyboard order signals.
General keyboard trap detection across all page components is not fully
enumerated and should be verified with manual keyboard testing.
AI in the pipeline
AI is used to generate structured remediation guidance for issues already
raised by axe-core and the extended checks. It assists developers in
understanding and fixing detected issues — it is not a separate detection
layer adding new findings to the scan results.
Industry research supports the position that automated tools — even
comprehensive ones — detect a minority of real-world accessibility issues
when measured against the full range of WCAG success criteria. The figures
most cited are 30% coverage by criteria count and approximately 57% by
volume of real-world issues detected in audits.
[1]
a11ytest.ai's layered approach extends coverage beyond a basic axe-core
scan by adding deterministic extended rules and behavioural checks. The
remaining gap — quality judgements, contextual review, and real assistive
technology testing — cannot be closed by any automated tool.
Note: the full layered scan described above applies to the main product
scan. Lighter endpoints may run axe-core tags only without the full extended
rule set and behavioural suite.
No single method is sufficient. The most effective approach combines:
| Method | What it catches |
|---|---|
| Automated scanning | Fast, scalable detection of rule-based failures |
| Keyboard testing | Behavioural failures — focus, traps, order, skip links |
| Screen reader testing | Announcement quality, live region behaviour, dynamic content |
| Manual expert review | Contextual quality issues — alt text, labels, headings |
| User testing with disabled users | Real-world usability issues that no automated test predicts |
A passing automated scan does not mean a page is accessible. It means the
page passes the subset of WCAG criteria that automated tools can test. Sites
with high Lighthouse scores frequently have severe keyboard and screen reader
failures that are invisible to automated scanning.
Treat automated testing as a necessary first pass that catches obvious
failures early — not as a compliance certificate.
Last edited Apr 5, 2026, 7:34 PM · P**** J****