Bias & Fairness Evaluation Framework
(Pre-Release Testing, Incident Taxonomy & Improvement Loop)
1. Purpose
This framework sets out how askKira:
Identifies and mitigates bias and unfair outcomes
Tests for education-specific risk scenarios
Responds to incidents and continuously improves fairness over time
It recognises that AI systems may reflect societal and data-driven biases and therefore require active evaluation, human oversight, and iterative improvement.
2. Fairness Principles
askKira’s approach to fairness is guided by the following principles:
Equity over uniformity – recognising different needs (e.g. SEND, EAL)
Context sensitivity – education decisions require professional nuance
Human judgement first – AI outputs are advisory, not determinative
Transparency about limits – fairness is monitored, not assumed
3. Pre-Release Bias & Fairness Test Suite
Before major releases or material system changes, askKira undertakes a pre-release evaluation using structured test scenarios.
3.1 Test Scenario Categories (Education-Specific)
Test prompts and scenarios may include:
SEND contexts
Neurodiversity
Learning difficulties
Behavioural needs
Socio-economic disadvantage
Pupil Premium–related scenarios
Attendance and behaviour narratives
EAL and language acquisition
Assumptions about comprehension or ability
Cultural and linguistic sensitivity
Safeguarding & vulnerability
Children in care
Mental health references
Family risk factors
Protected characteristics
Race, sex, disability, religion, sexual orientation (where contextually relevant)
Professional role fairness
Consistency of guidance across roles (teacher vs leader)
Avoidance of hierarchical or gendered assumptions
3.2 Evaluation Criteria
Each test scenario is reviewed against:
Presence of biased or stereotyped assumptions
Tone and framing
Appropriateness for educational context
Safeguarding sensitivity
Reliance on professional judgement prompts
Gaps / to confirm:
☐ Size and coverage of test prompt library
☐ Scoring or pass/fail thresholds
☐ Frequency of pre-release testing
4. Incident Taxonomy (Bias & Fairness)
askKira uses a structured taxonomy to categorise fairness-related incidents reported by users or identified internally.
4.1 Incident Categories
Category A – Minor bias indicators
Subtle assumptions
Ambiguous phrasing
Category B – Material fairness concerns
Stereotyping
Inappropriate generalisation
Disproportionate guidance
Category C – Serious or systemic bias
Discriminatory language
Harmful assumptions linked to protected characteristics
Repeated patterns across contexts
Category D – Safeguarding-linked bias
Bias that could materially increase risk to a child or group
4.2 Recording & Analysis
For each incident:
Context and prompt are recorded (proportionately)
Category and severity assigned
Immediate mitigation considered
Pattern analysis conducted for recurrence
Gaps / to confirm:
☐ Incident tracking tooling
☐ Data retention period for incidents
☐ Threshold for escalation to governance review
5. Escalation & Response
Minor incidents:
Addressed through prompt refinement or guidance updates
Material incidents:
Reviewed by senior human reviewer
Guardrails or system prompts adjusted
Serious or systemic incidents:
Immediate escalation
Broader test suite expansion
Temporary restriction of affected use cases if required
Safeguarding-linked incidents:
Treated as high-risk
Linked to safeguarding incident response processes
6. Improvement Loop (Fairness-by-Design)
askKira applies a continuous improvement cycle:
Detect
Pre-release testing
In-product feedback
Monitoring signals
Assess
Severity classification
Contextual risk evaluation
Mitigate
Prompt updates
Guardrail refinement
User guidance adjustments
Verify
Re-testing against relevant scenarios
Regression checks
Document
Internal change logs
Risk register updates
Gaps / to confirm:
☐ Formal fairness risk register
☐ Change approval process
☐ User-visible communication of fixes
7. Relationship to Equality & Public Sector Duties
askKira supports organisations in meeting their Equality Act 2010 and Public Sector Equality Duty responsibilities by:
Avoiding prescriptive or discriminatory outputs
Prompting reflection and professional judgement
Providing consistency rather than automated classification
Responsibility for equality impact decisions remains with the organisation.
8. Governance & Oversight
Governance approach:
Fairness risks reviewed as part of wider safety and ethics governance.
High-risk or repeated issues inform roadmap and policy updates.
Gaps / to confirm:
☐ Named fairness or ethics owner
☐ Independent or advisory review involvement
☐ Review cadence
9. Known Limitations
Bias cannot be eliminated entirely in probabilistic language models.
Evaluation is scenario-based, not exhaustive.
Human oversight is essential to fair use in education contexts.
10. Summary for Buyers
askKira takes a proportionate, education-aware approach to bias and fairness by:
Testing realistic school and Trust scenarios before release
Providing clear routes to identify and escalate issues
Applying structured learning and improvement loops
The gaps identified represent areas for deeper assurance, not unmanaged risk, and can be strengthened in line with organisational requirements.