Bias & Fairness Evaluation Framework

(Pre-Release Testing, Incident Taxonomy & Improvement Loop)


1. Purpose

This framework sets out how askKira:

  • Identifies and mitigates bias and unfair outcomes

  • Tests for education-specific risk scenarios

  • Responds to incidents and continuously improves fairness over time

It recognises that AI systems may reflect societal and data-driven biases and therefore require active evaluation, human oversight, and iterative improvement.


2. Fairness Principles

askKira’s approach to fairness is guided by the following principles:

  • Equity over uniformity – recognising different needs (e.g. SEND, EAL)

  • Context sensitivity – education decisions require professional nuance

  • Human judgement first – AI outputs are advisory, not determinative

  • Transparency about limits – fairness is monitored, not assumed


3. Pre-Release Bias & Fairness Test Suite

Before major releases or material system changes, askKira undertakes a pre-release evaluation using structured test scenarios.

3.1 Test Scenario Categories (Education-Specific)

Test prompts and scenarios may include:

  • SEND contexts

    • Neurodiversity

    • Learning difficulties

    • Behavioural needs

  • Socio-economic disadvantage

    • Pupil Premium–related scenarios

    • Attendance and behaviour narratives

  • EAL and language acquisition

    • Assumptions about comprehension or ability

    • Cultural and linguistic sensitivity

  • Safeguarding & vulnerability

    • Children in care

    • Mental health references

    • Family risk factors

  • Protected characteristics

    • Race, sex, disability, religion, sexual orientation (where contextually relevant)

  • Professional role fairness

    • Consistency of guidance across roles (teacher vs leader)

    • Avoidance of hierarchical or gendered assumptions

3.2 Evaluation Criteria

Each test scenario is reviewed against:

  • Presence of biased or stereotyped assumptions

  • Tone and framing

  • Appropriateness for educational context

  • Safeguarding sensitivity

  • Reliance on professional judgement prompts

Gaps / to confirm:

☐ Size and coverage of test prompt library
☐ Scoring or pass/fail thresholds
☐ Frequency of pre-release testing


4. Incident Taxonomy (Bias & Fairness)

askKira uses a structured taxonomy to categorise fairness-related incidents reported by users or identified internally.

4.1 Incident Categories

  • Category A – Minor bias indicators

    • Subtle assumptions

    • Ambiguous phrasing

  • Category B – Material fairness concerns

    • Stereotyping

    • Inappropriate generalisation

    • Disproportionate guidance

  • Category C – Serious or systemic bias

    • Discriminatory language

    • Harmful assumptions linked to protected characteristics

    • Repeated patterns across contexts

  • Category D – Safeguarding-linked bias

    • Bias that could materially increase risk to a child or group

4.2 Recording & Analysis

For each incident:

  • Context and prompt are recorded (proportionately)

  • Category and severity assigned

  • Immediate mitigation considered

  • Pattern analysis conducted for recurrence

Gaps / to confirm:

☐ Incident tracking tooling
☐ Data retention period for incidents
☐ Threshold for escalation to governance review


5. Escalation & Response

Minor incidents:

  • Addressed through prompt refinement or guidance updates

Material incidents:

  • Reviewed by senior human reviewer

  • Guardrails or system prompts adjusted

Serious or systemic incidents:

  • Immediate escalation

  • Broader test suite expansion

  • Temporary restriction of affected use cases if required

Safeguarding-linked incidents:

  • Treated as high-risk

  • Linked to safeguarding incident response processes


6. Improvement Loop (Fairness-by-Design)

askKira applies a continuous improvement cycle:

  1. Detect

    • Pre-release testing

    • In-product feedback

    • Monitoring signals

  2. Assess

    • Severity classification

    • Contextual risk evaluation

  3. Mitigate

    • Prompt updates

    • Guardrail refinement

    • User guidance adjustments

  4. Verify

    • Re-testing against relevant scenarios

    • Regression checks

  5. Document

    • Internal change logs

    • Risk register updates

Gaps / to confirm:

☐ Formal fairness risk register
☐ Change approval process
☐ User-visible communication of fixes


7. Relationship to Equality & Public Sector Duties

  • askKira supports organisations in meeting their Equality Act 2010 and Public Sector Equality Duty responsibilities by:

    • Avoiding prescriptive or discriminatory outputs

    • Prompting reflection and professional judgement

    • Providing consistency rather than automated classification

  • Responsibility for equality impact decisions remains with the organisation.


8. Governance & Oversight

Governance approach:

  • Fairness risks reviewed as part of wider safety and ethics governance.

  • High-risk or repeated issues inform roadmap and policy updates.

Gaps / to confirm:

☐ Named fairness or ethics owner
☐ Independent or advisory review involvement
☐ Review cadence


9. Known Limitations

  • Bias cannot be eliminated entirely in probabilistic language models.

  • Evaluation is scenario-based, not exhaustive.

  • Human oversight is essential to fair use in education contexts.


10. Summary for Buyers

askKira takes a proportionate, education-aware approach to bias and fairness by:

  • Testing realistic school and Trust scenarios before release

  • Providing clear routes to identify and escalate issues

  • Applying structured learning and improvement loops

The gaps identified represent areas for deeper assurance, not unmanaged risk, and can be strengthened in line with organisational requirements.