Facilitator guide · staff development day

AI for Assessment & Feedback

Everything you need to run this course with your staff — 6 sessions, ~5 hours total.

How to run it

Run this as a staff development day (SDD), a twilight series, or faculty-by-faculty — the whole course is ~4–5 hours. Each module has a session plan, discussion prompts and a 'watch for'. You don't need to be an AI expert to facilitate; you need to keep the room honest about what AI may draft and what only a teacher may decide. Before you start: ask every teacher to bring a real, de-identified assessment task (or one they need to write) from their own teaching — the activities work best on genuine work; decide which tool staff will use (NSWEduChat for general drafting, Lessio for building the task + marking guidelines) and check everyone can log in; and project the Ethical-Use Checklist and the DoE's six ethical checks for the room. Capture two things across the day: a faculty-agreed position on whether GenAI is permitted for each assessment type, and each teacher's capstone artefact + reflection for their eTAMS PD record.

Session plans

1
What assessment AI can and can't do
~45 min
Session plan
Open by asking the room to define 'validity' and 'reliability' in their own words, then sharpen with the table. Run a quick sort: project a list of assessment tasks (write a task, decide a grade, draft a rubric, give feedback, moderate a sample) and have the room call 'AI may draft' or 'teacher only'. Generate one assessment task in Lessio on a shared screen and build the 'what it drafted / what's still mine' two-column list together.
Discussion prompts
- Where in our marking do we most rely on consistency between teachers — and how do we currently get it?
- Have we ever seen a task that was reliable but not valid? What was it really measuring?
- What would we never want an AI anywhere near in our assessment process?
Watch for
Two reactions appear: 'AI can just mark it for us' (over-trust — redirect to validity and the on-balance judgement) and 'AI has no place in assessment at all' (resistance — show the safe drafting wins). The course's job is to land everyone on 'AI drafts, the teacher judges'.
Standards2.3 Curriculum, assessment and reporting5.1 Assess student learning5.3 Make consistent and comparable judgements
2
Designing valid, fair tasks with AI
~60 min
Session plan
Hands-on build session. Each teacher generates a task + marking guidelines in Lessio for a real outcome, then runs the four-step check (plain English / alignment / adjustment / code verification). Teach code-verification with the MA5-TRG case study, then have Maths and Science staff verify two of their own codes live against the NESA site to model it for everyone. Swap tasks with a partner and mark each other's marking guidelines for 'observable verbs only'.
Discussion prompts
- Take a real task we use — could every student in our class access the wording independently?
- Do our marking guidelines name the outcome AND the success criterion for each mark?
- Where would a hallucinated outcome code do the most damage if it reached an assessment?
Watch for
Staff may treat a polished AI task as finished — emphasise the edits ARE the work, especially the code check and the adjustments. Watch for adjustments that quietly lower the standard rather than change access; bring it back to 'same outcome, same standard, different access'.
Standards2.3 Curriculum, assessment and reporting5.1 Assess student learning1.6 Strategies to support students with disability
3
Marking guidelines, rubrics & consistency
~45 min
Session plan
Run a mini-moderation. Have AI draft student-friendly 'I can…' criteria for a shared task, then critique them as a room for fidelity to the standard. Then mark a single de-identified sample independently and compare grades — surface the spread, and discuss how you'd reach agreement. Use an AI-generated 'borderline scenarios' starter to fuel the conversation; the point is to feel how moderation, not a tool, creates consistency.
Discussion prompts
- When we mark the same work, how far apart do our grades land — and why?
- Do our band descriptors actually align to the Common Grade Scale, or to our habits?
- Could AI-drafted 'I can…' criteria help our students — and where might they mislead them?
Watch for
Some staff will want AI to 'just give the mark' to save time — hold the line that grading is a moderated human judgement. Watch for descriptors full of 'good/sound/excellent' with no observable verbs; that's the most common AI weakness here.
Standards2.3 Curriculum, assessment and reporting5.3 Make consistent and comparable judgements5.1 Assess student learning
4
Feedback that moves learning
~45 min
Session plan
Run a feedback make-and-take. Each teacher gives AI only their success criteria (no student work) and gets a frame, then personalises it for two de-identified samples and shares one before/after with a partner. Spend dedicated time on the de-identification rule — put a 'would I paste this?' scenario to a think-pair-share. Close on warmth + specificity: read out a vague comment and a specific one and discuss which moves learning.
Discussion prompts
- What does feedback in our faculty currently emphasise — the grade, or the next step?
- Where might staff be tempted to paste identifiable work into a tool, and how do we make the safe path the easy path?
- How do we keep AI-assisted feedback from sounding generic to students?
Watch for
The privacy line needs clarity, not a lecture — keep it practical (frame from criteria, not comment from the child's work). Watch for teachers who let AI write the final comment unchanged; the personalisation is the professional act and the part students actually feel.
Standards5.2 Provide feedback to students on their learning4.5 Use ICT safely, responsibly and ethically1.5 Differentiate teaching to meet specific learning needs
5
Integrity & authorship in the AI age
~45 min
Session plan
Run a task-redesign clinic. Map current assessment tasks against 'could a student complete this with AI and no one would know?' and pick the most exposed. As a faculty, write the explicit AI rule onto a task, then redesign it for authorship-by-design (checkpoints, conference, in-class component). Steer firmly away from any 'detection' solution toward task design. Agree a faculty position on which task types permit GenAI and which don't.
Discussion prompts
- Which of our tasks are most exposed to AI misuse — and how would we redesign them?
- Do our task instructions currently tell students the AI rule? If not, what's our standard wording?
- How do we want students to use AI honestly, and where do we teach that explicitly?
Watch for
Integrity anxiety runs high and 'detector' talk will surface — name that detectors are unreliable and can harm EAL/D students. Keep the energy on design, not policing. Some staff conflate 'students must never use AI' with policy; clarify that schools decide task-by-task.
Standards5.1 Assess student learning4.5 Use ICT safely, responsibly and ethically7.1 Meet professional ethics and responsibilities
6
Capstone — build, critique & log it
~50 min
Session plan
Run as a longer workshop or directed time. Teachers build their assessment (task + marking guidelines) in Lessio, then run the three-front critique (validity / fairness / integrity) and self-assess against the Ethical-Use Checklist. Collect the artefacts and reflections — they're the evidence of a Standards-relevant PD day and the teachers' eTAMS record. Consider a faculty share-back where one task is critiqued live against the three fronts.
Discussion prompts
- What's our shared standard for an AI-assisted assessment that's 'ready to use'?
- What faculty-wide rule for AI in assessment and feedback should we adopt from today?
- How will we log this as PD and keep moderating together from here?
Watch for
Some will want to skip the reflection — but it's what makes this real PD and the eTAMS evidence, so protect the time. Watch for capstones that are polished but unverified (codes unchecked, no adjustment, no authorship scaffold); send them back to the three-front critique.
Standards2.3 Curriculum, assessment and reporting5.1 Assess student learning6.2 Engage in professional learning

After the day

Collect each teacher's capstone artefact and reflection — that's your evidence of a Standards-relevant PD day, and theirs to log in eTAMS. Included in the whole-school Lessio programme; also available standalone per teacher. Because NESA removed the Accredited/Elective PD categories in 2024, the course counts as Standards-relevant PD without an endorsement gate — schools can run it school-wide on a staff development day.

No identifiable student work or personal data entered into general AI tools — de-identify, or work from success criteria only.
Every AI-drafted task, rubric and feedback frame reviewed, aligned and owned by the teacher before use.
Outcomes and success criteria verified against the official NESA syllabus — every outcome code checked, not assumed.
Validity and fairness confirmed: the task measures the outcome, the wording is plain-English accessible, and reasonable adjustments keep the same standard.
Integrity upheld: the AI rule is explicit on the task, authorship is assured by design (not 'detectors'), and AI use is disclosed where policy requires.

Standards-relevant professional learning, mapped to the APST · verified against national and NSW frameworks, June 2026.

AI for Assessment & Feedback

How to run it

Session plans

What assessment AI can and can't do

Designing valid, fair tasks with AI

Marking guidelines, rubrics & consistency

Feedback that moves learning

Integrity & authorship in the AI age

Capstone — build, critique & log it

After the day