AI Assessment: Capabilities, Limitations, and Balanced Approaches

Key Questions for Today

  • What can AI effectively evaluate in student learning?
  • What limitations do AI assessment systems have?
  • How can we design effective AI-generated feedback?
  • When does AI misrepresent student understanding?
  • How can we integrate human and AI assessment effectively?

What Can AI Effectively Evaluate?

  • Objective knowledge

  • Multiple choice responses

  • Factual recall

  • Terminology usage

  • Standard patterns

  • Common problem-solving approaches

  • Basic writing structures

  • Expected answer formats

  • Quantifiable metrics

  • Length and complexity

  • Time spent on tasks

  • Response patterns

Subject-Specific Assessment Strengths

Mathematics:

  • Calculation accuracy
  • Application of formulas
  • Solution verification

Writing:

  • Grammar and mechanics
  • Organization structure
  • Vocabulary usage

Science:

  • Knowledge of facts
  • Understanding of standard processes
  • Basic concept relationships

Can AI effectively evaluate these?

  • Novel approaches

  • Unconventional but valid solutions

  • Creative problem-solving

  • Innovative thinking

  • Nuanced understanding

  • Subtle misconceptions

  • Depth of conceptual grasp

  • Context-dependent meaning

  • Higher-order thinking

  • Complex critical analysis

  • Interdisciplinary connections

  • Evaluative judgment

Activity: Think-Pair-Share

  1. Think: What are the implications of AI assessment capabilities and limitations for your subject area/context?

  2. Pair: Discuss with a partner:
    • One aspect of learning AI could assess effectively
    • One aspect that would require human assessment
    • Why this division makes sense
  3. Share: Selected pairs will share key insights with the whole group

Principles of Effective AI Feedback

Specific and actionable

  • Points to particular aspects of work
  • Suggests concrete next steps

Learning-focused

  • Emphasizes growth over judgment
  • Connects to learning goals

Balanced in approach

  • Identifies both strengths and needs
  • Provides encouraging tone

Appropriate in scope

  • Manageable number of suggestions
  • Prioritizes most important issues

Types of AI Feedback

Verification Feedback

  • Confirms correctness/incorrectness
  • Limited learning impact
  • Example: "Your answer is incorrect."

Elaborative Feedback

  • Explains why answers are right/wrong
  • Provides additional information
  • Example: "Your answer is incorrect because you applied the wrong formula. The correct approach is..."

Process Feedback

  • Addresses strategies and approaches
  • Focuses on how to improve
  • Example: "Your approach shows you're trying to use elimination, but consider setting up the equations differently by..."

Impact of Different Feedback Types

Feedback Type Learning Impact Best Used For Limitations
Verification Low Quick checks, Simple tasks Doesn't support improvement
Elaborative Medium Content mastery, Concept building Can promote dependence
Process High Complex tasks, Skill development More difficult to automate

AI Feedback Design Considerations

Student factors:

  • Prior knowledge
  • Motivation level
  • Feedback literacy

Task factors:

  • Complexity
  • Purpose (formative/summative)
  • Stakes involved

Feedback delivery:

  • Timing (immediate vs. delayed)
  • Format (text, visual, interactive)
  • Tone and language

Four Corners

Move to the corner that represents what you see as the primary value of AI in assessment:

Corner 1: Efficiency

Corner 2: Objectivity

Corner 3: Personalization

Corner 4: Analytics

When AI Misrepresents Student Understanding

False Positives:

  • Giving credit for superficial responses
  • Missing plagiarism or AI-generated work
  • Accepting formulaic approaches without understanding

False Negatives:

  • Penalizing unusual but valid approaches
  • Missing sophisticated thinking
  • Failing to recognize creative solutions

Reductive Analysis:

  • Focusing only on surface features
  • Missing conceptual connections
  • Ignoring important context

Assessment Validity Concerns

Construct Validity:

  • Does the AI measure what it claims to measure?
  • Does it capture the full construct or just parts?

Content Validity:

  • Does the assessment cover the intended content?
  • Is important content missing from evaluation?

Consequential Validity:

  • What are the impacts of this assessment approach?
  • Does it lead to appropriate instructional decisions?

Examples of AI Assessment Misrepresentations

Example 1: Math Problem Solving

  • Student uses valid but uncommon approach
  • AI fails to recognize alternative solution path
  • Student receives incorrect feedback

Example 2: Written Analysis

  • Student makes deep but implicit connections
  • AI focuses only on explicit statements
  • Sophisticated thinking goes unrecognized

Example 3: Science Explanation**

  • Student explanation uses unconventional analogies
  • AI expects standard terminology
  • Conceptual understanding is missed

Addressing Validity Challenges

Design approaches:

  • Multiple assessment methods
  • Clear criteria for evaluation
  • Appropriate task selection

Implementation strategies:

  • Human review of edge cases
  • Continuous improvement of algorithms
  • Student appeal processes

Transparency practices:

  • Clear communication about AI role
  • Sharing assessment limitations
  • Explaining decision processes

Station Activity: AI Feedback Impacts

Station 1: Content-Focused AI Feedback

  • Example feedback provided
  • Questions about impact on learning
  • Space for observations

Station 2: Process-Focused AI Feedback

  • Example feedback provided
  • Questions about impact on learning
  • Space for observations

Station 3: Combined Human-AI Feedback

  • Example feedback provided
  • Questions about impact on learning
  • Space for observations

Station 4: Harmful AI Feedback

  • Example feedback provided
  • Questions about potential negative impacts
  • Space for observations

Models for Integrating Human and AI Assessment

Sequential Model

  • AI provides initial assessment
  • Human reviews, modifies, and finalizes
  • Example: AI grades essays first, teacher reviews

Parallel Model

  • AI and human assess independently
  • Results are compared and reconciled
  • Example: Double-scoring writing with comparison

Complementary Model

  • AI and human assess different aspects
  • Results are combined for complete picture
  • Example: AI checks mechanics, humans assess ideas

Augmentation Model

  • Human leads assessment process
  • AI provides supporting data and analysis
  • Example: Teacher grades with AI analytics dashboard

Designing Balanced Assessment Systems

Key questions to consider:

  1. What aspects of learning are best assessed by AI?

  2. What aspects require human judgment?

  3. How will these approaches be integrated?

  4. How will we ensure validity and reliability?

  5. How will students experience the assessment?

  6. How will the assessment inform teaching?

Assessment Design Activity

Your task: Design an AI-enhanced assessment approach

  1. Select a specific learning objective
  2. Design the assessment approach
  3. Create examples of different types of feedback
  4. Develop an integration plan
  5. Address validity concerns

Key takeaways

  • AI has specific assessment capabilities and limitations
  • Effective AI feedback follows sound design principles
  • Validity challenges require intentional mitigation
  • Balanced approaches integrate AI and human judgment