IN THIS LESSON

Learning Focus: Identifying common sources of data bias and understanding their impact on AI fairness.

Essential Question: If AI learns from data, what happens when the data itself is unfair?


The Skewed Donut Survey


Core Concepts Explained: Three Flavours of Bias

A donut shop owner wants to create a new town favourite donut.

They only survey customers who come in before 8 AM.

The result?

A coffee-flavoured donut is declared the winner, completely ignoring the preferences of afternoon and evening customers.

This introduces the concept of sample bias.

Sample Bias: The data used to train the model doesn't accurately represent the environment it will operate in. (e.g., A speech recognition tool trained primarily on adult male voices struggles to understand children's voices).

Label Bias: The labels assigned to the data are incorrect or reflect subjective opinions. (e.g., A model is trained to identify "good" student writing, but the labels were all assigned by a single teacher with a strong preference for a certain style, penalizing other valid forms of expression).

Historical Bias: The data reflects existing societal biases, and the model learns and perpetuates them. (e.g., An AI used for hiring recommendations is trained on historical company data where men held most senior roles, leading the AI to favor male candidates for leadership positions).


Interactive Activity: "Spot the Bias" Scenario Cards


Reflection

Think about the data you collect in your own classroom (e.g., grades, attendance, participation). Which of the three types of bias discussed might be most likely to show up in your data? Why?