A training approach where the model generates its own labels from unlabeled data — for example, predicting the next word in a sentence.
Friendly Description: Self-supervised learning is a smart trick where the AI creates its own training puzzles from data nobody has labeled. For example, you can hide a word in a sentence and ask the model to guess it. The data already tells you the right answer, so no human has to label anything. This approach unlocked AI's ability to learn from the internet's vast oceans of text.
Example: Modern language models learn by playing fill-in-the-blank with billions of sentences. Given "The cat sat on the ___," the model tries to predict "mat," "floor," or "couch." Doing this trillions of times teaches the model an enormous amount about how language works, all without anyone hand-labeling examples.