Module 5 Overview#
Theme#
Metadata, lineage, and provenance
Essential Question#
How do we preserve the story of data transformations?
Module Components#
Book prose: conceptual framing, domain scenario, methods, and failure modesAssignment: evidence-backed production of a specific artifactSlides: presentation sequence for seminar or lecture deliveryNarration: spoken version of the slide flowInstructor notes: facilitation plan, discussion prompts, and grading cuesRubric: criteria for evaluating the module artifactNotebook: executable lab aligned with the module theme using synthetic pipeline events with freshness, schema drift, lineage completeness, volume, and access-risk indicators
Module Artifact#
AI data platform design review with lineage, quality checks, cost controls, and access model focused on metadata, lineage, and provenance: Create a lineage record for a training dataset.
Professional Setting#
Students work as if advising a platform team designing data infrastructure for repeatable model training and monitoring. Their work must be intelligible to data engineer, ML engineer, security architect, data steward, and platform owner.