Module 4 Overview#

Theme#

Distributed processing and scale

Essential Question#

When does scale require distributed computation?

Module Components#

  • Book prose: conceptual framing, domain scenario, methods, and failure modes

  • Assignment: evidence-backed production of a specific artifact

  • Slides: presentation sequence for seminar or lecture delivery

  • Narration: spoken version of the slide flow

  • Instructor notes: facilitation plan, discussion prompts, and grading cues

  • Rubric: criteria for evaluating the module artifact

  • Notebook: executable lab aligned with the module theme using synthetic pipeline events with freshness, schema drift, lineage completeness, volume, and access-risk indicators

Module Artifact#

AI data platform design review with lineage, quality checks, cost controls, and access model focused on distributed processing and scale: Profile batch processing and identify bottlenecks.

Professional Setting#

Students work as if advising a platform team designing data infrastructure for repeatable model training and monitoring. Their work must be intelligible to data engineer, ML engineer, security architect, data steward, and platform owner.