AINS6006: Big Data Management for AI Applications

AINS6006: Big Data Management for AI Applications#

Aurnova MSAI track: Core
Credits: 3
Format: 8-week online graduate course

Designs data platforms, pipelines, lineage, cloud integration, and security for AI workflows.

This course follows the Aurnova/Castalia course-site pattern used by AINS6003: each module includes book prose, an assignment notebook, slide notebook, narration, instructor notes, and an executable lab.

Course Outcomes#

By the end of the course, students will be able to:

  • explain the major concepts and tradeoffs in Big Data Management for AI Applications;

  • build or evaluate applied AI artifacts aligned with the course domain;

  • document assumptions, evidence, limitations, and operational risks;

  • connect technical work to governance, stakeholder needs, and deployment readiness.

Module Map#

  1. Data architectures for AI — What architecture supports trustworthy AI workflows?

  2. Pipelines, orchestration, and quality — How do data pipelines fail, and how are failures detected?

  3. Storage, indexing, and retrieval — How do access patterns shape storage choices?

  4. Distributed processing and scale — When does scale require distributed computation?

  5. Metadata, lineage, and provenance — How do we preserve the story of data transformations?

  6. Cloud integration and cost control — How do cloud choices affect reliability and budget?

  7. Security and access governance — How should sensitive data be protected across AI workflows?

  8. AI data platform readiness review — What proves a data system can support production AI?