Lectures
You can download the lectures here. We will try to upload lectures prior to their corresponding classes.
-
Lecture 1 - Introduction to Responsible AI
tl;dr: Overview of AI history, example models (NLP/LLM, Computer Vision, Robotics), and core challenges in responsible AI: explainability, uncertainty, and potential harms
[slides]
Topics Covered:
- What is AI: Agents, perceptions, actions, and environment
- AI Timeline: From 1900s to 2025+ (AI winters, transformers, LLMs, AGI)
- Example AI Models:
- Natural Language Processing (LLM): Transformer, code generation
- Computer Vision: CLIP, image captioning, medical segmentation
- Robotics: Vision-Language-Action models (RT-2)
- AI Responsibility Challenges:
- Explainability: Understanding model decisions (from linear models to LLMs), Chain-of-Thought limitations, mechanistic interpretability
- Uncertainty: Aleatoric vs. epistemic uncertainty, impact on LLM performance, robotics navigation, self-driving vehicles
- AI Harms: Safety (hallucination, jailbreaking, prompt injection, data poisoning), privacy (data leakage, memorization), fairness
-
Lecture 2 - Transformer and Vision Models
tl;dr: Deep dive into Transformer architecture and its applications in computer vision, including ViT, CLIP, SAM, and multi-modal models
[slides]
Topics Covered:
- Transformer Architecture: Positional embedding, Multi-head attention, Feed-forward layers, Residual connections, Layer normalization
- Vision Transformer (ViT): Applying transformers to images, patch embedding, scaling laws
- ViT Variants: MAE (self-supervised pre-training), Swin Transformer (multi-scale patches), DeiT (distillation)
- CLIP: Contrastive language-image pre-training, zero-shot classification
- SAM: Segment Anything Model, promptable segmentation
- LLaVA: Visual instruction tuning, multi-modal reasoning
- Responsibility Issues: Uncertainty estimation, out-of-distribution detection, fairness, hallucination
-
Lecture 3 - Model-Agnostic Explanations and SHAP
tl;dr: Post-hoc explanation methods that work on any AI model by approximating complex logic with simpler, local models or using game theory to calculate feature importance
[slides]
Topics Covered:
- Local vs. Global Explanations
- LIME (Local Interpretable Model-agnostic Explanations)
- Shapley Values & SHAP
- Properties of SHAP
-
Lecture 4 - Uncertainty in AI
tl;dr: How to understand why LLMs produce certain outputs—covering token-level attributions, perturbation and counterfactual explanations, reasoning structures (CoT, ToT), and mechanistic interpretability inside transformers.
[slides]
Topics Covered:
- Gradient-based explanations
- Perturbation & counterfactual explanations
- Chain-of-Thought, Tree / Graph of Thoughts, and adaptive reasoning
-
Lecture 5 - AI Safety and Robustness
tl;dr: The vulnerability of AI systems to malicious attacks and environmental noise.
[slides]
Topics Covered:
- Adversarial Examples
- Attack Methods
- Defensive Strategies
-
Lecture 6 - Backdoor Attacks in AI
tl;dr: Backdoor Attacks, a form of data poisoning where a model is trained to behave normally on clean data but perform specific malicious actions when a hidden trigger is present.
[slides]
Topics Covered:
- Backdoor Attack Mechanism
- Attack vs. Adversarial Examples
- Poisoning Strategies
-
Lecture 7 - Responsible Retrieval-Augmented Generation
tl;dr: This lecture introduces the RAG framework and examines methods to ensure its reliability, specifically focusing on resolving knowledge conflicts and defending against adversarial poisoning of retrieval sources.
[slides]
Topics Covered:
- The Basic RAG Pipeline: Overview of the Indexing, Retrieval, and Generation stages.
- Improving Factuality: Advanced strategies such as Self-RAG and FLARE which use self-reflection and uncertainty-based retrieval to minimize hallucinations. Knowledge Conflict Resolution: Frameworks like Micro-Act and ASTUTE RAG that allow LLMs to reason through contradictions between their internal memory and retrieved documents.
- Adversarial Robustness: Analyzing vulnerabilities to corpus poisoning (e.g., PoisonedRAG) and backdoor attacks (e.g., AGENTPOISON) that inject malicious text into the knowledge base to manipulate model outputs.
-
Lecture 8 - Uncertainty in Machine Learning
tl;dr: This lecture covers the fundamental role of uncertainty in AI, moving beyond raw predictions to provide stakeholders with confidence measures essential for high-risk domains like healthcare and autonomous driving.
[slides]
Topics Covered:
- High-Stakes Use Cases: Why binary predictions are insufficient in Fintech (stock forecasting), Healthcare (tumor diagnosis), and Auto-driving (path planning).
- Uncertainty vs. Confidence: Definitions and the mathematical intuition that Uncertainty $\approx$ 1 - Confidence.
- Informed Decision-Making: How uncertainty acts as a complementary layer of information to help humans prioritize risk avoidance over simple model outputs.
-
Lecture 9 - AI Governance and Intellectual Property
tl;dr: This lecture examines methods for quantifying and managing uncertainty in Vision-Language Models (VLMs), focusing on techniques to detect overconfidence and hallucinations in multimodal tasks.
[slides]
Topics Covered:
- Categorization of Multimodal Uncertainty (Intramodal vs. Intermodal)
- Core Quantification Methods (Probabilistic Embeddings and Verbalized Uncertainty)
- Calibration Techniques and Conformal Prediction Benchmarking
-
Lecture 10 - Reinforcement Learning and Safety
tl;dr: This lecture introduces Distributional Reinforcement Learning (DistRL), focusing on modeling the full probability distribution of returns to enable risk-sensitive decision-making in autonomous systems.
[slides] [distributed RL]
Topics Covered:
- From Expected to Distributional Reinforcement Learning
- Return Distribution Representations and Bellman Operators
- Risk-Sensitive Control and Robotics Applications
-
Lecture 11 - Responsible Agentic System
tl;dr: This lecture investigates the technical, ethical, and management-level risks associated with agentic systems and presents frameworks for building responsible, secure, and auditable AI agents
[slides]
Topics Covered:
- Risks and Challenges in Agentic Systems
- Security Attacks (Memory Extraction and Prompt Injection)
- Mitigation Strategies and Auditing Mechanisms
-
Lecture 12 - Agentic Systems
tl;dr: This lecture explores the transition from static Large Language Models (LLMs) to autonomous agents that can reason, use tools, and interact with the world.
[slides]
Topics Covered:
- From LLM to Agent
- Agentic Architectures
- Multi-Agent Systems
