Open Source · Free Forever

AI Red Team Academy

Master AI Security. Break AI Systems. Defend What Matters.

A free, open-source course covering offensive security testing of AI systems — from prompt injection to supply chain attacks. 60+ hours of content with hands-on Docker labs.

8 Modules
60+ Hours
8 Docker Labs
100% Free

What You'll Learn

A comprehensive, hands-on curriculum covering offensive security testing of AI systems — LLMs, RAG pipelines, multi-agent systems, and AI infrastructure. Every module includes a Docker-based lab environment.

Prerequisites

  • Basic understanding of machine learning concepts
  • Familiarity with Python programming
  • Command-line / terminal experience
  • Docker basics (install Docker Desktop before starting)
  • Curiosity about how AI systems can be exploited

Who This Is For

  • Security professionals expanding into AI/ML
  • AI/ML engineers who want to build more secure systems
  • Penetration testers adding AI targets to their skill set
  • Researchers studying adversarial machine learning
  • Anyone passionate about AI safety and security

8 Modules. 60+ Hours. Hands-On.

Each module includes detailed topics, a hands-on Docker lab, and curated references. Click any module to expand its full content.

Topics Covered

  • What is AI Red Teaming and why it matters
  • Traditional vs AI Red Teaming: key differences
  • The AI Attack Surface: models, APIs, training data, outputs, infrastructure
  • MITRE ATLAS Framework: 14 tactics, 66 techniques for AI adversary behavior
  • NVIDIA AI Kill Chain: Recon → Poison → Hijack → Persist → Impact
  • OWASP Top 10 for LLM Applications (2025 edition)
  • NIST AI 100-2: Adversarial ML Taxonomy
  • Threat modeling for AI systems
  • Legal and ethical considerations in AI red teaming
  • Setting up your AI red teaming lab environment

Lab 1: Setting Up Your AI Red Team Lab

Deploy a complete AI red teaming environment with local LLMs (Ollama), vector databases, and testing tools. Includes a vulnerable chatbot application as your first target.

  • Deploy Ollama with a local LLM (Mistral 7B or Llama 3)
  • Set up ChromaDB vector database
  • Deploy a vulnerable AI chatbot application
  • Install and configure garak, PyRIT, and promptfoo
  • Run your first automated vulnerability scan with garak
  • Document findings using the MITRE ATLAS taxonomy
Download Lab

Topics Covered

  • Direct prompt injection: overriding system prompts
  • Indirect prompt injection: poisoning external context
  • Jailbreaking techniques: DAN, role-play, context manipulation
  • Encoding-based attacks: Base64, ROT13, Morse, Leetspeak, Unicode
  • Multi-turn attacks: Crescendo and context accumulation
  • Policy Puppetry and instruction hierarchy exploitation
  • Token-level attacks and adversarial suffixes
  • Automated prompt injection with evolutionary algorithms
  • Measuring Attack Success Rate (ASR)
  • Bypassing guardrails: character injection, AML evasion methods
  • Testing guardrail products: Azure Prompt Shield, Meta Prompt Guard, NeMo
  • Defense analysis: what works and what doesn't

Lab 2: Prompt Injection Playground

Attack a series of increasingly hardened chatbots. Start with unprotected models, progress through guardrail-protected systems, and learn to systematically discover bypasses.

  • Perform direct prompt injection against an unprotected chatbot
  • Extract system prompts from a 'secure' application
  • Bypass content filters using encoding techniques (Base64, Unicode confusables)
  • Execute multi-turn Crescendo attacks
  • Use garak to automate jailbreak discovery
  • Test and bypass a DeBERTa-based prompt injection classifier
  • Implement evolutionary prompt generation to find novel bypasses
  • Calculate and report ASR across different attack strategies
Download Lab

Topics Covered

  • RAG architecture deep dive: ingestion, embedding, retrieval, generation
  • The RAG attack surface: every component is a target
  • Knowledge base poisoning: injecting malicious documents
  • Indirect prompt injection through retrieved context
  • HijackRAG: manipulating retrieval mechanisms (black-box and white-box)
  • Vector database security: the 3,000+ exposed databases problem
  • Embedding inversion attacks: reconstructing source data from vectors
  • Data poisoning in vector databases
  • Membership and attribute inference attacks
  • Semantic deception: fooling similarity search
  • Cross-context information conflicts
  • RAG credential harvesting (MITRE ATLAS technique)
  • Orchestration layer exploits: LangChain, LlamaIndex vulnerabilities
  • CVE-2025-27135: RAGFlow SQL injection case study
  • Microsoft 365 Copilot exploit chain: prompt injection + ASCII smuggling

Lab 3: Breaking RAG Systems

Build and then systematically compromise a RAG application. Poison its knowledge base, hijack retrieval, perform embedding inversion, and exfiltrate data through the LLM.

  • Deploy a vulnerable RAG application with ChromaDB and LangChain
  • Poison the knowledge base with malicious documents
  • Perform indirect prompt injection through poisoned retrieved context
  • Execute embedding inversion to recover source text from vectors
  • Demonstrate membership inference against the vector database
  • Exploit semantic deception to manipulate search results
  • Chain RAG poisoning with data exfiltration via LLM output
  • Test unauthenticated vector database access
  • Identify and exploit orchestration framework vulnerabilities
Download Lab

Topics Covered

  • Multi-agent AI architectures: how agents communicate and coordinate
  • Trust relationships between agents and their exploitation
  • Communication interference and man-in-the-middle attacks on agents
  • Byzantine attacks and agent impersonation
  • Emergent exploitation: M-Spoiler framework for collective manipulation
  • Jailbreak propagation across multi-agent systems
  • Remote Code Execution (RCE) through agent tool use
  • Memory manipulation attacks on agent long-term memory
  • Thread injection in agent conversations
  • Over-permissioned agent actions and privilege escalation
  • Agent configuration modification for persistent backdoors
  • Activation trigger discovery and exploitation
  • AI Agent Tool Invocation for unauthorized actions
  • Zero-trust architecture for agent interactions
  • MITRE ATLAS 2025: 14 new agent-specific attack techniques

Lab 4: Compromising Multi-Agent Systems

Attack a multi-agent customer service system where agents collaborate to handle requests. Compromise one agent to influence others, escalate privileges, and exfiltrate data through tool invocations.

  • Map the multi-agent system architecture and trust relationships
  • Perform agent impersonation through prompt injection
  • Demonstrate jailbreak propagation from one agent to another
  • Manipulate agent memory to create persistent backdoors
  • Exploit agent tool access to perform unauthorized actions
  • Execute data exfiltration via agent tool invocation
  • Discover and exploit activation triggers
  • Test inter-agent communication integrity
  • Implement and test zero-trust defenses
Download Lab

Topics Covered

  • The AI supply chain: models, datasets, frameworks, dependencies
  • Model poisoning: backdoors, sleeper agents, and trojan models
  • Malicious model serialization: pickle exploits and code execution
  • Typosquatting on model registries (openai-official, chatgpt-api, tensorfllow)
  • Training data poisoning: medical LLM case study ($5 to poison)
  • Backdoor triggers and sleeper agent models (Anthropic research)
  • Fine-tuning attacks: corrupting model behavior through adaptation
  • Framework vulnerabilities: LangChain, LlamaIndex, Haystack exploits
  • API key exposure and credential leakage in AI pipelines
  • Container and infrastructure security for ML deployments
  • Model theft via distillation and extraction attacks
  • SBOM for AI: Software and ML Bill of Materials
  • Supply chain attack case studies: 3CX, NullBulge/Hugging Face
  • Detecting and preventing model poisoning
  • Secure model provenance and integrity verification

Lab 5: AI Supply Chain Attack Simulation

Simulate supply chain attacks against an ML pipeline. Create a backdoored model, exploit pickle deserialization, demonstrate typosquatting, and poison training data to corrupt model behavior.

  • Create a model with a hidden backdoor trigger
  • Demonstrate malicious pickle deserialization for code execution
  • Simulate a typosquatting attack on a model registry
  • Poison training data to introduce targeted misclassification
  • Exploit insecure API key storage in an ML pipeline
  • Perform model extraction via API querying
  • Analyze a model for signs of poisoning or backdoors
  • Generate and validate an ML-SBOM
  • Implement model integrity verification checks
Download Lab

Topics Covered

  • Model extraction fundamentals: cloning models through API access
  • Query-based model stealing: strategies and optimization
  • Training data extraction from language models
  • Membership inference: was this data in the training set?
  • Attribute inference from model outputs
  • Side-channel attacks on LLMs: Whisper Leak traffic analysis
  • Token length side-channel for response reconstruction
  • Timing attacks on efficient inference (speculative decoding)
  • Cache-sharing timing attacks (InputSnatch)
  • TPUXtract: extracting neural network hyperparameters
  • Model inversion: reconstructing inputs from outputs
  • Intellectual property theft implications
  • Defenses: rate limiting, output perturbation, watermarking
  • API monitoring for extraction attempts
  • Differential privacy as a mitigation

Lab 6: Model Theft & Privacy Attacks

Extract a proprietary model's behavior through strategic API querying. Perform membership inference, attempt training data extraction, and analyze encrypted traffic for information leakage.

  • Clone a target model's behavior through systematic API querying
  • Train a surrogate model matching the target's predictions
  • Perform membership inference to identify training data
  • Extract memorized training data from an LLM
  • Analyze encrypted LLM traffic for topic classification (Whisper Leak)
  • Demonstrate model inversion on a simple classifier
  • Implement and test rate-limiting defenses
  • Evaluate output perturbation as a defense mechanism
  • Generate an extraction detection report
Download Lab

Topics Covered

  • Why manual testing isn't enough: the automation imperative
  • garak deep dive: generators, probes, detectors, and analyzers
  • PyRIT architecture: datasets, orchestrators, converters, scoring
  • Promptfoo: declarative red teaming configuration and CI/CD integration
  • Designing attack datasets and seed prompts
  • Attack strategy selection and configuration
  • Scoring and evaluating model responses automatically
  • Multi-turn attack orchestration
  • Converter chains: encoding, obfuscation, and evasion
  • Benchmarking AI security: CVE Bench and evaluation frameworks
  • CI/CD integration: red teaming in the deployment pipeline
  • Generating actionable security reports
  • Custom probe development for domain-specific testing
  • Comparing and combining multiple tools
  • Building a continuous AI security testing program

Lab 7: Automated Red Teaming Pipeline

Build and run an automated AI red teaming pipeline using garak, PyRIT, and promptfoo. Test multiple models, generate comprehensive reports, and integrate security testing into a CI/CD workflow.

  • Configure and run garak against a local LLM with multiple probe types
  • Build a PyRIT orchestrator with custom datasets and converters
  • Create a promptfoo red team configuration with multiple attack vectors
  • Compare vulnerability results across different models
  • Implement multi-turn attack automation with PyRIT
  • Build custom garak probes for domain-specific testing
  • Set up CI/CD integration with promptfoo
  • Generate and analyze comprehensive security assessment reports
  • Create a dashboard for tracking AI security posture over time
Download Lab

Topics Covered

  • From vulnerability to impact: thinking like a business adversary
  • Building AI exploitation chains: combining multiple weaknesses
  • Data exfiltration through AI systems
  • Privilege escalation via AI agent tool abuse
  • Lateral movement through AI infrastructure
  • Persistence mechanisms in AI systems
  • Impact categories: confidentiality, integrity, availability, safety
  • Business impact quantification for AI vulnerabilities
  • AI incident response: detection, containment, recovery
  • Writing effective AI red team reports
  • CVSS scoring adapted for AI vulnerabilities
  • Remediation strategies and defense-in-depth for AI
  • Communicating findings to technical and non-technical stakeholders
  • Building an AI security improvement roadmap
  • Continuous monitoring and re-testing

Lab 8: Full AI Red Team Engagement

Conduct a complete AI red team engagement against a realistic AI-powered enterprise application. Perform reconnaissance, chain multiple exploits, demonstrate business impact, and deliver a professional report.

  • Perform comprehensive reconnaissance of a target AI application
  • Identify and document all attack surfaces
  • Chain prompt injection + RAG poisoning + data exfiltration
  • Demonstrate privilege escalation through agent tool abuse
  • Establish persistence in the AI system
  • Quantify business impact of discovered vulnerabilities
  • Write a professional AI red team report with CVSS scores
  • Present remediation recommendations prioritized by risk
  • Develop a 30/60/90 day security improvement plan
Download Lab

Open-Source Arsenal

Three industry-leading tools used throughout the course for automated AI vulnerability discovery and red teaming.

gk

garak

by NVIDIA

LLM vulnerability scanner with 47+ probes across 12 categories. Automated detection of prompt injection, data leakage, toxicity, hallucination, and more.

Py

PyRIT

by Microsoft

Python Risk Identification Tool for generative AI. Multi-turn attack orchestration, converter chains for evasion, automated scoring, and comprehensive reporting.

pf

promptfoo

Open Source

LLM red teaming and evaluation framework. Declarative YAML configuration, CI/CD integration, comparative testing across models, and automated vulnerability reporting.

Set Up in 3 Commands

Every lab runs locally via Docker. Clone the repository, pick a lab, and start hacking.

terminal
# Download and extract the labs
curl -LO airt-labs.zip
unzip airt-labs.zip -d airt-labs
cd airt-labs

# Start any lab (e.g., Lab 01 - Foundations)
cd lab01-foundations
docker-compose up

# Access the lab interface
open http://localhost:8888

# Run vulnerability scan with garak
garak --model_type ollama --model_name llama3 --probes all

# Launch PyRIT orchestrator
python -m pyrit.orchestrator --config config.yaml

Community-Driven AI Security Education

The AI Red Team Academy is a free, open-source educational resource designed to democratize AI security knowledge. We believe that understanding offensive techniques is essential for building robust AI defenses.

This course covers similar ground to commercial AI red teaming certifications — but is freely accessible to everyone. Whether you're a seasoned penetration tester, an AI researcher, or a security-curious developer, AIRT provides the hands-on experience you need.

Built for security professionals, researchers, and anyone passionate about AI safety. All labs run locally via Docker, requiring no cloud API keys or external services. Your testing environment stays completely under your control.

The curriculum spans 60–80 hours of content across 8 modules, from foundational concepts to full red team engagements. Each module includes both theory and a hands-on Docker lab with real attack simulations.