Understanding AI Supply Chains
A security guide to AI supply chain vulnerabilities, covering pickle deserialization risks, SafeTensors conversion, and upstream dependency protection.
Every time you deploy an AI product, run a local LLM, or fine-tune a model, you inherit a vast, unverified supply chain of data, training code, base model weights, and library dependencies.
AI supply chain attacks target upstream elements of this pipeline. By compromising a single training dataset or injecting malicious code into serialized model weights, attackers can execute arbitrary code on downstream host systems the moment a developer loads the model.
This guide analyzes the structure of the AI supply chain, details the critical vulnerabilities of model consumption, and outlines practical engineering mitigations to secure your AI development pipeline.
1. Defining the AI Supply Chain
A traditional software supply chain consists of the packages, libraries, and build tools an application depends on. If an attacker injects malicious code into a widely used upstream utility, every downstream application pulling in that utility becomes compromised.
The AI supply chain expands this attack surface by introducing training data, model architectures, and serialized model weights.
[ UPSTREAM LIFECYCLE ]
+---------------------------------------+
| Data Sources (HF, Kaggle, Web) | --> Training Data Trust
+---------------------------------------+
|
v
+---------------------------------------+
| Training Infrastructure (Cloud GPU) | --> Training Process Trust
+---------------------------------------+
|
v
+---------------------------------------+
| Foundation Model (Base Weights) | --> Model Provenance Trust
+---------------------------------------+
|
v
+---------------------------------------+
| Model Registry (Public/Private Hub) | --> Deployment Trust
+---------------------------------------+
|
v
+---------------------------------------+
| Production App (Your Infrastructure) |
+---------------------------------------+
[ DOWNSTREAM PROPAGATION OF RISK ]
[ Foundation Model ]
(Pre-trained/Compromised)
|
+------------+------------+
| |
v v
[ Customer Support ] [ Code Review ]
(Inherits Backdoor) (LoRA Adapter added)
As illustrated, the core risk of the AI supply chain is its transitive trust model. If a parent foundation model is backdoored during its initial pre-training phase, that backdoor propagates to all descendant models, regardless of subsequent fine-tuning or secondary alignment (such as RLHF).
2. API vs. Download: AI Consumption Paradigms
Organizations consume Large Language Models (LLMs) and diffusion models through two primary paradigms. Each paradigm shifts the security boundary and introduces distinct trust trade-offs.
=== DOWNLOAD PARADIGM ===
[ Model Registry ]
|
| (Download weight files)
v
[ File Formats (.pkl, .pt, .gguf, .safetensors) ]
|
| (Loads directly into memory)
v
[ Local Infrastructure ] <-- YOU host it. YOU scan it.
=== API PARADIGM ===
[ API Provider ]
(Dataset / Fine-tuning / Infrastructure hidden)
|
+-------------------------------------+
| API Call | <-- TRUST BOUNDARY
v v
[ JSON/REST Response ] ----------------> [ Production Application ]
(Supply chain is hidden)
The Download Paradigm (Local Hosting)
In this model, developers download model files (e.g., Llama, Mistral, Qwen) directly to their local infrastructure using formats like GGUF, SafeTensors, PyTorch .pt, or Pickle .pkl.
Security Boundary: Runs entirely within your network. You own the execution risk.
Primary Risk: Serialized formats like Pickle are natively capable of executing arbitrary code when loaded into memory.
The API Paradigm (Hosted Services)
Here, the application queries a model hosted by a third-party provider via REST endpoints.
Security Boundary: The physical model file is isolated behind the provider's infrastructure.
Primary Risk: The supply chain is completely opaque. You must trust the provider's data handling, prompt filters, model versioning consistency, and infrastructure access controls.
3. The Four Attack Layers
The AI supply chain surface consists of four distinct layers, each presenting unique opportunities for exploitation.
[ THE ROAD TO YOUR APPLICATION ]
(Start)
|
v
[ Layer 1: MODEL LAYER ] -------------------> [ Danger: Serialization Attacks ]
| (e.g., malicious model.pkl execution)
v
[ Layer 2: DATA LAYER ] --------------------> [ Danger: Dataset Poisoning ]
| (e.g., poisoned training streams)
v
[ Layer 3: DEPENDENCY LAYER ] --------------> [ Danger: Upstream Typosquatting ]
| (e.g., malicious v99.0.0 package injection)
v
[ Layer 4: INFRASTRUCTURE LAYER ] ----------> [ Danger: Key/Credential Theft ]
| (e.g., compromised build pipelines)
v
(Your Application)
Layer 1: The Model Layer
This layer involves the raw weights and architecture configurations. The primary threat here is arbitrary code execution via insecure deserialization.
Python's native object serialization format, pickle (commonly used in PyTorch .pt, .bin, and .ckpt files), does not distinguish between data and executable instructions. When a model is loaded using torch.load(), Python executes the constructor functions embedded inside the file.
Mitigating with SafeTensors
The safetensors library, developed by Hugging Face, completely eliminates serialization-level code execution by storing model weights in a simple, flat byte array. SafeTensors files contain only metadata (shapes, data types, and names) and raw tensor data, meaning they cannot run arbitrary code.
To secure your pipeline, convert traditional PyTorch pickle models to SafeTensors:
import torch
from safetensors.torch import save_file
def convert_to_safetensors(pytorch_bin_path: str, output_path: str):
# WARNING: Only run this script on models from sources you trust,
# as torch.load executes code during the loading phase.
print(f"Loading unsafe PyTorch weights from {pytorch_bin_path}...")
unsafe_state_dict = torch.load(pytorch_bin_path, map_location="cpu")
# Save the clean state dict to SafeTensors format
print(f"Saving safe weights to {output_path}...")
save_file(unsafe_state_dict, output_path)
print("Conversion complete. Deserialization risk eliminated.")
# Example conversion
convert_to_safetensors("pytorch_model.bin", "model.safetensors")
Auditing Pickles with Fickling
If you must interact with traditional pickle-based models, use Fickling—an open-source security tool that decompiles and analyzes python pickle bytecodes to identify malicious calls (such as eval, exec, or subprocess).
# Install Fickling
pip install fickling
# Scan a model for remote code execution payloads
fickling unsafe_model.pkl
Layer 2: The Dependency Layer
AI frameworks rely heavily on traditional software dependencies. Python's package resolver, pip, is susceptible to dependency confusion and typosquatting attacks.
For example, if an internal project depends on a custom, private package, an attacker can register the exact same package name on the public PyPI registry with an extremely high version number (e.g., v99.0.0). By default, pip will resolve to and install the malicious public version instead of the internal one.
Mitigating Dependency Confusion
Pin Hashes in Requirements: Always lock dependencies and enforce cryptographic hash verification.
# requirements.txt filelock==3.13.1 \ --hash=sha256:d875323a23f114c6c06a3014a600a12e3e5c94e09efb4db75775f0a0d9319e7a \ --hash=sha256:0d53c3df31be812d46e8fa176a9a08e1a6c429c3620f4ef9d784a0d93195f26bInstall using:
pip install --require-hashes -r requirements.txtExplicit Registry Indexing: Use configuration files (
pip.conforpyproject.toml) to define explicit search sources, avoiding fallback resolution behavior.
Layer 3: The Data Layer
In data layer attacks, actors inject malicious samples into public datasets to corrupt training metrics or introduce model backdoors.
Even a minute modification—such as poisoning 0.1% of a training dataset with specific trigger keywords associated with targeted outputs—can cause a fine-tuned model to output attacker-controlled content when triggered in production.
Mitigating Data Poisoning
Cryptographic Signatures: Verify the integrity of downloaded datasets using SHA-256 checksums before initiating training runs.
Input Filtering: Filter training inputs for anomalous distributions or outlier values that deviate significantly from baseline metadata.
Layer 4: The Infrastructure Layer
This layer represents the registries, version control systems, and build agents hosting the models. Compromising this layer involves stealing maintainer credentials or hijacking build pipelines (e.g., GitHub Actions).
If an attacker obtains a developer's API key for an repository hub, they can modify weight files directly, push backdoored updates to high-trust models, and evade standard code review.
Mitigating Infrastructure Risks
MFA Enforcement: Enable Multi-Factor Authentication for all developers accessing model registries (like Hugging Face or PyPI).
Secrets Scanning: Implement automated code scanners (such as GitGuardian or Trufflehog) in your CI/CD pipelines to prevent developers from accidentally committing API keys or Hugging Face tokens to public source control.
Conclusion
The AI supply chain has introduced unique, data-driven attack vectors that bypass traditional code-only security scanners. As AI development matures, treating model weight files as untrusted, executable binaries is a necessity.
By prioritizing the SafeTensors format, enforcing strict pip package hash verification, and implementing secrets scanning across your deployment pipelines, you can insulate your applications from upstream supply chain compromises.
Thanks for reading. See you in the next lab.


