Curriculum Layout

This document maps the full arc from foundations to a working SSA simulation. Each module builds on the previous ones. The final module project is the payoff: a Rust implementation of a small game-theoretic SSA scenario using everything learned.

The arc in one sentence per module

Orbital mechanics and SDA domain: TLEs, SGP4, reference frames, CDMs, conjunction probability, and the commercial SDA data ecosystem — the domain foundation every later ML model is built on top of. SP. Spacepower theory and strategic context: Dolman, Lutes, USSF doctrine, counterspace taxonomy, deterrence stability, Krepinevich's RMA/MTR, Chinese theory, and the mapping from strategic frameworks to game-theoretic tools — why your wargame design choices are the ones they are.
Foundations: probability, linear algebra, calculus, SVD, the multivariate Gaussian, and constrained optimization — every tool every later algorithm uses.
Neural networks: MLPs as function approximators, PyTorch mechanics, loss functions with MLE/MAP foundations.
Reinforcement learning: MDPs, DQN, policy gradients, actor-critic, hierarchical RL, and IMPALA distributed training.
Search and planning: MCTS, AlphaZero self-play, and IS-MCTS for fog-of-war games.
Game theory: extensive-form games, Nash equilibria, CFR, MCCFR, and Deep CFR.
Multi-agent RL: PSRO, fictitious play, alpha-rank, and cooperative CTDE with MAPPO and QMIX.
Partial observability: POMDPs, particle filters, imperfect-information games, and opponent modeling.
OpenSpiel and capstone: the full OpenSpiel → PettingZoo → Ray RLlib pipeline, a Rust CFR solver, SBIR contracting, and LLM wargame adjudication.
Applied SDA ML: sequence models and LSTM maneuver detection from TLE history — the first commercially viable product.

Module 0: Orbital Mechanics and the SDA Data Ecosystem

Builds toward: a Space-Track conjunction screening pipeline; the domain knowledge that grounds every ML model in Modules 1–9.

#	Lesson	Key concepts
1	TLEs and Keplerian elements	TLE format, 6 Keplerian elements, mean vs. osculating elements, J2 RAAN drift, ndot/ndotdot
2	Reference frames: ECI, ECEF, TEME, RTN	J2000 ECI, ECEF, TEME (SGP4 output), RTN for CDM covariances
3	SGP4 propagation	J2–J6 harmonics, BSTAR drag, SDP4 for deep-space, accuracy characterization, python-sgp4
4	Conjunction analysis and the CDM format	Pizza-box screening volume, Pc methods, CCSDS CDM format, OBJECT1/OBJECT2 blocks, RTN covariance
5	The commercial SDA data ecosystem	SSA vs. SDA distinction, Space-Track, CelesTrak, LeoLabs, commercial providers, data pipeline architecture

Project: Space-Track conjunction screening pipeline in Python.

Module SP: Spacepower Theory and Strategic Context

Builds toward: the wargame design choices in Modules 4–8; the strategic vocabulary for government customer conversations.

#	Lesson	Key concepts
1	Foundations of spacepower theory	Dolman/Mackinder, Lutes definition, sanctuary vs. high ground debate, USSF SCP seven disciplines, Ziarnick's general theory, Carlson's Chinese framework (Go not Chess), OST and space law basics
2	Counterspace operations and the new RMA	Kinetic/non-kinetic, reversible/irreversible taxonomy, stability-instability paradox, Krepinevich MTR/RMA, PLA Science of Military Strategy 2013, commercial space as strategic actor (Viasat/Starlink), deterrence by resilience (PWSA/SDA), allied/partner dimensions (Five Eyes, NATO Space COE, CASR)
3	Historical case studies in space competition	2007 Chinese ASAT test (Fengyun-1C, debris, signaling), Russia's Luch co-orbital program (GEO proximity ops, attribution problem), Viasat KA-SAT hack (invasion sequencing, German wind turbines, CASR response)
4	Chinese spacepower theory and gray zone competition	PLA informationized warfare, Qiao Liang Unrestricted Warfare, Three Warfares (legal/psychological/public opinion), near-space legal warfare, civilian-military blur, gray zone wargame findings, Hal Brands coalition dynamics
5	Escalation dynamics, crisis stability, and the ML deterrence framework	Space escalation ladder (8 rungs, 2 firebreaks), Russian calibrated escalation model, Brands/Cooper deterrence dilemmas, ISR blinding as escalation accelerant, Campbell crisis communication, Kessler Syndrome limits, OST limits, ML deterrence-by-detection thesis
6	From strategic theory to wargame design	Strategic questions → game structures, why CFR/IS-MCTS for gray zone, why PSRO for multi-actor, behavioral attribution → particle filters, AlphaZero Nash equilibrium findings, capstone game design rationale

No project — strategic theory module; connections to later projects are explicit in each lesson.

Module 1: Foundations

Builds toward: a Monte Carlo conjunction probability estimator.

#	Lesson	Key concepts
1	Probability, distributions, and expectation	Random variables, categorical and Gaussian distributions, E[X]
2	Conditional probability and Bayes' rule	P(A\|B), prior/likelihood/posterior, sequential updates
3	Sampling and Monte Carlo estimation	The 1/√N convergence, unbiasedness, variance reduction preview
4	Entropy, cross-entropy, and KL divergence	Surprise, H(P), H(P,Q), KL(P‖Q), asymmetry
5	Vectors and dot products	State vectors, norms, alignment, cosine similarity
6	Matrices and matrix-vector multiplication	Row-as-dot-product, shapes, bias, why nonlinearity is needed
7	Derivatives, gradients, and the chain rule	Slope, partial derivatives, ∇f, chain rule, autograd
8	Matrix decompositions: SVD and Cholesky	A = UΣVᵀ, low-rank approximation, Eckart-Young, Cholesky sampling
9	The multivariate Gaussian	Covariance matrix, Mahalanobis distance, marginals/conditionals, Kalman connection
10	Constrained optimization and Lagrange multipliers	Lagrangian, KKT conditions, duality, L2 regularization as MAP

Project: Monte Carlo conjunction probability estimator in Python.

Module 2: Neural Networks as Function Approximators

Builds toward: a trained MLP that predicts conjunction risk from orbital features, the value function approximator pattern used in every later RL module.

#	Lesson	Key concepts
1	Activation functions	Why nonlinearity is needed, ReLU, tanh, softmax
2	Building an MLP	Stacking layers, `nn.Sequential`, forward pass by hand
3	Loss functions and what we are optimizing	MSE and cross-entropy as MLE; L2 regularization as MAP with a Gaussian prior
4	The training loop	Datasets and batches, forward/backward/step, overfitting and validation

Project: train a small MLP to approximate a conjunction-risk scoring function from simulated orbital feature data. Lays the groundwork for the value network in Module 4.

Module 3: Reinforcement Learning Fundamentals

Builds toward: a DQN sensor allocation agent; the distributed training infrastructure for thousands of parallel SSA game simulations.

#	Lesson	Key concepts
1	Markov Decision Processes	States, actions, transitions, rewards, discount factor γ
2	Value functions	V(s), Q(s,a), Bellman equations, bootstrapping
3	Tabular Q-learning	TD error, ε-greedy exploration, convergence
4	Deep Q-Networks (DQN)	Function approximation for Q, experience replay, target networks
5	Policy gradient methods	REINFORCE, the score function estimator, entropy regularization
6	Actor-critic	Advantage functions, baseline subtraction, GAE, the A2C/A3C structure
7	Hierarchical reinforcement learning	Options framework (I, π, β), SMDP Q-values, HIRO goal-conditioned policies, 3-layer SSA decomposition
8	IMPALA and distributed RL	Actor-learner decoupling, V-trace off-policy correction, APPO in RLlib, throughput math

Project: a DQN agent that learns to allocate sensor dwell time across a set of tracked objects to maximize conjunction-detection reward. First OpenSpiel touchpoint: the game is defined as an OpenSpiel environment.

Module 4: Search and Planning

Builds toward: an AlphaZero-lite agent for pursuit-evasion; IS-MCTS as the inference-time planner for fog-of-war SSA games.

#	Lesson	Key concepts
1	Tree search fundamentals	Game trees, minimax, alpha-beta pruning
2	Monte Carlo Tree Search	UCB1, selection/expansion/simulation/backpropagation, PUCT
3	Neural-guided MCTS	Policy network for priors, value network replacing rollouts
4	AlphaZero self-play	Self-play data generation, MCTS as policy improvement operator
5	Information Set MCTS	Determinization, strategy fusion problem, PUCT with neural prior, IS-MCTS vs. CFR

Project: an AlphaZero-lite agent trained by self-play on a small pursuit-evasion game between two spacecraft. Uses an OpenSpiel game definition and PyTorch policy/value networks. Rust translation: the MCTS tree structure.

Module 5: Game Theory and Equilibrium Computation

Builds toward: a CFR solver for a small orbital negotiation game (who maneuvers to avoid conjunction?).

#	Lesson	Key concepts
1	Normal-form and extensive-form games	Strategy profiles, Nash equilibrium, information sets
2	Extensive-form games in detail	Game trees, information sets, strategies vs. policies, reach probabilities
3	Counterfactual Regret Minimization (CFR)	Counterfactual values, regret matching, convergence to Nash
4	Monte Carlo CFR (MCCFR)	Outcome sampling, external sampling, variance vs. speed tradeoff
5	Deep CFR	Neural network as regret buffer, traversal sampling

Project: a vanilla CFR and MCCFR solver for the "who maneuvers?" conjunction game defined in OpenSpiel. Rust translation: the CFR data structures (information set table, regret vector). This is the most Rust-relevant lesson in the curriculum.

Module 6: Multi-Agent Reinforcement Learning

Builds toward: a PSRO solver for adversarial satellite-constellation games; MAPPO for cooperative ally coalition training.

#	Lesson	Key concepts
1	The multi-agent problem	Non-stationarity, simultaneous vs. sequential, cooperative vs. competitive
2	Fictitious play	Best response to empirical distribution, convergence in zero-sum games
3	Policy Space Response Oracles (PSRO)	Meta-game, restricted Nash, oracle computation
4	Alpha-rank	Markov chain over strategy profiles, stationary distribution, eigenvectors
5	Centralized training, decentralized execution	CTDE paradigm, MAPPO (centralized critic), QMIX (value decomposition, monotonicity)

Project: a PSRO loop for a 2-player satellite constellation coverage game. Alpha-rank used to analyze which strategies dominate.

Module 7: Partial Observability

Builds toward: a particle-filter RSO belief tracker; the belief-propagation infrastructure for the Module 8 capstone.

#	Lesson	Key concepts
1	POMDPs	Observation functions, belief states, belief MDP, PBVI/SARSOP
2	Belief state representation	Particle filters, ESS, deprivation detection, DRQN implicit belief
3	Imperfect-information games	Multi-agent private information, information sets, value of information
4	Opponent modeling	Bayesian type inference, exploit vs. Nash tradeoff, KL drift detection

Project: a bootstrap particle filter tracking an uncooperative RSO from noisy RA/Dec observations, with ESS monitoring and roughening for deprivation recovery.

Module 8: OpenSpiel and the Rust Capstone

Builds toward: the full production pipeline — OpenSpiel game → PettingZoo → Ray RLlib distributed training — plus a Rust CFR solver, a business on-ramp via SBIR, and LLM wargame adjudication.

#	Lesson	Key concepts
1	OpenSpiel architecture	Game API, algorithm API, bots, observers, information state tensors
2	Implementing a custom game	Extending `pyspiel.Game`, state transitions, information states
3	Rust and burn: the production gap	What exists, what does not, how to bridge
4	Designing the SSA game	State representation, action space, reward structure for the capstone
5	PettingZoo, shimmy, and Ray RLlib	OpenSpiel → shimmy → PettingZoo AEC → RLlib MultiAgentEnv → MARLlib MAPPO; self-play config; parallelism math
6	From research to revenue: SBIR and government contracting	SBIR eligibility, SpaceWERX, Phase I/II mechanics, commercial-first vs. SBIR-first, ITAR
7	LLM-in-the-loop wargame adjudication	FedRAMP constraints, local models, matrix game format, auditability, prompt injection mitigations

Project (capstone): a Rust crate implementing:

A two-player extensive-form SSA game (attacker tries to mask a maneuver; defender allocates sensors to detect it)
A vanilla CFR solver over the game tree using native Rust data structures
A burn neural network trained to approximate regret values (replacing tabular CFR for larger state spaces)
A simple CLI that runs self-play and prints the Nash equilibrium strategy profile

This is the artifact you could drop into a thesis simulation. It references every concept built in modules 0 through 7 and fills the gap left by the absence of a Rust-native OpenSpiel.

Module 9: Applied SDA ML

Builds toward: a production maneuver detection pipeline — the first commercially viable SDA AI product built entirely from public data.

#	Lesson	Key concepts
1	Sequence models for maneuver detection	LSTM on TLE history, synthetic label generation, time-normalized delta features, operational evaluation metrics

Project: production LSTM maneuver detection pipeline on Space-Track TLE history with ISS reboost test evaluation.

Keyboard shortcuts

ML for Spacepower Simulations