Controllers: Theory & Variants
Introduction to Path Integral Control
Model Predictive Path Integral (MPPI) control is a sampling-based algorithm rooted in the duality between Stochastic Optimal Control and Statistical Mechanics. Unlike gradient-based methods (e.g., iLQR), MPPI does not require differentiability of the dynamics or cost function, making it suitable for complex, discontinuous environments.
The information-theoretic basis
Consider the controlled stochastic system: \[ dx = f(x,t)\,dt + G(x,t)\,u\,dt + G(x,t)\,\Sigma^{1/2}\,d\omega \] where \(d\omega\) is Brownian motion. The objective is to minimise: \[ J(u) = \mathbb{E}\!\left[\phi(x_T) + \int_{t_0}^{T}\!\left(q(x,t) + \tfrac{1}{2}u^\top R\,u\right)dt\right] \]
Via the Feynman–Kac formula this reduces to minimising the KL-divergence between the controlled path distribution and the optimal one. The resulting importance-sampling update is: \[ u^*(t) = u_{\mathrm{nom}}(t) + \frac{\sum_{k=1}^K w_k\,\epsilon_k(t)}{\sum_{k=1}^K w_k} \] with Boltzmann weights: \[ w_k = \exp\!\left(-\tfrac{1}{\lambda}\bigl(S(\tau_k) - \rho\bigr)\right) \]
| Symbol | Meaning |
|---|---|
| \(K\) | Number of parallel rollouts (num_samples) |
| \(T\) | Horizon length (horizon) |
| \(\lambda\) | Temperature (lambda_) |
| \(S(\tau_k)\) | Total cost of rollout \(k\) |
| \(\epsilon_k\) | Sampled noise perturbation |
1. Standard MPPI
Header: include/mppi/controllers/mppi.cuh Python class: QuadrotorMPPI
The standard formulation samples i.i.d. Gaussian noise in control space: \[ \epsilon_k(t) \sim \mathcal{N}(0,\,\Sigma) \]
Algorithm
- Sample \(K\) noise sequences \(\epsilon_k \sim \mathcal{N}(0, \Sigma)\).
- Rollout \(x_{t+1} = f(x_t,\,u_{\mathrm{nom},t} + \epsilon_{k,t})\).
- Evaluate cost \(J_k = \sum_t c(x_t, u_t) + \phi(x_T)\).
- Reweight \(w_k \propto \exp(-J_k / \lambda)\), computed on-GPU via CUB.
- Update \(u_{\mathrm{nom}} \leftarrow u_{\mathrm{nom}} + \sum_k w_k\,\epsilon_k\).
- Shift nominal sequence one step forward.
Characteristics
- Cheapest to evaluate; maximum high-frequency exploration.
- Trajectories can be discontinuous if \(\lambda\) is small.
2. Smooth MPPI (S-MPPI)
Header: include/mppi/controllers/smppi.cuh Python class: QuadrotorSMPPI
Status: Python bindings are available. The exploration mechanism has a known scaling issue —
integrate_actions_kernelmultiplies noise by \(\Delta t\), giving effective variance \(\sigma^2 \Delta t^2 \approx 0\) for small timesteps. Tracking diverges until this is fixed. See issue #30.
S-MPPI lifts optimisation into the velocity (rate-of-change) of the control signal, enforcing \(C^1\) continuity.
Theory
Define an auxiliary control-velocity \(v\) such that \(\dot{u} = v\). In discrete time: \[ u_{t+1} = u_t + v_t\,\Delta t \]
Noise is sampled in velocity space: \[ \delta v_k \sim \mathcal{N}(0,\,\Sigma_v) \]
and integrated to produce smooth perturbations: \[ \epsilon_{k,t} = \sum_{i=0}^{t} \delta v_{k,i}\,\Delta t \]
An additional smoothness penalty can be added to the cost: \[ J_{\mathrm{smooth}} = J_{\mathrm{base}} + \gamma \sum_{t=0}^{T-1}\|u_{t+1} - u_t\|^2 \]
Characteristics
- \(C^1\)-continuous controls; reduced actuator wear.
- Slower reaction to sudden obstacles due to control inertia.
- Noise attenuation bug pending fix (issue #30).
3. Kernel MPPI (K-MPPI)
Header: include/mppi/controllers/kmppi.cuh Python class: QuadrotorKMPPI
K-MPPI samples colored noise from a Gaussian Process defined by an RBF kernel, reducing the effective search dimensionality from \(T \cdot n_u\) to \(M \cdot n_u\) where \(M \ll T\).
Theory
Perturbations \(\epsilon(t)\) lie in an RKHS defined by: \[ k(t_i, t_j) = \sigma_f^2 \exp\!\left(-\frac{(t_i - t_j)^2}{2\ell^2}\right) \]
Rather than sampling the full GP, we use \(M\) support points (knots):
- Sample latent noise \(\theta \in \mathbb{R}^{M \times n_u} \sim \mathcal{N}(0, I)\).
- Compute the interpolation matrix \(W = K_{TM}\,K_{MM}^{-1}\) (done once at init).
- Expand: \(\epsilon = W\theta \in \mathbb{R}^{T \times n_u}\).
The weight update applies directly to \(\theta\), keeping optimisation in the low-dimensional latent space.
Characteristics
- Guaranteed smooth trajectories — bandwidth \(\ell\) is derived automatically from support-point spacing.
- ~6× faster per step than standard MPPI for the same \(K\) (fewer noise DOFs).
- Slightly higher RMSE than MPPI on lemniscate tracking (0.79 m vs 0.60 m, 30 s trial).
- Only user-tunable knob:
num_support_pts\(M\) (more points → finer-grained control, slower).
4. B-Spline MPPI
Header: include/mppi/controllers/bspline_mppi.cuh
Parameterises the control trajectory as a cubic B-spline, guaranteeing \(C^2\) continuity (continuous acceleration). Noise is sampled at the B-spline control points.
- Smoother than K-MPPI (one higher continuity order).
- Python bindings not yet exposed.
5. Informative MPPI (I-MPPI)
Header: include/mppi/controllers/i_mppi.cuh Python classes: QuadrotorIMPPI, DI3IMPPI
See the dedicated I-MPPI page for theory and usage.
Summary comparison
| Feature | MPPI | S-MPPI | K-MPPI | B-Spline MPPI | I-MPPI |
|---|---|---|---|---|---|
| Control variable | \(u\) | \(\dot{u}\) (vel.) | \(\theta\) (latent) | B-spline cpts | \(u\) + info cost |
| Noise type | White | Brownian | GP/RBF | B-spline GP | White |
| Smoothness | Low | \(C^1\) | \(C^\infty\) (RBF) | \(C^2\) | Low |
| Compute/step | baseline | ~baseline | ~6× faster | — | ~baseline |
| Python exposed | ✓ | ✓ (#30) | ✓ | — | ✓ |
| Quadrotor RMSE (30 s) | 0.60 m | — | 0.79 m | — | — |