POMDPStressTesting
This package is used to find likely failures in a black-box software system. The package is integrated with the POMDPs.jl ecosystem; giving access to solvers, policies, and visualizations (although no prior knowledge of the POMDPs.jl package is needed to use POMDPStressTesting.jl—see the Guide). It uses a technique called adaptive stress testing (AST)[1] to find likely failures using a distance metric to a failure event and the likelihood of an environment sample to guide the search.
A POMDP is a partially observable Markov decision process, which is a framework to define a sequential decision making problem where the true state is unobservable. In the context of this package, we use the POMDP acronym mainly to tie the package to the POMDPs.jl package, but the system that is stress tested can also be defined as a POMDP.
This package is intended to help developers stress test their systems before deployment into the real-world (see existing use cases for aircraft collision avoidance systems[1] and aircraft trajectory prediction systems[2]). It is also used for research purposes to expand on the AST concept by allowing additional solution methods to be explored and tested.
[1] Ritchie Lee et al., "Adaptive Stress Testing: Finding Likely Failure Events with Reinforcement Learning ", 2020. https://arxiv.org/abs/1811.02188
[2] Robert J. Moss, Ritchie Lee, Nicholas Visser, Joachim Hochwarth, James G. Lopez, Mykel J. Kochenderfer, "Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems", DASC 2020. https://arxiv.org/abs/2011.02559
Package Features
- Search for failures in a black-box system
- Define probability distributions of your simulation environment
- Find likely system failures using a variety of solvers
- Calculate and visualize failure metrics
- Replay found failures
Contents
BlackBox
System Definition
A black-box system could be an external software executable, code written in Julia, or code written in another language. The system is generally a sequential decision making system than can be stepped foward in time. It is termed "black-box" because all we need is to be able to initialize it (using BlackBox.initialize!
), evaluate or step the system forward in time (using BlackBox.evaluate!
), and parse the output of the system to determine the distance metric (using BlackBox.distance
), the failure event indication (using BlackBox.isevent
), and whether the system is in a terminal state (using BlackBox.isterminal
).
- See the
BlackBox
interface for implementation details
The BlackBox
system interface includes:
BlackBox.initialize!(sim::Simulation)
to initialize/reset the system under testBlackBox.evaluate!(sim::Simulation)
to evaluate/execute the system under testBlackBox.distance(sim::Simulation)
to return how close we are to an eventBlackBox.isevent(sim::Simulation)
to indicate if a failure event occurredBlackBox.isterminal(sim::Simulation)
to indicate the simulation is in a terminal state
GrayBox
Simulator/Environment Definition
The gray-box simulator and environment define the parameters of your simulation and the probability distributions governing your simulation environment. It is termed "gray-box" because we need access to the probability distributions of the environment in order to get the log-likelihood of a sample used by the simulator (which is ulimately used by the black-box system).
- See the
GrayBox
interface for implementation details.
The GrayBox
simulator and environment interface includes:
GrayBox.Simulation
type to hold simulation variablesGrayBox.environment(sim::Simulation)
to return the collection of environment distributionsGrayBox.transition!(sim::Simulation)
to transition the simulator, returning the log-likelihood
Failure and Distance Definition
A failure event of the system under test is defined be the user. The user defines the function BlackBox.isevent
to return an boolean indicating a failure or not given the current state of the simulation. An example failure used in the context of AST would be a collision when stress testing autonomous vehicles or aircraft collision avoidance systems.
The real-valued distance metric is used to indicate "how close are we to a failure?" and is defined by the user in the BlackBox.distance
function. This metric is used to guide the search process towards failures by receiving a signal of the distance to a failure. An example distance metric for the autonomous vehicle problem would be the distance between the autonomous vehicle and a pedestrian, where if a failure is a collision with a pedestrian then we'd like to minimize this distance metric to find failures.
Citation
If you use this package for research purposes, please cite the following:
@article{moss2021pomdpstresstesting,
title = {{POMDPStressTesting.jl}: Adaptive Stress Testing for Black-Box Systems},
author = {Robert J. Moss},
journal = {Journal of Open Source Software},
year = {2021},
volume = {6},
number = {60},
pages = {2749},
doi = {10.21105/joss.02749}
}