Skip to content

Slooo: A Fail-slow Fault Injection Testing Framework

Notifications You must be signed in to change notification settings

xlab-uiuc/slooo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Slooo: A Fail-slow Fault Injection Testing Framework

Slooo is a Xonsh-based fault injection framework for distributed systems.

Slooo is a part of the DepFast project in which we evaluate the fail-slow fault tolerance of a quorum system using fault injection, e.g., slowing down a node by adding delays and creating contention on the CPU/memory/disk/Network.

Doing such fault injection requires a lot of scripting. Some are application specific (e.g., scripts to start and terminate the system) and some are generic (e.g., injecting certain types of faults).

We wrote a lot of shell scripts for fast scripting, but soon we ran into maintenance hell, especially every time there is a major reorgs of the team (members leaving and joining). After many rounds of energy draining and time wasting, we decided to write a more structured, reusable framework to minimize the overhead.

The choice of using Xonsh comes from the following considerations:

  • We still need shell scripts for the ease of integration. Specifically, many of the scripts of the system under test are in shell.
  • We want high-level languages which can build some abstractions and reusable code.

Xonsh serves both – it is a chimera of Python and Shell.

We have used Slooo to test a number of quorum systems, including RethinkDB, MongoDB and TiDB.

The test can be done in a "pseudo-distributed" mode and in cloud environments. The former runs all the tests on one machine and the latter runs the tests in a cloud platform. Currently, we only support Azure Cloud (which sponsored the DepFast project).

Please checkout the tutorial on how to write fault-injection tests using Slooo.

Slooo Demo for ICSE'23