Clouds are chaotic.
Unexpected events happen all the time.
Deathstar is here to help us being proactive and ensuring we can withstand disturbances in the cloud.
It's application-level chaos engineering.
A way of creating controlled outage simulations and other types of simulated chaos, through the means of a Slack bot.
The underlying idea is that, by creating our own short and controlled misbehaviors, we can...
🅰️ identify areas for improvement and...🅱️ ensure there's enough error handling and other resilience capabilities in place to avoid too much suffering, even though things are on fire behind the scenes.
More in-depth reasoning and background can be found in this article.
Deathstar comes in two parts,...
- a Slack bot (this repo), which can be seen as the Deathstar control plane. It is responsible for coordinating and carrying out simulations and...
- a middleware that needs to be included in all services that are to be attacked by Deathstar
The middleware act on signals from the Deathstar control plane. Signals that are broadcasted during an outage simulation.
Deathstar currently supports the following simulations:
error
- make the target under attack throw HTTP errorsslow
- make the target extremely slooow
All simulations can be applied to either all endpoints of a service or a selected few endpoints.
It's also possible to define a list of endpoints and HTTP headers that should be excluded from a simulation because it might be that the same service is used to serve both external and internal users and we may or may not have the same resilience level in both use cases.
git clone
,npm install
andnpm run build
- Make necessary changes to
src/config
npm run start:web
to launch the control planenpm run start:trigger my-org/my-app
as a cronjob, to trigger a run of a particular simulation suite at a given time
AWS_ACCESS_KEY
- An AWS access key with read and write permissions toBUCKET_NAME
.AWS_SECRET_KEY
- An AWS secret key with read and write permissions toBUCKET_NAME
.AWS_REGION
- The AWS region whereBUCKET_NAME
is hosted. Defaults toeu-north-1
.BUCKET_NAME
- The name of an S3 bucket. You know, for state.SLACK_SIGNING_SECRET
- A Slack signing secret, used to verify that requests from Slack are actually coming from Slack.SLACK_TOKEN
- A Slack access token, for communicating with the Slack APIs.
See src/config/index.ts
for a typical example configuration.
The Slack bot uses a lot of custom emojis. Scanning the code base for emojis and installing them in your Slack workspace is left as an exercise to the reader. Sorry.
Schibsted made this. Come work with us!