Max's notebook

A collection of sorts

Boto Over Time

05 Jul 2020

If you’ve worked with AWS using python, then you’ve come across the AWS SDK. The current generation is boto3, the previous version is boto, and you can use both side-by-side in the same code-base, and after a few incidents due to this, I will never do this thing.

How many ways can you grant an app running on an EC2 instance access to AWS resources?

Here are a few:

  1. dedicated IAM credentials in an app-specific config file
  2. dedicated IAM credentials in the .boto file or in the .aws/ directory
  3. instance profiles or roles

Guess what I inherited? (hint: it was all of them.)

Bonus round: Configuration management

This application uses ansible for config management, which suffers from the exact same issue since it uses the same libraries, sometimes in parallel too (for an example, see the s3 module), so debugging and deploying reliable fixes was harder still.



And then? We find it and kill it

  1. Spelunk the App the first: find all the code that loads the IAM creds, and identify the services and calls made
  2. Spelunk the App again: compare these calls against the IAM policy, and patch to match the code when needed
  3. Remove the creds from the config file and cross your fingers
  4. Test: did it work?
  5. Ship it if so, fix it if not
  6. Remove the user creds
  7. Spelunk the config management: find the calls and services, remove unused, patch the policy where required
  8. Remove the non-instance profile creds
  9. Test it again: how about now?
  10. How about the crons? You did check the crons, didn’t you? (Narrator: they did check the crons)
  11. Ship it if so, fix it if not
  12. Find surprise edge cases and cross-service library usage by watching breakage in prod
  13. Cry while fixing and testing and shipping
  14. Express anger at vague error messages
  15. Express gratitude for fast deployments
  16. Go to bed. It was a very long week