Amazon S3 : Under the Hood
Most S3 courses teach you how to use it. This one teaches you how it works. S3 Under the Hood goes beneath the API and into the engineering: how your bytes survive hardware failure, how a flat namespace scales to hundreds of billions of objects, how a single PUT travels from your machine to durable storage across multiple physical buildings. No AWS console, no CLI, just the internals explained from first principles with interactive simulations at every step.
Sweave
Instructor
Chapters
Why S3 is not a filesystem
Before understanding how S3 works, you need to unlearn how you think about storage. This chapter builds the correct mental model from the ground up.
- Block storage, file systems, and object storage: what actually differs at the hardware level
- What an object really is: bytes + metadata as a single atomic unit
- The flat namespace: why folders don't exist and what keys actually are
- Why this model enables horizontal scale that filesystems fundamentally cannot
How stored bytes survive hardware failure
S3 claims 11 nines of durability. This chapter explains exactly what that means and the engineering behind it.
- How an object is split into chunks and distributed across multiple availability zones
- Erasure coding: the math that lets you reconstruct data from partial shards
- Checksums at every layer: how silent corruption is detected before you ever see it
- The background scrubber: how S3 continuously heals itself without any request
How S3 finds any object among trillions
With hundreds of billions of objects stored, a GET returns in milliseconds. This chapter explains the index architecture that makes that possible.
The full journey of a single PUT request
Everything comes together here. Trace one PUT request from the moment you call the API to the moment the bytes are durably committed across multiple AZs.
- DNS resolution and how S3 routes you to the right regional fleet
- SigV4 signing: the cryptographic handshake that authenticates every request
- The frontend layer vs the storage layer: two separate systems with different responsibilities
- Multipart upload internals: how large objects are landed and assembled server-side



