Multipart uploads to S3

Sometimes I need to upload really large .zip files to S3. AWS cli or the web interface works perfect for this task. The problem is, at least in my case, the bad quality of the internet connection. If for some reason it goes down I need to start the upload again from the very beginning.

After checking the documentation of S3 I found that it supports multipart uploads. So basically, you can split a file in different pieces to upload them one by one. When all pieces are on S3 they they are merged to get the final file.

Let's say that I want to upload a 30GB file full of videos. I can use this tool to split it in 500MB pieces. If the connection goes down it will resume from the last piece that was uploaded instead of doing it from the beginning.

That is what s3cmd does.

AWS S3 Tools

The installation of the tool is very easy in Mac:

brew install s3cmd

It is also possible to get it from the offical webpage of AWS S3 Tools.

In the first execution it will ask the usual AWS keys. After that, it is time to start the upload:

$ s3cmd put --multipart-chunk-size-mb=500 Camara.zip s3://some-bucket/
upload: 'Camara.zip' -> 's3://some-bucket/Camara.zip'  [part 1 of 16, 500MB] [1 of 1]

Now let's stop it with Control+C:

upload: 'Camara.zip' -> 's3://some-bucket/Camara.zip'  [part 5 of 16, 500MB] [1 of 1]
  13041664 of 524288000     2% in  109s   116.48 kB/s^CERROR: 
Upload of 'Camara.zip' part 5 failed. Use
  /usr/local/Cellar/s3cmd/1.6.1/libexec/bin/s3cmd abortmp s3://some-bucket/Camara.zip Y8CYFHWCmnT6WUIw9nPTyU1AyseDrvsXhroXqVHIfA5AaTsUWw01Y8LZgx5H.8JJybYtMUUsW2GBXByAGiZJ_lc3qtBDa2WO5x2F6397UHZSBwaj61P4pkw67zQF63nM
to abort the upload, or
  /usr/local/Cellar/s3cmd/1.6.1/libexec/bin/s3cmd --upload-id Y8CYFHWCmnT6WUIw9nPTyU1AyseDrvsXhroXqVHIfA5AaTsUWw01Y8LZgx5H.8JJybYtMUUsW2GBXByAGiZJ_lc3qtBDa2WO5x2F6397UHZSBwaj61P4pkw67zQF63nM put ...
to continue the upload.
See ya!

There we get the upload id that can be used to resume it. Just add it to the s3cmd command to resume the upload:

s3cmd --upload-id Y8CYFHWCmnT6WUIw9nPTyU1AyseDrvsXhroXqVHIfA5AaTsUWw01Y8LZgx5H.8JJybYtMUUsW2GBXByAGiZJ_lc3qtBDa2WO5x2F6397UHZSBwaj61P4pkw67zQF63nM put --multipart-chunk-size-mb=500 Camara.zip s3://some-bucket/
WARNING: MultiPart: size and md5sum match for s3://some-bucket/Camara.zip part 1, skipping.
WARNING: MultiPart: size and md5sum match for s3://some-bucket/Camara.zip part 2, skipping.
WARNING: MultiPart: size and md5sum match for s3://some-bucket/Camara.zip part 3, skipping.
WARNING: MultiPart: size and md5sum match for s3://some-bucket/Camara.zip part 4, skipping.
upload: 'Camara.zip' -> 's3://some-bucket/Camara.zip'  [part 5 of 16, 500MB] [1 of 1]
     65536 of 524288000     0% in    3s    16.27 kB/s  failed
WARNING: Upload failed: /Camara.zip?partNumber=5&uploadId=Y8CYFHWCmnT6WUIw9nPTyU1AyseDrvsXhroXqVHIfA5AaTsUWw01Y8LZgx5H.8JJybYtMUUsW2GBXByAGiZJ_lc3qtBDa2WO5x2F6397UHZSBwaj61P4pkw67zQF63nM ([Errno 32] Broken pipe)
WARNING: Retrying on lower speed (throttle=0.00)
WARNING: Waiting 3 sec...
upload: 'Camara.zip' -> 's3://some-bucket/Camara.zip'  [part 5 of 16, 500MB] [1 of 1]

Easy, right? Now I don't care if the connection goes down or if I have to stop the upload to watch Netflix :)