Thanks for flagging this for us. It does look like the docs are missing a bit of detail here and need an update. Spaces was designed to be compatible with the S3 API, and its usage of ETag is consistent with S3’s behavior. Quoting from the S3 documentation:
The entity tag is a hash of the object. The ETag reflects changes only to the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the object data. [ … ] If an object is created by either the Multipart Upload or Part Copy operation, the ETag is not an MD5 digest, regardless of the method of encryption.
So if it’s not an MD5 digest, what exactly is it? Let’s take a look at a real example. Here’s the HEAD for an object stored in Spaces:
$ aws s3api --endpoint-url https://nyc3.digitaloceanspaces.com head-object --bucket example-bucket --key demo-h264.mov
"LastModified": "Mon, 11 Dec 2017 16:30:22 GMT",
The ETag is
abca46f3fae1b698571c0f08b98618e1-96 This is made up of two pieces, the value before the hyphen and the value after. The latter means that the object was uploaded using a multipart upload consisting of 96 parts. Each of these parts was 8MB large. To get the first value, the MD5 of each of the 96 parts are concatenated in binary format, and then the MD5 of that is taken.
This bash script is a useful tool for calculating and verifying the ETag for an object:
Looking at our object again, we can calculate what the ETag should be:
- ./s3md5 8 demo-h264.mov
Or verify the existing one:
- ./s3md5 -e abca46f3fae1b698571c0f08b98618e1-96 8 demo-h264.mov