6

Amazon S3 Supports New Checksum Algorithms for Integrity Checking

 2 years ago
source link: https://www.infoq.com/news/2022/03/aws-s3-checksum-algorithms/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Amazon S3 Supports New Checksum Algorithms for Integrity Checking

Mar 13, 2022 2 min read

Amazon S3 recently introduced support of four checksum algorithms for data integrity checking on upload and download requests. Amazon claims that the enhancements to the AWS SDK and S3 API accelerates integrity checking of the S3 requests by up to 90%.

Depending on application needs, developers can choose a SHA-1, SHA-256, CRC32, or CRC32C checksum algorithm, and verify the integrity of the data providing a precalculated checksum or having the AWS SDK automatically calculate a checksum as it streams data into S3.

The checksum and the specified algorithm are stored as part of the object’s metadata, are persistent even if the object changes storage classes and are copied as part of S3 replication. Jeff Barr, vice president and chief evangelist at AWS, explains the benefits of the new feature for uploads:

Computing checksums for large (multi-GB or even multi-TB) objects can be computationally intensive, and can lead to bottlenecks. The newest versions of the AWS SDKs compute the specified checksum as part of the upload, and include it in an HTTP trailer at the conclusion of the upload. (...) S3 will verify the checksum and accept the operation if the value in the request matches the one computed by S3. In combination with the use of HTTP trailers, this feature can greatly accelerate client-side integrity checking.

1fc_cons_2-1646739883366.png

Source: https://aws.amazon.com/blogs/aws/new-additional-checksum-algorithms-for-amazon-s3/

S3 API can calculate and store part-level checksum for objects uploaded through S3 multipart upload. Previously AWS suggested using the Content-MD5 header to check the integrity of an object. Peter Mescalchin, software engineer at Flip, tweets:

Pretty excited to see S3 can now provide alternative checksum algorithms for uploaded objects. I'm thinking pieces are close to being in place to use SHA-256 with Terraform S3 -> Lambda function deployments and source code hashes.

Aaron Booth, freelance consultant, asks:

What’s the use case of giving the user 4 different options then a choice of two such as the most performant or most accurate?

Kevin Miller, vice president & GM for S3 at AWS, explains:

Compatibility with other applications that use one algorithm or another. In our experience it’s best when you can have an end-to-end checksum from the data origin to the final point of consumption, so we built support for the most popular (and will add more by customer demand!)

The additional checksums are available in all AWS regions and there is no extra cost associated with the feature.

About the Author

Renato Losio

Renato has many years of experience as a software engineer, tech lead and cloud services specialist in Italy, UK, Portugal and Germany. He lives in Berlin and works remotely as principal cloud architect for Funambol. Cloud services and relational databases are his main working interests. He is a AWS Data Hero.

Show more

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK