Abhijith S Nair
Partner Solutions Architect at Amazon Web Services
Published Jan 10, 2023
Note: This article was originally posted on abhijithnair.com.
Amazon S3 is an object-based storage service provided by AWS and depending upon the size of the uploading object, there are a few things you must know:
But if a single PUT operation can upload an object of size only up to 5 GB, how will you upload a larger file? This is where you would leverage Amazon S3's Multipart Upload feature. Using multipart upload, you can upload a single data object into Amazon S3 as a set of parts. Each part is a portion of the data object which can be independently uploaded in any order. Once all the parts are uploaded, Amazon S3 fetches these parts, combines & creates a single data object. This feature also allows you to resume the upload process, even if the connection breaks while uploading files. It is recommended by AWS to use the multipart upload feature if the uploading data object has a size of 100 MB or more.
What to expect?
In this blog, I will explain how to upload a video file into Amazon S3 using the S3 Multipart upload feature.
Prerequisites
In order to pursue this task, I have a video file of size 232.3 MB which currently resides on my local machine. I also have an Amazon S3 bucket my-s3-multipart-upload created in the us-east-1 region with all the default settings. I will be using AWS CloudShell to perform the following tasks.
Procedure
1.Login to AWS CloudShell and upload the video file.
2. Use the split command to split the original file. Here I have split them into separate contiguous 100 MB file. As you may see there are 3 parts in total: file-aa, file-ab and file-ac respectively.
3. Install openssl to generate MD5 checksum values for our files.
4. Generate MD5 checksum values for our files. Copy and save them to a clipboard.
5. Start the multipart upload process using the following command. Copy & save the generated UploadId to a clipboard.
aws s3api create-multipart-upload --bucket <bucket_name> --key <original_file_name> --metadata md5=<original_file_checksum_value>
6. Upload each individual file parts using the following command. The upload-part command generates an ETag value for each parts.
aws s3api upload-part --bucket <bucket_name> --key <original_file_name> --part-number 1 --body <file_name_1> --upload-id <upload_id_from_step5>
7. List all the file parts using the following command.
aws s3api list-parts --bucket <bucket_name> --key <original_file_name> --upload-id <upload_id_from_step5>
8. Copy & save the PartNumber and ETag values for all the file parts into a JSON file. You can use the nano command to create a JSON file.
9. Complete the upload process using the following command.
aws s3api complete-multipart-upload --multipart-upload file://<JSON_file> --bucket <bucket_name> --key <original_file_name> --upload-id <upload_id_from_step5>
10. The original file can be now fetched from the Amazon S3 bucket.
I hope this blog helped in explaining how to upload a large file into Amazon S3 using the S3 Multipart Upload feature. Please do check out my other blogs on this portfolio.
Until then, Happy Blogging!
To view or add a comment, sign in
Sign in
Stay updated on your professional world
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now