Introducing Amazon S3 Transfer Manager in the AWS SDK for Java 2.x

The global data ecosystem has grown faster over the last decade and now it has become a little challenging to select prime data technology. With more than 32% of the world’s public cloud share, Amazon Web Services (AWS) is the leader in this space. It serves almost 190 countries with scalability, durability, and security. Since its inception, S3 storage has become an internal part of thousands of companies for data storage and data management.

The Amazon Simple Storage Service (S3) is a cloud storage solution provided by Amazon Web Services (AWS). With a key-based object storage architecture, Amazon S3 is well suited for storing massive amounts of structured and unstructured data. Unlike the operating systems we are all familiar with, Amazon S3 does not store files in a file system and instead of that, it stores files as objects. Object Storage allows users to upload files the same as the other popular cloud storage products like Dropbox and Google Drive.

Recommended Article: Azure vs AWS Which Works Best for Serverless Architecture

 

What is Amazon S3 Transfer Manager?

 

Transfer Manager is considered one of the significant APIs inside the AWS SDK (amazon web servicesoftware development kit). It provides easy and convenient management for uploads and downloads between your application and Amazon S3. It hides the complex process of transferring files behind a simple API. Transfer Manager performs two operations, i.e. upload and download. From there, you can upload and download objects to interact with your data transfers.

Whenever possible, Transfer Manager tries to use a couple of threads to upload multiple parts of a single upload at once. When dealing with massive data sets, this can have a significant increase in productivity. Transfer Manager is present on top of the Java bindings of the AWS Common Runtime S3 client.

 

Parallel Upload via Multipart Upload

Multipart Upload offers you to upload a single object into small parts. You can upload object parts independently in any order and after all parts are uploaded, Amazon S3 presents the data as a single object. For instance, when your object size reaches 100 MB, you should use multipart instead of a single operation because this allows you to create parallel uploads.

Transfer Manager uses Amazon S3 multipart upload API for upload operation; it converts one single PutObjectRequest to multiple MultiPartUpload requests and then sends these requests simultaneously to achieve more durability and high performance.

 

Parallel Download via Byte-Range Fetches

Transfer Manager utilizes byte-range fetches for download operations. By using the Range HTTP header in a GET object request, you can fetch a byte range from an object for transferring only the desired portion. For instance, it splits a GetObjectRequest into multiple smaller requests, each of which retrieves a specific portion of the object. This helps you achieve high performance as compared to a single whole-object request. Fetching a smaller portion of a large object also allows your application to improve retry times when requests are interrupted.

If you are uploading an object as a single object while working with the transfer manager 1. x, the transfer manager will not be able to increase the downloading speed. To increase the downloading speed in transfer manager 1. x, an object must be uploaded using multipart upload. This is no longer a limitation in the transfer manager 2. x. With transfer manager 2.x, downloading an object does not depend on how the object was originally uploaded.

 

Getting Started

  • Add a dependency for the Transfer Manager

First, include the separate dependency in the project.

XML

<dependency>

<groupId>software.amazon.awssdk</groupId>

  <artifactId>s3-transfer-manager</artifactId>

  <version>2.17.123-PREVIEW</version>

</dependency>

  • Instantiate the Transfer Manager

You can instantiate the Transfer Manager easily using the default settings

Java:

S3TransferManager transferManager = S3TransferManager.create();

  • Upload a file to Amazon S3

For uploading a file to Amazon S3, you need to provide a file path along with PutObjectRequest that should be used for the upload.

Java:

FileUpload upload = transferManager.uploadFile(b -> b.source(Paths.get(“myFile.txt”))

                                                  .putObjectRequest(req -> req.bucket(“bucket”)

                                                                         .key(“key”)));

upload.completionFuture().join();

  • Download an Amazon S3 Object to a File

For downloading an object in Amazon S3 you need to provide the destination file path along with the GetObjectRequest that should be used for the download.

Java:

FileDownload download =

    transferManager.downloadFile(b -> b.destination(Paths.get(“myFile.txt”))

                                       .getObjectRequest(req -> req.bucket(“bucket”)

                                                                   .key(“key”)));

download.completionFuture().join();

 

Conclusion:

Customers of all sizes and industries can use Amazon S3 to store and protect any amount of data for a range of use cases, such as data analytics, data lakes, backup, restore, and much more. Transfer manager 2. x is better than transfer manager 1.x in many ways. You can check the developer guide and source code on Github of Transfer Manager for the AWS SDK for Java 2. x for complete documentation.

Share

Recommended Posts

How Generative AI Streamlines Software Development to Boost Developer Productivity

Generative AI in software development has emerged as a powerful tool that can dramatically improve the software development process. From automating mundane tasks to enhancing creativity, generative AI is transforming how software is developed. In this blog, we will explore how AI is revolutionizing the field, streamlining the development process, and boosting productivity for developers….

Top 10 tech stacks for software development in 2024 | GrayCell Technologies

In 2024, software development continues to evolve rapidly, driven by new technologies, frameworks, and tools that empower developers to create more efficient, scalable, and user-friendly applications. Choosing the right tech stack is crucial for any development project, as it determines the foundation of your software, influencing its performance, scalability, and maintenance.  In this blog, we’ll…

Streamlining Modernization Efforts: A User-Centric Approach to Quality Assurance

Modernization efforts are vital for businesses to remain competitive and need to modernize their software systems. A key aspect of this is ensuring Quality Assurance (QA) processes keep up with modern software development demands. A user-centric approach to QA is essential, as it focuses on enhancing user experience and meeting end-user expectations. Modernization involves updating…

Follow Us. Li./ X./ Fb./ In.