Streamlining data extraction and submission with sizes of the order of MBs using webMethods.io Integration

Introduction

This article talks about the option to streamline the data extraction and submission with sizes of MB’s using webMethods.io integration.

Audience

It is assumed that readers of this article know how to create integrations on webMethods.io.

Pre-requisite

To implement this use case, we have used the below third-party system.

  1. Azure Blob storage acting as the source system.
  2. Azure storage location as a temporary cloud location
  3. SFTP server acting as an end system.

Usecase

• Pulling or getting the data from third-party applications for example Azure blob storage, S3 bucket, or HTTP end points
• Implement the business logic on flow service\workflow
• Temporary store the data in cloud storage for example: SFTP location
• Append the cloud storage.
• Submit the data to the end system example uploading the files to the SFTP location.

Implementation

webMethods.io integration provides 2 options to process the data.

• When the developer is a citizen developer, and the business logic is not too complex then we can go with workflow approach.
• When the business logic is too complex, and the size of incoming data is also on the higher side then we should prefer to implement the interfaces using the flow service

Implementation using workflow.

To implement the above logic in the workflow we are going to use the out-of-the-box connectors available.
Details for same

  1. Getting the files from Azure blob storage location using Azure storage connector.
  2. Implementing the business logic on the received file. In our case, we are looping for number of document array available in the source file
  3. Temporarily saving the file on temp cloud storage using “Writefile” within the loop.

  1. Keep appending the file using “File Append” within the loop
  2. Extracting file from temporary storage location using “Read file”.

  1. Upload the new files to the SFTP location with the dynamic name starting with workflowSamplefile concatenated with CurrentDateTime.

Limitations

  • We can handle the data when the incoming data size is not too big. For example, the average size near about 30-40 MB
  • We can use the workflow when the business logic implementation is not too complex.

Implementation using Flow service

To implement the same use case we can have two approaches when using the flow service.

Approach 1: Using inbuilt memory to store the data temporarily.

  1. If we are not getting a high volume of data, then we can extract the data and store it within the memory.
  2. In the below diagram we can see that we have received the file from Azure storage and looped on item.
  3. In each loop we have implemented the business logic and appended to the document list.

  1. Once all the data is appended in the memory then the data can be saved directly to the SFTP location.

Approach 2: Using cloud storage location to store the data temporarily.

Overview

  1. We can use this approach when we have the data size on a higher limit and when the business logic is too complex to implement on workflow.

  2. Once the request comes into flow service, The incoming request is then converted into the appropriate format. In our case, we are pulling the data from the Azure blob storage location.

  1. Loop is executed and data is extracted from the incoming request.

  2. This extracted data is temporarily uploaded to some cloud storage location like Azure blob storage, S3 bucket, FTP location, etc. In my case, Azure blob storage is the temporary cloud storage location.

  3. Once all the data is extracted and uploaded in a temporary cloud location.

  4. All the data is fetched and then uploaded to the actual end system like SFTP location etc.

Points to remember

When implementing this kind of use cases on integration platform we need to consider multiple aspects related to it for example:

  1. what will be the payload frequency
  2. What will be the overall load on the integration platform.
  3. payload attributes that need to be logged for monitoring purposes
  4. If the data size is quite high how we are sending it from one system to another

Note

Attaching the flow service and workflow for reference purposes.
Workflow: export-fl3eab36510cb3966856c6dc-1679645629696.zip (77.4 KB)
FlowServices:
ProcessingLargeDataInMemory.zip (12.4 KB)
ProcessingBulkData.zip (15.3 KB)
UploadBulkDataToSFTP.zip (9.0 KB)
UploadBulkData.zip (9.2 KB)
GetBulkData.zip (8.9 KB)

3 Likes