Using Azure Java API to upload data to Microsoft Azure Blob Storage

Intent

Develop a simple, standalone webMethods Integration Server (IS) service that uses the Azure Java API to upload data to Microsoft Azure Blob Storage. This should work for IS version 10.1 and above.

Overview

See discussion here, where this approach was formulated. Alternatives such as an Azure REST-based solution (simpler, architecturally cleaner) and webMethods Cloudstreams were also discussed.

The alternatives may be better. But this approach has these key features:

  • Simple to use
  • Uses a pre-generated Shared Access Signature (SAS) token for authentication
  • Fewer moving parts (no CloudStream component or calls to login.microsoft.com for OAuth token)
  • Code can be made adapter-like (add calls to list and download blobs, separate connection creation and data transfer)
  • Java API may offer functionality (such as parallel uploads, asynchronous transfers and transfer pause-restart) difficult to access via other mechanisms.
  • Installing code in a separate package with package-level classloader (see NOTES in documentation section) should mitigate JAR version conflicts in future IS upgrades

Details

Code

Here’s the code to upload data (file or bytes) into an Azure Blob Storage container. Copy-paste the relevant imports and method body into SAG Designer’s Java service editor.

Note, a large number of Java dependencies need to be installed in the package’s code/jars folder. This is documented further below. Also, so far this code has been successfully tested to upload to existing containers only. Creating a new container is untested.

import com.wm.data.*;
import com.wm.util.Values;
import com.wm.app.b2b.server.Service;
import com.wm.app.b2b.server.ServiceException;
import com.azure.storage.blob.*;
import com.azure.storage.blob.models.*;
import com.azure.storage.*;
import com.azure.storage.common.*;
import com.azure.core.exception.*;
import com.azure.core.util.BinaryData;

public final class deliverDataToAzureBlobStorageService_SVC

{

	/** 
	 * The primary method for the Java service
	 *
	 * @param pipeline
	 *            The IData pipeline
	 * @throws ServiceException
	 */
	public static final void deliverDataToAzureBlobStorageService(IData pipeline) throws ServiceException {
		// pipeline
		IDataCursor pipelineCursor = pipeline.getCursor();
			Object	sourceBytes = IDataUtil.get( pipelineCursor, "sourceBytes" );
			String	sourceFilename = IDataUtil.getString( pipelineCursor, "sourceFilename" );
			String	destinationBlobName = IDataUtil.getString( pipelineCursor, "destinationBlobName" );
			String	accountName = IDataUtil.getString( pipelineCursor, "accountName" );
			String	accountSharedAccessSignature = IDataUtil.getString( pipelineCursor, "accountSharedAccessSignature" );
			String	destinationContainerName  = IDataUtil.getString( pipelineCursor, "destinationContainerName" );
			String	createContainer = IDataUtil.getString( pipelineCursor, "createContainer" );
		pipelineCursor.destroy();
		
		/* Input handling */
		//Check that either the binaryData or sourceFilename input was specified (binaryData has preference)
		BinaryData binaryData = null;
		if (sourceBytes != null ) {
			binaryData = BinaryData.fromBytes((byte[]) sourceBytes);
		} else {
			if (sourceFilename == null) {
				throw new ServiceException ("Error: sourceBytes and sourceFilename cannot both be null");
			}
		}
		// default createContainer to false
		if (createContainer == null ) createContainer = "false";  
		
		/* Authenticate and upload data */
		//Create a new BlobServiceClient with the input SAS token
		BlobServiceClient blobServiceClient = new BlobServiceClientBuilder()
		    .endpoint("https://"+accountName+".blob.core.windows.net")
		    .sasToken(accountSharedAccessSignature)
		    .buildClient();    
		 
		//Create new container client. Have it either create a container or attach to an existing container 	
		BlobContainerClient containerClient =null;  
		if (createContainer.equals("true")) {
			containerClient = blobServiceClient.createBlobContainer(destinationContainerName);
		} else {
			containerClient = blobServiceClient.getBlobContainerClient(destinationContainerName);
		}
		
		//Upload data to the container. If the sourceBytes input is available, data is sourced from that.    
		//If it is null, data is instead sourced from the sourceFilename input (representing a local file). 
		BlobClient blobClient = null;
		blobClient = containerClient.getBlobClient(destinationBlobName);
		if (binaryData == null ) {
			blobClient.uploadFromFile(sourceFilename);
		} else {
			blobClient.upload(binaryData);
		}
		
		// There is no output pipeline because the Microsoft BlobClient.upload* methods above do not return status.
		// The transfer is deemed successful if no exception was thrown running by this service.
			
	}

Documentation

Here’s the documentation accompanying this service. The NOTES section documents how dependencies that need to be installed in the package’s code/jars folder (40 JAR files in all) are sourced.

INPUT
========
sourceBytes - Optional. A byte array with content to be uploaded to Azure Blob Storage service. Either this input, or the sourceFilename input must be provided. 
	If both are specified, the sourceBytes input is preferred. 
sourceFilename - Optional. Path to local file to be uploaded to Azure Blob Storage service. Either this input, or the sourceBytes inputs must be specified. 
	If both are specified, the sourceBytes input is preferred. 
destinationBlobName - The name of blob containing the source data that is to be created in the Azure Blob Storage service. 
accountName - The account name used to authenticate to the Azure Blob Storage service. 
accountSharedAccessSignature - The Shared Access Signature (SAS) token used to authenticate to the Azure Blob Storage service. 
	E.g., 'sp=asdfghj&st=2022-01-20T01:55:17Z&se=2024-01-20T09:55:17Z&sv=2020-08-04&sr=c&sig=asdfghj%2Basdfghjk%2Basdfghjk%3D'
destinationContainerName - The name of the Azure Blob Storage container where the blob will be created. This can include a path within the container. 
	E.g. 'inbound/webmethods'
createContainer - true/false value (default is false). Whether the destination storage container must be created before blob upload is attempted. 


OUTPUT
=======
(None)


PROCESS
=======
This service accepts input data and uploads it to an Azure Blob Storage container. It does so using Azure Blob Storage Java library. It uses a Shared Access Signature 
(SAS) token to authenticate to the Azure Blob Storage service. For details, please see notes. 

The Java code in this service was adapted from sample Microsoft code available here:
https://docs.microsoft.com/en-us/java/api/overview/azure/storage?view=azure-java-stable

If the sourceBytes input is specified, data is preferentially sourced from this input. Otherwise, data is sourced from the sourceFilename input (representing a local file).
	
This service returns no output. This is because the Microsoft Azure Blob Storage API BlobClient.upload* methods do not return status. Instead, data transfer is deemed 
succeessful if no exception was thrown when running by this service.
			

NOTES
========
1. Installing Azure Blob Storage Java Libraries
This Java service uses external open-source Azure Blob Storage Java libraries published by Microsoft, and dependent other related open-source code. 
For this service to work, several external JARs (40 of them!) must be sourced and installed into the package's code/jars folder. You can use the Maven 
build management application for this. Follow the procedure below. 

1.1. Install Maven 
(Adapted from https://tecadmin.net/install-apache-maven-on-fedora/ )
--------------------------------------------------------------------------------------------------
wget https://dlcdn.apache.org/maven/maven-3/3.8.4/binaries/apache-maven-3.8.4-bin.tar.gz
sudo tar xzf apache-maven-3.8.4-bin.tar.gz -C /opt
cd /opt && sudo ln -s apache-maven-3.8.4 maven
sudo vi /etc/profile.d/maven.sh
# Add this content:
_______________________________________________
export M2_HOME=/opt/maven
export PATH=${M2_HOME}/bin:${PATH}
_______________________________________________
source /etc/profile.d/maven.sh
mvn -version
--------------------------------------------------------------------------------------------------

1.2. Initialize Maven and generate a dummy project
(Adapted from https://maven.apache.org/guides/getting-started/maven-in-five-minutes.html )
Carry out the following steps in a <work-folder> 
--------------------------------------------------------------------------------------------------
cd <work-folder>
mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=my-app -DarchetypeArtifactId=maven-archetype-quickstart -DarchetypeVersion=1.4 -DinteractiveMode=false
--------------------------------------------------------------------------------------------------

1.3. Configure Maven to download all required JARs for Azure Blob storage locally
(Adapted from https://technology.amis.nl/software-development/java/download-all-directly-and-indirectly-required-jar-files-using-maven-install-dependencycopy-dependencies/ )
The dependency list was adapted from: https://docs.microsoft.com/en-us/java/api/overview/azure/storage?view=azure-java-stable
Only the 'azure-storage-blob' dependency was chosen. The current version is used (otherwise you can expect a "Could not resolve dependencies"  error). 
--------------------------------------------------------------------------------------------------
cd my-app
vi pom.xml
# Add the following to the dependencies section:
_______________________________________________
<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-storage-blob</artifactId>
    <version>12.4.0</version>
</dependency>
_______________________________________________
--------------------------------------------------------------------------------------------------

1.4. Download all JARs 
Run this maven command in <work-folder> 
--------------------------------------------------------------------------------------------------
mvn install dependency:copy-dependencies
--------------------------------------------------------------------------------------------------
Maven now downloads 39 JAR files to the 'my-app/target/dependency/' subfolder

1.5.  Move the 39 downloaded JAR files into the IS <package>/code/jars folder.
--------------------------------------------------------------------------------------------------
cd <work-folder>/my-app/target/dependency/
...
cp *.jar <IS>/<package>/code/jars
--------------------------------------------------------------------------------------------------
1.6. Reload your package
At this point, running the service in IS 10.1 gets past errors about missing JARs but returns this new dependency error below.
--------------------------------------------------------------------------------------------------
java.lang.reflect.InvocationTargetException: Package versions: jackson-annotations=2.10.1, jackson-core=2.10.1, jackson-databind=2.10.1, jackson-dataformat-xml=unknown, jackson-datatype-jsr310=unknown, azure-core=1.22.0, Troubleshooting version conflicts: https://aka.ms/azsdk/java/dependency/troubleshoot
--------------------------------------------------------------------------------------------------
This is probably because IS 10.1 shows the 1.x version of the Jackson JAR loaded on it's 'About' page. 
--------------------------------------------------------------------------------------------------
 <SAG-10.1-folder>/IntegrationServer/lib/jars/jackson-coreutils-1.8.jar
--------------------------------------------------------------------------------------------------
However, the JAR version used by Azure is 2.x

1.7 Configure package to prioritise local packages
Basically, I need to compel the package's Java code (and the classes that it calls) to use the newly packaged Jackson version 2.x JARs 
(instead of the Jackson version 1.x JARs packaged by IS). For this, edit the manifest.v3 file of the package where this Java service is stored
as noted in documentation below 
-------------------------------------------------------------------------------------------------
[From 'webMethods Integration Server Administrator’s Guide Version 10.1' page 46]

A package's manifest.v3 file controls a number of characteristics of a package,
including whether the package's class loader defers to its parent class loader. The
default is to defer to the parent class loader. However, Integration Server will use the
package class loader instead, if the following is specified in the manifest.v3 file:
____________________________________________________________________
<value name='classloader'>package</value>
____________________________________________________________________
If a package uses its own class loader, the jar files containing the classes you
want to make available must be in the Integration Server_directory\instances
\instance_name \packages\packageName \code\jars directory.
-------------------------------------------------------------------------------------------------

1.8. Workaround circular dependency in slf4j packages.
At this point, running the service returns the following error:
--------------------------------------------------------------------------------------------------
Could not run 'deliverDataToAzureBlobStorageService'
java.lang.reflect.InvocationTargetException: loader constraint violation: when resolving method "org.slf4j.impl.StaticLoggerBinder.getLoggerFactory()Lorg/slf4j/ILoggerFactory;" 
the class loader (instance of com/wm/app/b2b/server/PackageClassLoader) of the current class, org/slf4j/LoggerFactory, 
and the class loader (instance of java/net/URLClassLoader) for the method's defining class, org/slf4j/impl/StaticLoggerBinder, have different Class objects for the type org/slf4j/ILoggerFactory used in the signature
--------------------------------------------------------------------------------------------------

Thankfully, I could wing it after coming across this article: 
	https://documentation.tribefire.com/tribefire.cortex.documentation/concepts-doc/features/tribefire-modules/troubleshooting/slf4j-api-linkage-error.html
This article suggested installing slf4j-jdk14-1.7.32.jar. This is an unlisted dependency that accompanies the slf4j-api-1.7.32 JAR 
dependency that Maven had automatically downloaded. Adding this JAR resolves a sort of crazy circular dependency in the slf4j package.

1.9 Acknowledgements
I was assisted greatly by these two forum posts. The first post suggests a possibly better architecture for this solution. The second suggested 
the workaround in point 1.8 above.
--------------------------------------------------------------------------------------------------
https://tech.forums.softwareag.com/t/how-to-get-on-prem-is-to-upload-a-file-into-azure-blob-storage/254459/10
https://tech.forums.softwareag.com/t/can-is-java-service-use-a-different-version-jar-than-one-provided-by-is/254533
--------------------------------------------------------------------------------------------------


2. SAS Token Expiry
The 'Shared Access Signature' (SAS) input to this service is a token that enable it to exchange data with Azure Blob Storage infrastructure. A SAS token has a defined lifetime. 
When it expires, integration breaks. To prevent this, a new token must generated and configured on both systems (Azure Blob Storage and webMethods integration) prior
to the expiry date.

For convenience, an SAS token can be made to expire far in the future (possibly as far as the year 9999). The reasons are:  
	1. Instead of the sensitive Storage Shared Key credential, the integration uses a SAS token signed by the key. The SAS token is designed to be a limited-access artifact. 
	2. Authorization rights granted to the SAS token used by this service may be revoked without affecting other applications. 
	3. System-to-system integration with Azure Blob Storage can operate indefinitely without an absolute future date before which the token must change.

1 Like