This article outlines the steps involved in implementing a solution using StreamSets, webMethods.io Integration and ARIS Process Mining
1 Introduction
1.1 Why StreamSets?
StreamSets is a Data Ops platform. It can work with data at a large scale. Smart data pipelines can be built and deployed across hybrid and multi-cloud platforms from a single portal.
1.2 Why ARIS Process Mining?
ARIS Process Mining lets you understand business processes to find bottlenecks and opportunities for improvement. Compare designed processes to as-is processes to see, if they execute as planned and make changes before they impact the bottom line.
1.3 How can they work together?
ARIS process mining needs process execution data/logs/audit trail for mining the processes. ARIS process mining is more effective when there is a continuous stream of data coming in. StreamSets which processes data at a large scale can extract meaningful process audit trails and send it to ARIS Process Mining. ARIS Process Mining can analyze the data and churn out meaningful insights about the processes.
1.4 Use Case for this article
Assume all the systems involved in the required process are publishing Audit data/activities to Kafka. From the audit data available in Kafka we can build a Process Mining pipeline.
StreamSets can extract the data from Kafka, and aggregate and transform data. The resultant data can be sent to ARIS process mining via webmethods.io Integration.
ARIS Process Mining can use this data to Mine the processes and show the near real-time Process Analytics and Dashboards.
Webmethods.io Integration is required to build a workflow to upload data to ARIS Process mining using Data APIs. To ingest data client applications need to call a series of APIs. As StreamSets is a Data Ops platform, itâs not suited to build a functional app. Hence webmethods.io is a suitable platform to build such a workflow with an existing ARIS Process Mining connector.
2 Pre-requisite
⢠ARIS Process Mining Cloud tenant
⢠webMethods.io cloud tenant with Integration enabled
⢠StreamSets Cloud Tenant and StreamSets Data Collector instance, Data Collector should have network access to connect to Kafka instance and internet.
⢠Kafka and Zookeeper Setup.
3 Implementation
3.1 Configure ARIS Process Mining Instance
3.1.1 Enable the Data Ingest APIs
Login to ARIS Process Mining Instance with User having Engineer and Process mining Admin roles.
Goto Administration > System Integration
Add System Integration for type âData Ingest APIâ and Auth type âClient Credentialsâ
This creates Client Credentials, with ClientID and Client Secret.
The credentials should be used to get Access Tokens to call the APIs.
Curl command:
curl --location --request POST 'https://mc.ariscloud.com/api/applications/login' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--header 'Accept: application/json' \
--data-urlencode 'clientId={Client Id}' \
--data-urlencode 'clientSecret={Client Secret}' \
--data-urlencode 'tenant={Tenant Name}'
3.1.2 Create a Process Mining Project and Data Collection
Go to Projects and Create a project and an associated Data Collection. Create an Analysis in the Project.
3.1.3 Add Data Ingest API License to Data Collection
Go to Data Collections and open the new Data Collection created
Goto Connections
Add a new connection, give it a name, and Select the System Integration to attach an available License. If you donât have a license contact Admin and get Data Ingest licenses.
This step is critical, without this connection you will not have permission to upload data using REST APIs
3.1.4 Create Table in the Data Collection using REST API
Ingest APIs donât work with tables created directly on the portal. So tables need to be created from the APIs.
Goto Source Tables
Create a table using REST APIs
Curl Command to create table
curl --location --request POST 'https://processmining.ariscloud.com/mining/api/pub/dataIngestion/v1/dataSets/testdata/sourceTables' \
--header 'Authorization: Bearer {Access Token}' \
--header 'Content-Type: application/json' \
--data-raw '[
{
"name": "parceldelivery_csv",
"namespace": "default",
"columns": [
{
"dataType": "STRING",
"name": "Case_ID"
},
{
"dataType": "STRING",
"name": "Activity"
},
{
"dataType": "FORMATTED_TIMESTAMP",
"name": "Start",
"format": "dd.MM.yyyy HH:mm"
},
{
"dataType": "FORMATTED_TIMESTAMP",
"name": "End",
"format": "dd.MM.yyyy HH:mm"
},
{
"dataType": "STRING",
"name": "Product"
},
{
"dataType": "STRING",
"name": "Customer"
},
{
"dataType": "STRING",
"name": "Country"
},
{
"dataType": "STRING",
"name": "Delivery type"
}
]
}
]'
After execution of the API, check the portal for newly created table.
3.2 webMethods.io Workflow to send data to Process Mining
Create a webMethods.io Workflow and use ARIS Process Mining connector.
Add an account for ARIS Process Mining account.
ARIS Process Mining Data Ingest APIs needs to be called in a particular order to ingest data. Implement the order as shown below
Create a Webhook to accept JSON array as input .
The input should be the process data that can be submitted to process mining.
3.3 StreamSets Data Pipeline
StreamSets supports HTTP Client as a Destination. REST API calls can be implemented using this destination. It is very configurable hence itâs easy to implement any REST API call.
But ARIS Process mining Data Ingest API is a complex set of API calls. Implementing such a workflow is not a good use case for StreamSets, as StreamSets is meant for data processing and not for building functions and app integrations.
To complete the use case call the webMethods.io workflow using the REST endpoint created using webhook from StreamSets using HTTP Client destination.
Set Data Format to JSON array of objects
Below is a simple Data Pipeline in StreamSets,
Data is sourced from Kafka, using origin as Kafka Multitopic Consumer
And Destination as HTTP Client, calling webMethods.io REST API
4 Results
When the data is uploaded to ARIS Process Mining, it starts processing the data. The status can be seen on the overview page of the Data Collection
4.1 StreamSets
4.2 webMethods.io Integration transactions
4.3 ARIS Process Mining Data Collection Overview
Displays current status: Processing Data when uploaded data is being processed, once completed the status changes to Data Loaded
Next steps
Use a real-world business process from a customer project to implement this solution.