The Cumulocity IoT Domain Model - Application data & Query optimization

Stefan_Witschel · June 29, 2023, 9:03am

Introduction

In my last article about the Cumulocity IoT Domain Model I described how you can design a good model and shared some best practices about storing device meta & transactional device data.

In this article, I’ll focus on additional aspects of the domain model which are storing application configuration, audit data, and why you should always have in mind how to retrieve/query data when you design a domain model.

Let’s get started with the application data!

Author remark: The guidance provided in this article is based on the project experience of @Tobias_Sommer and @Stefan_Witschel (me). Kudos goes to Tobias who collected & shared this with me.

How to store application data

When you start developing applications on top of Cumulocity IoT you will come to the point where you need to persist certain information from your application. This can be information that exists only once for your application (e.g. general configurations), this can be sensitive data (e.g. access credentials for 3rd party systems), or also user-specific metadata that the application needs to store per user.

You will have different options to store such information and each option has certain advantages and disadvantages so let us go through them.

Storing the data in a managed object in the inventory
Using tenant options which is a simple key-value store.
Using the tenant object

1. Storing application data in inventory

The inventory of a tenant is the most generic collection and you can store any JSON structure here. It is perfect especially for more complex metadata as you can create a single JSON object that holds a lot of information. The Cockpit application for example stores the configuration and layout of a dashboard in the inventory. But be aware that in most tenants there will be users that can at least on API access the whole inventory. So if you want to store sensitive data you might want to encrypt the value and save the encrypted String in the JSON object. You also should be aware that many other things like assets and devices are stored in the inventory so you should have an easy way to query your configuration in a fast way e.g. using the identity API with a unique identifier or using a common type.

Here is an example of Dynamic MQTT Mapper using two kinds of managed objects.

One is to store application information about the MQTT Broker connection status and MQTT Broker SYS stats in a managed object. Here an external ID in the identity API is used with the key MQTT_MAPPING_SERVICE. When the microservice starts up, it will first attempt to retrieve an existing object. If this retrieval fails because the object doesn’t exist, the system will then create a new object.

The second managed object types are mappings which are persisted. Here we use a common type c8y_mqttMapping to persist all mappings configured in that tenant. That way we can also retrieve them very easily by filtering on that type.

{
	"owner": "me",
	"creationTime": "2022-11-24T08:06:33.697Z",
	"type": "c8y_mqttMapping",
	"lastUpdated": "2022-11-30T14:07:30.299Z",
	"id": "102364668",
	"c8y_mqttMapping": {
		"snoopStatus": "NONE",
		"extension": null,
		"templateTopicSample": "device/express/berlin_01",
		"ident": "d90f2961-3fad-4c49-9b41-2778bd440ed0",
		"tested": false,
		"mapDeviceIdentifier": true,
		"active": true,
		"targetAPI": "INVENTORY",
		"source": "{\"line\":\"Bus-Berlin-Rom\",\"operator\":\"EuroBus\",\"customFragment\":{\"customFragmentValue\":\"Express\"},\"capacity\":64,\"customArray\":[\"ArrayValue1\",\"ArrayValue2\"],\"customType\":\"type_International\"}",
		"target": "{\"c8y_IsDevice\":{},\"name\":\"Vibration Sensor\",\"capacity\":100,\"type\":\"maker_Vibration_Sensor\"}",
		"externalIdType": "c8y_Serial",
		"templateTopic": "device/express/+",
		"qos": "AT_LEAST_ONCE",
		"substitutions": [
			{
				"pathSource": "_TOPIC_LEVEL_[2]",
				"pathTarget": "_DEVICE_IDENT_",
				"repairStrategy": "DEFAULT",
				"expandArray": false
			},
			{
				"pathSource": "line",
				"pathTarget": "name",
				"repairStrategy": "DEFAULT",
				"expandArray": false
			},
			{
				"pathSource": "customType",
				"pathTarget": "type",
				"repairStrategy": "DEFAULT",
				"expandArray": false
			},
			{
				"pathSource": "capacity",
				"pathTarget": "capacity",
				"repairStrategy": "DEFAULT",
				"expandArray": false
			}
		],
		"updateExistingDevice": true,
		"mappingType": "JSON",
		"lastUpdate": 1669817250289,
		"name": "Device Mapping",
		"snoopedTemplates": [],
		"createNonExistingDevice": false,
		"id": "102364668",
		"subscriptionTopic": "device/#"
	}
}

2. Storing application data in tenant options

Tenant Options is a simple key value store available on each tenant. It gives you very simple and fast access to a dedicated store for metadata that belongs to the tenant. Besides the key and value, a tenant option also has a category which allows you to easily query all tenant options in a single query for all your metadata (if you set a common application-specific category).

One of the most useful features of tenant options is that they also have a built-in encryption feature so that you can store sensitive data like passwords without bothering about your own encryption. Only service users (like those generated for microservices) can access these parameters unencrypted and even if an admin user has access to the tenant options he would only get the values encrypted. You can use that feature by just using the prefix credentials in your key e.g. credentials.password

Again the example of the Dynamic MQTT Mapper where we use tenant options to store the MQTT credentials to connect to the MQTT Broker in a separate category:

GET {{url}}/tenant/options/mqtt.dynamic.service

{
    "credentials.connection.configuration": "<<Encrypted>>",
    "service.configuration": "{\"logPayload\":true,\"logSubstitution\":true}"
}

3. Storing application data in tenant object

The tenant object is a bit of a special location to store metadata. This object can only be written by the parent tenant and the actual tenant can only read it. It is the perfect place to store billing relevant information e.g. how many device licenses this tenant owns. Such information shouldn’t be changeable by the tenant but might be necessary to know for certain application functionality.

Using audit records

Another part of the Cumulocity IoT Domain Model is the audit records. Mostly used by standard components also custom components can add audit records mainly to track important changes.
Audit records are similar to MEAs (see my other article) and should be cleaned up by retention rules.

An audit record should contain:

a string activity which is basically the title of the audit record
a string text which is the description of the audit record
a time stamp as time - Ideally UTC time format is used
a managed object as source
a type string as type
an optional user
optional one or multiple custom Properties
optional change array that contains one or multiple changes
- newValue - the new value of a property
- previousValue- the previous value of a property or “null” when empty
- attribute - the property that has been changed
- type - the type of the attribute

Example record:

{
	"activity": "User updated",
	"creationTime": "2023-06-28T15:34:09.682Z",
	"changes": [
		{
			"newValue": "sdfsdf",
			"attribute": "firstname",
			"type": "java.lang.String",
			"previousValue": null
		}
	],
	"source": {
		"self": "https://xxx.eu-latest.cumulocity.com/inventory/managedObjects/buguser",
		"id": "buguser"
	},
	"type": "User",
	"application": "administration",
	"self": "https://xxx.eu-latest.cumulocity.com/audit/auditRecords/109942970",
	"time": "2023-06-28T15:34:09.682Z",
	"id": "109942970",
	"text": "User buguser updated: firstname='sdfsdf'",
	"user": "admin"
}

As the name audit records states it should be mainly used to track changes made in the system by users or applications e.g. permissions have been changed, operations have been triggered by user etc… To track the status of devices please use events.

Make clever use of the query language

The query language that is available to filter on the inventory API is a great tool both for UIs and applications. It is however by design not necessarily fast as it allows you to search on every parameter and of course, not every parameter is indexed in the database. When performing queries you can differentiate between a query using an index (IXSCAN) or not using any index (COLLSCAN). As the inventory will grow over time (more devices, more assets, etc.) the collection scan queries will become slower and slower, and eventually it might be noticeable.

A very simple trick to mitigate that is to include parameters in the query that are indexed. Those parameters can be easily identified as direct query parameters exist for them (e.g. on inventory the type). So if you for example have a list of your shipping containers and want to filter this list on all their parameters, ensure that every shipping container object has the same type value and include this parameter in the query. Then it will automatically utilize the existing type index and quickly ignore all other objects with different types in the database.

This is mostly relevant for the inventory API and the query language but also for the others API you might be careful building your API-Requests retrieving data.

Here are some details about the parameters which are indexed (tested with 10.16 on eu-latest)

Please note: The used index can be configured on instance level. Also in future we want to handle this more dynamically so it might be you have additional parameters on your instance that are indexed (or less).

API	Parameter
Inventory	type
Inventory	text
Inventory	name
Inventory	childAdditionId
Inventory	childAssetId
Inventory	fragmentType
Inventory	childDeviceId
Inventory	ids

API	Parameter
events	source
events	type
events	dateFrom
events	dateTo
events	createdFrom
events	createdTo

API	Parameter
alarms	source
alarms	dateFrom
alarms	dateTo
alarms	createdFrom
alarms	createdTo

API	Parameter
measurements	source
measurements	dateFrom
measurements	dateTo
measurements	valueFragmentType
measurements	valueFragmentSeries

API	Parameter
operations	deviceId
operations	dateFrom
operations	dateTo
operations	status
operations	agentId

.
.
.
.
.
.
.
.
.
.
.
.

Summary

In this article I explained how application data could be stored in Cumulocity IoT, you can use the audit records in a good way and to optimize your queries.
You did well by understanding all the concepts of a “good” domain model now and can avoid major pitfalls in your data design!

Movie gif. Robert Redford as Jeremiah Johnson, in the movie of the same name, holding a blue-white light saber and giving a subtle, approving nod.