Data Harmonization is one of the main purposes of using an IoT Platform such as Cumulocity IoT. You want to decouple the application from the connected devices data model. This can be achieved using the powerful and very flexible Cumulocity IoT Domain Model.
When you start your project integrating devices the main question should be “How & where should I store the data to Cumulocity IoT?”.
This basic question is very often followed by other questions like:
- What is an efficient way to store data?
- What is an efficient way to retrieve data?
- How should I name my custom data?
This is the first article of a series that takes a deep dive into the core of the domain model within Cumulocity IoT and tries to give some guidelines and best practices to provide answers to the above questions.
In this article, I will describe why a good domain model is important and you should invest time in designing your data model for your IoT solution.
Important: Please make yourself familiar with that documentation before reading this article as it introduces not only the domain model itself but also explains the terminology used within Cumulocity IoT that will be used in this article and follow-up articles as well.
A high-level introduction to the domain model can be found in the official documentation.
Why is a (good) domain model important?
Every IoT solution has a domain model whether it is defined explicitly or implicitly as you need to interact between devices that provide data and applications that consume this data. A domain model allows you to easily scale once you have more than one IoT use case. If there is a structured way how your devices ingest the data you can easily use the same data from different applications without changing the device implementation or doing application-specific device implementations.
Therefore, it makes sense to take some time at the beginning of your IoT project to have a clear definition of how you want the data to be present in the database. It will already help you with the implementation of the first IoT use case as once the domain model is defined you can start developing the device and application in parallel as both will know how they need to store/access the data.
Special for Cumulocity IoT is, that it doesn’t require upfront a data model where you have a model editor and design your data model before ingesting anything. The data model is implemented in the agents/microservices that ingest the data. That doesn’t mean you don’t need any, which is very often misunderstood.
Why is that? Because it enables full flexibility and accelerates the speed of development. When designed and implemented correctly you can easily and very fast extend your data model without touching any other component. “Look here, now we even have the new sensor values coming in and displayed in the Dashboard …”
To sum it up: A well-defined domain model is the core fundament of a good architecture for your IoT solution.
It will help you …
- to be more flexible in regard to data model changes,
- to easily scale your solution to more devices, applications, and customers,
- to be faster in the development process,
- to be more efficient in working with the data and
- to avoid cost due to unnecessary double effort in breaking data model changes or unnecessary data redundancy.
What could possibly go wrong?
In my experience since my first IoT project in 2012, wrong or missing data design is the main reason IoT projects fail or at least got stuck at some point. So what was the issue?
Here are some typical pitfalls you might to avoid:
- Device Data has been integrated without much / any data harmonization, often stored as is in the database.
- Data Model has been “over-engineered” and doesn’t fit well with the available data.
- Missing IoT data model standards which force device manufacturers to define their own models… independent from any IoT solution. So for manufacturer 1, it is called “T” for “Temperature”, for manufacturer 2 it is called “TSensorValue”. Sometimes you don’t have any description/metadata in the protocol or API just a specification that there is a value on this register / API that indicates a “temperature”
- APIs & UIs must be adapted each time a change was made in the data model leading to never-ending developer efforts & projects.
- Querying data was very inefficient due to inefficient data structure or missing query parameters.
- Correlation of data was sometimes impossible due to missing correlation IDs or relations between objects.
- Projects escalated quickly due to massive performance issues ingesting or querying data of inefficient models
- Data hasn’t been pre-filtered during integration. So even unnecessary or redundant data has been ingested, as it was not clear from the beginning how the data could be used later on.
Almost all the points above could be avoided if we would have from beginning a harmonized domain model. Cumulocity with its domain model is just a starting point to avoid those pitfalls. Still, a good data design on top of the Cumulocity IoT domain model is necessary
What is a good domain model?
A good domain model tries to abstract data from different kinds of devices and serves as a harmonized layer for Data Processing, Integration & Visualization.
For the Cumulocity IoT Domain model, this means we leverage the core concept of it and use them as designed.
An example: If we have a very simple device sending temperature sensor values regularly, we design our domain model to have a managed object for the device in the inventory with measurements (time series) of type “c8y_Temperature”. We would not use events storing any plain data in there or we would also not overload the managed object with all kind of historical temperature time series data.
I know this example is quite simple and there are much more cases to cover but this is worthy of another article
I hope you got the understanding now why you should invest time in designing a good data model.
In the next article of this series, I will explain how a good domain model can look like. Starting in explaining what is defined in the Cumulocity IoT domain model and how to leverage it on examples. In another article, I will cover naming conventions and giving best practices.