Data migration approach

Hi everyone it’s Luca from DAC System.

We are facing the following scenario:

We have a Java agent running on our devices that sends data to Cumulocity IoT.
We are going to upgrade the agent, changing the data format as well.

So, it will happen that upgraded and outdated agents send data to the same tenant. In other words,
the tenant is going to receive the same data but in different format at the same time.

Do we need to build a layer to normalize incoming data?
Is there any well known practice?
Does Cumulocity IoT offer something about this?

I’m quite lost so any bit of information is welcome.

Thank you,
Luca

Hi Luca,

I understand the issue that the new agent sends data in a different structure compared to the previous agent, but what is the actual problem? Are there any applications (web app, EPL apps, etc.), which rely on the old structure? What affects does the new data structure have on your use case? What exactly do you want to normalize? Do you want to map data coming from outdated agents to the new data structure?

Best regards
Christian

Hi Christian thank you for your answer.

Yes, at the moment we have a web app based on the current data structure. In particular we have build some custom widgets starting from a Cockpit application.

We are offloading the data on a data lake as well.

Basically the widgets are designed to work only with current data format. Using a simple example, let’s say that we have a widget that displays a measurament over time. It expects the data to be in a field called “dac_measure”. Now, we change the format renaming the field to “dac_measure2”. The data coming from outdated agents (that send “dac_measure”) are not displayed.

The offloading is also effected. As before we rename a field from “dac_measure” to “dac_measure2”. First, I need to manually change the offloading settings adding the a new column (“dac_measure2”). Then the offloaded table is going to be sparse i.e. some rows have “dac_measure” populated and “dac_measure2” to null and viceversa. I think it’s not the best set up to work with.

This is basically what I meant with “normalize data”. Bring old fashion data to the latest format without the agent be aware of this transformation.

Best regards,
Luca

Hi Luca,

existing measurements in Cumulocity IoT can’t be modified or updated. If you want to perform a mapping of measurements from the old structure to the new data format, you have to duplicate the measurements and change the structure accordingly. This can either be done using a custom Microservice or an Apama EPL App. The duplication of measurements has the drawback that additional data will be stored in Cumulocity, which could increase the costs.

I would advise to go for Apama to implement this use case. Apama EPL Apps is very reliable and robust when it comes to listening to real-time notifications (in this case measurements). In your Apama monitor you would subscribe to the measurements, which are still using the old data format. Once you receive such a measurement, you create a new measurement based on the old measurement and its source (device) but map it to the new structure. This new measurement is then stored in Cumulocity.

A simple Apama monitor to illustrate what I have just described:

using com.apama.cumulocity.Measurement;
using com.apama.cumulocity.MeasurementValue;

/** Miscellaneous utilities */
using com.apama.cumulocity.Util;
using com.apama.util.AnyExtractor;

monitor MySampleMonitor {
	/** Initialize the application */
	action onload() {
        monitor.subscribe(Measurement.SUBSCRIBE_CHANNEL);

		on all Measurement(type="c8y_Temperature") as temperatureMeasurement {
			Measurement measurement := new Measurement;
			measurement.source := temperatureMeasurement.source;
			measurement.time := currentTime;
			measurement.type := "xyz_TemperatureMeasurement";
			measurement.measurements.getOrAddDefault("xyz_Temperature").getOrAddDefault("T").value := temperatureMeasurement.measurements.getOrAddDefault("c8y_Temperature").getOrAddDefault("T").value;

			send measurement to Measurement.SEND_CHANNEL;
		}
	}
}

The monitor listens to measurements, which have the type c8y_Temperature. It maps the value of the c8y_Temperature.T fragment to the xyz_Temperature.T fragment and creates a new measurement for the device.

Additional information about Apama and Apama EPL Apps can be found here.

The advantage of the approach described above is you don’t need to touch any of the other components like the custom web widgets or the offloading job. Disadvantage would be the increased amount of stored data in the respective Cumulocity tenant.

Hope this helps.

Best regards
Christian

1 Like

Hello Christian,
thank you for the interesting solution.

First a question: would this solution work with alarms, events and invetory as well?

Then a comment: In my understanding this would map only new incoming data. Please correct me if I’m wrong.

Best regards,
Luca

Hi Luca,

would this solution work with alarms, events and invetory as well?

Yes, you can also subscribe for new alarms, events and Managed Object updates in Apama EPL Apps. For these domain objects you can actually run updates instead of creating them again to modify some of the information or fragments.

In my understanding this would map only new incoming data. Please correct me if I’m wrong.

Exactly. In case you want to map already existing measurements, I suggest to write a script, which takes care of this. For example a Python script, which loads all measurements of a specific type for a given timeframe. Then you can run a similar logic as described above to create the new measurements, but with the new structure. In this case you could also copy the timestamp of the old measurement to the new measurement to stay consistent. There is a Python client which you could use for this scenario.

Best regards
Christian