Understanding Analytics Builder triggering behavior

Introduction

Analytics Builder in Cumulocity IoT Streaming Analytics is a great way to quickly implement logic that reacts to device behavior. It can be used to analyze & process device data, detect positive and negative events, or escalate issues if they persistent.

While, due to the visual nature of Analytics Builder, it is easy to build models, their behavior can be confounding. A model may

  • never produce any results even though data is incoming

  • produce a result exactly once and then stops without any error message

  • produce a result fewer times than anticipated

All of these can typically be attributed to an erroneous or incomplete understanding of the triggering behavior of Analytics Builder models and their blocks.

While all of this is described in great detail in the Analytics Builder documentation, these problems are so common that we decided to explain the issues together with concrete examples and solutions in a concise manner in this article.

This article covers three major sources of unexpected behavior of Analytics Builder models in the following sections:

  • The different types of input blocks and how they control the triggering of models

  • Implicit conversion between the outputs and inputs of blocks

  • The way Analytics Builder works with timestamps of incoming data

Input blocks

Let us start with a look at Input blocks. These trigger the execution of an Analytics Builder model, and the product supports five different kinds of Input blocks for the major kinds of master and transactional data in Cumulocity IoT:

  • Managed objects - triggered when the selected managed object changes

  • Measurements - triggered when a new measurement is created

  • Events - triggered when a new event is created or and existing event is updated

  • Alarms - triggered when a new alarm is created or and existing alarm is updated

  • Operations - triggered when a new operation is created or and existing operation is updated

Right off, we can see that there are three classes of behavior. Events, alarms, and operations together make up one of the classes: their Input blocks can be triggered if a new object is created, an existing one is updated, or in both cases. Measurements make up the second class: as measurements cannot be updated the Input block is only triggered for new measurements. Managed objects make up the last class: an Analytics Builder model can only react to changes in an existing Managed object, which is typically the device for which this Analytics Builder model runs, and cannot react to the creation of new Managed objects as this would interfere with the execution semantics of Analytics Builder models, e.g. the detection of cycles between models.

Input block for Managed objects

The Input block for Managed objects differs from all other Input blocks in two other important aspects:

  • Capture start value - if this box is checked, the Input block triggers once upon activation of the model with the current Managed object. This can be useful if the Managed Object contains configuration parameters (like thresholds) that configure other blocks of the model.
  • Property name - allows specifying the name of a top-level property (nested properties are not allowed). If this is used, the Input block will only trigger in case the value of that property changes. All other changes to the Managed Object will be ignored.

This second aspect brings us to the next topic: What actually is the output of an Input block? The answer to that question, depends on the type of object selected and in case of Managed object Input block on its configuration.

In general, a wire connecting the output of an Input block to another block can transport a value plus a structure of properties. The value can either be of one of the value types boolean, float or string or a pulse. A pulse is different from all the other types in that it does not transport a value but is a signal of a point in time indicating that something happened. The value types boolean, float and string are expected to remain unchanged until they are explicitly changed. Internally, a pulse is implemented as a boolean that is “true” for the point in time the pulse happened and then switches back to “false” immediately without further input. Pulses are typically used to trigger behavior in models e.g. by triggering an Output block of a model to produce a measurement, event, or alarm or by triggering another Block to do something (like triggering the “Sample” input of a block). The documentation of the Pulse type contains further details as well as typical examples of its usage.

No Property

So let us first look at Managed objects to see how values and properties are being sent from Input blocks. We are going to use the same format as shown in the picture above for all examples. On the left-hand side you can see how the Input block is configured. In this case, we are using a Managed object as an input and we have not provided any property name for Analytics Builder to check for updates. On the right-hand side we can see the value and the properties the Input block produces. If a Managed Object Input block is not configured for a specific property, it will always produce a pulse and put all fragments from the Managed Object into the properties.

Notice that we are showing the properties in JSON format. Actually, Analytics Builder uses the Apama EPL dictionary type but as the output of the whole Analytics Builder model will be in JSON format if it is stored in Cumulocity IoT this is a valid abbreviation.

String Property

If we select a string property of the Managed object for the property to detect changes on, the value output of the Input block will no longer be a pulse but the value of that property (in this case the name of the Managed object). Note that in this case, Input block will only produce an output if the name changes and will ignore any other changes to the Managed object. Still, all fragments of the Managed object are included in the properties of the block output.

Float Property

If we select a float property (or a Boolean property, not shown) the behavior will be the same. In this case, we selected the lastUpdated property of the Managed object. The output value of the block will be a float containing that property value.

Complex Object

If you select a property that is itself a complex object (like the address property), the output value will be the result of calling “toString()” on the EPL object.

Non top-level property

The property name selected to detect updates on has to be a top-level property. So it is not possible to detect changes only on individual properties of complex objects like it is tried here for the street property of the address fragment. In this case, the Input block will never produce any output.

Input blocks for Events, Alarms, and Operations

Having covered all the options for Managed objects, let us now look at events, alarms and, and operations together as they behave alike:

We configured the Input block to receive location update events of the device. Every received event will have the Input block produce a pulse output value and the all fragments from the event as properties. Alarms and operations behave the same.

Input block for Measurements

The last case we need to look at are measurements. On a Measurement Input block you do not select the type of measurement but the fragment and series of a concrete time-series. A measurement can contain multiple time-series. Even different measurements of the same type can contain different time-series (not a good idea in most cases as it can be confusing). The Input block will receive the configured fragment and series including any properties defined on the series level prefixed with “series_”. Custom top-level properties of the measurement are included with a prefix “measurement_”. The prefixes are used to avoid clashes between identically named properties on different levels. The standard top-level properties id, source, time, and type are included without prefix.

As you can see, in the picture above, this also affects the output of the Measurement Input block. It contains the standard fields and the data from the selected fragment and series but nothing else. The value will always be a float.

We can now wrap up the discussion on Input blocks and their behavior. As we have seen all of them produce the fragments of the object they receive as properties on the output but the blocks differ in what kind of output they produce: pulse, float, boolean, or string. As we will see this will be the another source of issues with triggering behavior, when the different types of outputs are implicitly converted into each other on the inputs of the next blocks in the model.

Implicit conversion

Let us now come to the last topic affecting the triggering behavior of models: implicit type conversions. We have observed problems resulting from these conversions as their impact is often not immediately apparent when building a model. So we will show some common problems and explain how they can be solved.

Inputs and outputs of blocks in Analytics Builder are typed using one of the following types: pulse, boolean, float, string, and any. The type “any” is special as it allows all the other types. A block with an “any” input needs to be implemented such that it can handle all possible types. If the output of a block is connected to the input of a different type that is not “any”, Analytics Builder will perform an implicit type conversion if possible. In the following cases, a type conversion is not possible and thus a connection is not allowed:

  • pulse → float - the pulse does not contain a number to initialize the float from

  • pulse → string - the pulse does not contain a string to initialize string

  • string → float - if the string contains a number, this could work but as this is not guaranteed, this type of connection is forbidden. If the string is a number, you should do an explicit conversion using the “From Base N” or the Expression block.

If you use an illegal connection in your model, you will receive an error on activation, like in this case when we try to connect a pulse output to a string input:

The implicit conversions are great for keeping models concise but can be a matter of confusion if the conversion logic is not understood correctly. Most common problems are related to the conversion from another type into a pulse.

Implicit conversion - float & string to pulse

When you convert the float or string output of a block into a pulse input, the pulse only gets activated if the value of the float or string changes. Let us look at a simple, somewhat artificial model to see how this can lead to unexpected behavior:

This model reads temperature measurements and creates an event with a text explaining the current value for each measurement. The output looks like this in Cumulocity IoT - Data Explorer:

The graph shows the temperature curve and and below it you can see a time axis with rhombi indicating when the events were produced. As measurements were produced every 5 seconds, a corresponding event is created with the same interval. Except for the period when the temperature reading stayed at 15°C for 3 subsequent readings. Only for the first reading an event was created. The next event is created when the next temperature reading is below 15°C.

This behavior is actually desirable in a lot of case but can lead to issue in other cases, like:

  • You only want to use the measurement as a trigger but are not interested in the value. You use the Constant Value block to provide the actual value you are interested in. For example, the measurement value might be the quality of a piece being produced. You use the Constant Value block triggered by the quality measurement to provide a “1” to your model to be used as a piece count. As the Constant Value block has a pulse input, it will only be triggered if the input measurement has different values. If subsequent quality measurements have the value 100%, the model will only receive a single “1”.
  • Your machine has a manual override button for operators to change parameters on the machine. You have a model that uses the Discrete Statistics block to calculate aggregations. As a rule, every time an operator uses the override button, the current aggregations should be written to measurements. This is true, even if the user did not actually change a parameter. For this purpose, you connect the override measurement with the sample input of the Discrete Statistics block. Contrary to your expectations, the aggregation measurements will only be produced if the override parameters change.

Changing this behavior is actually pretty simple using the Pulse block. It makes the type conversion explicit and allows to change the behavior of the conversion:

In our case, we would change the mode to “On every input” thus achieving the desired functionality. You might wonder, why the Pulse block supports “On value change” as this is the default behavior. The reason is that other blocks can behave differently depending on whether they receive a pulse or another typed input. For example, the logic blocks (And, Or) produce a pulse if any of their inputs are a pulse and a Boolean otherwise.

Implicit conversion - Boolean to pulse

When converting Boolean outputs into pulse inputs, the behavior is slightly different from the behavior for float and string outputs. Again, a pulse is created only on value changes but for Boolean outputs only in case the value changes to “true”. So if you have a sequence of outputs like this: “false”, “true”, “true”, “false”, “false”, “true”, only two pulses will be created on the connected input: one for the “true” in second position and one for the “true” at the end of the sequence.

The typical situations where you encounter unexpected triggering behavior are also different for Boolean outputs. While it is possible that you receive a Boolean property as a part of a changed managed object or an event, it is much more likely that you are using a Boolean output from one of the built-in blocks and expect it to behave like a pulse.

A typical example, can result from the usage of the Expression block. Say, you want to filter a time-series to only contain positive values. Your model and its output might look like this:

This model does not provide the expected output. Instead of creating a new time-series with all positive values, it only contains the first positive values after the series has crossed 0. This is the point in time when the output of the Expression block switches from “false” to “true”. The next output will only be produced once the series has a negative value followed by a positive value. The Pulse block can again be used to solve this issue but the mode must be switched to “On non-zero values” so to only create a pulse for every positive number:

Similar issues can arise from not fully understanding outputs of blocks like Range, Threshold, and Geofence:

All three blocks have a mixture of Boolean and pulse outputs:

  • Range: Out of & in range are Boolean, Crossed is a pulse

  • Threshold: Breached and within threshold are Boolean, Crossed threshold is a pulse

  • Geofence: Within geofence is Boolean, Entered & exited geofence are pulse

For these blocks, typically the pulse outputs should be used to trigger subsequent behavior and the Boolean outputs should be used if additional tests are required. In some cases, using just the Boolean outputs can be much simpler. If you wanted to create an event every time a temperature reading switches from negative to positive, the following model does the job:

Implicit conversion - float & string to Boolean

The conversion from float and string into Boolean is intuitive:

  • string to Boolean - empty string is “false” and and other string is “true”

  • float to Boolean - 0 is “false” and every other value is “true”

There are two cases, where you have to be careful though. The first case is related to the Boolean to string conversion. Converting a Boolean to a string results in “true” or “false”. So if that same value is converted back to Boolean, it would always be true because the string “false” is converted to the Boolean value “true”. There are very few blocks which require a string value as an input and none of them outputs the converted value. So, the most likely scenario is an incorrect usage of the Extract Property block:

If you use the block to extract a Boolean property and forget to change the property type to Boolean, the default type will be string. Thus regardless of the actual value of the property, if it is used on a Boolean input of another block it will always be the “true” Boolean value.

The second case is related to the float to Boolean conversion. Floating point numbers in Analytics Builder are 64-bit signed numbers according to the IEEE-754 standard. Not all floating point numbers can be accurately described in this format. This becomes more pronounced for very large numbers but can be an issue for smaller numbers. For example:

0.3 - (0.2 + 0.1)

is not zero in this format but -5.551115123125783e-17. If this value were to be converted into Boolean, it would be “true” although you would expect it to be “false”. To avoid this issue, you can use the Round block to round the result any arithmetic operation before having it implicitly converted into a Boolean value.

Implicit conversion - pulse to Boolean

A pulse output is converted into a Boolean input, by switching the Boolean input to “true” momentarily and then switching it back to “false”. This might not always work as expected:

This model has two inputs the first one is a measurement that is provided to the value input of the Latch values block to remove duplicate subsequent values. The second input listens for changes to the Managed object of the device. It produces a pulse that is routed to the “Enabled” input of the Latch values block to trigger if the block is enabled or not.

This model might not behave as expected. Every time the Managed object is changed, the “Enabled” input is set to “true” and then right the next moment set back to “false”. Unless a measurement happens at exactly the same moment of the “Enabled” input being switched to true, no measurement will be sent through the block.

If this behavior is not intended, the Toggle block can be used to change it. Essentially, it is a “reverse” Pulse block. Two pulse inputs for “set” and “reset” control if the output of the Toggle block is “true” or “false”. A solution could look like this:

Two different properties of the Managed object are monitored and depending on which one is updated, the output of the Toggle block is set to “true” or “false” consequently enabling or disabling the Latch Values block.

Further reading on implicit conversion: Using Analytics Builder for Cumulocity IoT | Wires and Blocks | Type conversions

Ignore timestamp

A final topic that leads to unexpected results when executing Analytics Builder models is the option to ignore timestamps on incoming measurements, events, and alarms.

By default Analytics Builder will reorder incoming measurements, events and alarms to provide them to the models in the right order and with correct temporal spacing. This reordering allows blocks in the models to behave correctly without having to perform their own reordering. This is especially true if aggregations are being calculated or if the model needs to react to error conditions (e.g. if the model needs to react if temperatures are above a threshold longer than short period of time but should ignore temporary spikes, the order of measurements is important).

Analytics Builder implements this behavior by delaying received measurements, events, and alarms by a tenant-wide configurable delay. Any measurement, event, or alarm received after this delay may be dropped if a measurement, event, or alarm with an equal or later timestamp has already been processed by the model.

This behavior can be confusing if the clock of the devices sending the data is not in sync with clock of Cumulocity IoT as models might never get triggered.

An often applied quick fix is to enable “Ignore timestamp”. Suddenly, models will work and start to receive events but they may not work as expected. Any Aggregate block (like Discrete Statistics) will calculate aggregates as the events are received and not as they occurred. Events may be processed out of order, so blocks like Delta or Difference Calculation, Direction Detection, Crossing Counter or Missing Data might not work as expected.

Enabling “Ignore timestamp” is safe if a model only uses events from a single device whose clock is off by a fixed amount of time but sends events in the right order and the correct temporal spacing and the order and temporal spacing are retained upon reception. The same is true if a model uses data from multiple devices, whose clocks may or may not be in sync, as long as the previous conditions for an individual device also hold across multiple devices (the timestamps of events from different devices might be off, but they are received by Cumulocity IoT in the right order with correct temporal spacing).

In all other cases, especially if events are received out of order or in batches, the usage of “Ignore timestamp” must be carefully weighed against the inconsistencies it will cause. It might be necessary to develop custom blocks that can perform their functionality even in case of out of order reception of events.

Further Reading: Using Analytics Builder for Cumulocity IoT | Wires and Blocks | Input blocks and event timing

Summary

Hopefully, this article helped you in understanding how models and blocks are triggered in Analytics Builder and what are common problems and how they can be handled. We covered the following source of models not behaving as expected.

  • Triggering of models through input blocks - how input blocks trigger Analytics Builder models. We identified three classes of input blocks: (1) Managed objects, (2) events, alarms, and operations, (3) measurements and how they behave differently.

  • Implicit conversion - how implicit conversion between output types and input types of blocks can lead to unexpected behavior, how that behavior can be changed and how it can be used to simplify models.

  • Timestamps - we had a brief look at how the timestamps attached to incoming data control Analytics Builder behavior and when it is safe to ignore timestamps.

7 Likes

This article is a an eye-opener for me! Finally I understood what is actually happening (or not happening) in my AB models. Thanks Harald!