Skip to content

Insight Analytics API

The Insight Analytics API offers means to observe the work done with connected devices.

Use cases include:

  • Providing operational dashboards to understand the current state
  • Exploring data to discover process dynamics
  • Exporting data (e.g. as .csv file) to process in custom BI tools

Time series data model

The analytics is done using the time series data where all data is stored with a specific point in time to which the data correlates to.

E.g. every scan of a connected device creates a datum at the time of the scan. The data persisted at this point in time is all the information retrieved about the scan: E.g. information about the devices used, scan-related metadata such as the decoded symbology, but also inferred information such as an estimation of the worker effort preceding such as step counts or missed scanning attempts.

The data model provides efficient access to the data persisted for a specific interval of time (e.g. a day, hour, etc.) Additionally, other attributes (e.g. scan symbology, device serial number, etc.) can be used to further filter or group the data.

Conceptionally, you can consider the data model a big table where the data is indexed by timestamps with additional columns:

Time Attribute: Symbology Attribute: Device Serial Attribute: Scan Duration
2021.01.23 06:02:12 QR Code M2MD010101234 2.03 s
2021.01.23 06:03:14 DataMatrix M2MB010105678 9.23 s
2021.01.23 11:17:45 QR Code M2MD010101234 0.54 s
2021.01.23 14:18:03 QR Code M2MD010101234 5.54 s

Each of these columns is either a simple attribute or a measure.

A measure has a varying value of a specific type that is meaningful to observe. This can be a simple count like the number of steps taken or a continous measure such as the duration of triggering the scanner to perform a barcode scan.

An attribute is any kind of label that further identifies a measure. Examples include the device's metadata such as the serial number or the firmware version, process details (like the location, or a process group of a connectivity device), or just the symbology of the decoded barcode. Below, you can find the current list of potential attributes attached to measure.

Recorded measures

Measure SQL Data Type Description
device_battery DOUBLE The percentage of the battery's charge levels
scan_code VARCHAR The scanned barcode value
scan_duration DOUBLE The trigger time in seconds needed to scan the barcode
metrics_worker_steps DOUBLE The approximate number of steps since the last scan
bce_activation_total DOUBLE The sum of trigger times for unsuccessful scans
bce_num_activations DOUBLE The number of trigger presses for unsuccessful scans
trigger_effort_bce_activation_total DOUBLE The barcode scanning effort measure in miliseconds
trigger_effort_bce_num_activations DOUBLE The scanning effort measured as a number of trigger button presses.
time_effort DOUBLE The time effort between the current scan and the previous successful scan made by the same device, measured in milliseconds (max. 5 min).

Recorded attributes

Each measurement can be identified by a set of metadata attributes:

Attribute Description
device_model Worker scanner device model
device_manufacturer Worker scanner device manufacturer
device_serial Worker scanner device serial number
device_firmware Worker scanner device firmware revision
gateway_model Connectivity device model
gateway_manufacturer Connectivity device manufacturer
gateway_id Connectivity device ID
gateway_type Connectivity device deploymen type
gateway_firmware Connectivity device software revision
event_type Type of event: scan or telemetry
event_id Unique event ID
scan_decode_symbology The decoded barcode symbology
customer_gateway_name Connectivity device name
customer_l0_name Name of the first associated level in the connectivity device organization
customer_l1_name Name of the second associated level in the connectivity device organization
customer_l2_name Name of the third associated level in the connectivity device organization
customer_l3_name Name of the forth associated level in the connectivity device organization
gateway_l0_id ID of the first associated level in the connectivity device organization
gateway_l1_id ID of the second associated level in the connectivity device organization
gateway_l2_id ID of the third associated level in the connectivity device organization
gateway_l3_id ID of the forth associated level in the connectivity device organization

Time series query structure

To access the time-series data, a simple sub-dialect of SQL (Structured Query Language) offers basic capabilities to filter, group, and represent the data:

An analytics query can specify the following constraints:

  1. A time frame to extract data from with the inclusive start and end time stamps
  2. A resolution in the time domain by which it will be reported and its data grouped
  3. The selected measurement and how it is aggregated
  4. Optional additional grouping to further divide the data (e.g. to present distinct devices)
  5. Optional additional filtering to further restrict what data to report (e.g. to only show the data of the selected device)
  6. Optional ordering clauses that sort the resulting data

Example: Time series query structure

Retrieve the number of scans for a use case in time frames of one hour grouped by the used scanning device:

{
    "range_start": "2021-02-20T00:00:00+0000",
    "range_end": "2021-02-22T23:59:59+0000",
    "resolution": "1h",
    "select": [
      {"name": "measure_value::varchar", "aggregate": "count", "alias": "scans"},
      {"name": "gateway_l3_id"}
    ],
    "where": [
      {"name": "measure_name", "operator": "=", "value": "scan_code"},
      {"name": "gateway_l2_id", "operator": "=", "value": "b2a931"}
    ],
    "groupby": ["gateway_l3_id"]
}

The sections below describe the existing constraints:

1.Time frame

The time-series data is always selected within the bounds of a time frame. This is specified by two inclusive points in time, range_start and range_end, as the boundaries. Only data within the interval of these points in time is considered for the query and its result.

Calendar-sensitive resolutions: weeks and months

Most resolutions are constant durations expressed in seconds. They are insensitive to time zones and other calendar effects.

In contrast, the resolutions w for week and M for month are both sensitive to the calendar and time zone as they represent a full calendar week or month rather than the number of seconds a week or an average month would take.

For this reason, certain restrictions and special behavior apply to week and month resolutions:

  • At the moment, the resolutions of weeks or months are only possible with a single week or month respectively.
  • The start of a week or a month is adjusted by evaluating the time zone of the range of the timestamps.

Time zone adjustments

As outlined above both week and month resolution (w and M) are time zone sensitive. When selecting them, these rules apply:

The time zone of the range_start parameters is evaluated. The UTC offset of the time zone is used to transpose the result times to match the local time zone.

Time Range Zones Example Effect
different range_start: "2000-01-01:00:00+0000" warning
range_end: "2000-01-01:00:00+0100"
UTC range_start: "2000-01-01:00:00+0000" no adjustment
range_end: "2000-01-01:00:00+0000"
local time zone range_start: "2000-01-01:00:00+0100" adjustment +1h
range_end: "2000-01-01:00:00+0100"

This is done to allow the client to get results reflecting either the neutral UTC time zone or a local time zone of the users choice.

2.Resolution

The resulting time-series data is grouped into equidistant time slots. By picking a suitable resolution, the granularity of this grouping can be adjusted to fit the time range and observation requirements.

To group the data into time windows the following resolution values are available:

Resolution Factor Description Example
s any Seconds 30s
m any Minutes 10m
h any Hours 12h
d any Days 3d
w 1 (One) week 1w
M 1 (One) month 1M

Example: Resolution

E.g. in the first query above, without any other grouping clauses, picking different resolution values yields these results: Consider we aggregate over the resolution windows by summing up the Scan Duration.

Resolution Sum(Scan Duration)
1 min 2.03 s , 9.23 s, 0.54 s , 5.54 s
1 h 11.26 s , 0.54 s , 5.54 s
1 d 17.34 s

3.Measurement and aggregation

To select measurements:

  • select measure_value::double or measure_value::varchar according to the type in the table of Recorded measures
  • filter by the measure_name of the desired measure

E.g. this selects the scan_duration measure:

{
    "select": [
      {"name": "measure_value::double", "aggregate": "sum"},
    ],
    "where": [
      {"name": "measure_name", "aggregate": "=", "value": "scan_duration"},
    ]
}

To aggregate data, you can use one of these functions:

Aggregation Description
avg Arithmetic mean of all values
sum Sum of all values
count Number of all values

These aggregation functions are applied to all selected data with a single group and time resolution.

4.Grouping

The resulting data can be grouped by:

  • The resolution in the time domain
  • And any grouping clauses

Example: Grouping

Additionally, the groups can be subdivided by attributes other than time. In the example above, if we additionally group by the Device Serial Number, the time windows of the resolution parameter are reported for each device:

Resolution Sum(Scan Duration)
1 min M2MD010101234:2.03 s , M2MB010105678:9.23 s, M2MD010101234:0.54 s , M2MD010101234:5.54 s
1 h M2MD010101234:2.03 s , M2MB010105678:9.23 s, M2MD010101234:0.54 s , M2MD010101234:5.54 s
1 d M2MD010101234:8.11 s , M2MB010105678:9.23 s

5.Filtering

Dividing the data for analysis only into groups of time windows or attribute values is often not enough as it does not allow you to zoom in on specific data. Instead, before grouping and aggregating the data, a filtering pass is necessary.

E.g. consider the retrieved data should be that of a specific process or a set of devices. To achieve this, a filtering clause can be used to exclude any data not matching the criteria:

At the moment data can be matched only by comparing for equality of a string value. To match data attributes these string-valued operators can be used:

Operator Description
= Test for equality

Example: Filtering

In the example, if we filter by Device Serial to be equal to M2MB010105678, we aggregate only on this row:

Time Attribute: Symbology Attribute: Device Serial Attribute: Scan Duration
2021.01.23 06:03:14 DataMatrix M2MB010105678 9.23 s

6.Ordering

The order of the returned data can be selected with an ordering clause: One or more attributes can be selected to order the data by. In addition, the data can be either returned in an ascending (lowest values first) or descending direction.

An important use of data ordering is selecting the tail or head of some data: E.g. the top 5 of some measurement or the lowest 10 of some datum can be implemented by ordering such that the first returned rows are the wanted ones and selecting a limit to get exactly the wanted rows.

Example: Ordering

If we were now interested in the longest scan duration we would take it from the top row with ordering set to duration:

{
    "range_start": "2021-02-20T00:00:00+0000",
    "range_end": "2021-02-22T23:59:59+0000",
    "select": [
      {"name": "measure_value::double", "aggregate": "sum", "alias": "duration"},
      {"name": "gateway_l3_id"}
    ],
    "where": [
      {"name": "measure_name", "aggregate": "=", "value": "scan_duration"},
    ],
    "groupby": ["gateway_l3_id"],
    "orderby": ["duration"],
}

The results would be following:

Time Attribute: Symbology Attribute: Device Serial Attribute: Scan Duration
2021.01.23 06:03:14 DataMatrix M2MB010105678 9.23 s

Time series query syntax

When calling an Analytics API endpoint, the time-series query is represented as a structured input such as a JSON object, rather than a DSL string. While this is not as expressive as using a parsed SQL DSL string it has several advantages:

The query syntax is directly specified in the OpenAPI specification as the only documentation of the API. There is no need to maintain another form of input specification for the query syntax. Furthermore, injecting malicious SQL is much harder as parsing and sanitizing requests are much simpler.

Query OpenAPI schema

To query the /{customer_id}/analytics/query endpoint the OpenAPI 3 schema of the request is the following TimeSeriesQuery:

openapi: 3.0.3

components:
  schemas:
    TimeSeriesQuery:
      type: object
      required:
        - range_start
        - range_end
        - select
        - resolution
      properties:
        range_start:
          description: "TheiInclusive oldest time of the data from an ISO 8601 notation. If a UTC offset is given, the responded timestamps will also be in local time, using this parameter's offset."
          type: string
          format: date-time
          example: "2021-01-08T08:14:52+0000"
        range_end:
          description: "The inclusive newest time of the data from an ISO 8601 notation. If a UTC offset is given, the responded timestamps will be also in local time, using the offset provided via range_start"
          type: string
          format: date-time
          example: "2021-01-08T16:14:52+0000"
        resolution:
          description: "The time resolution to bin the data into equally sized buckets: Must be a positive integer followed by s,m,h,d for the second, minute, hour, and day. Additionally, 1w and 1M (one week and one month) are also supported."
          type: string
          pattern: '^\d+(s|m|h|d)|1w|1M$'
          example: "1h"
        select:
          description: "The list of selected clauses determining what data columns to retrieve."
          type: array
          items:
            $ref: "#/components/schemas/SelectClause"
          minLength: 1
          maxLength: 5
        where:
          description: "The list of filtering clauses determining the data records to retrieve the data from. Implicitly filled with the range_start and range_end even if unset otherwise."
          type: array
          items:
            $ref: "#/components/schemas/FilterClause"
          minLength: 0
          maxLength: 5
        groupby:
          description: "The list of columns to group the data by, additional to the implicit binned time resolution. Results in one time series per column value."
          type: array
          items:
            type: string
          minLength: 0
          maxLength: 1
        order:
          description: "Sort order"
          type: string
          enum:
            - ASC
            - DESC
          default: DESC
        orderby:
          description: "Sorting fields. The results are always ordered by the implicit time first. Then we sort by the order of fields supplied here."
          type: array
          items:
            type: string
          minLength: 0
          maxLength: 5
    SelectClause:
      type: object
      required:
        - name
      properties:
        name:
          description: "Name of the column (measure, attribute or function) to select."
          type: string
          example: "measure_value::varchar"
        aggregate:
          description: "Aggregation function to use."
          type: string
          enum:
            - avg
            - count
            - sum
          example: "count"
        alias:
          description: "Name to map the column to for convenience."
          type: string
          example: "scans"
    FilterClause:
      type: object
      required:
        - name
        - value
      properties:
        name:
          description: "The name of the column (measure, attribute) to compare. Must be a string value for now"
          type: string
          example: "measure_name"
        operator:
          description: "Comparison operator. Must be '=' for now."
          type: string
        value:
          description: "Value to compare against."
          type: string
          example: "scan_code"

Response OpenAPI schema

The obtained response TimeSeries consists of a XAxis listing the labels of each reported time in the series and the time series data itself: A list with one element for each XAxis element which itself is a list of all the selected fields.

components:
  schemas:
    TimeSeries:
      type: object
      description: "Time series data"
      required:
        - xaxis
        - series
      properties:
        xaxis:
          $ref: "#/components/schemas/XAxis"
        series:
          type: array
          items:
            $ref: "#/components/schemas/SeriesCollection"
        warning:
          type: string
          description: "An optional warning, displayed if something is wrong or when an ambiguity is removed."
          example: "Different UTC time offsets for the start and end time range detected. Reporting with applied start time range's offset."
    XAxis:
      type: object
      description: "Time series data X-axis values: Either the binned times or other grouping selectors"
      required:
        - categories
        - type
      properties:
        type:
          type: string
          example: "datetime"
        categories:
          type: array
          items:
            type: string
            example: "2021-01-08 10:00:00.000000000"
    SeriesCollection:
      description: "The list of the time-series."
      type: array
      items:
        $ref: "#/components/schemas/Series"
    Series:
      description: "The time-series data values matching the X-Axis positions and the selected clauses."
      type: object
      required:
        - data
        - name
      properties:
        data:
          type: array
          items:
            type: string
            description: "The requested data."
            example: "9"
        name:
          type: string
          description: "The name of the series. Corresponds to the first selected name or the value of the groupby attribute, if one is provided."
          example: "scans"

Example queries

Retrieve the number of scans for a use case in time windows of one hour grouped by the used scanner device:

{
    "range_start": "2021-02-20T00:00:00+0000",
    "range_end": "2021-02-22T23:59:59+0000",
    "resolution": "1h",
    "select": [
      {"name": "measure_value::varchar", "aggregate": "count", "alias": "scans"},
      {"name": "device_serial"}
    ],
    "where": [
      {"name": "measure_name", "operator": "=", "value": "scan_code"},
      {"name": "gateway_l2_id", "operator": "=", "value": "b2a931"}
    ],
    "groupby": ["device_serial"]
}

Summarizes the number of steps in one day for a specific use case in all locations:

{
    "range_start": "2021-01-08T00:00:00+0000",
    "range_end": "2021-01-08T23:59:59+0000",
    "select": [
      {"name": "measure_value::double", "aggregate": "sum", "alias": "steps"},
    ],
    "where": [
      {"name": "measure_name", "operator": "=", "value": "metrics_worker_steps"},
      {"name": "customer_l3_name", "operator": "=", "value": "picking"},
    ],
    "resolution": "1h"
}

Retrieves the overhead of acquiring a successful scan measured by the combined durations of triggering the scan engine preceding a successful scan for a whole day:

{
    "range_start": "2021-01-08T00:00:00+0000",
    "range_end": "2021-01-08T23:59:59+0000",
    "select": [
      {"name": "measure_value::double", "aggregate": "sum", "alias": "activations"},
    ],
    "where": [
      {"name": "measure_name", "operator": "=", "value": "trigger_effort_bce_activation_total"},
    ],
    "resolution": "1h"
}

Patterns and use cases

This is meant as a practicioners collection of emerging patterns and interesting use cases. The presentation favors the problem statements over cohesion and is meant to evolve over time with emerging patterns.

Local time zones in time resolutions

As described above, use a UTC offset in the time range fields to control how the responded times are time zone adjusted. Also

E.g. this would select a time zone shifted 2 hours in respect to UTC:

{
    "range_start": "2021-01-02:00:00+0200",
    "range_end": "2021-01-08T23:59:59+0200",
    "select": [
      {"name": "measure_value::double", "aggregate": "sum", "alias": "steps"},
    ],
    "resolution": "1w"
}