Case Studies

Industrial IoT telemetry and asset monitoring platform

Key Results

  • 105K assets monitored
  • 5.5M sensor readings / month
  • 750K operational events / month
  • Location: EMEA
  • Cooperation Period: 4 months
  • Industry: Industrial IoT

About the project

The project was performed for a telecom operator managing a large fleet of distributed infrastructure assets, including network towers and related field equipment. These assets generated a large pool of continuous telemetry and operational events data that needed to be collected, processed, and analyzed to support monitoring, maintenance, and incident response.

PerformaCode’s team was engaged to redesign the monitoring system once operational load from tower telemetry exceeded what the existing solution could reliably handle. The work focused on ingestion of high-frequency measurements, event correlation across assets, and system behavior under sustained load, including authorization boundaries, logging, and failure scenarios required for operational use.

The solution was designed to handle high-volume telemetry streams from tens of thousands of physical assets, consolidate data coming from heterogeneous sources, and provide near-real-time visibility into asset state and operational incidents. Particular attention was paid to scalability, predictable data flows, and clear separation between ingestion, processing, and consumption layers to support future system growth.

4

engineers

4

months

FP

delivery model

Client challenges

The client operated a large, geographically distributed fleet of telecom towers, including assets located in remote or hard-to-access areas. These towers produced continuous telemetry and operational events, and physical access for inspection or repair was often costly or delayed. As a result, monitoring accuracy and timeliness directly affected operational decisions.

The existing monitoring system could not reliably handle the event volume, burst patterns, and data heterogeneity generated by the fleet. Under peak load and incident conditions, delayed processing and incomplete correlation reduced confidence in the reported asset state.

Telemetry arrived from multiple sources with inconsistent schemas, timing, and delivery guarantees. Converting raw measurements into a consistent, current asset state required correlation across streams, ordering of events, and tolerance to partial or delayed data—particularly when connectivity was degraded.

The system also had to serve multiple internal consumers with different access scopes and latency expectations. Any architectural change carried production risk, making it necessary to design a second-generation solution that could scale deterministically, isolate failures, and evolve without interrupting live monitoring and operations.

Tasks performed

  • Designed a second-generation IIoT monitoring architecture for continuous telemetry ingestion from geographically distributed industrial assets
  • Defined telemetry acquisition and normalization pipelines for data originating from field equipment with heterogeneous protocols, formats, and connectivity quality
  • Designed an event-driven processing backbone to handle bursty telemetry, operational events, and incident-driven load spikes without data loss
  • Modeled physical assets and operational state, correlating raw measurements into a consistent, time-aware view of asset condition and availability
  • Designed system interfaces with field equipment, accounting for embedded constraints such as limited bandwidth, intermittent connectivity, and device-side buffering
  • Designed an IoT gateway layer to isolate field protocols, handle intermittent connectivity, and decouple asset communication from core processing
  • Specified stream-processing logic for operational signals, including filtering, aggregation, correlation, and ordering under partial or delayed data delivery
  • Defined authorization and access control boundaries aligned with operational roles (operations, maintenance, planning) rather than application features
  • Designed logging and traceability mechanisms to support fault analysis, incident investigation, and operational audits across distributed components
  • Separated real-time operational flows from analytical consumption, enabling downstream reporting and analytics without impacting monitoring latency
  • Reviewed and iterated architecture with operations and engineering teams, validating design decisions against real deployment, maintenance, and failure scenarios

Project results

105K assets modeled

Delivered an asset and telemetry data model capable of representing ~105,000 monitored units by standardizing identifiers, state fields, and event semantics across heterogeneous tower data sources.

5.5M measurements/month

Designed ingestion and normalization flows for ~5.5 million monthly telemetry readings by defining schema normalization rules and processing paths tolerant to delayed and out-of-order data.

750K events/month processed

Specified event-driven processing for ~750,000 monthly equipment events by using an event-bus-centered architecture with controlled buffering and back-pressure to absorb burst load during incidents.

End-to-end traceability added

Enabled incident-level troubleshooting by defining centralized logging and correlation identifiers across ingestion, processing, and consumption components so events could be traced through the pipeline.

Real-time and BI isolated

Prevented analytical workloads from degrading monitoring responsiveness by separating real-time telemetry processing paths from DWH/BI consumption interfaces and defining decoupled integration points.

Role-based access enforced

Defined authorization boundaries for operations, maintenance, and planning users by mapping data access to roles and isolating sensitive functions behind explicit authorization and audit points.

Near-real-time asset state

Enabled near-real-time visibility into tower condition by normalizing telemetry streams and correlating measurements and events into a continuously updated asset-state model, even under delayed or burst data delivery.

Remote issue triage enabled

Reduced reliance on physical site visits by structuring telemetry and events to support remote diagnosis, allowing operations teams to identify and triage issues without immediate access to towers in remote or hard-to-access locations.

Second-generation scaling unblocked

Removed architectural scaling limits of the first system by decoupling ingestion, processing, and consumption paths, allowing new assets, telemetry sources, and consumer roles to be added without redesign.

Value we bring

Turning raw telemetry into operational asset state

We translate noisy, heterogeneous telemetry into a consistent, time-aware asset state by aligning embedded signals, analytics logic, and operational semantics early. This requires close cooperation between embedded, data, and in-house teams to validate assumptions against production behavior, not dashboards. The result is a monitoring model that operators can trust when conditions degrade or incidents spike.

Hardware-first thinking that challenges weak decisions early

We start from device and field constraints: bandwidth limits, latency, failure modes, access cost—and challenge architectural choices that ignore them. By pushing back early on designs that look clean on paper but break in the field, we reduce rework and shorten time-to-market. This approach prevents downstream fixes that are expensive, slow, and operationally risky.

Designing IIoT systems that tolerate imperfect telemetry

IIoT telemetry is late, duplicated, missing, and often out of order. We design ingestion and processing paths to absorb imperfect data through buffering, ordering, correlation, and clear failure isolation. This practice produces systems that fail predictably rather than silently, maintaining operational usefulness under real network and device conditions.

Technologies

  • Linux
  • .NET Core
  • Java
  • Apache Spark
  • Pentaho
  • Data Vault
  • REST APIs
  • IoT Gateway

Other Case Studies

ARM Platform Firmware and Lifecycle Engineering

ARM Platform Firmware and Lifecycle Engineering

Long-term firmware and platform engineering for a new ARM processo...

Autonomous Driving ML Toolchain Validation

Autonomous Driving ML Toolchain Validation

Validation and sustaining engineering for an autonomous driving ML...

Edge AI Drone for Orchard Monitoring

Edge AI Drone for Orchard Monitoring

Development of a drone-based video analytics system, delivering re...

All Case Studies
Let's Talk