Using SCADA Historian Data for Anomaly Detection Without a Data Scientist

Here's a pattern we encounter regularly: a facility has been running OSIsoft PI (now AVEVA PI System) or a similar process historian for 8–12 years. It records motor current, drive torque, temperature, flow rates, and dozens of other process variables at 1–10 second intervals. The historian is full of data that preceded every significant equipment failure in the past decade. And almost none of it has ever been used for condition monitoring.

The reason isn't malice or incompetence. It's bandwidth and skill gap. Turning historical time-series data into anomaly detection requires statistical knowledge that most maintenance teams don't have on staff, and the tools that process historians expose for querying that data (PI ProcessBook, PI Vision, SQL queries against the AF structure) aren't designed for the kind of multi-variable pattern recognition that failure detection requires.

This article is about changing that — specifically, how to extract actionable condition monitoring from your existing historian data without a data scientist on your team.

What Your Historian Is Actually Recording

Most process historians in discrete manufacturing facilities record four categories of data that are directly relevant to equipment condition monitoring:

Motor current draw — the single most useful early indicator of mechanical loading changes. A motor drawing 8% more current than its baseline while running at the same speed and load is almost always a mechanical issue: bearing drag, gear mesh degradation, or increased process resistance.
Drive and VFD parameters — drives connected to variable frequency drives log output frequency, torque estimates, and fault registers. Torque trending on a VFD is a non-invasive way to detect gearbox and coupling degradation without adding any hardware.
Temperature signals — whether from thermocouples, RTDs, or infrared probes, temperature trending on motor housings, gearboxes, and bearing housings provides a lagging indicator of heat-generating faults. Less leading than vibration, but available in most historians without additional instrumentation.
Process variable correlations — production rate, line speed, press cycle time, and similar process variables. When a machine starts taking 3% longer to complete a cycle than it did six months ago, that's a mechanical signal, even if no alarm has fired.

The challenge is that these signals are useful for anomaly detection only when you can establish what "normal" looks like for a specific asset under specific operating conditions — and then detect statistically meaningful deviations from that normal. That's the step that requires methodology, not just data access.

The Baseline Establishment Problem

The most common mistake in historian-based condition monitoring is pulling a data export and looking at it with no normalization. Raw motor current data for a press that runs 4 different product families at 3 different cycle speeds will look chaotic — the operating mode variations swamp the degradation signal. This is why most informal attempts to use historian data for condition monitoring fail: the data looks noisy, the analyst can't see a pattern, and the project gets abandoned.

The solution is operating mode stratification before baseline calculation. For each monitored asset, identify the operating modes that represent distinct loading conditions — typically defined by production recipe, line speed, or product specification. Calculate a separate baseline for each mode. Then compare incoming data only to the baseline for the matching operating mode.

For a motor driving a stamping press, this might mean three operating modes: part type A (high draw weight, 45 SPM), part type B (medium draw weight, 55 SPM), and part type C (light draw weight, 65 SPM). A current spike during part A production is unremarkable if it's within the part A current envelope; the same current level during part C production is an anomaly worth investigating.

In OSIsoft PI, operating mode stratification is implementable using PI AF (Asset Framework) analysis templates that filter input data by process variable ranges before feeding the baseline calculation. This requires some initial configuration but creates a persistent, auto-updating baseline that doesn't need manual recalculation as production mixes shift.

Which Historian Variables Provide the Most Predictive Signal

Not all historian variables are equally useful for failure prediction. Based on our work deploying Gearcadence alongside existing historian infrastructure at manufacturing facilities, here is how the major variable types rank by predictive value for common failure modes:

Variable Type	Leading/Lagging	Failure Modes Detected	Typical Lead Time
Motor current RMS	Leading	Bearing drag, gear mesh wear, coupling misalignment	24–96 hours
Drive torque estimate (VFD)	Leading	Gearbox degradation, coupling wear	12–72 hours
Bearing temperature	Lagging	Lubrication failure, bearing fatigue (advanced stage)	2–8 hours
Cycle time / throughput rate	Leading	Clutch/brake degradation, press ram wear	48–120 hours
Drive fault registers	Reactive	Overcurrent, overtemperature, encoder errors	0–30 minutes

The key insight from this table: motor current and drive torque provide the most actionable lead time, and both are almost universally available in existing historians. Temperature is widely recorded but is a lagging indicator — by the time temperature rises significantly, the failure is often 2–6 hours away rather than 24–72 hours. Drive fault registers are logged in most historians but trigger only when the drive is about to protect itself — essentially the last signal before failure, not an early warning.

Getting from Historian Data to Actionable Alerts Without Python

The practical question is: how do you turn a pattern recognition exercise into a repeatable, automated process that a maintenance team can act on without doing a manual data review every day?

There are three paths, in increasing order of automation:

Historian native analytics: OSIsoft PI Asset Framework includes a built-in analytics engine that can calculate rolling statistics, deviation from baseline, and threshold-based alerts. For teams already running PI AF, this is the lowest-friction starting point — no new software required, just PI AF analysis templates configured per asset. The limitation is that PI AF analytics are largely threshold-based: they fire when a value crosses a defined limit, not when a multi-variable pattern becomes anomalous.
PdM platform with historian integration: Gearcadence connects directly to OSIsoft PI via the PI Web API, ingesting time-series data from any tag in the AF structure. The platform applies operating-mode-stratified baseline models and multi-variable anomaly detection on top of the historian data stream. This extends the detection capability from threshold alarms to pattern-based anomaly scoring without requiring any configuration changes on the historian itself.
Custom ML pipeline: Python or R-based statistical models running on the exported historian data. Feasible if you have a data engineer on staff or access to one. Time-to-value is typically 6–12 months for a production-quality implementation. Not realistic for most mid-size manufacturers without dedicated analytics resources.

For most facilities we work with, option 2 is the practical path: use the historian as the data source, and add a purpose-built anomaly detection layer on top of it. The historian doesn't need to change — it keeps doing what it does. The PdM platform consumes the same data and applies the pattern recognition logic that the historian can't do natively.

A Starting Point: The Three-Tag Audit

If you're not sure whether your historian data is usable for condition monitoring, start with a simple audit on your three highest-priority assets. Pull three tags per asset: motor current RMS (or equivalent), the most relevant temperature signal, and whatever cycle-time or production rate proxy your SCADA records. Go back 12 months. Plot each tag against time. Look for trends in the 4–8 weeks before any unplanned failures or reactive repairs you had in that period.

In our experience, approximately 70% of the facilities that run this exercise find a visible current or torque trend 2–6 weeks before a reactive repair event that they could have acted on with the right detection logic in place. The data was there. The failure wasn't inevitable — it was just invisible until someone looked at the right signals with the right methodology.

That's the core premise behind using historian data for condition monitoring: you don't necessarily need new sensors. You might just need a different lens on the data you've already been collecting.

Using SCADA Historian Data for Anomaly Detection Without a Data Scientist

What Your Historian Is Actually Recording

The Baseline Establishment Problem

Which Historian Variables Provide the Most Predictive Signal

Getting from Historian Data to Actionable Alerts Without Python

A Starting Point: The Three-Tag Audit

Related Articles

Predictive vs. Preventive Maintenance: The ROI Case for Your Plant

Best Practices for Integrating PdM Alerts with Your CMMS Work Orders

Gearbox Anomaly Detection: How Edge AI Caught a Failure 60 Hours Early

See Gearcadence on your equipment