Several important systems like power grids, water supply systems, oil and gas production/distribution systems, mass transportation systems, etc., are considered as critical infrastructure (CI) systems due to the catastrophic nature of damages that can result from their failure. The task of monitoring and controlling such systems is often entrusted to Supervisory Control and Data Acquisition (SCADA) systems.
SCADA systems are an attractive target for attackers, as they offer an avenue for launching attacks against high valued CI systems. A typical SCADA system may include several remote terminal units (RTU), one or more master terminal units (MTU), a variety of communication equipment and links, computers running human machine interface (HMI) software to enable more intuitive operator driven control when necessary. Hidden malicious/accidental functionality in any SCADA system component could be exploited by an attacker to launch attacks such as the above. Such hidden functionality could exist in (the logic programmed into) programmable logic controllers (PLC) in RTUs and MTUs, in any computer used for programming PLCs, or in any peripheral of the computer running the HMI software or the SCADA data logger, in the operating system of such computers, in the HMI software, or even, ironically, in a computer that runs the intrusion detection system (IDS) intended for protecting the SCADA system.
In 2010, a virus known as Stuxnet1 that had evaded detection for over a year  was identified. This virus targeted nuclear plants, and shut down centrifuges inside the plant by overwriting some set-points. In November 2011, the Illinois Statewide Terrorism and Intelligence Center reported2 a cyber-attack on a small, rural water utility outside Springfield, where attackers had gained remote access to pumps. In May 2003  , a Slammer worm exploiting3 an un-patched version of Microsoft SQL erased crucial SCADA system logs. “In March 2016, the U.S. Justice Department claimed that Iran had attacked U.S. infrastructure by infiltrating the industrial controls of a dam in Rye Brook, New York. The attackers compromised the dams command-and-control system in 2013 using a cellular modem”  . “In December 2015, a power company located in western Ukraine suffered a power outage that impacted a large area that included the regional capital of Ivano-Frankivsk. The cybercriminals had facilitated the outage by using BlackEnergy malware to exploit the macros in Microsoft Excel documents”  .
It is indeed for very good reasons that such threats have been recognized as “Advanced Persistent Threats”      . Due to the high value of targets, the possibility of sophisticated state sponsored attacks has to be considered. Sophisticated malicious functionality may be introduced even during the manufacturing process of various components that could ultimately end up in SCADA systems. In addition, we cannot afford to ignore the possibility that an attacker may have actually participated in the deployment of the SCADA system, or testing of the deployed system, and taken advantage of such an opportunity to inject hidden functionality in some component.
While it is important to take all possible practical steps to reduce the threat of hidden malicious functionality, we may never be able to eliminate such functionality in every component. Such functionality may be exploited to launch attacks while simultaneously reporting “all clear” messages to the stake-holders. It is of vital importance that we are at the minimum able to reliably detect such attacks, even if hidden malicious functionality is inevitable.
1.1. Active vs Passive Security Measures
The process of securing any system can be seen as consisting of three broad steps: 1) enumeration of desired assurances; 2) identification of reasonable assumptions; and 3) development of a process, viz., a security protocol, to translate the assumptions into the desired assurances. In other words, if we begin with good assumptions, and if the security protocol is correct, and if the agency responsible for executing the protocol is trustworthy, then the desired assurances are guaranteed.
Approaches to secure systems can be broadly classified  into:
1) active approaches based on attack models; and
2) passive approaches based on system-state models.
Underlying active approaches is the notion that violations of desired assurances result from attacks. As attacks exploit pitfalls in the implementation of systems (resulting in undesired functionality) some of the practical tools used in active approaches for identification, removal and/or isolation of attacks, include machine learning for modeling, detecting, and classifying intrusions; isolation mechanisms like hypervisors, containers, and various cryptographic mechanisms; and good programming practices to minimize bugs and potentially malicious functionality. The complexity of most tools used in active approaches render it infeasible to make meaningful assumptions regarding the integrity of both: 1) the tools, and 2) the process that utilizes the tools, to realize the desired assurances. Furthermore, due to the unrestricted freedom of attacks, active approaches will forever be engaged in an evolutionary arms race with attacks.
Passive approaches view digital assets of a system as a dynamic set of states. The desired assurances dictate the nature of protection to be extended to each state, and are expressed in the form of an unambiguous system-state transition model. Executing the model, (or model-driven verification) is a process of 1) actually verifying that the state-transition rules specified by the model are not violated when the system is operational, and 2) reporting such findings to stake-holders. To ensure that the model is correct, the model should be made open to scrutiny. In such an event, the extent of trust in the assurances offered by such an approach (that no state violation will go undetected) is only limited by the trust in the environment for model-execution. The novel STCB (SCADA Trusted Computing Base) approach proposed in this paper is a passive approach, which leverages a rigorous standard for a trustworthy model-execution environment.
Current approaches to secure systems, and more specifically, CI systems monitored by SCADA systems, are overwhelmingly active approaches. Ultimately, both active and passive approaches are necessary. Active approaches attempt to deflect/repel as many attacks as possible. Passive approaches diminish the pay-off for attackers, as even state violations resulting from attacks that slip-past active approaches will be detected by passive approaches. In addition, active approaches are also essential for the narrow purpose of thwarting attacks that seek to compromise the integrity of the model-execution environment.
1.2. Trusted Computing Base
For any system with a desired set of security requirements , the trusted computing base (TCB) is “a small amount of software and hardware we rely on” (to realize the requirements ) and “that we distinguish from a much larger amount that can misbehave without affecting security”  . In other words, as long as the TCB is worthy of trust the TCB can be leveraged to realize the desired assurances regarding the operation of the entire system.
In the proposed passive approach to secure SCADA systems, a resource limited trustworthy module―which we shall refer to as an STCB (SCADA Trusted Computing Base) module serves as the TCB for model-execution. The main contributions of this paper are: 1) a strategy for expressing of state-transition models for SCADA systems; and 2) a functional specification for STCB modules, for executing the model.
To improve the confidence in the integrity of STCB modules, they should ideally be manufactured under a well-controlled environment, and consummately tested for the designed functionality. To facilitate consummate testing, it is necessary to deliberately constrain STCB modules to possess simple functionality. For low-cost mass-realization of reliable STCB modules to be practical, the simple functions executed inside STCB modules should nevertheless permit them to serve as the TCB for any SCADA system―irrespective of the nature and scale of the CI system. While the “instruction set” for specifying the state transition model should be rich enough to be suitable for any SCADA system, it should simultaneously be simple enough to be executed even by severely resource limited STCB modules.
The main components of the proposed STCB based security architecture include
1) a systematic strategy for designing SCADA state-transition models for any SCADA system, consisting of
a) an instruction set for expressing the model,
b) role of the designer of the SCADA system, and
c) role of the deployer of the system;
2) a functional specification for STCB modules, suitable for executing the instruction set for any SCADA system; and
3) an STCB protocol, for interacting with STCB modules, and obtaining SCADA state reports.
The rest of this paper is organized as follows. Section 2 is an overview of STCB approach. Section 3 outlines the STCB design process. Section 4 outlines processes for STCB deployment and operation. Section 5 provides a detailed description of the STCB functionality. Section 6 describes the STCB protocol. Finally, conclusions are offered in Section 7.
2. Overview of STCB Approach
While a state-based security approach can be extended to any system, such an approach is indeed natural for critical infrastructure SCADA systems. Note that the ultimate purpose of a SCADA system, viz., to monitor and report CI system states to stake-holders, is indeed identical to that of state-model based security architecture, consisting of model-driven verification and reporting.
The state reports from a SCADA system can be seen as a function of the current states of all sensors associated with the system. For a SCADA system characterized by n sensors, let represent the states of the n sensors, and let
represent a function that captures the “physics” of the controlled system, and reports values to the stake-holder as the “state of the system”. More specifically, as inputs to the SCADA system (sensor measurements) may be received asynchronously, the function is often realized as
where is evaluated whenever an fresh measurement is made available. Furthermore, in practical SCADA systems, evaluation of is performed jointly by numerous system components that may include PLCs in multiple RTUs and MTUs, the HMI, and even actions by human operators. Consequently, notwithstanding current active measures, the integrity of the state reports is far from assured. Specifically, current active approaches include features like a) cryptographic protection of links between RTUs and MTUs  -  to prevent message injection attacks by attackers and b) intrusion detection systems to facilitate early detection of attacks    to detect and evade attacks.
The goal of the STCB security model is to guarantee the integrity of state reports provided by the agent. To achieve its goals, the STCB security model relies only on a) the integrity of STCB modules, and b) the integrity of clearly defined processes to be adopted by entities identified as the designer and the deployer of the SCADA system. The designer is an entity with good domain knowledge (regarding the CI system); the deployer is a security professional who is not required to possess any knowledge of the CI system. To the extent the stake-holder trusts the integrity of the STCB modules, and the verifiable processes adopted by the designer and the deployer, the stake-holder is assured of the integrity of the state report―even if malicious functionality may exist in SCADA system components.
2.1. STCB System Components
The additional components introduced into a STCB-secured SCADA system include
1) an untrusted “STCB system manager” U,
2) STCB modules , and .
All STCB modules are identical, and are capable of executing a set of simple TCB functions. Modules are “closely bound” to SCADA system sensors. In the rest of this paper we shall use the term sensor module (SM) for STCB modules , and the term central module (CM), for STCB module .
The untrusted STCB manager U periodically receives sensor reports from SMs and makes them available to CM . CM evaluates , and outputs state reports, see Figure 1. From a broad perspective, the authenticity of the inputs to are assured by SMs ; the integrity of the function is assured by the CM.
The exact make up of the manager U is irrelevant for our purposes of guaranteeing the integrity of , as U is not trusted. Unless U performs it tasks faithfully, valid state reports cannot be sent to the stake holders.
The state reports are relayed by the STCB manager U to an STCB module associated with a stake-holder. Any number of stake holder modules like may exist. More generally, a stake-holder module may be the CM for another STCB deployment.
For example, the state reports from different SCADA systems may be provided as “sensor reports” to a system at a higher level of hierarchy. In such a scenario, the stake-holder module can be seen as the CM of an STCB deployment at a higher level of hierarchy. Module considers the state reports from the CMs of systems at the lower level as “sensor reports” from foreign STCB deployments.
Any number of hierarchical levels may exist. For example, state reports from multiple SCADA systems in a town may be inputs to a single SCADA system that monitors the health of all such systems in the town. The reports from such
Figure 1. Top: STCB components. Bottom: Information flow in the STCB model. The STCB module associated with the stake-holder can be the CM for a deployment at a higher hierarchical level.
SCADA systems in different towns may be inputs to another SCADA system at an even higher level of hierarchy, that monitors the health of all systems in a state, and so on.
The main challenge lies in the choice of a strategy for evaluating any inside the trusted confines of resource challenged STCB modules. Recall that we desire to deliberately constrain STCB modules to possess only modest memory and computational abilities. Consequently, we constrain STCB modules to perform only logical and cryptographic hash operations. By performing simple logical operations the STCB modules support a simple instruction set for representing different s. As no restrictions are placed on the nature and scale of the SCADA system, some of the specific challenges are that
1) the number of sensors n can be unlimited;
2) evaluation of ’s may require evaluation of complex functions, and thus challenging to represent using merely the instruction set .
Both challenges are addressed through the use of Merkle hash trees  .
2.2.1. Merkle Trees
A Merkle tree is a binary hash tree which permits a resource limited entity to assure the integrity of a dynamic database of practically any size, even while the database is stored in an untrusted location. Specifically, the resource limited entity only needs to store a single cryptographic hash―the root of the tree.
A Merkle hash tree with leaves (for simplicity we shall assume that is a power of (2) has L levels. For storing a database with n records, each record is interpreted as a leaf of the tree. Corresponding to each leaf (record) R is a leaf-hash obtained as , by hashing the leaf using a secure cryptographic hash function .
The n leaf-hashes (say, ) corresponding to n records are at level 0 of the binary tree. At level 1 of the tree are n/2 leaf hashes , where . Similarly, the n/2 leaves in level 1 result in n/4 nodes in level 2, and so on. Construction of the tree stops at level L with a single node ―the root of the tree. For any leaf (with corresponding leaf node ), there exists a set of L complementary nodes, say, , and a sequence of L hash operations represented as such that , see Figure 2.
Protocols that employ Merkle hash trees can be seen as an interaction between two parties―a prover and a verifier. The prover stores all n leaves and all nodes (distributed over levels 0 to L). The verifier stores only the root r (the lone node at level L). To demonstrate that a record R is part of the tree the prover provides L complementary nodes as proof. The verifier accepts record R as authentic only if . To update record R to the verifier simply sets it’s root to .
Figure 2. A binary hash tree with 8 leaves. The set of complementary nodes for are “the siblings of all ancestors of ”― (sibling of ), (sibling of ancestor ) and (sibling of ancestor ).
2.2.2. Merkle Trees in the STCB Approach
In the STCB approach resource challenged STCB modules store only the root of the tree, and have the ability to perform operations. This capability is leveraged to assure the integrity of
1) a dynamic database of n sensor measurements;
2) any number of simple static “algorithms” to evaluate different s, where each algorithm is a small number of instructions (belonging to the instruction set ) supported by STCB modules; and
3) static look-up tables (of any size) for evaluating complex functions that may be necessary to execute some (instruction in) .
Specifically, the STCB module for an STCB deployment stores a (static) root of a static Merkle tree, and the (dynamic) root of a dynamic Merkle tree. The leaves of the static tree are the specifications for a specific STCB deployment―provided by the designer and the deployer of the system. The leaves of the dynamic tree are the current states of the n sensors of the system. The leaves and all intermediate nodes of both trees, are stored by the untrusted STCB manager U.
2.3. STCB Designer and Deployer
One of the main motivations for clearly demarcating between the roles of a designer and a deployer is that entities with good domain knowledge (for example, an entity with in-depth knowledge about the domain of specific CI system, like a nuclear plant) are often unlikely to be security experts. Likewise, security experts are unlikely to be experts in the domain of the specific CI system.
The designer is a domain expert with good knowledge of the CI system. The designer is required to be aware of the purpose of each sensor in the system, and the interpretation of their states. For example, (say) in a water-tank control system, “if (water level greater than 100) should be zero (the pump should be off).” The responsibility of the designer is to come up with a specification for the function that captures the physics of the system. Specifically, the designer specifies two types of records. For a system with n sensors, the designer specifies
1) n static records which convey cryptographic commitments to values (instructions, inputs, constants) necessary for executing each ;
2) n records which convey the initial state of n sensors. During operation, a sensor report from will trigger execution of instructions defined in a record to modify record .
The deployer is a security professional who may not possess any CI system domain knowledge. The responsibility of the deployer is to procure and install STCB modules, and be aware of steps to be taken, for example, to
1) facilitate establishment of shared secrets between modules;
2) securely connect (for example, using tamper-evident connectors) physical sensor outputs to SMs, and record such bindings (for example, indicating that sensor is connected to module );
3) deploy the STCB manager U―which includes installation of all hardware/software necessary to relay SM outputs to the STCB manager U, setting up a channel to the CM , and a channel to be used for conveying state reports to stake holders. However, U, and such channels, are not trusted.
Ultimately, the components of the system-model specified by the deployer take the form of two types of records. Records of type B (binding records) specify binding between a sensor identity and the module identity responsible for authenticating reports from . Records of type R (or reporting records) specify the identity of the STCB module to which a report regarding a specific system-state is to be made.
3. STCB Design
The designer is entrusted with the responsibility of describing function
where is evaluated whenever an fresh measurement from sensor is available.
3.1. STCB Design Tree
The designer provides a specification of by constructing a static Merkle tree―the design tree―with root . The tree includes leaves
where is itself a root of a Merkle tree with n leaves. Each leaf specifies the initial state of n sensors as records . The other n leaves correspond to the n design records . The sensor records and design records are of the form
Each sensor is associated with a set of dynamic values, where w is a constant. Specifically, 1) is the latest measurement of sensor ; 2) is the time of the measurement; 3) are the w outputs of function , evaluated when the last report was received from sensor . The value is a measure of time associated with the outputs (and is not necessarily the same as ).
In the design record for sensor , value is the hash of a small number (say, m) of instructions chosen from the set . Specifically, the instructions define the function to be evaluated on receipt of a report from sensor . The value is a one way function of a set of (say, l) constants C. Such constants may specify various values like set-points, permitted ranges of measurements, minimum expected frequency of reports from sensors, etc. In addition, such constant values may also be used as look up tables. The values specify (up to) q related sensors. Such sensors are “related” to as the states of such sensors can influence . Some or all of the q values can be set to zero if less than q related sensors suffice. Finally, the value is optional, and is the identity of a “synthetic” sensor (explained later).
3.1.1. Example System
To describe different steps involved in the construction of a design tree we will use a simplified version of a thermal power plant with six sensors as a running example, see Figure 3.
temperature sensor inside boiler.
coal weight sensor (coal fed into the boiler).
position of fire regulator.
temperature inside turbine cell.
pressure inside turbine cell.
speed of turbine.
The sensor records are of the form
Figure 3. Example: simplified version of thermal power plant.
The design records are of the form
3.2. Inputs and Outputs of
Due to limited memory inside STCB modules, there is a need for a strict upper bound on the number of inputs to, and outputs of, each . In other words, irrespective of the total number of sensors n, note that s are restricted to specifying only 1) up to q related sensors, 2) l constants, 3) m instructions, and 4) one synthetic sensor as inputs. Each produces w outputs.
As is re-evaluated whenever a fresh report is available from sensor , the inputs necessary to evaluate are stored in reserved volatile registers inside STCB modules, and include
1) values in the record associated with sensor (stored in a register inside the module);
2) values in a fresh report from sensor (register r);
3) values in records for related sensors (registers );
4) constants in (register c); and
5) instructions (some of which may be set to 0 to represent “no operation” if m instructions are not required to evaluate );
The m logical operations in provide the instructions to recompute the outputs of following a fresh report from . On evaluation of the record is modified. Specifically,
1) replace the previous values ,
2) outputs replace outputs of the previous execution of ,
3) and is replaced with
to reflect the staleness of the w outputs.
Note that dynamic values associated with any may be affected not just by values corresponding to sensors directly related to , but also sensors indirectly related to ―for example sensors related to a related sensor (once removed) or sensors related to a sensor related to (twice removed) and so on. Computing the value as in Equation (5) ensures that the value will be the least of the sensor-report time t corresponding to every sensor that is directly or indirectly related sensor .
On a continuous basis, as and when new sensor reports are available, the states of the reporting sensors are modified. A subset of dynamic values corresponding to a subset of sensors may be reported to the stake-holder as values describing the state of the system. For example, if value (second output of ) is one of the values reported as the state of the system, the time associated with the state is reported as .
3.2.1. Example for Power Plant
For the example system 0.0.3, let the maximum number of related sensors be ; the number of outputs of each be ; and the number of constants . In this example, say there exists a rule that the values of sensors have to be within threshold ranges respectively. The design of function checks if are within thresholds respectively. The inputs are and constants. The output is written in . For evaluation of , is specified as a related sensor. As no other related sensors are used, and are set to 0.
As is re-evaluated whenever a fresh report is available from sensor , the inputs necessary to evaluate are stored in reserved volatile registers inside STCB modules, and include
1) values in the record associated with sensor (stored in a register inside the module);
2) values in a fresh report from sensor (register r);
3) values of record is specified for for related sensors, and and are set to 0 (registers );
4) constants in (register c); and
The m logical operations in provide the instructions to recompute the outputs of following a fresh report from . On evaluation of the record is modified. Specifically,
1) replace the previous values ,
2) output replaces output of the previous execution of ,
3) and is replaced with
to reflect the staleness of the w outputs.
3.3. Synthetic Sensors
The sensors can be of three types―real sensors, state-report sensors, and synthetic sensors.
Real sensors are physical sensors in the SCADA deployment. Specifically, during the STCB deployment phase, real sensors are bound to SMs.
State reports from a foreign STCB system are seen by the receiving CM as a “sensor” report; as such reports are authenticated by the CM of the foreign deployment, state-report sensors are bound to foreign CMs.
Synthetic sensors are not bound to CMs or SMs. In a design record , if , implies that evaluation of results in the “synthesis of a fresh report from a (synthetic) sensor ”. Just as a fresh report from a sensor should be followed by evaluation of , a fresh report from synthetic sensor should be followed by evaluation of .
The primary motivation for using such synthetic sensors is to cater for complex where the fixed number of (m) instructions in may be insufficient. By specifying a synthetic sensor , evaluation of is continued as evaluation of . Similarly, evaluation of , specified by the designer as may be continued again, if necessary, by specifying .
3.4. Constants and Look-Up Tables
In general, the value ―which is a one way function of constants required to evaluate ―may be a function of multiple sets of l constants (l constants in each set). More specifically, is itself the root of a Merkle tree, where each leaf specifies a set of l constants. Any number of such leaves may exist, with a minimum of one.
Permitting an unlimited number of constants facilitates the use of look-up tables (LUT) for evaluating . An LUT for evaluating a complex function will have many sets of l constants―say where there are no practical limits on j. In each set two of the l constants will specify the range of the independent variable x, and one will specify the corresponding dependent variable y. For 2 dimensional LUTs of the form , four of the l constants will specify the ranges for the two independent variables, and a fifth constant will specify the corresponding value of y.
Special instructions (say LUT1 and LUT2) in the instruction set will specify the operands―the dependent and independent variables. As one possible design of the two instructions, instruction LUT1 interprets constants and as the range of the independent variable x and constant as the corresponding dependent variable y. Before the module executes the instruction LUT1, it expects the value of the input operand to be within the range of constants and ―else the execution will not proceed. If the input operand satisfies the requirement, then the value of the output operand is set to . Similarly, for LUT2, constants and specify the range of the first input operand ; specify the range of the second operand ; is the corresponding output y.
3.5. Instruction Set
Each instruction in specifies a logical operation (opcode), input operands (1, 2 or 3) depending on the type of opcode, and an output operand. The operands are restricted to be values in STCB registers , c, r, etc. Specifically, as the instructions in each can modify only values in register (the current state of sensor when is computed), only such values, and a temporary register T can be specified as output operands.
Examples of simple logical operations include traditional operations like addition/subtraction, logical operations, bit-wise operations, COPY, MOV, etc., and some special instructions like LUT1 and LUT2. Other potentially useful special instructions for SCADA systems is a bounds checking operation CHKB which checks a specific value is within set-points specified as constants and tolerance checking TOL where two values are verified to be close enough―within a tolerance specified by a third value.
Ultimately, a comprehensive specification for STCB modules will fix values like the number of related sensors q, number of outputs w, and the number of constants l (and hence the number of addressable values in the STCB registers). Such a specification will also include a detailed listing of all permitted opcodes and their interpretation. This paper, however, is restricted to describing some of the salient features of STCB modules.
3.5.1. Instructions for Thermal Power Plant Example
For the example system 3.1.1, let be the values reported by sensors required to determine the state of the system.
Let us assume that the state report expected by the stake-holder is a single bit value― if the system is in an acceptable state, and otherwise. According to the designer, the system is in an acceptable state if the following conditions are satisfied:
1) are all within thresholds respectively, where represents lower and higher thresholds for sensor .
2) The speed of turbine should be between upper and lower limits depending on the temperature and pressure inside the turbine cell. where is another threshold (the speed of the turbine should be a specific function of the pressure and temperature inside the turbine cell).
3) The position of the fire regulator should be between upper and lower limits depending on the current speed of turbine and the current temperature and pressure values inside the boiler cell. where is a function of the temperature and pressure of the boiler.
Let the maximum number of related sensors be ; the number of outputs of each be ; and the number of constants . A possible design of functions is as follows:
1) checks if are within thresholds respectively. The inputs are and constants. The output is written in . For evaluation of , is specified as a related sensor. As no other related sensors are used, and are set to 0.
2) performs LUT2 operation for function . The inputs are and an LUT leaf. The output is written . For chaining the output of (now stored in register ) to entire system state, the current value at output register of is copied to ‘s output register .
3) regards and as related sensors, and perform LUT2 operation on (an output of related sensor ) and value of other related sensor (now stored in register . The contents of output register of are copied to of.
4) checks if are within thresholds respectively. The inputs are and constants. The output is provided in register. is specified as a related sensor of.
5) regards as related sensor and perform a LUT2 operation on value of related sensor (value u in record) and (u in register), the output is stored in. The content register (of) are copied to (of).
6) regards and as related sensors. This function checks if are within thresholds respectively and the output is stored at register of. In addition, also performs the following steps:
a) if available at satisfies―the result of is now available at of related sensor; the result of the check is stored at of. The result of an AND operation performed on outputs in and is stored back in;
b) if available at satisfies―the output of is available at of related sensor; the 1/0 result is stored in register.
A result of AND operation of output registers of―, placed in of represents the entire state of system (acceptable-1, unacceptable-0).
The constants required to evaluate are provided as a leaf that can be proved against corresponding root. The designer specifies the following constant trees:
1) A tree with one leaf with 8 constants with root.
2) A tree with one leaf with 8 constants with root.
3) Three trees―one for a 2D LUT for function with root; one for a 2D LUT for function with root; and the third for 2D LUT for function with root.
With available information from the designer specifies the following design records:
where are hashes of instructions outlined in Table 1.
4. STCB Deployment
The deployer of the SCADA system is trusted to verify the integrity of the physical bindings between various sensors and SMs. Specifically, the deployer is required to permanently connect the outputs of every sensor to an SM, and apply tamper-evident seals to such connections. The deployer specifies binding records of the form
Table 1. Instructions for thermal power plant.
to convey that
1) measurements corresponding to sensor will be reported by a module; if is a real sensor, then is an SM; if is a state-report sensor, then is the identity of the CM of a foreign system. Recall that synthetic sensors are not bound to CMs or SMs and thus have no binding records. The deployer may be totally oblivious of the existence of such records;
2) is an achievable minimum round-trip duration between module, and the CM for the deployment; and
3) implies a record corresponding to a real sensor; implies report from a foreign STCB system with STCB descriptor.
To indicate that a module was deployed as the CM for the STCB system, the binding records includes a record for the CM as
Depending on the requirements specified by the stake holder, the deployer also specifies reporting records of the form
to indicate that the value (corresponding to sensor) should be reported to the stake-holder module, and that the report should indicate as the latest (time) “measurement” from state-report sensor.
All such and are included as leaves of a static Merkle hash tree constructed by the deployer―the STCB deployment tree. Let be the root of the deployment tree with leaves
The end-result of the design and deployment processes are two hash trees with roots and. The design root can be seen as concise representation of the physics of the system. The deployment root is a concise representation of the bindings between real-sensors & SMs, and state-report-sensors & CMs of foreign STCB deployments. The value
can now be seen as the root of a Merkle tree with two sub-trees―the design tree to the left, with root, and the deployment tree to the right, with root. The static value is the unique descriptor for a specific STCB deployment (deployed to secure a specific SCADA system), see Figure 4.
Note that two different deployments of identical SCADA systems may have the same design root, but will have different deployment root as different STCB modules will be used in the two deployments. If is collision resistant, no two STCB deployments will have the same descriptor.
Figure 4. Static descriptor is a concise summary of the specification for an STCB system.
4.1. STCB Operation
To commence operation in a SCADA deployment, the STCB module associated with the system is initialized with the STCB descriptor, and the value corresponding to the initial state of sensors. All leaves of the tree with root (which includes two sub-trees―the design tree and the deployment tree) are stored by U; all n leaves of the form corresponding to the initial states of all sensors are also stored by U.
During regular operation of the system the STCB manager U receives authenticated sensor reports from SMs (and possibly CMs from foreign deployments), and submits them one at a time, to the CM. Such reports take the form of a message authentication code (MAC) computed as
1) values indicates a report for sensor S to convey a fresh measurement v and measurement time t;
2) c is the clock-tick value of the module that created the report;
3) is STCB descriptor of the creator of the report (which was the value used to initialize the module);
4) is a shared secret between the creator (SM or foreign CM) and receiver (CM).
Along with the report from sensor (authenticated by a module), U also submits a binding record constructed by the deployer, consistent with the static root.
The STCB manager U is then required to submit other values required for the CM to evaluate. Such values include: 1) sensor state record; 2) the state records of the (up to) q related sensors; 3) a design record consistent with descriptor; 4) l constant values; and 5) m instructions.
As dynamic sensor states associated with the n sensors are maintained by U as leaves of dynamic a Merkle tree with root, sensor state records provided by U will be accepted as valid only if they can be verified to be consistent with a copy of the dynamic root stored inside the CM. The instructions will be accepted as valid only if where is specified in the design record. The set of constants will be accepted as valid only if can be demonstrated to be a node in the binary tree with root―where is specified in the design record.
The values used for evaluating have fixed reserved locations in the internal memory of the STCB modules, and are specified as the operands for the m instructions in A. An internal module function executes every instruction sequentially. The end result is the modification of the values in the sensor state record of. To “remember” such changes to the module modifies the dynamic root.
If untrusted U does not modify in the same manner, then will no longer be consistent with the root stored inside the CM. Thus, if U provides a fresh sensor report to the CM to invoke, it is forced to modify the state of exactly in the manner specified by the designer.
At any time, a reporting record consistent with can be provided as input, along with the state record consistent with, to request the STCB module to report (to module) values and in a state-report “sensor”.
If U does not invoke functions to evaluate any, the time associated with cannot be updated. Thus, in any state-report that directly or indirectly depends on sensor the time will be stuck at, and will thus be recognized as stale by the stake-holder (or CM of a foreign deployment).
4.1.1. STCB Interfaces
To interact with the STCB modules (SMs and CM), U employs various interfaces exposed by the modules. An interface is used to set up shared keys between modules that send and receive sensor/state reports. A function can be invoked to engage two modules in a hand-shake sequence within a duration, to enable the modules to estimate their respective clock offsets with an error less than. Interface is used to initialize a module as a CM for a deployment with identifier, or as a SM for a specific real sensor (in an STCB deployment with identifier).
A function is invoked to request a module to send a report to another module, in which is invoked to accept the report. Function is used to load various values necessary to evaluate some. Function is then invoked to evaluate and update the root stored inside the module. Functions and are utilized only in modules used as CMs.
To the extent the SMs and the deployer (who is trusted to verify and specify bindings between sensors and SMs) are trusted, we can trust the authenticity of sensor reports provided to the CM module. To the extent the CM module, and the designer (who is responsible for specifying the functions) are trusted, the stake-holder trusts the integrity of the state reports.
In practice, the SMs will need to be located as close as possible to the sensors to improve the security of the binding between sensors and SMs. The CM could be housed in any location―for example, a secure location far removed even from the SCADA control center. Components of U will need to be housed close to SMs, and close to the CM.
5. STCB Architecture
STCB modules have a unique identity, and a unique secret issued by a trusted key distribution center (KDC). Two modules (say) and can use their respective secrets (say) and issued by the KDC to compute a common pairwise secret  . Specifically, associated with a pair of modules is a non-secret value which is also made available by the KDC (for example, in a public repository), where is computed as
Module and can compute a common secret,
Every module possesses three values that are affected whenever a module is powered on:
1) a clock tick counter c, which is set to 0;
Peer module params
Sensor report register
Sensor state register
Related sensor states
2) a non-volatile session counter, which is incremented; and
3) a random secret, which is spontaneously generated inside the module.
5.1. Module Registers
Non-volatile storage inside the module is used to store three values―secret Q issued by the KDC, session counter, and module identity M.
Every module has the following reserved volatile registers:
The self-secret spontaneously generated inside the module remains privy only to the module. This secret is used for computing self-certificates. A self-certificate is a “memorandum to self”,―memoranda issued by the module for verification by itself at a later time, during the same session.
The register is the (160-bit) descriptor of the STCB system. The register is the dynamic root of a Merkle tree. Register p contains various parameters regarding a peer module from which 1) sensor/state report has to be received, or 2) a report has to be sent.
The register c is reserved for storing a set of l constants to be used to evaluate some corresponding to a sensor. Values that reflect the current state of should be stored in register. The states of q related sensors are stored in registers. Register r is reserved for a freshly received sensor report from.
For example, if at some instant of time,
1) the contents of the dynamic record corresponding to sensor is stored in location, and
2) records corresponding to sensor are stored on location respectively, and
3) values from a fresh report from sensor are stored in register r, then
The SM register indicates the sensor to which the module is bound (if the module is used as a SM) or is set to zero (if the module is used as a CM). If the module is used as an SM, the register always contains the (dynamic) sensor measurement.
STCB modules have a built in hash function which is reused extensively for binary tree computations, computing shared secrets, computing message authentication codes, and self-certificates.
An in-built function in every module can execute a set of m instructions, where each instruction (chosen from the set) identifies a) an opcode (type of logical operation), b) one or more input operands (from the values stored in registers, , c, T), depending on the type of opcode; and c) the output operand (, or temporary register T).
5.2. Initializing Peer Parameters
The pairwise secret K that a module M shares with a peer module is used for computing message authentication codes (MAC) for outgoing messages to peer, and for verifying incoming MACs from peer. Specifically, the secret used by M for computing outgoing MACs is where is the session counter of M; consequently, the secret used for verifying MACs received from is, where is the session counter of.
STCB modules possess reserved registers to store the identity of a peer module (to which it needs to send a message, or from which it needs to receive a message), the session counter of the peer, and MAC secrets and. Function exposed by a module can be invoked to populate values related to a peer module.
To facilitate secure communications between two modules and, should be invoked on, and should be invoked on. computes the pairwise secret using the public value; computes the same value without using (or as XORing with 0 results in no change).
5.3. Self Certificates
Two types of certificates are computed by STCB modules―binary hash tree certificates, and offset certificates.
5.3.1. Binary Tree Certificates
A binary hash tree certificate is computed as
Such a self-memoranda states that “x is a node in a binary hash tree with root y”, and “if then”.
STCB modules expose a function which evaluates a sequence of hash operations, and output a binary tree certificate.
A function concatenates two such certificates to create another certificate. Specifically, a certificate binding node x (and) to an ancestor y (and) and a certificate binding a node y (and) to an ancestor z (and) can be combined to a certificate binding node x (and) to an ancestor z (and).
The primary need for the function is due to restrictions on the size of inputs to module functions. Specifically, we can now place a hard limit on the size of the input―to (say) 8 hashes. For computing relationships between a node and the root of a tree with a million leaves (20 levels) three calls to (to produce certificates binding 1) a level zero node to a level 8 node, 2) level 8 node to a level 16 node, and 3) a level 16 node to a level 20 node) and two calls (to combine the first two certificates, and combine the resulting certificate with the third certificate) can be used.
5.3.2. Offset Certificates
An offset certificate is computed as
and states that the module M (that issued the certificate) had performed a handshake within a duration with a module, and had estimated the offset between their clocks to be. The certificate also states that the handshake was performed when it’s session counter was and the session counter of was. The offset certificate is issued by a function exposed by modules.
The function can be invoked on pair of modules to perform a handshake, after which the initiator of the handshake obtains an estimate of the clock offset of the responder. Before is invoked, should be invoked on both nodes to set up respective peer identities, session counters, and secrets and to be used for incoming and outgoing MACs.
The function has three inputs―a received MAC (from peer) with time stamp, and a time-stamp that was previously sent to peer (which had triggered the response from). The output of is either a MAC intended for the peer or a self-MAC intended for itself, indicating the estimated offset for peer.
is first invoked on the initiator with all inputs set to zero; the output of is where and are the current clock-counter and the session counter of the initiator and is the session counter of the responder.
is then invoked in the responder module with inputs. If the clock tick count of the responder is, the output is.
is then invoked on the initiator for the second time, at time, with inputs. The offset between the clock of the initiator and the responder can be estimated by the initiator to within the round-trip duration. The best estimate of the initiator is that, when the clock tick count of responder was, the clock tick count of the initiator was, and thus, the best estimate of the offset is.
The output of in this case is the offset certificate. This certificate can be provide to the module at any time to convince the module that “the offset to was estimated as with a tolerance of”, and that “the offset to was estimated when the session counters of the modules were and”. If any of the two session counters had changed since the certificate was issued, the certificate becomes invalid.
5.4. Initializing STCB Modules
Initializing an STCB module M implies initializing three internal registers reserved for values and. Specifically, a module M can be initialized to participate in a deployment only if a binding record for module M can be demonstrated to be consistent with.
As is a node in a tree with root, U can use to obtain a certificate
Similarly, as binding record that exists in deployment tree is used to initialize the register Now U can use to obtain a certificate
Function can be used to initialize a module M as
1) a CM for a deployment or
2) as an SM for a sensor S in deployment.
To initialize a module as a CM for the deployment, the inputs to are such that is a binary tree certificate relating a node and root (inputs and are set to 0).
To initialize the module as a SM the certificate, inputs and are non-zero. The binary tree certificate should relate and to prove to the module that “in an STCB system with descriptor, the module M (which is being initialized) is authorized to report measurements corresponding to sensor”. Accordingly, the register in the module M is set to.
For a module initialized as a SM for a sensor, the output of the sensor is physically connected to module using a tamper-evident seal by the deployer. The physical connection ensures that the sensor measurement is always available in the register of the SM. measurement Later (during regular operation) module cannot be initialized to act as a SM for any other sensor, as no record binding to can be demonstrated to be a part of the tree with root.
During regular operation, any dynamic sensor record can be loaded on to any register or using function. A record s provided as input is simply loaded onto register where j is the index specified. Specifically, the record is loaded only if the inputs and are consistent with dynamic root.
5.5. Sensor and State Reports
In an STCB deployment with SMs, and STCB module, the state reports are made available to a module associated with the stake holder. In general, can be seen as an STCB module associated with a different STCB system at a higher level of hierarchy.
The handshake sequence (which involves two calls to in the initiator module and one call in the responder module) the handshake sequence is orchestrated by U between
1) k responders, with (as initiator)
2) as initiator and as responder.
After the hand-shake sequences have been completed, k self-certificates of type OS are created by―one corresponding to each SM, and one self-certificate is created by. Such certificates indicate both the estimated offset os, and the maximum error in the estimate os.
Now modules are ready to exchange authenticated messages. More specifically, SMs send authenticated and time-stamped sensor reports to, and STCB module can send state reports to. Such messages exchanged between modules are computed as
where for sensor reports (from SMs to)
1) is the identity of a sensor that is bound to the module that created the report, and;
2) is the current clock tick count of the creator of the report;
3) is the value used to initialize the module.
A report from to stake holders is made in accordance with a record. Such a report can be created for the benefit of only if the state of sensor is loaded in register (or), to report the value and time. In such a report is a label assigned to the report, , and is the time associated with (or).
When is invoked in a SM (with register) the output is. When invoked on an STCB module, this function should be able to verify the existence of an appropriate reporting record R that authorizes the module to report one of the w values stored in register.
Corresponding to a state report for a “sensor”, while the STCB module that generates the report is initialized with the, the stake-holder module―which in general can be seen as the STCB module associated with a foreign STCB system may be initialized with a different descriptor. For the module to accept the report from a foreign system, the deployment tree in the foreign system with root should include a binding record
Function can be invoked in the STCB module to provide a fresh sensor report to the module. Specifically, before a sensor report from a module can be provided to a module, the function should be invoked on the receiving module to set. Now,
1) inputs to are used to compute the leaf hash of a binding record.
2) input is used to confirm that (the hash of the binding record) is a node in a tree with root.
3) inputs and are used to verify that.
4) and inputs and are used to verify that where if, or if.
The function returns error if, or on failure of verification of inputs or or.
Ultimately, the purpose of function is to receive two values and time corresponding to a sensor where is the offset corrected time associated with. Values and are then stored in a reserved register R for further processing.
5.6. Sensor Updates and Incremental State Evaluations
Values r in a fresh sensor report from are part of the inputs used to evaluate. Evaluation of results in modifications to the state of sensor. Before can be evaluated it should be ensured that appropriate values are loaded on to registers r, , , and c.
Recall that register r is populated by function. Function can be used to load the dynamic values S associated with any sensor on to any of the registers,.
The main purpose of is to evaluate corresponding to a sensor. For this purpose, verifies that all inputs required to evaluate are available. If a design record corresponding to sensor is
it should be ensured that the current state of sensor is loaded onto register, and the states of related sensors are loaded on to registers respectively. Other values required to evaluate are provided as inputs to.
1) the constants C should be such that is node in a tree with root. This can be demonstrated by providing a certificate.
2) the instructions A should be such that.
3) and should exist in the design record. More specifically, should be a node in the tree with root. This can be demonstrated by providing a certificate.
4) the values (the identity of the sensor to be updated) and (the sensor from which a fresh report has been received) should be the same.
During execution of the algorithm A, in situations where many options exist for choosing the set of constants C consistent with it is the responsibility of U to choose the correct set of constants that satisfy the range of the independent variable(s). On successful evaluation of the algorithm A the status of sensor in register will be updated. If x is the hash of register before the update, and if is the hash of register after the update, then a new root and a certificate should be provided as input satisfying
6. STCB Protocol
The STCB protocol can be seen as the actions to be performed by the untrusted STCB manager U to submit sensor reports from SMs and CMs of foreign deployments to the CM of the STCB deployment, obtain state-reports from, and submit such reports to stake holders (or CMs of foreign deployments).
6.1. Generation of Offset Certificates
The first step in the operation of an STCB deployment is that of performing handshakes between various modules to obtain offset certificates. In general one offset certificate will be generated for every module specified in the binding records of the deployment.
For a STCB system with n sensors (real, state-report and synthetic) the total number of binding records is, where is the number of synthetic records: no binding record exists for synthetic records, and one binding record is for the CM. The total number of distinct modules in general will be. Specifically, while there will exist one module corresponding to every real sensor, as a single CM may report multiple states, the number of state-report sensors may be greater that the number of foreign CMs that provide state reports.
A total of hand shake sequences will be invoked to obtain offset certificates. Recall that each such sequence begins with a challenge from the CM generated using to which a response is generated by invoking in the responder module, and finally, the response is submitted to the CM to generate the certificate. If any SM is rebooted, the offset certificate corresponding to the SM has to be regenerated. If the CM is rebooted, all offset certificates will need to be regenerated.
Before is invoked, should be invoked on both modules to set up the MAC secrets and.
6.2. Generating Static Binary Tree Certificates
The second step is for U to obtain binary tree certificates corresponding to all leaves of the static tree with root. Specifically, as U maintains the tree with root, U can readily provide the complementary nodes for any leaf in the static tree to function, and obtain certificates of the form
where is the cryptographic hash of the sth leaf of the static tree. The total number of static leaves is where
1) n is the number of design records: one for each sensor (real, state-report, or synthetic);
2) one leaf corresponds to the value in the design tree;
3) is the number of binding records (including one for the CM); and
4) m is the number of reporting records.
6.3. Initialization and Regular Operation
The third step is the initialization of STCB modules to operate in deployment―by invoking on each STCB module. For initializing the modules the two binary tree certificates are required: one linking to, and one linking a binding record for the module with the static root.
On completion of the three steps, the STCB manager maintains a dynamic Merkle tree with leaves as sensor records. As the initial values of such records are specified by the designer, the root of the tree should be the same as the initial value stored by the STCB module.
Once all STCB modules have been initialized, sensor reports from SMs (or CMs of foreign deployments) are submitted to the CM as and when they are received. In general, not all sensors may report at the same frequency.
As all SMs send messages only to the CM, needs to be invoked only once on each SM (which was already performed before invoking to generate offset certificates).
To create a sensor report from an SM, U will invoke on an SM. Some sensor reports corresponding to state reports from foreign CMs may also be received by U.
Once a sensor report for some sensor has been received, U is expected to make appropriate modifications to the sensor state, and accordingly, modify the Merkle tree maintained by U. Let be the change in the root of the dynamic tree, corresponding to the modification triggered by the received sensor report. If and, U can readily determine complementary nodes required to obtain the certificate
The STCB manager invokes followed by to submit the report to the CM. Recall that inputs to include a MAC received from an SM/CM, a binding record along with a certificate linking the record to, and an offset certificate corresponding to the reporting module.
Corresponding to the sensor state record for sensor and q related sensors U invokes to obtain certificates binding the sensor state records to dynamic root. Following this, the STCB manager uses up to times to load the 1) previous sensor state on to register and 2) the states of related sensors on to registers.
Finally, the STCB manager invokes. Recall that the inputs to include obtained as per Equation (25), a certificate binding design record to STCB descriptor, the set of m instructions, a set of l constants, and a certificate binding the constants to a value in the design record. Only if the modification computed by the CM is exactly the same as that performed by the STCB manager U will the update be successful in modifying the dynamic root stored inside the CM to.
At any time the STCB manager can invoke a to load a state record consistent with on to register. Now can be invoked to create a state-report. Note that when is invoked on a CM a certificate binding a reporting record to the static root should be provided as input.
The ever growing complexity of systems poses a severe threat―the possibility of hard-to-detect hidden functionality that can be exploited to take control of the system. Current strategies for securing SCADA systems are predominantly focused on development of suitable intrusion detection systems. Such security measures ignore the very real possibility of hidden functionality in the intrusion detection systems themselves.
In the proposed approach to secure SCADA systems only STCB modules are trusted to provide the assurance that “no attack will go undetected”. The proposed approach involves three stages―a design process carried out by a designer with good domain knowledge, a deployment process carried out by a security professional, and regular operation of the STCB system. The designer and deployer together specify a concise description of the system. During regular operation, an STCB module reports the state of a system identified as to stake-holders.
Some of the important features of the STCB approach that make it well suited for any SCADA system of any size include
1) the ability to support hierarchical deployments;
2) the ability to support any type of function―if necessary through the use of 1-D, or 2-D look up tables (which are also specified as leaves of the design tree); and
3) the ability to specify synthetic sensors.
Such features are intended to enable the use of STCB modules for securing any SCADA system.
The first pre-requisite for deployment of STCB based security solutions is the actual availability of STCB modules/chips. Towards this end, the work that has been performed is a small first step―identification of a functional specification of such chips. In arriving at an appropriate functional specification, some our main goals have been
1) reducing computational and memory requirement inside STCB chips;
2) reducing interface complexity (size of inputs and outputs to/from the STCB chips); and
3) simplifying the STCB protocol―which is a specification of a sequence of interactions with the STCB modules―to realize the desired assurances.
The proposed functional specification (for STCB modules) is merely a specification, and not the specification. Just as there are numerous ways to realize a block-cipher or a hash function, there are numerous ways to arrive at a “set of STCB functions” (which can be leveraged to realize the same assurances). The functional specification in this paper is however the first of its kind.
This research was funded by the Department of Homeland Security (DHS)-sponsored Southeast Region Research Initiative (SERRI) at the Department of Energy’s Oak Ridge National Laboratory.
1Stuxnet was able to use about twenty zero-day vulnerabilities  in a popular commercial SCADA-system design software to gain control over the plant. Stuxnet has the potential to turn off pumps, control actuators, and still report that everything is normal. Due to its popularity, this worm is freely available, and could lead to more drastic attacks upon re-engineering.
2“A hacker calling himself ‘Prof’ posted screen shots from his computer showing him logged onto the control system of a water utility in the Texas town of South Houston”  .
3The SCADA systems data acquisition server was infected through the corporate network. The entry point for the Slammer worm  was discovered to be a laptop.