JTTs  Vol.3 No.3 , July 2013
Reliable Train Network with Active Supervisor
Abstract: In this paper, a new reliable hierarchical model is suggested for a two-wagon train Networked Control System. Each wagon has a Controller that carries the control load and an Entertainment server that handles the entertainment. A supervisory controller runs on top of the two controllers and the two entertainment servers. Contrary to a similar model in the literature, the Supervisory node replaces a Controller as soon as it fails (Active Supervisor). All system states are analyzed and simulated using OPNET. It is shown that, for all states, this architecture has zero control packets dropped and the end-to-end delay is below the maximum target delay. A comparison between this Active model and the other model in the literature is presented. It is found that the entertainment in this new architecture is kept available for the passengers in more of the system states when compared to the architecture previously presented in the literature.

1. Introduction

In industrial and transportation systems, Networked Con- trol Systems (NCSs) are currently widely applied [1-4]. Previously, deterministic protocols that ensure meeting critical real-time delays and no packet loss for the small control packets such as Controller Area Network (CAN), PROFIBUS and PROFINET, were used [5,6]. Ethernet (IEEE Std. 802.3) [7], despite its non-deterministic na- ture, is a promising protocol that is being applied in NCS [8,9]. Packet Scheduling and reformatting the Ethernet packet were the main approaches to overcome the non- deterministic nature of Ethernet [10,11]. Furthermore, Rockwell automation, the ODVA, EtherNet/IP, TT Eth- ernet and FTT Ethernet have implemented different mo- difications to the protocol, some of which are in the course of standardization [12-16].

Train operation, safety, collision avoidance and exchange of information are the main tasks to be handled by the train networks [17]. As the demand for more and more entertainment services on board of trains is increasing, Ethernet became a promising technology.

Due to its large bandwidth, Gigabit Switched Ethernet was successfully tested as a one-wagon train network to carry both control and entertainment loads within a wagon [18]. The entertainment load is represented as video streams and running Wi-Fi applications [18]. The network model was further enhanced to increase its reliability at the Server level [19]. In [19], the authors used

one server to handle each type of load. The control load was handled by the Control Server (Controller) and the entertainment was handled by the entertainment server. The entertainment server acted as a backup for the Controller; it would handle the control load in case of the Controller failure [19]. Simulations proved that the requirements on the control packet end-to-end delays were met in both [18,19]. The reliability of the network was further enhanced and its performability was calculated [20,21].

In this research, a two-wagon train control network using unmodified Ethernet is presented. A hierarchal structure at the Server level, including an active supervisor, is simulated. As soon as a controller fails, this active supervisor replaces it and carries its load. The system is modeled using OPNET network simulation tool [22]. The control packets are sampled at different sampling periods [23]. Furthermore the entertainment load is simulated as compressed DVD video streaming and 4 different Wi-Fi applications which are web-browsing, FTP, database and email access. Additionally, the network is simulated in all possible faulty server states as well as the fault-free state. It will be shown that the architecture with an active supervisor will function correctly irrespective of server failures. The architecture will then be compared to the one presented in [24].

The rest of this paper is organized as follows. Section 2 summarizes the previous work done in the field of hierarchal Ethernet train networks. In Section 3, the newly proposed model will be illustrated. Simulated scenarios and their outcomes shall be discussed in Section 4. Section 5 concludes this paper.

2. Background

In [24,25], the authors proposed a Gigabit Ethernet train network using the unmodified IEEE 802.3 standard; a hybrid model was introduced in which multiple control sampling periods were used. More details about this hybrid model are presented below.

2.1. Hybrid Network Model

In the IEC 61375 Standard (Trains network Standard), different sensor/actuator sampling periods are specified [23]. In [18,19], only one sampling period is simulated per network scenario; one scenario uses the 1 ms sampling rate while the other scenario uses the 16 ms sampling rate. According to [23], 16 ms is the sampling period of the majority of train sensors/actuator and 1ms is the smallest sampling period in a train network. In [24], the authors formulated the network to contain different sampling periods, specifically combination of 16 ms and 1 ms.

A train wagon contains a total of 250 sensor/actuator nodes. In [18,19], only 1:1 sensor:actuator ratio is simulated, but in [24,25], the network contained more diverse combinations of sensors:actuators to simulate a more realistic scenario. The sensor/actuators were divided into 3 different groups with different sensor:actuator ratios.

2.2. Passive Supervisor

In [24], the authors presented a hierarchal control structure including having a Passive Supervisor node in addition to the 2 servers per wagon, resulting in a total of 4 servers and a supervisor. This node is assumed to be the most reliable node in the network; it acts as a backup for any controller after all other Servers/Controllers have failed.

In case the Controller fails in one of the wagons, the Entertainment Server drops its main functionality (handling the entertainment load) in order to handle the wagon’s control load. For the Entertainment Server to handle the control load, the sensors send 4 streams of their data to the Controllers and the Entertainment Servers. Only the Server handling the control load is the one responsible for making the control decision and sending the control action to the corresponding actuator. If both Servers in a wagon (Wi) fail (the Controller Ki and the Entertainment Server Ei), the Entertainment Server (Ej) of the other wagon (Wj) drops its entertainment load and handles the Wi control load [24].

If three of the four Servers fail, then the remaining operational server in a wagon (Wj) (either Controller Kj or Entertainment Server Ej) handles its own wagon load. This is also the point at which the Supervisor starts to come into action and handles the control load of the other entirely-failed wagon Wi. The sensors of both wagons start to send their data to the Supervisor node after the failure of three Servers. In case of the failure of all four servers, the Supervisor node will handle the control load of both wagons; this is again under the assumption that the Supervisor will be the last to fail among all Servers/ Controllers.

3. Proposed Network Model

The same network architecture presented in [24] is used in this study for comparison purposes. The network consists of 2 single wagon networks interconnected via a switch (Intermediate Switch) and a 10 Gb link to a Supervisor Node. This 2-wagon model represents the main train building unit such as the Siemens Desiro diesel or the Siemens electric multiple unit (DMU or EMU) [26]. The two-wagon train unit network model is illustrated in Figure 1.

There are 60 seats per wagon [27]. In each wagon, all nodes are connected to the wagon’s Main Switch (MS) via Gigabit Ethernet fabric. The forwarding rates of the two Main Switches and the Intermediate Switch are 6.6 Mpps [24]. This rate is much lower than 38.2 Mpps, the forwarding rate of the commonly available switches in the market such as the Cisco Catalyst 3560 Gigabit Ethernet switch [28].

Each wagon has 250 nodes (Sensors/ actuators) divided into 3 groups with different sensor:actuator ratio as in [24,25]. Group 1 (G1) has a 1:1 sensor:actuator ratio, Group 2 (G2) has a 2:1 sensor:actuator ratio and Group 3 (G3) has a 3:1 sensor:actuator ratio. There are 60 nodes in G1 running at a sampling period of 1ms, 150 nodes in G2 and 40 nodes in G3. Nodes in G2 and G3 are running at a sampling period of 16 ms [24,25]. To simulate the worst end-to-end delay for the control packets, the sensors/actuators are located to ensure maximum propagation time.

Moreover, the same entertainment services are run in the form of 60 Wi-Fi nodes (one laptop per seat), running 4 different applications as in [24]: web-browsing, FTP, database and e-mail access. These nodes/laptops are connected to the wagon MS via a wireless router. Also, 60 Compressed DVD video streams are running at a rate of 5 Mbps connected to the MS of each wagon [29].

In the fault-free case, there are 2 operational Servers in each wagon; one Control Server or Controller (K) and another Entertainment Server (E). They handle the control and the entertainment loads of the wagon respectively. A watchdog signal of 32 bytes is sent every 1ms in between all four Servers and each other as well as with

Figure 1. Two-wagon train model.

the Supervisor node. This watchdog enables all Servers and the Supervisor to be aware of the status of the other Servers. Also, there are 4 cameras per wagon located at each door to enhance safety [30]. They send video signals directly to the Supervisor for safety monitoring purposes by the train driver. The reliability of the Supervisor is assumed to be the highest as in [24] to ensure it has the lowest probability of failure and therefore the longest lifetime for comparison purposes.

Active Supervisor

In this research, unlike the previous system, the Supervisor acts as an Active Supervisor. It is the primary backup for the Controllers (Ks) in each wagon. If the Controller in either wagon fails, the Supervisor handles the control load of the wagon. Additionally, in case the Controller of the second wagon fails, the Supervisor in such case will handle the control load of both wagons.

The fault-tolerance relation between the Controllers in both wagons is no longer present, i.e., they no longer carry each other’s control load. Furthermore, the Entertainment Servers do not act as backups to the Controllers and do not handle any control load unlike the presented case in [24]. In regard to entertainment services, the same conservative approach followed in [24] is still applied. If the Entertainment Server fails, the entertainment services are dropped due to the high safety requirements in train operations. However, the Entertainment Server does not drop the entertainment load to handle any control load.

The sensors send their data only to their corresponding Controller and to the Supervisor. So, for example, the sensors in wagon 1 only send their data to Controller 1 (K1) and the Supervisor as these are the only nodes to handle the control load.

4. Simulation Outcomes

In [24], the simulations presented the outcomes for the unique states which the network experiences. The same approach is used in this research. However, after analyzing the unique states of functioning Servers at a time, only 10 scenarios need to be simulated using OPNET network simulator. Please note that the simulation of only the unique states means that all possible scenarios are accounted for by the simulations, because they are represented by one of those unique states.

4.1. Simulated Scenarios

In [24], 11 unique states were simulated. In this research however, only 10 states are needed. Table 1 shows the 10 unique states that have been simulated using OPNET. The scenarios simulate all the possible combinations of operational Servers that the network can go through. The main measuring metric for network performance is the control packet end-to-end delay. As shown in Table 1, all end-to-end delays are below the sampling periods of the corresponding group, thus, fulfilling the delay requirement [31]. Also, in the column labeled Entertainment, the entertainment services are on in the specified wagons.

In [24], there were 11 scenarios simulated while here, only 10 states are simulated. This is due to the fact that, in [24], Scenario EiS appeared twice. In the first case the Entertainment Server (Ei) was carrying the control load of Wagon W i , while in the second case, it was carrying the control load of the other Wagon Wj. As the Entertainment Server in this research does not handle any control load, consequently, both cases end up being identical. In the active case, the supervisor node carries the control load of both wagons while each entertainment server handles its own wagon entertainment load.

All the results were obtained after a 95% confidence analysis. The results shown represent the mean value of the maximum packet end-to-end delay obtained from all runs. The maximum deviation (Δ) from these means is 0.411 µs. Furthermore, the delays for the door cameras and the video streaming were below the acceptable delay requirements. As per [32], the OPNET results presented in this research are comparable to hardware implementation outcomes.

For completeness, Table 2 has the corresponding data for the Passive Supervisor architecture presented in [24]. Figures 2-4 illustrate a sample of the OPNET results for the Active Supervisor architecture.

In all figures, the x-axis is the simulation time in seconds and the y-axis is the delay in seconds. The red dots in the graphs are the delay from the sensor to the controller and the delay from the controller to the actuator are the blue dots.

Table 1 . Total end-to-end delay (µs)—active supervisor.

Table 2. Total end-to-end delay (µs)—passive supervisor [24].

Figure 2. Fault-free scenario (KiKjEiEjS)—G2.

Figure 3. One controller and one entertainment server in different wagons (KiEjS)—G3.

4.2. Outcomes Comparison

When comparing the outcomes with the results in [24], it can be noticed that, in the fault-free scenario (KiKjEiEjS), the delay is generally lower in the Active supervisor architecture. This is due to the fact that the sensors send their data to their corresponding controller and the supervisor node only rather than 4 different streams to all Servers (2 Controllers and 2 Entertainment Servers).

In other scenarios such as KiEiEjS, in [24], the Controller node (Ki) carries the control load of wagon Wi while the Entertainment Server (Ej) has dropped its entertainment load and is handling the control load of Wj.

Figure 4. Supervisor only (S)—G1.

In this research, since Ej does not drop its entertainment load, S handles the control load of wagon Wj. Hence, the passenger can still enjoy the on-board services and will not be affected by the failure that occurred.

Also, in case of EiEjS in the active model, the supervisor S handles the control load of both wagons but the entertainment services are still running in both wagons. In [24], each of the Entertainment Servers carries its own wagon control load after dropping the entertainment load. Therefore, the delay in the Active Supervisor case is somewhat higher when compared to the Passive supervisor case.

Comparing another scenario such as “S”, the delay is the same in the active or passive models since all the entertainment is dropped in both cases and the sensors only send their data to the supervisor. When monitoring the forwarded traffic by the intermediate switch it was verified that the same amount of traffic (133.9 Mbps) was forwarded.

The main benefit when comparing the active supervisor case to the passive supervisor case presented in [24] is that the passenger will only be affected by a failure when the entertainment server of the wagon fails. Consider scenarios KiKjEiEjS, KiEiEjS and EiEjS; in these three scenarios, the entertainment is functional in both wagons. On the other hand, in the passive scenario, the entertainment is functional in both wagons in the fault free scenario only. Also, only one wagon will experience the failure of the entertainment in scenarios KiKjEiS, KiEiS, KiEjS, and EiS. When comparing with the passive scenario, the passengers will enjoy the entertainment services in one wagon only in scenarios KiEiEiS and KiKjEiS.

Table 3 shows a comparison between the Active Su- pervisor architecture and the passive Supervisor archi-

Table 3. Number of states with enabled entertainment.

tecture with respect to the number of states that have the entertainment enabled in either one or two wagons. Due to the symmetric nature of the network, the states: KiEiEjS, KiKjEiS, KiEiS, KiEjS, KiS and EiS are duplicated. Consequently, in the Active Supervisor architecture, the 10 states are expanded to 16 and, in the Passive Supervisor architecture, the 11 states are expanded to 18.

Note finally that, in the Active Supervisor architecture, when the controller of a wagon fails, its only backup is the Supervisor node. In [24], for each failing controller, there are 4 other machines that act as backups.

5. Conclusions

Ethernet is an interesting technology in the field of Networked Control Systems. The use of Gigabit Switched Ethernet on-board of trains has already been reported in the literature. Previously, a hybrid network model was proposed for a two-wagon network model. Furthermore, a hierarchal structure at the controller level was proposed. However, the supervisor node was a passive one and it only handled the control load as a last resort.

In this paper, a new role was defined for the supervisor. As soon as either Controller fails, it acts as a backup for that failed Controller and handles its control load; therefore, it became an active node. For safety purposes, no other node acted as backup for any failing Entertainment Server; the entertainment was dropped when the Entertainment server failed.

All possible combinations of operational Servers/ Controllers were simulated using OPNET. It was shown that the control packet end-to end delays met the control requirements and that no packet was dropped. The network was proven to function properly even after the failure of all Controllers and Entertainment Servers; the Supervisor was able to successfully carry the control load of both wagons. It was also shown that this architecture has the advantage of keeping entertainment services operational for a longer period when compared to other hierarchical architectures in the literature.

Cite this paper: Ibrahim, M. , Daoud, R. and Amer, H. (2013) Reliable Train Network with Active Supervisor. Journal of Transportation Technologies, 3, 214-219. doi: 10.4236/jtts.2013.33022.

[1]   N. Navet, Y. Song, F. Simonot-Lion and C. Wilwert, “Trends in Automotive Communication Systems,” Proceedings of the IEEE, Vol. 93, No. 6, 2005, pp. 1204-1223. doi:10.1109/JPROC.2005.849725

[2]   R. M. Daoud, H. M. Elsayed and H. H. Amer, “Gigabit Ethernet for Redundant Networked Control Systems,” Proceedings of the IEEE International Conference on Industrial Technology ICIT, Hammamet, 8-10 December 2004, pp. 869-873.

[3]   J. D. Decotignie, “Ethernet-Based Real-Time and Industrial Communications,” Proceedings of the IEEE, Vol. 93, No. 6, 2005, pp. 1102-1117. doi:10.1109/JPROC.2005.849721

[4]   R. M. Daoud, H. H. Amer, H. M. Elsayed and Y. Sallez, “Fault-Tolerant On-Board Ethernet-Based Vehicle Networks,” Proceedings of the 32nd Annual Conference of the IEEE Industrial Electronics Society IECON, Paris, 6-10 November 2006, pp. 4662-4665.

[5]   “CAN in Passenger and Cargo Trains,” CAN in Automation, 2011.

[6]   Official Site for PROFIBUS and PROFINET.

[7]   IEEE 802.3 Standard.

[8]   T. Skeie, S. Johannessen and C. Brunner, “Ethernet in Substation Automation,” IEEE Control Systems, Vol. 22, No. 3, 2002, pp. 43-51. doi:10.1109/MCS.2002.1003998

[9]   F. L. Lian, J. R. Moyne and D. M. Tilbury, “Performance Evaluation of Control Networks: Ethernet, ControlNet, and DeviceNet,” IEEE Control Systems Magazine, Vol. 21, No. 1, 2001, pp. 66-83. doi:10.1109/37.898793

[10]   S. H. Lee and K. H. Cho, “Congestion Control of HighSpeed Gigabit-Ethernet Networks for Industrial Applications,” Proceedings of the IEEE International Symposium on Industrial Electronics ISIE, Pusan, 12-16 June 2001, pp. 260-265.

[11]   J. S. Meditch and C. T. A. Lea, “Stability and Optimization of the CSMA and CSMA/CD Channels,” IEEE Transactions on Communications, Vol. 31, No. 6, 1983, pp. 763-774. doi:10.1109/TCOM.1983.1095881

[12]   ODVA, “EtherNet/IP Adaptation on CIP,” CIP Common, 2007.

[13]   Allen-Bradley, “EtherNet/IP Performance and Application Guide,” Rockwell Automation Application Solution, 2003.

[14]   J. Ferreira, P. Pedreiras, L. Almeida and J. Fonseca, “Achieving Fault-Tolerance in FTT-CAN,” Proceedings of the 4th IEEE International Workshop on Factory Communication Systems WFCS, Vasteras, August 2002, pp. 125-132. doi:10.1109/WFCS.2002.1159709

[15]   P. Pedreiras, L. Almeida and P. Gai, “The FTTEthernetProtocol: Merging Flexibility, Timeliness and Efficiency,” Proceedings of the IEEE Euromicro Conference on Real-Time Systems ECRTS, Vienna, 19-21 June 2002, pp. 134-142.

[16]   K. Steinhammer and A. Ademaj, “Hardware Implementation of the Time-Triggered Ethernet Controller,” Embedded System Design: Topics, Techniques and Trends, Vol. 231, Springer, Boston, 2007, pp. 325-338. doi:10.1007/978-0-387-72258-0_28

[17]   G. Krambles, J. J. Fox and W. J. Bierwagen, “Automatic Train Control in Rapid Transit,” 1976.

[18]   M. Aziz, B. Raouf, N. Riad, R. M. Daoud and H. M. ElSayed, “The Use of Ethernet for Single On-Board Train Network,” Proceedings of the IEEE International Conference on Networking, Sensing and Control ICNSC, Hainan, 6-8 April 2008, pp. 1430-1434.

[19]   M. Hassan, S. Gamal, S. Louis, G. F. Zaki and H. H. Amer, “Fault Tolerant Ethernet Network Model for Control and Entertainment in Railway Transportation Systems,” Proceedings of the Canadian Conference on Electrical and Computer Engineering CCECE, Niagara Falls, 4-7 May 2008, pp. 771-774.

[20]   T. K. Refaat, H. H. Amer and R. M. Daoud, “Reliable Architecture for a Two-Wagon Switched Ethernet Train Control Network,” Proceedings of the 3rd IEEE International Congress on Ultra Modern Telecommunications and Control Systems ICUMT, Budapest, 5-7 October 2011, pp. 1-7.

[21]   T. K. Refaat, H. H. Amer, R. M. Daoud and M. S. Moustafa, “On the Performability of On-Board Train Networks with Fault-Tolerant Controllers,” Proceedings of the IEEE International Conference on Mechatronics ICM, Istanbul, 13-15 April 2011, pp. 743-748.

[22]   Official Site for OPNET.

[23]   “Train Communication Network, IEC 61375,” International Electrotechnical Committee, Geneva, 1999.

[24]   M. Hassan, R. M. Daoud and H. H. Amer, “Passive Supervisor for Railway Fault-Tolerant Ethernet Networked Control Systems,” Proceedings of the IEEE International Conference on Emerging Technologies and Factory Automation ETFA, Toulouse, 5-9 September 2011, pp. 1-4.

[25]   T. K. Refaat, M. Hassan, R. M. Daoud and H. H. Amer, “Ethernet Implementation of Fault Tolerant Train Network for Entertainment and Mixed Control Traffic,” Journal of Transportation Technologies, Vol. 3, No. 1, 2013, pp. 105-111. doi:10.4236/jtts.2013.31010

[26]   H. Glickenstein, “New Developments in Land Transportation,” IEEE Vehicular Technology Magazine, Vol. 5, No. 2, 2010, pp. 17-20. doi:10.1109/MVT.2010.936653

[27]   “Trains Reference List,” Siemens AG Transportation Systems Trains, pp. 41-46.

[28]   Official Site for Cisco Catalyst 3560 Series Switch.

[29]   Video Charg, “Description of the Supported Formats,” 2010.

[30]   J. D. Swanson and C. Thornes, “Light Rail Transit Systems,” IEEE Vehicular Technology Magazine, Vol. 5, No. 2, 2010, pp. 22-27. doi:10.1109/MVT.2010.936645

[31]   R. M. Daoud, “Wireless and Wired Ethernet for Intelligent Transportation Systems,” D.Sc. Dissertation, Universite de Valenciennes et du Hainaut-Cambresis, Valenciennes, 2008.

[32]   L. Seno, S. Vitturi and F. Tramarin, “Experimental Evaluation of the Service Time for Industrial Hybrid (Wired/ Wireless) Networks under Non-Ideal Environmental Conditions,” Proceedings of the IEEE International Conference on Emerging Technologies and Factory Automation ETFA, Toulouse, 5-9 September 2011, pp. 1-8.