Paper | Title | Other Keywords | Page |
---|---|---|---|
MOBPP03 | Fault Tolerant, Scalable Middleware Services Based on Spring Boot, REST, H2 and Infinispan | database, controls, operation, network | 33 |
|
|||
Control systems require several, core services for work coordination and everyday operation. One such example is Directory Service, which is a central registry of all access points and their physical location in the network. Another example is Authentication Service, which verifies callers identity and issues a signed token, which represents the caller in the distributed communication. Both cases are real life examples of middleware services, which have to be always available and scalable. The paper discusses design decisions and technical background behind these two central services used at CERN. Both services were designed using latest technology standards, namely Spring Boot and REST. Moreover, they had to comply with demanding requirements for fault tolerance and scalability. Therefore, additional extensions were necessary, as distributed in-memory cache (using Apache Infinispan), or Oracle database local mirroring using H2 database. Additionally, the paper will explain the tradeoffs of different approaches providing high-availability features and lessons learnt from operational usage. | |||
![]() |
Slides MOBPP03 [6.846 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-MOBPP03 | ||
About • | paper received ※ 27 September 2019 paper accepted ※ 08 October 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
MOPHA112 | Improving Perfomance of the MTCA System by use of PCI Express Non-Transparent Bridging and Point-To-Point PCI Express Transactions | controls, embedded, ISOL | 480 |
|
|||
The PCI Express Standard enables one of the highest data transfer rates today. However, with a large number of modules in a MTCA system and an increasing complexity of individual MTCA components along with a growing demand for high data transfer rates to client programs performance of the overall system becomes an important key parameter. Multiprocessor systems are known to provide not only the ability for higher processing bandwidth, but also allow greater system reliability through host failover mechanisms. The use of non-transparent bridges in PCI systems supporting intelligent adapters in enterprise and multiple processors in embedded systems is a well established technology. There the non-transparent bridge acts as a gateway between the local subsystem and the system backplane. This can be ported to the PCI Express standard by replacing one of the transparent switches on the PCI Express switch with a non-transparent switch. Our experience of establishing non-transparent bridging in MTCA systems will be presented. | |||
![]() |
Poster MOPHA112 [0.452 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-MOPHA112 | ||
About • | paper received ※ 10 September 2019 paper accepted ※ 03 November 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
MOPHA160 | Enabling Data Analytics as a Service for Large Scale Facilities | simulation, data-analysis, software, experiment | 614 |
|
|||
Funding: UK Research and Innovation - Science & Technology Facilities Council (UK SBS IT18160) The Ada Lovelace Centre (ALC) at STFC is an integrated, cross-disciplinary data intensive science centre, for better exploitation of research carried out at large scale UK Facilities including the Diamond Light Source, the ISIS Neutron and Muon Facility, the Central Laser Facility and the Culham Centre for Fusion Energy. ALC will provide on-demand, data analysis, interpretation and analytics services to worldwide users of these research facilities. Using open-source components, ALC and Tessella have together created a software infrastructure to support the delivery of that vision. The infrastructure comprises a Virtual Machine Manager, for managing pools of VMs across distributed compute clusters; components for automated provisioning of data analytics environments across heterogeneous clouds; a Data Movement System, to efficiently transfer large datasets; a Kubernetes cluster to manage on demand submission of Spark jobs. In this paper, we discuss the challenges of creating an infrastructure to meet the differing analytics needs of multiple facilities and report the architecture and design of the infrastructure that enables Data Analytics as a Service. |
|||
![]() |
Poster MOPHA160 [1.665 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-MOPHA160 | ||
About • | paper received ※ 30 September 2019 paper accepted ※ 10 October 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
TUBPL05 | RecSyncETCD: A Fault-tolerant Service for EPICS PV Configuration Data | operation, network, EPICS, controls | 714 |
|
|||
Funding: Work supported by the U.S. Department of Energy Office of Science under Cooperative Agreement DESC0000661 RecCaster is an EPICS module which is responsible for uploading Process Variables (PVs) metadata from the IOC database to a central server called RecCeiver. The RecCeiver service is a custom-built application that passes this data on to the ChannelFinder, a REST-based search service. Together, RecCaster and RecCeiver form the building blocks of RecSync. RecCeiver is not a distributed service which makes it challenging to ensure high availability and fault-tolerance to its clients. We have implemented a new version of RecCaster which uploads the PV metadata to ETCD. ETCD is a commercial off-the-shelf distributed key-value store intended for high availability data storage and retrieval. It provides fault-tolerance as the service can be replicated on multiple servers to keep data consistently replicated. ETCD is a drop-in replacement for the existing RecCeiver to provide data storage and retrieval for PV metadata. Also, ETCD has a well-documented interface for client operations including the ability to live-watch the PV metadata for its clients. This paper discusses the design and implementation of RecSyncETCD as a fault-tolerant service for storing and retrieving EPICS PV metadata. |
|||
![]() |
Slides TUBPL05 [1.099 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-TUBPL05 | ||
About • | paper received ※ 26 September 2019 paper accepted ※ 02 October 2020 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
TUBPR01 | The Distributed Oscilloscope: A Large-Scale Fully Synchronised Data Acquisition System Over White Rabbit | network, HOM, status, controls | 725 |
|
|||
A common need in large scientific experiments is the ability to monitor by means of simultaneous data acquisition across the whole installation. Data is acquired as a result of triggers which may either come from external sources, or from internal triggering of one of the acquisition nodes. However, a problem arises from the fact that once the trigger is generated, it will not arrive to the receiving nodes simultaneously, due to varying distances and environmental conditions. The Distributed Oscilloscope (DO) concept attempts to address this problem by leveraging the sub-nanosecond synchronization and deterministic data delivery provided by White Rabbit (WR) and augmenting it with automatic discovery of acquisition nodes and complex trigger event scheduling, in order to provide the illusion of a virtual oscilloscope. This paper presents the current state of the DO, including work done on the FPGA and software level to enhance existing acquisition hardware, as well as a new protocol based on existing industrial standards. It also includes test results obtained from a demonstrator based on two digitizers separated by a 10 km optical fiber, used as a showcase of the DO concept. | |||
![]() |
Slides TUBPR01 [10.026 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-TUBPR01 | ||
About • | paper received ※ 27 September 2019 paper accepted ※ 10 October 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEDPL02 | AliECS: A New Experiment Control System for the Alice Experiment | controls, detector, experiment, operation | 956 |
|
|||
The ALICE Experiment at CERN LHC (Large Hadron Collider) is undertaking during Long Shutdown 2 in 2019-2020 a major upgrade, which includes a new computing system called O² (Online-Offline). To ensure the efficient operation of the upgraded experiment along with its newly designed computing system, a reliable, high performance and automated experiment control system is being developed with the goal of managing all O² synchronous processing software, and of handling the data taking activity by interacting with the detectors, the trigger system and the LHC. The ALICE Experiment Control System (AliECS) is a distributed system based on state of the art cluster management and microservices which have recently emerged in the distributed computing ecosystem. Such technologies will allow the ALICE collaboration to benefit from a vibrant and innovating open source community. This communication illustrates the AliECS architecture. It provides an in-depth overview of the system’s components, features and design elements, as well as its performance. It also reports on the experience with AliECS as part of ALICE Run 3 detector commissioning setups. | |||
![]() |
Slides WEDPL02 [2.858 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-WEDPL02 | ||
About • | paper received ※ 30 September 2019 paper accepted ※ 09 October 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEMPR009 | Development of Event Receiver on Zynq-7000 Evaluation Board | timing, controls, FPGA, linac | 1063 |
|
|||
The timing system of SuperKEKB accelerator is used Event Timing System developed by Micro Research Finland. In this presentation, we tested the receiver on Zynq7000 evaluation board. The serialized event data are transferred from Event Generator to Event Receiver by using GTX transceiver. So, we selected Zynq7000(7z030) as receiver, because the FPGA has the GTX. And also, Zynq is mounted on arm processor, it is easily able to control received event data stream by using EPICS ICO. Finally we are aiming to combine event system and RF or BPM system in one FPGA board. | |||
![]() |
Poster WEMPR009 [0.572 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-WEMPR009 | ||
About • | paper received ※ 17 September 2019 paper accepted ※ 09 October 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEPHA020 | Pushing the Limits of Tango Archiving System using PostgreSQL and Time Series Databases | TANGO, database, controls, SRF | 1116 |
|
|||
The Tango HDB++ project is a high performance event-driven archiving system which stores data with micro-second resolution timestamps, using archivers written in C++. HDB++ supports MySQL/MariaDB and Apache Cassandra backends and has been recently extended to support PostgreSQL and TimescaleDB*, a time-series PostgreSQL extension. The PostgreSQL backend has enabled efficient multi-dimensional data storage in a relational database. Time series databases are ideal for archiving and can take advantage of the fact that data inserted do not change. TimescaleDB has pushed the performance of HDB++ to new limits. The paper will present the benchmarking tools that have been developed to compare the performance of different backends and the extension of HDB++ to support TimescaleDB for insertion and extraction. A comparison of the different supported back-ends will be presented.
https://timescale.com |
|||
![]() |
Poster WEPHA020 [1.609 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-WEPHA020 | ||
About • | paper received ※ 30 September 2019 paper accepted ※ 02 November 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEPHA103 | Backward Compatible Update of the Timing System of WEST | FPGA, network, timing, controls | 1338 |
|
|||
Between 2013 and 2016, the tokamak Tore Supra in operation at Cadarache (CEA-France) since 1988 underwent a major upgrade following which it was renamed WEST (Tungsten [W] Environment in Steady state Tokamak). The synchronization system however was not upgraded since 1999*. At the time, a robust design was achieved based on AMD’s TAXI chip**: clock and events are distributed from a central emitter over a star shaped network of simplex optical links to electronic crates around the tokamak. Unfortunately, spare boards were not produced in sufficient quantities and the TAXI is obsolete. In fact, multigigabit serial communication standards question the future availability of any such low rate SerDeses. Designing replacement boards provides an opportunity for a new CDR solution and extended functionalities (loss-of-lock detection, latency monitoring). Backward compatibility is a major constraint given the lack of resources for a full upgrade. We will first describe the current state of the timing network of WEST, then the implementation of a custom CDR in full firmware, using the IOSerDeses of Xilinx FPGAs and will finally provide preliminary results on development boards.
*"Upgrade of the timing system for Tore Supra long pulses", D. Moulin et al. IEEE RealTime Conference 1999 **http://hep.uchicago.edu/~thliu/projects/Pulsar/otherdoc/TAXIchip.pdf |
|||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-WEPHA103 | ||
About • | paper received ※ 30 September 2019 paper accepted ※ 03 October 2020 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WESH2003 | Toward Continuous Delivery Of A Nontrivial Distributed Software System | software, controls, operation, monitoring | 1511 |
|
|||
Funding: SKA South Africa National Research Foundation of South Africa Department of Science and Technology The MeerKAT Control and Monitoring(CAM) solution is a mature software system that has undergone multiple phases of construction and expansion. It is a distributed system with a run-time environment of 15 logical nodes featuring dozens of interdependent, short-lived processes that interact with a number of long-running services. This presents a challenge for the development team to balance operational goals with continued discovery and development of useful enhancements for its users (astronomers, telescope operators). Continuous Delivery is a set of practices designed to always keep software in a releasable state. It employs the discipline of release engineering to optimise the process of taking changes from source control to production. In this paper, we review the current path to production (build, test and release) of CAM, identify shortcomings and introduce approaches to support further incremental development of the system. By implementing patterns such as deployment pipelines and immutable release candidates we hope to simplify the release process and demonstrate increased throughput of changes, quality and stability in the future |
|||
![]() |
Slides WESH2003 [2.933 MB] | ||
![]() |
Poster WESH2003 [1.448 MB] | ||
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2019-WESH2003 | ||
About • | paper received ※ 30 September 2019 paper accepted ※ 09 October 2019 issue date ※ 30 August 2020 | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||