NXCALS - Architecture and Challenges of the Next CERN Accelerator Logging Service

Wozniak, Jakub; Roderick, Chris

doi:10.18429/JACoW-ICALEPCS2019-WEPHA163

Joint Accelerator Conferences Website

The Joint Accelerator Conferences Website (JACoW) is an international collaboration that publishes the proceedings of accelerator conferences held around the world.

BiBTeX citation export for WEPHA163: NXCALS - Architecture and Challenges of the Next CERN Accelerator Logging Service

@InProceedings{wozniak:icalepcs2019-wepha163,
  author       = {J.P. Wozniak and C. Roderick},
  title        = {{NXCALS - Architecture and Challenges of the Next CERN Accelerator Logging Service}},
  booktitle    = {Proc. ICALEPCS'19},
  pages        = {1465--1469},
  paper        = {WEPHA163},
  language     = {english},
  keywords     = {extraction, software, controls, operation, hardware},
  venue        = {New York, NY, USA},
  series       = {International Conference on Accelerator and Large Experimental Physics Control Systems},
  number       = {17},
  publisher    = {JACoW Publishing, Geneva, Switzerland},
  month        = {08},
  year         = {2020},
  issn         = {2226-0358},
  isbn         = {978-3-95450-209-7},
  doi          = {10.18429/JACoW-ICALEPCS2019-WEPHA163},
  url          = {https://jacow.org/icalepcs2019/papers/wepha163.pdf},
  note         = {https://doi.org/10.18429/JACoW-ICALEPCS2019-WEPHA163},
  abstract     = {CERN’s Accelerator Logging Service (CALS) is in production since 2003 and stores data from accelerator infrastructure and beam observation devices. Initially expecting 1 TB/year, the Oracle based system has scaled to cope with 2.5 TB/day coming from >2.3 million signals. It serves >1000 users making an average of 5 million extraction requests per day. Nevertheless, with a large data increase during LHC Run 2 the CALS system began to show its limits, particularly for supporting data analytics. In 2016 the NXCALS project was launched with the aim of replacing CALS from Run 3 onwards, with a scalable system using "Big Data" technologies. The NXCALS core is production-ready, based on open-source technologies such as Hadoop, HBase, Spark and Kafka. This paper will describe the NXCALS architecture and design choices, together with challenges faced while adopting these technologies. This includes: write/read performance when dealing with vast amounts of data from heterogenous data sources with strict latency requirements; how to extract, transform and load >1 PB of data from CALS to NXCALS. NXCALS is not CERN-specific and can be relevant to other institutes facing similar challenges.},
}