ICALEPCS2013 - List of Authors (Motesnitsalis, E.)

Paper

Title

Page

Concept and Prototype for a Distributed Analysis Framework for the LHC Machine Data

604

K. Fuchsberger, J.C. Garnier, A.A. Gorzawski, E. Motesnitsalis
CERN, Geneva, Switzerland

The Large Hadron Collider (LHC) at CERN produces more than 50 TB of diagnostic data every year, shared between normal running periods as well as commissioning periods. The data is collected in different systems, like the LHC Post Mortem System (PM), the LHC Logging Database and different file catalogues. To analyse and correlate data from these systems it is necessary to extract data to a local workspace and to use scripts to obtain and correlate the required information. Since the amount of data can be huge (depending on the task to be achieved) this approach can be very inefficient. To cope with this problem, a new project was launched to bring the analysis closer to the data itself. This paper describes the concepts and the implementation of the first prototype of an extensible framework, which will allow integrating all the existing data sources as well as future extensions, like hadoop* clusters or other parallelization frameworks.
*http://hadoop.apache.org/

Poster TUPPC026 [1.378 MB]

Paper	Title	Page
TUPPC026	Concept and Prototype for a Distributed Analysis Framework for the LHC Machine Data	604
	K. Fuchsberger, J.C. Garnier, A.A. Gorzawski, E. Motesnitsalis CERN, Geneva, Switzerland
	The Large Hadron Collider (LHC) at CERN produces more than 50 TB of diagnostic data every year, shared between normal running periods as well as commissioning periods. The data is collected in different systems, like the LHC Post Mortem System (PM), the LHC Logging Database and different file catalogues. To analyse and correlate data from these systems it is necessary to extract data to a local workspace and to use scripts to obtain and correlate the required information. Since the amount of data can be huge (depending on the task to be achieved) this approach can be very inefficient. To cope with this problem, a new project was launched to bring the analysis closer to the data itself. This paper describes the concepts and the implementation of the first prototype of an extensible framework, which will allow integrating all the existing data sources as well as future extensions, like hadoop* clusters or other parallelization frameworks. *http://hadoop.apache.org/
	Poster TUPPC026 [1.378 MB]