Keyword: distributed
Paper Title Other Keywords Page
FRCB2 Design and Construction of the Data Warehouse Based on Hadoop Ecosystem at HLS-II EPICS, controls, software, database 233
 
  • Y. Song, X. Chen, C. Li, G. Liu, J.G. Wang, K. Xuan
    USTC/NSRL, Hefei, Anhui, People’s Republic of China
 
  Funding: Work supported by National Natural Science Foundation of China (No.11375186)
A data warehouse based on Hadoop ecosystem is designed and constructed for Hefei Light Source II (HLS-II). The ETL program based on Spark migrates data to HDFS from RDB Channel Archiver and the EPICS Archiver Appliance continuously and store them in Parquet format. The distributed data analysis engine based on Impala greatly improves the performance of data retrieval and reduces the response time of queries. In this paper, we will describe our efforts and experience to use various open sources software and tools to effectively manage the big data. We will also report the plans on this data warehouse in the future.
 
slides icon Slides FRCB2 [5.157 MB]  
DOI • reference for this paper ※ https://doi.org/10.18429/JACoW-PCaPAC2018-FRCB2  
About • paper received ※ 09 October 2018       paper accepted ※ 15 October 2018       issue date ※ 21 January 2019  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)