The Joint Accelerator Conferences Website (JACoW) is an international collaboration that publishes the proceedings of accelerator conferences held around the world.
TY - CONF AU - Song, Y. AU - Chen, X. AU - Li, C. AU - Liu, G. AU - Wang, J.G. AU - Xuan, K. ED - Cheng, Yung-Sen ED - Schaa, Volker RW ED - Chiu, Pei-Chen ED - Li, Lu ED - Liu, Yung-Hui ED - Petit-Jean-Genaz, Christine TI - Design and Construction of the Data Warehouse Based on Hadoop Ecosystem at HLS-II J2 - Proc. of PCaPAC2018, Hsinchu, Taiwan, 16-19 October 2018 CY - Hsinchu, Taiwan T2 - International Workshop on Emerging Technologies and Scientific Facilities Controls T3 - 12 LA - english AB - A data warehouse based on Hadoop ecosystem is designed and constructed for Hefei Light Source II (HLS-II). The ETL program based on Spark migrates data to HDFS from RDB Channel Archiver and the EPICS Archiver Appliance continuously and store them in Parquet format. The distributed data analysis engine based on Impala greatly improves the performance of data retrieval and reduces the response time of queries. In this paper, we will describe our efforts and experience to use various open sources software and tools to effectively manage the big data. We will also report the plans on this data warehouse in the future. PB - JACoW Publishing CP - Geneva, Switzerland SP - 233 EP - 235 KW - EPICS KW - controls KW - software KW - database KW - distributed DA - 2019/01 PY - 2019 SN - 978-3-95450-200-4 DO - DOI: 10.18429/JACoW-PCaPAC2018-FRCB2 UR - http://jacow.org/pcapac2018/papers/frcb2.pdf ER -