Author: Arrowsmith, M.
Paper Title Page
THPPC082 Monitoring of the National Ignition Facility Integrated Computer Control System 1266
 
  • J.M. Fisher, M. Arrowsmith, E.A. Stout
    LLNL, Livermore, California, USA
 
  Funding: This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. #LLNL-ABS-632812
The Integrated Computer Control System (ICCS), used by the National Ignition Facility (NIF) provides comprehensive status and control capabilities for operating approximately 100,000 devices through 2,600 processes located on 1,800 servers, front end processors and embedded controllers. Understanding the behaviors of complex, large scale, operational control software, and improving system reliability and availability, is a critical maintenance activity. In this paper we describe the ICCS diagnostic framework, with tunable detail levels and automatic rollovers, and its use in analyzing system behavior. ICCS recently added Splunk as a tool for improved archiving and analysis of these log files (about 20GB, or 35 million logs, per day). Splunk now continuously captures all ICCS log files for both real-time examination and exploration of trends. Its powerful search query language and user interface provides allows interactive exploration of log data to visualize specific indicators of system performance, assists in problems analysis, and provides instantaneous notification of specific system behaviors.
 
poster icon Poster THPPC082 [4.693 MB]