Paper | Title | Page |
---|---|---|
MOSDI1 | Analyzing Multipacting Problems in Accelerators using ACE3P on High Performance Computers | 54 |
|
||
Funding: This material is based upon work supported by the U.S. Department of Energy Office of Science under Cooperative Agreement DE-SC0000661 Track3P is the particle tracking module of ACE3P, a 3D parallel finite element electromagnetic code suite developed at SLAC which has been implemented on the US DOE supercomputers at NERSC to simulate large-scale complex accelerator designs. Using the higher-order cavity fields generated by ACE3P codes, Track3P has been used for analyzing multipacting (MP) in accelerator cavities. The prediction of the MP barriers in the ICHIRO cavity at KEK was the first Track3P benchmark against measurements. Using a large number of processors, Track3P can scan through the field gradient and cavity surface efficiently, and its comprehensive postprocessing tool allows the identifications of both the hard and soft MP barriers and the locations of MP activities. Results from applications of this high performance simulation capability to accelerators such as the Quarter Wave Resonator for FRIB, the 704 MHz SRF gun cavity for BNL ERL and the Muon cooling cavity for Muon Collider will be presented. |
||
MOSDC2 | GPGPU Implementation of Matrix Formalism for Beam Dynamics Simulation | 59 |
|
||
Matrix formalism is a map integration method for ODE solving. It allows to present solution of the system as sums and multiplications of 2-indexes numeric matrix. This approach can be easy implement in parallel codes. As the most natural for matrix operation GPU architecture has been choosen. The set of the methods for beam dynamics has been implemented. Particles and envelope dynamics are supported. The computing facilities are located in St. Petersburg State University and presented by the NVIDIA Tesla clusters. | ||
Slides MOSDC2 [0.770 MB] | ||
MOSDC3 |
Fast Determination of Spurious Oscillations in an Entire Klystron Tube with ACE3P | |
|
||
Funding: USDOE Spurious oscillations remain one of the challenges in the development of high-power klystrons which prevent the tube from reaching the design performance. ACE3P is a parallel electromagnetic code suite comprising Omega3P which computes the eigenmodes of open cavities and Track3P which calculates the particle trajectory in the cavity fields. The oscillation condition is determined by the total Q of the mode which is the sum of the external Q from Omega3P and the beam loaded Q due to energy gain or loss computed with Track3P. With massively parallel computing it is possible to perform an exhaustive search of unstable modes in a given klystron from the gun to the collector on a time scale much shorter than existing tools. Application to the XC8 and LBSK klystrons at SLAC will be presented. |
||
Slides MOSDC3 [1.018 MB] | ||
THP09 | Global Scan of All Stable Settings (GLASS) for the ANKA Storage Ring | 239 |
|
||
Funding: This work has been supported by the Initiative and Networking Fund of the Helmholtz Association under contract number VH-NG-320. The design of an optimal magnetic optics for a storage ring is not a simple optimization problem, since numerous objectives have to be considered. For instance, figures of merit could be tune values, optical functions, momentum compaction factor, emittance, etc. There is a technique called “GLobal scan of All Stable Settings” (GLASS), which provides a systematic analysis of the magnetic optics and gives a global overview of the capabilities of the storage ring. We developed a parallel version of GLASS, which can run on multi-core processors, decreasing significantly the computational time. In this paper we present our GLASS implementation and show results for the ANKA lattice. |
||
THP10 |
GPU-Accelerated Beam Dynamics Simulations with ELEGANT | |
|
||
Funding: Work supported by the DOE Office of Science, Office of Basic Energy Sciences grant No. DE-SC0004585, and in part by Tech-X Corporation. Efficient implementation of general-purpose particle tracking on GPUs can result in significant performance benefits to large scale particle tracking and tracking-based lattice optimization simulations. We present the latest results of our work on accelerating Argonne National Lab's accelerator simulation code ELEGANT* using CUDA-enabled GPUs**. We provide a list of ELEGANT's beamline elements ported to GPUs, identify performance-limiting factors, and briefly discuss optimization techniques for efficient utilization of the device memory space, with an emphasis on register usage. We also present a novel hardware-assisted technique for efficiently calculating a histogram from a large distribution of particle coordinates, and compare this to data-parallel implementations. Finally, we discuss results of simulations performed with realistic test lattices, and give a brief outline of future work on GPU-enabled version of ELEGANT. * M. Borland, "elegant: A Flexible SDDS-compliant Code for Accel. Simulation", APS LS-287 (2000); Y. Wang, M. Borland, Proc. of PAC07, THPAN095 (2007) ** CUDA home page: http://www.nvidia.com/cuda |
||
THSDI1 |
Coherent Electron Cooling Simulations for Parameters of the BNL Proof-of-principle Experiment | |
|
||
Funding: Work funded by the US Department of Energy, Office of Science, Office of Nuclear Physics. Increasing the luminosity of relativistic hadron beams is critical for the advancement of nuclear physics. Coherent electron cooling promises to cool such beams significantly faster than alternative methods. We present simulations of 40 GeV/n Au79+ ions for a single pass, which consists of a modulator, an FEL amplifier and a kicker. In the modulator, the electron beam copropagates with the ion beam, which perturbs the electron beam density and velocity via anisotropic Debye shielding. Self-amplified spontaneous emission lasing in the FEL both amplifies and imparts wavelength-scale modulation on the electron beam perturbations. The modulated electric fields appropriately accelerate or decelerate the copropagating ions in the kicker. In analogy with stochastic cooling, these field strengths are crucial for estimating the effective drag force on the hadrons and, hence, the cooling time. The inherently 3D particle and field dynamics is modeled with the parallel VORPAL framework (modulator and kicker) and with GENESIS (amplifier), with careful coupling between codes. Physical parameters are taken from the CeC proof-of-principle experiment under development at Brookhaven National Lab. |
||
Slides THSDI1 [14.817 MB] | ||
FRSAC1 | Hybrid Programming and Performance for Beam Propagation Modeling | 284 |
|
||
Funding: DOE ASCR (Advanced Scientific Computing Research) Program We examined hybrid parallel infrastructures in order to ensure performance and scalability for beam propagation modeling as we move toward extreme-scale systems. Using an MPI programming interface for parallel algorithms, we expanded the capability of our existing electromagnetic solver to a hybrid (MPI/shared-memory) model that can potentially use the computer resources on future-generation computing architecture more efficiently. As a preliminary step, we discuss a hybrid MPI/OpenMP model and demonstrate performance and analysis on the leadership-class computing systems such as the IBM BG/P, BG/Q, and Cray XK6. Our hybrid MPI/OpenMP model achieves speedup when the computation amounts are large enough to compensate the OMP threading overhead. |
||
Slides FRSAC1 [4.252 MB] | ||
FRSAC2 | Comparison of Eigenvalue Solvers for Large Sparse Matrix Pencils | 287 |
|
||
Funding: Work supported by the DFG through SFB 634 Efficient and accurate computation of eigenvalues and eigenvectors is of fundamental importance in the accelerator physics community. Moreover, the eigensystem analysis is generally used for the identifications of many physical phenomena connected to vibrations. Therefore, various types of algorithms such that Arnoldi, Lanczos, Krylov-Schur, Jacobi-Davidson etc. were implemented to solve the eigenvalue problem efficiently. In this direction, we investigate the performance of selected commercial and freely available software tools for the solution of a generalized eigenvalue problem. We choose two setups by considering spherical and billiard resonators in order to test the robustness, accuracy, and computational speed and memory consumption issues of the recent versions of CST, Matlab, Pysparse, SLEPc and CEM3D. Simulations were performed on a standard personal computer as well as on a cluster computer to enable the handling of large sparse matrices in the order of hundreds of thousands up to several millions degrees of freedom. We obtain interesting comparison results with the examined solvers which is useful for choosing the appropriate solvers for a given practical application. |
||
Slides FRSAC2 [10.095 MB] | ||