À la carte BioLUC BSM CIPDSS CUB Finance Risk Geostatistics HYDRA IEISS INCCA Infrastructure Logical Reasoning Machine Learning Physics SERA TRANSIMS Weather

à la carte

The à la carte [A Los Alamos Computer Architecture Toolkit for Extreme-Scale Architecture Simulation] project aims to develop a simulation-based analysis tool for evaluating massively parallel computing platforms including current and future ASCI scale systems. Such a tool will provide a means to analyze and optimize the current systems and applications as well as influence the design and development of next-generation high-performance computers. The proposed multiscale simulations will combine models of individual computing components, along with workload characterization of relevant classes of applications at various levels of complexity. The interaction will be modeled at different granularities, allowing the user to choose the level of detail and the associated costs for evaluation. The computing systems modeled will be on the order of today’s ASCI platforms. This defines massive computational requirements, need for computational tractability, and the need to study dynamics of complex interactions. These requirements will motivate the necessary basic scientific research with the intent to efficiently build and optimally deploy the system for application driven analysis.

Selected Publications

F. J. Alexander, M. Anghel, K. Berkbigler, G. Booker, B. Bush, K. Davis, A. Hoisie, N. Moss, S. Smith, T. P. Caudell, D. P. Holten, K. L. Summers, and C. Zhou, “Design, Implementation, and Validation of Network and Workload Simulations for a 30-TeraOPS Computer System,” Los Alamos National Laboratory.
The magnitude of the scientific computations targeted by the US DOE ASCI project requires as-yet unavailable computational power, and unprecedented bandwidth to enable remote, realtime interaction with the compute servers. To facilitate these computations ASCI plans to deploy massive computing platforms, possibly consisting of tens of thousands of processors, capable of achieving 10-100 TeraOPS, with WAN connectivity from these to distant sites. For various reasons the current approach to building a yet-larger supercomputer–connecting commercially available SMPs with a network–may be reaching practical limits. Better hardware design and lower development costs require performance evaluation, analysis, and modeling of parallel applications and architectures, and in particular predictive capability. We outline an approach for simulating computing architectures applicable to extreme-scale systems (thousands of processors) and to advanced, novel architectural configurations, and describe our progress in its realization. The simulation environment is intended to allow (i) exploration of hardware/architecture design space; (ii) exploration of algorithm/implementation space both at the application level (e.g. data distribution and communication) and the system level (e.g. scheduling, routing, and load balancing); (iii) determining how application performance will scale with the number of processors or other components; (iv) analysis of the tradeoffs between performance and cost; and (v) testing and validating analytical models of computation and communication. Our component-based design allows for the seamless assembly of architectures from representations of workload, processor, network interface, switches, etc. with disparate resolutions, into an integrated simulation model. This accommodates different case studies that may require different levels of fidelity in various parts of a system. Our current implementation, includes low and medium-fidelity models of the network and low-fidelity and direct execution models of the workload. It supports studies of both simulation performance and scaling, and the properties of the simulated system themselves. Ongoing work allows more realistic simulation and dynamic visualization of ASCI-like workloads on very large machines.

F. J. Alexander, K. Berkbigler, G. Booker, B. Bush, K. Davis, A. Hoisie, N. Moss, S. Smith, T. P. Caudell, D. P. Holten, K. L. Summers, and C. Zhou, “Design, Implementation, and Validation of Low- and Medium-Fidelity Network Simulations of a 30-TeraOPS System,” Los Alamos National Laboratory, Report LA-UR-02-6573.
The magnitude of the scientific computations targeted by the US DOE ASCI project requires as-yet unavailable computational power, and unprecedented bandwidth to enable remote, realtime interaction with the compute servers. To facilitate these computations ASCI plans to deploy massive computing platforms, possibly consisting of tens of thousands of processors, capable of achieving 10-100 TeraOPS, with WAN connectivity from these to distant sites. For various reasons the current approach to building a yet-larger supercomputer–connecting commercially available SMPs with a network–may be reaching practical limits. Better hardware design and lower development costs require performance evaluation, analysis, and modeling of parallel applications and architectures, and in particular predictive capability. We outline an approach for simulating computing architectures applicable to extreme-scale systems (thousands of processors) and to advanced, novel architectural configurations, and describe our progress in its realization. The simulation environment is intended to allow (i) exploration of hardware/architecture design space; (ii) exploration of algorithm/implementation space both at the application level (e.g. data distribution and communication) and the system level (e.g. scheduling, routing, and load balancing); (iii) determining how application performance will scale with the number of processors or other components; (iv) analysis of the tradeoffs between performance and cost; and (v) testing and validating analytical models of computation and communication. Our component-based design allows for the seamless assembly of architectures from representations of workload, processor, network interface, switches, etc. with disparate resolutions, into an integrated simulation model. This accommodates different case studies that may require different levels of fidelity in various parts of a system. Our current implementation, includes low and medium-fidelity models of the network and low-fidelity and direct execution models of the workload. It supports studies of both simulation performance and scaling, and the properties of the simulated system themselves. Ongoing work allows more realistic simulation and dynamic visualization of ASCI-like workloads on very large machines.

F. J. Alexander, K. Berkbigler, G. Booker, B. Bush, K. Davis, A. Hoisie, S. Smith, T. P. Caudell, D. P. Holten, K. L. Summers, and C. Zhou, “Design and Implementation of Low- and Medium-Fidelity Network Simulations of a 30-TeraOPS System,” Los Alamos National Laboratory, Report LA-UR-02-1930.
The magnitude of the scientific computations targeted by the ASCI project requires as-yet unavailable computational power. To facilitate these computations ASCI plans to deploy massive computing platforms, possibly consisting of tens of thousands of processors, capable of achieving 10-100 TeraOPS. For various reasons the current approach to building a yet-larger supercomputer–connecting commercially available SMPs with a network–may be reaching practical limits. The path to better hardware design and lower development costs involves performance evaluation, analysis, and modeling of parallel applications and architectures, and in particular predictive capability. We outline an approach for simulating computing architectures applicable to extreme-scale systems (thousands of processors) and to advanced, novel architectural configurations. The proposed simulation environment can be used for: (i) exploration of hardware/architecture design space; (ii) exploration of algorithm/implementation space both at the application level (e.g. data distribution and communication) and the system level (e.g. scheduling, routing, and load balancing); (iii) determining how application performance will scale with the number of processors or other components; (iv) analysis of the tradeoffs between performance and cost; and, (v) testing and validating analytical models of computation and communication. Our component-based design allows for the seamless assembly of architectures from representations of workload, processor, network interface, switches, etc., with disparate resolutions, into an integrated simulation model. This accommodates different case studies that may require different levels of fidelity in various parts of a system. Our initial implementation, comprising low- and medium-fidelity models for the network and a low-fidelity model for the workload, can simulate at least 4096 computational nodes in a fat-tree network using Quadrics hardware. It supports studies of both simulation performance and scaling, and the properties of the simulated system themselves. Ongoing work allows more realistic simulation and visualization of ASCI-like workloads on very large machines.

K. Berkbigler, B. Bush, K. Davis, A. Hoisie, S. Smith, C. Zhou, K. Summers, and T. Caudell, “Graph Visualization for the Analysis of the Structure and Dynamics of Extreme-Scale Supercomputers,” Los Alamos National Laboratory, Report LA-UR-02-1929.
We are exploring the development and application of information visualization techniques for the analysis of new extreme-scale supercomputer architectures. Modern super-computers typically comprise very large clusters of commodity SMPs interconnected by possibly dense and often nonstandard networks. The scale, complexity, and inherent nonlocality of the structure and dynamics of this hardware, and the systems and applications distributed over it, challenge traditional analysis methods. As part of the a la carte team at Los Alamos National Laboratory, who are simulating these advanced architectures, we are exploring advanced visualization techniques and creating tools to provide intuitive exploration, discovery, and analysis of these simulations. This work complements existing and emerging algorithmic analysis tools. Here we gives background on the problem domain, a description of a prototypical computer architecture of interest (on the order of 10,000 processors connected by a quaternary fat-tree network), and presentations of several visualizations of the simulation data that make clear the flow of data in the interconnection network.

Francis Alexander, Kathryn Berkbigler, Graham Booker, Brian Bush, Thomas Caudell, Kei Davis, Tim Eyring, Adolfy Hoisie, Donner Holten, Steve Smith, and Kenneth Summers, “Extreme-Scale Architecture Simulation,” presented at the SC’2001 Research Poster Session, Denver, Colorado.

Kathryn Berkbigler, Brian Bush, Kei Davis, Nicholas Moss, Steve Smith, Thomas P. Caudell, Kenneth L. Summers, and Cheng Zhou, “À la carte: A Simulation Framework for Extreme-Scale Hardware Architectures,” in MS 2003: IASTED International Conference on Modelling and Simulation, Palm Springs, California.
We outline à la carte, an approach for simulating computing architectures applicable to extreme-scale systems (thousands of processors) and to advanced, novel architectural configurations. Our component-based design allows for the seamless assembly of architectures from representations of workload, processor, network interface, switches, etc., with disparate resolutions, into an integrated simulation model. This accommodates different case studies that may require different levels of fidelity in various parts of a system. The current implementation includes low- and medium-fidelity models of the network and low-fidelity and direct execution models of the workload. It supports studies of both simulation performance and scaling, and the properties of the simulated system themselves.

K. L. Summers, T. P. Caudell, K. Berkbigler, B. Bush, K. Davis, and S. Smith, “Graph visualization for the analysis of the structure and dynamics of extreme-scale supercomputers,” Information Visualization, vol. 3, no. 3, p. 209–222. <http://dx.doi.org/10.1057/palgrave.ivs.9500079>
We are exploring the development and application of information visualization techniques for the analysis of new massively parallel supercomputer architectures. Modern supercomputers typically comprise very large clusters of commodity SMPs interconnected by possibly dense and often non-standard networks. The scale, complexity, and inherent non-locality of the structure and dynamics of this hardware, and the operating systems and applications distributed over them, challenge traditional analysis methods. As part of the á la carte (A Los Alamos Computer Architecture Toolkit for Extreme-Scale Architecture Simulation) team at Los Alamos National Laboratory, who are simulating these new architectures, we are exploring advanced visualization techniques and creating tools to enhance analysis of these simulations with intuitive three-dimensional representations and interfaces. This work complements existing and emerging algorithmic analysis tools. In this paper, we give background on the problem domain, a description of a prototypical computer architecture of interest (on the order of 10,000 processors connected by a quaternary fat-tree communications network), and a presentation of three classes of visualizations that clearly display the switching fabric and the flow of information in the interconnecting network.