LightAMR format standard and lossless compression algorithms for adaptive mesh refinement grids: RAMSES use case [IMA]

http://arxiv.org/abs/2208.11958


The evolution of parallel I/O library as well as new concepts such as ‘in transit’ and ‘in situ’ visualization and analysis have been identified as key technologies to circumvent I/O bottleneck in pre-exascale applications. Nevertheless, data structure and data format can also be improved for both reducing I/O volume and improving data interoperability between data producer and data consumer. In this paper, we propose a very lightweight and purpose-specific post-processing data model for AMR meshes, called lightAMR. Based on this data model, we introduce a tree pruning algorithm that removes data redundancy from a fully threaded AMR octree. In addition, we present two lossless compression algorithms, one for the AMR grid structure description and one for AMR double/single precision physical quantity scalar fields. Then we present performance benchmarks on RAMSES simulation datasets of this new lightAMR data model and the pruning and compression algorithms. We show that our pruning algorithm can reduce the total number of cells from RAMSES AMR datasets by 10-40% without loss of information. Finally, we show that the RAMSES AMR grid structure can be compacted by ~ 3 orders of magnitude and the float scalar fields can be compressed by a factor ~ 1.2 for double precision and ~ 1.3 – 1.5 in single precision with a compression speed of ~ 1 GB/s.

Read this paper on arXiv…

L. Straffela and D. Chapon
Fri, 26 Aug 22
19/49

Comments: 11 pages, 7 figures, accepted for publication in Journal of Computational Physics

Corrfunc: Blazing fast correlation functions with AVX512F SIMD Intrinsics [IMA]

http://arxiv.org/abs/1911.08275


Correlation functions are widely used in extra-galactic astrophysics to extract insights into how galaxies occupy dark matter halos and in cosmology to place stringent constraints on cosmological parameters. A correlation function fundamentally requires computing pair-wise separations between two sets of points and then computing a histogram of the separations. Corrfunc is an existing open-source, high-performance software package for efficiently computing a multitude of correlation functions. In this paper, we will discuss the SIMD AVX512F kernels within Corrfunc, capable of processing 16 floats or 8 doubles at a time. The latest manually implemented Corrfunc AVX512F kernels show a speedup of up to $\sim 4\times$ relative to compiler-generated code for double-precision calculations. The AVX512F kernels show $\sim 1.6\times$ speedup relative to the AVX kernels and compare favorably to a theoretical maximum of $2\times$. In addition, by pruning pairs with too large of a minimum possible separation, we achieve a $\sim 5-10\%$ speedup across all the SIMD kernels. Such speedups highlight the importance of programming explicitly with SIMD vector intrinsics for complex calculations that can not be efficiently vectorized by compilers. Corrfunc is publicly available at https://github.com/manodeep/Corrfunc/.

Read this paper on arXiv…

M. Sinha and L. Garrison
Wed, 20 Nov 19
35/73

Comments: Paper II for the Corrfunc software package, paper I is on arXiv here: arXiv:1911.03545. Appeared in the refereed proceedings for the “Second Workshop on Software Challenges to Exascale Computing”

NEARBY Platform for Automatic Asteroids Detection and EURONEAR Surveys [IMA]

http://arxiv.org/abs/1903.03479


The survey of the nearby space and continuous monitoring of the Near Earth Objects (NEOs) and especially Near Earth Asteroids (NEAs) are essential for the future of our planet and should represent a priority for our solar system research and nearby space exploration. More computing power and sophisticated digital tracking algorithms are needed to cope with the larger astronomy imaging cameras dedicated for survey telescopes. The paper presents the NEARBY platform that aims to experiment new algorithms for automatic image reduction, detection and validation of moving objects in astronomical surveys, specifically NEAs. The NEARBY platform has been developed and experimented through a collaborative research work between the Technical University of Cluj-Napoca (UTCN) and the University of Craiova, Romania, using observing infrastructure of the Instituto de Astrofisica de Canarias (IAC) and Isaac Newton Group (ING), La Palma, Spain. The NEARBY platform has been developed and deployed on the UTCN’s cloud infrastructure and the acquired images are processed remotely by the astronomers who transfer it from ING through the web interface of the NEARBY platform. The paper analyzes and highlights the main aspects of the NEARBY platform development, and the results and conclusions on the EURONEAR surveys.

Read this paper on arXiv…

D. Gorgan, O. Vaduvescu, T. Stefanut, et. al.
Mon, 11 Mar 19
48/78

Comments: ESA NEO and Debris Detection Conference, ESA/ESOC, Darmstadt, Germany, 22-24 Jan 2019

SWIFT: Maintaining weak-scalability with a dynamic range of $10^4$ in time-step size to harness extreme adaptivity [CL]

http://arxiv.org/abs/1807.01341


Cosmological simulations require the use of a multiple time-stepping scheme. Without such a scheme, cosmological simulations would be impossible due to their high level of dynamic range; over eleven orders of magnitude in density. Such a large dynamic range leads to a range of over four orders of magnitude in time-step, which presents a significant load-balancing challenge. In this work, the extreme adaptivity that cosmological simulations present is tackled in three main ways through the use of the code SWIFT. First, an adaptive mesh is used to ensure that only the relevant particles are interacted in a given time-step. Second, task-based parallelism is used to ensure efficient load-balancing within a single node, using pthreads and SIMD vectorisation. Finally, a domain decomposition strategy is presented, using the graph domain decomposition library METIS, that bisects the work that must be performed by the simulation between nodes using MPI. These three strategies are shown to give SWIFT near-perfect weak-scaling characteristics, only losing 25% performance when scaling from 1 to 4096 cores on a representative problem, whilst being more than 30x faster than the de-facto standard Gadget-2 code.

Read this paper on arXiv…

J. Borrow, R. Bower, P. Draper, et. al.
Thu, 5 Jul 18
39/60

Comments: N/A

Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxy Data [IMA]

http://arxiv.org/abs/1609.03932


We apply a novel spectral graph technique, that of locally-biased semi-supervised eigenvectors, to study the diversity of galaxies. This technique permits us to characterize empirically the natural variations in observed spectra data, and we illustrate how this approach can be used in an exploratory manner to highlight both large-scale global as well as small-scale local structure in Sloan Digital Sky Survey (SDSS) data. We use this method in a way that simultaneously takes into account the measurements of spectral lines as well as the continuum shape. Unlike Principal Component Analysis, this method does not assume that the Euclidean distance between galaxy spectra is a good global measure of similarity between all spectra, but instead it only assumes that local difference information between similar spectra is reliable. Moreover, unlike other nonlinear dimensionality methods, this method can be used to characterize very finely both small-scale local as well as large-scale global properties of realistic noisy data. The power of the method is demonstrated on the SDSS Main Galaxy Sample by illustrating that the derived embeddings of spectra carry an unprecedented amount of information. By using a straightforward global or unsupervised variant, we observe that the main features correlate strongly with star formation rate and that they clearly separate active galactic nuclei. Computed parameters of the method can be used to describe line strengths and their interdependencies. By using a locally-biased or semi-supervised variant, we are able to focus on typical variations around specific objects of astronomical interest. We present several examples illustrating that this approach can enable new discoveries in the data as well as a detailed understanding of very fine local structure that would otherwise be overwhelmed by large-scale noise and global trends in the data.

Read this paper on arXiv…

D. Lawlor, T. Budavari and M. Mahoney
Wed, 14 Sep 16
19/75

Comments: 34 pages. A modified version of this paper has been accepted to The Astrophysical Journal

A fast algorithm for identifying Friends-of-Friends halos [IMA]

http://arxiv.org/abs/1607.03224


We describe a simple and fast algorithm for identifying friends-of-friends clusters and prove its correctness. The algorithm avoids unnecessary expensive neighbor queries, uses minimal memory overhead, and rejects slowdown in high over-density regions. We define our algorithm formally based on pair enumeration, a problem that has been heavily studied in fast 2-point correlation codes and our reference implementation employs a dual KD-tree correlation function code. We construct halos in a hierarchical merger tree, and use a splay operation to reduce the average cost of identifying the root of a cluster from $O[\log L]$ to $O[1]$ ($L$ is the size of a cluster) without additional memory costs. This reduces the overall time complexity of merging trees form $O[L\log L]$ to $O[L]$, reducing the number of operations per by orders of magnitude. We next introduce a pruning operation that skips pair enumeration between two fully self-connected KD-tree nodes. This improves the robustness of the algorithm, reducing cost of exploring to high density peaks from $O[\delta^2]$ to $O[\delta]$. We show that for cosmological data set the algorithm eliminates more than half of enumerations for typically used linking lengths $b \sim 0.2$, and empirically scales as $O[\log b]$ at large $b$ (linking length) limit. Furthermore, our algorithm is extremely simple and easy to implement on top of an existing pair enumeration code, reusing the optimization effort that has been invested in fast correlation function codes.

Read this paper on arXiv…

Y. Feng and C. Modi
Wed, 13 Jul 16
62/74

Comments: 9 pages, 3 figures. Submitting to Astronomy and Computing

Mathematical Foundations of the GraphBLAS [CL]

http://arxiv.org/abs/1606.05790


The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. Mathematically the Graph- BLAS defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the mathematics of the GraphBLAS. Graphs represent connections between vertices with edges. Matrices can represent a wide range of graphs using adjacency matrices or incidence matrices. Adjacency matrices are often easier to analyze while incidence matrices are often better for representing data. Fortunately, the two are easily connected by matrix mul- tiplication. A key feature of matrix mathematics is that a very small number of matrix operations can be used to manipulate a very wide range of graphs. This composability of small number of operations is the foundation of the GraphBLAS. A standard such as the GraphBLAS can only be effective if it has low performance overhead. Performance measurements of prototype GraphBLAS implementations indicate that the overhead is low.

Read this paper on arXiv…

J. Kepner, P. Aaltonen, D. Bader, et. al.
Tue, 21 Jun 16
72/75

Comments: 9 pages; 11 figures; accepted to IEEE High Performance Extreme Computing (HPEC) conference 2016

GTOC8: Results and Methods of ESA Advanced Concepts Team and JAXA-ISAS [CL]

http://arxiv.org/abs/1602.00849


We consider the interplanetary trajectory design problem posed by the 8th edition of the Global Trajectory Optimization Competition and present the end-to-end strategy developed by the team ACT-ISAS (a collaboration between the European Space Agency’s Advanced Concepts Team and JAXA’s Institute of Space and Astronautical Science). The resulting interplanetary trajectory won 1st place in the competition, achieving a final mission value of $J=146.33$ [Mkm]. Several new algorithms were developed in this context but have an interest that go beyond the particular problem considered, thus, they are discussed in some detail. These include the Moon-targeting technique, allowing one to target a Moon encounter from a low Earth orbit; the 1-$k$ and 2-$k$ fly-by targeting techniques, enabling one to design resonant fly-bys while ensuring a targeted future formation plane% is acquired at some point after the manoeuvre ; the distributed low-thrust targeting technique, admitting one to control the spacecraft formation plane at 1,000,000 [km]; and the low-thrust optimization technique, permitting one to enforce the formation plane’s orientations as path constraints.

Read this paper on arXiv…

D. Izzo, D. Hennes, M. Martens, et. al.
Wed, 3 Feb 16
49/54

Comments: Presented at the 26th AAS/AIAA Space Flight Mechanics Meeting, Napa, CA

An Integer Linear Programming Solution to the Telescope Network Scheduling Problem [IMA]

http://arxiv.org/abs/1503.07170


Telescope networks are gaining traction due to their promise of higher resource utilization than single telescopes and as enablers of novel astronomical observation modes. However, as telescope network sizes increase, the possibility of scheduling them completely or even semi-manually disappears. In an earlier paper, a step towards software telescope scheduling was made with the specification of the Reservation formalism, through the use of which astronomers can express their complex observation needs and preferences. In this paper we build on that work. We present a solution to the discretized version of the problem of scheduling a telescope network. We derive a solvable integer linear programming (ILP) model based on the Reservation formalism. We show computational results verifying its correctness, and confirm that our Gurobi-based implementation can address problems of realistic size. Finally, we extend the ILP model to also handle the novel observation requests that can be specified using the more advanced Compound Reservation formalism.

Read this paper on arXiv…

S. Lampoudi, E. Saunders and J. Eastman
Thu, 26 Mar 15
17/48

Comments: Accepted for publication in the refereed conference proceedings of the International Conference on Operations Research and Enterprise Systems (ICORES 2015)