Probabilistic cosmic web classification using fast-generated training data [CEA]

http://arxiv.org/abs/1912.04412


We present a novel method of robust probabilistic cosmic web particle classification in three dimensions using a supervised machine learning algorithm. Training data was generated using a simplified $\Lambda$CDM toy model with pre-determined algorithms for generating halos, filaments, and voids. While this model lacks physical detail, it can be generated substantially more quickly than an N-body simulation without loss in classification accuracy. For each particle in this dataset, measurements were taken of the local density field and directionality. These measurements were used to train a random forest algorithm with, which was used to assign class probabilities to each particle in a $\Lambda$CDM, dark matter-only N-body simulation with $256^3$ particles, as well as on another toy model data set. By comparing the trends in the ROC curves and other statistical metrics of predictions made on each of the datasets using different feature sets, we demonstrate that the combination of measurements of the local density field magnitude and directionality enables accurate and consistent classification of halo, filament, and void particles in varied environments. We also show that this combination of training features ensures that the construction of our toy model does not affect classification. The use of a fully supervised algorithm allows greater control over the information deemed important for classification, preventing issues arising from hyperparameters and mode collapse in deep learning models. Due to the speed of training data generation, our method is highly scalable, making it particularly suited for classifying large datasets, including observed data.

Read this paper on arXiv…

B. Buncher and M. Kind
Wed, 11 Dec 19
68/69

Comments: 22 pages, 16 figures