Detecting Galaxy-Filament Alignments in the Sloan Digital Sky Survey III [CEA]

http://arxiv.org/abs/1805.00159


Previous studies have shown the filamentary structures in the cosmic web influence the alignments of nearby galaxies. We study this effect in the LOWZ sample of the Sloan Digital Sky Survey using the “Cosmic Web Reconstruction” filament catalogue of Chen et al. (2016). We find that LOWZ galaxies exhibit a small but statistically significant alignment in the direction parallel to the orientation of nearby filaments. This effect is detectable even in the absence of nearby galaxy clusters, which suggests it is an effect from the matter distribution in the filament. A nonparametric regression model suggests that the alignment effect with filaments extends over separations of $30-40$ Mpc. We find that galaxies that are bright and early-forming align more strongly with the directions of nearby filaments than those that are faint and late-forming; however, trends with stellar mass are less statistically significant, within the narrow range of stellar mass of this sample.

Read this paper on arXiv…

Y. Chen, S. Ho, J. Blazek, et. al.
Wed, 2 May 18
3/55

Comments: 10 pages, 9 figures

Testing One Hypothesis Multiple Times: The Multidimensional Case [CL]

http://arxiv.org/abs/1803.03858


The identification of new rare signals in data, the detection of a sudden change in a trend, and the selection of competing models, are among the most challenging problems in statistical practice. These challenges can be tackled using a test of hypothesis where a nuisance parameter is present only under the alternative, and a computationally efficient solution can be obtained by the “Testing One Hypothesis Multiple times” (TOHM) method. In the one-dimensional setting, a fine discretization of the space of the non-identifiable parameter is specified, and a global p-value is obtained by approximating the distribution of the supremum of the resulting stochastic process. In this paper, we propose a computationally efficient inferential tool to perform TOHM in the multidimensional setting. Here, the approximations of interest typically involve the expected Euler Characteristics (EC) of the excursion set of the underlying random field. We introduce a simple algorithm to compute the EC in multiple dimensions and for arbitrary large significance levels. This leads to an highly generalizable computational tool to perform inference under non-standard regularity conditions.

Read this paper on arXiv…

S. Algeri and D. van Dyk
Tue, 13 Mar 2018
36/61

Comments: N/A

Radio Imaging With Information Field Theory [IMA]

http://arxiv.org/abs/1803.02174


Data from radio interferometers provide a substantial challenge for statisticians. It is incomplete, noise-dominated and originates from a non-trivial measurement process. The signal is not only corrupted by imperfect measurement devices but also from effects like fluctuations in the ionosphere that act as a distortion screen. In this paper we focus on the imaging part of data reduction in radio astronomy and present RESOLVE, a Bayesian imaging algorithm for radio interferometry in its new incarnation. It is formulated in the language of information field theory. Solely by algorithmic advances the inference could be sped up significantly and behaves noticeably more stable now. This is one more step towards a fully user-friendly version of RESOLVE which can be applied routinely by astronomers.

Read this paper on arXiv…

P. Arras, J. Knollmuller, H. Junklewitz, et. al.
Wed, 7 Mar 18
36/65

Comments: 5 pages, 3 figures

Approximate Inference for Constructing Astronomical Catalogs from Images [CL]

http://arxiv.org/abs/1803.00113


We present a new, fully generative model for constructing astronomical catalogs from optical telescope image sets. Each pixel intensity is treated as a Poisson random variable with a rate parameter that depends on the latent properties of stars and galaxies. These latent properties are themselves random, with scientific prior distributions constructed from large ancillary datasets. We compare two procedures for posterior inference: Markov chain Monte Carlo (MCMC) and variational inference (VI). MCMC excels at quantifying uncertainty while VI is 1000x faster. Both procedures outperform the current state-of-the-art method for measuring celestial bodies’ colors, shapes, and morphologies. On a supercomputer, the VI procedure efficiently uses 665,000 CPU cores (1.3 million hardware threads) to construct an astronomical catalog from 50 terabytes of images.

Read this paper on arXiv…

J. Regier, A. Miller, D. Schlegel, et. al.
Fri, 2 Mar 18
8/61

Comments: N/A

Extreme Value Analysis of Solar Flare Events [CL]

http://arxiv.org/abs/1802.06100


Space weather events such as solar flares can be harmful for life and infrastructure on earth or in near-earth orbit. In this paper we employ extreme value theory (EVT) to model extreme solar flare events; EVT offers the appropriate tools for the study and estimation of probabilities for extrapolation to ranges outside of those that have already been observed. In the past such phenomena have been modelled as following a power law which may gives poor estimates of such events due to overestimation. The data used in the study were X-ray fluxes from NOAA/GOES and the expected return levels for Carrington or Halloween like events were calculated with the outcome that the existing data predict similar events happening in 110 and 38 years respectively.

Read this paper on arXiv…

T. Tsiftsi and V. Luz
Tue, 20 Feb 18
31/54

Comments: 17 pages, 5 figures

Bivariate density estimation using normal-gamma kernel with application to astronomy [CL]

http://arxiv.org/abs/1801.08300


We consider the problem of estimation of a bivariate density function with support $\Re\times[0,\infty)$, where a classical bivariate kernel estimator causes boundary bias due to the non-negative variable. To overcome this problem, we propose four kernel density estimators whose performances are compared in terms of the mean integrated squared error. Simulation study shows that the estimator based on our proposed normal-gamma ($NG$) kernel performs best, whose applicability is demonstrated using two astronomical data sets.

Read this paper on arXiv…

U. Bandyopadhyay and S. Modak
Fri, 26 Jan 18
12/60

Comments: 27 pages, 8 figs

EXONEST: The Bayesian Exoplanetary Explorer [EPA]

http://arxiv.org/abs/1712.08894


The fields of astronomy and astrophysics are currently engaged in an unprecedented era of discovery as recent missions have revealed thousands of exoplanets orbiting other stars. While the Kepler Space Telescope mission has enabled most of these exoplanets to be detected by identifying transiting events, exoplanets often exhibit additional photometric effects that can be used to improve the characterization of exoplanets. The EXONEST Exoplanetary Explorer is a Bayesian exoplanet inference engine based on nested sampling and originally designed to analyze archived Kepler Space Telescope and CoRoT (Convection Rotation et Transits plan\’etaires) exoplanet mission data. We discuss the EXONEST software package and describe how it accommodates plug-and-play models of exoplanet-associated photometric effects for the purpose of exoplanet detection, characterization and scientific hypothesis testing. The current suite of models allows for both circular and eccentric orbits in conjunction with photometric effects, such as the primary transit and secondary eclipse, reflected light, thermal emissions, ellipsoidal variations, Doppler beaming and superrotation. We discuss our new efforts to expand the capabilities of the software to include more subtle photometric effects involving reflected and refracted light. We discuss the EXONEST inference engine design and introduce our plans to port the current MATLAB-based EXONEST software package over to the next generation Exoplanetary Explorer, which will be a Python-based open source project with the capability to employ third-party plug-and-play models of exoplanet-related photometric effects.

Read this paper on arXiv…

K. Knuth, B. Placek, D. Angerhausen, et. al.
Wed, 27 Dec 2017
5/56

Comments: 30 pages, 8 figures, 5 tables. Presented at the 37th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering (MaxEnt 2017) in Jarinu/SP Brasil

Estimating activity cycles with probabilistic methods II. The Mount Wilson Ca H&K data [SSA]

http://arxiv.org/abs/1712.08240


Debate over the existence versus nonexistence of trends in the stellar activity-rotation diagrams continues. Application of modern time series analysis tools to study the mean cycle periods in chromospheric activity index is lacking. We develop such models, based on Gaussian processes, for one-dimensional time series and apply it to the extended Mount Wilson Ca H&K sample. Our main aim is to study how the previously commonly used assumption of strict harmonicity of the stellar cycles affects the results. We introduce three methods of different complexity, starting with the simple harmonic model and followed by Gaussian Process models with periodic and quasi-periodic covariance functions. We confirm the existence of two populations in the activity-period diagram. We find only one significant trend in the inactive population, namely that the cycle periods get shorter with increasing rotation. This is in contrast with earlier studies, that postulate the existence of trends in both of the populations. In terms of rotation to cycle period ratio, our data is consistent with only two activity branches such that the active branch merges together with the transitional one. The retrieved stellar cycles are uniformly distributed over the R’HK activity index, indicating that the operation of stellar large-scale dynamos carries smoothly over the Vaughan-Preston gap. At around the solar activity index, however, indications of a disruption in the cyclic dynamo action are seen. Our study shows that stellar cycle estimates depend significantly on the model applied. Such model-dependent aspects include the improper treatment of linear trends and too simple assumptions of the noise variance model. Assumption of strict harmonicity can result in the appearance of double cyclicities that seem more likely to be explained by the quasi-periodicity of the cycles.

Read this paper on arXiv…

N. Olspert, J. Lehtinen, M. Kapyla, et. al.
Mon, 25 Dec 17
13/37

Comments: N/A

Estimating activity cycles with probabilistic methods I. Bayesian Generalised Lomb-Scargle Periodogram with Trend [SSA]

http://arxiv.org/abs/1712.08235


Period estimation is one of the central topics in astronomical time series analysis, where data is often unevenly sampled. Especially challenging are studies of stellar magnetic cycles, as there the periods looked for are of the order of the same length than the datasets themselves. The datasets often contain trends, the origin of which is either a real long-term cycle or an instrumental effect, but these effects cannot be reliably separated, while they can lead to erroneous period determinations if not properly handled. In this study we aim at developing a method that can handle the trends properly, and by performing extensive set of testing, we show that this is the optimal procedure when contrasted with methods that do not include the trend directly to the model. The effect of the noise model on the results is also investigated. We introduce a Bayesian Generalised Lomb-Scargle Periodogram with Trend (BGLST), which is a probabilistic linear regression model using Gaussian priors for the coefficients and uniform prior for the frequency parameter. We show, using synthetic data, that when there is no prior information on whether and to what extent the true model of the data contains a linear trend, the introduced BGLST method is preferable to the methods which either detrend the data or leave the data untrended before fitting the periodic model. Whether to use different from constant noise model depends on the density of the data sampling as well as on the true noise model of the process.

Read this paper on arXiv…

N. Olspert, J. Pelt, M. Kapyla, et. al.
Mon, 25 Dec 17
35/37

Comments: N/A

Mixture Models in Astronomy [IMA]

http://arxiv.org/abs/1711.11101


Mixture models combine multiple components into a single probability density function. They are a natural statistical model for many situations in astronomy, such as surveys containing multiple types of objects, cluster analysis in various data spaces, and complicated distribution functions. This chapter in the CRC Handbook of Mixture Analysis is concerned with astronomical applications of mixture models for cluster analysis, classification, and semi-parametric density estimation. We present several classification examples from the literature, including identification of a new class, analysis of contaminants, and overlapping populations. In most cases, mixtures of normal (Gaussian) distributions are used, but it is sometimes necessary to use different distribution functions derived from astrophysical experience. We also address the use of mixture models for the analysis of spatial distributions of objects, like galaxies in redshift surveys or young stars in star-forming regions. In the case of galaxy clustering, mixture models may not be the optimal choice for understanding the homogeneous and isotropic structure of voids and filaments. However, we show that mixture models, using astrophysical models for star clusters, may provide a natural solution to the problem of subdividing a young stellar population into subclusters. Finally, we explore how mixture models can be used for mathematically advanced modeling of data with heteroscedastic uncertainties or missing values, providing two example algorithms, the measurement error regression model of Kelly (2007) and the Extreme Deconvolution model of Bovy et al. (2011). The challenges presented by astronomical science, aided by the public availability of catalogs from major surveys and missions, are a rich area for collaboration between statisticians and astronomers.

Read this paper on arXiv…

M. Kuhn and E. Feigelson
Fri, 1 Dec 17
42/68

Comments: 33 pages, 7 figures. This manuscript is a preprint of a chapter to appear in the “Handbook of Mixture Analysis,” edited by G. Celeux, S. Fr\”uwirth-Schnatter, and C. P. Robert, Chapman & Hall/CRC, 2018

Improving Exoplanet Detection Power: Multivariate Gaussian Process Models for Stellar Activity [IMA]

http://arxiv.org/abs/1711.01318


The radial velocity method is one of the most successful techniques for detecting exoplanets. It works by detecting the velocity of a host star induced by the gravitational effect of an orbiting planet, specifically the velocity along our line of sight, which is called the radial velocity of the star. As astronomical instrumentation has improved, radial velocity surveys have become sensitive to low-mass planets that cause their host star to move with radial velocities of 1 m/s or less. While analysis of a time series of stellar spectra can in theory reveal such small radial velocities, in practice intrinsic stellar variability (e.g., star spots, convective motion, pulsations) affects the spectra and often mimics a radial velocity signal. This signal contamination makes it difficult to reliably detect low mass planets and planets orbiting magnetically active stars. A principled approach to recovering planet radial velocity signals in the presence of stellar activity was proposed by Rajpaul et al. (2015) and involves the use of a multivariate Gaussian process model to jointly capture time series of the apparent radial velocity and multiple indicators of stellar activity. We build on this work in two ways: (i) we propose using dimension reduction techniques to construct more informative stellar activity indicators that make use of a larger portion of the stellar spectrum; (ii) we extend the Rajpaul et al. (2015) model to a larger class of models and use a model comparison procedure to select the best model for the particular stellar activity indicators at hand. By combining our high-information stellar activity indicators, Gaussian process models, and model selection procedure, we achieve substantially improved planet detection power compared to previous state-of-the-art approaches.

Read this paper on arXiv…

D. Jones, D. Stenning, E. Ford, et. al.
Tue, 7 Nov 17
74/118

Comments: 41 pages, 8 figures

On the variance of radio interferometric calibration solutions: Quality-based Weighting Schemes [IMA]

http://arxiv.org/abs/1711.00421


SKA-era radio interferometric data volumes are expected to be such that new algorithms will be necessary to improve images at very low computational costs. This paper investigates the possibility of improving radio interferometric images using an algorithm inspired by an optical method known as “lucky imaging”, which would give more weight to the best-calibrated visibilities used to make a given image. A fundamental relationship between the statistics of interferometric calibration solutions and those of the image-plane pixels is derived in this paper, relating their covariances. This “Cov-Cov” relationship allows us to understand and describe the statistical properties of the residual image. In this framework, the noise-map can be described as the Fourier transform of the covariance between residual visibilities in a new “(${\delta} u{\delta}v$)”-domain. Image-plane artefacts can be seen as one realisation of the pixel covariance distribution, which can be estimated from the antenna gain statistics. Based on this relationship, we propose a means of improving images made with calibrated visibilities using weighting schemes. This improvement would occur after calibration, but before imaging – it is thus ideally used between major iterations of self-calibration loops. Applying the weighting scheme to simulated data improves the noise level in the final image at negligible computational cost.

Read this paper on arXiv…

E. Bonnassieux, C. Tasse, O. Smirnov, et. al.
Thu, 2 Nov 17
68/71

Comments: Submitted, under review. 12 pages, 7 figures

Exoplanet Atmosphere Retrieval using Multifractal Analysis of Reflectance Spectra [EPA]

http://arxiv.org/abs/1710.09870


We extend a data-based model-free multifractal method of exoplanet detection to probe exoplanetary atmospheres. Whereas the transmission spectrum is studied during the primary eclipse, we analyze what we call the reflectance spectrum, which is taken during the secondary eclipse phase, allowing a probe of the atmospheric limb. In addition to the spectral structure of exoplanet atmospheres, the approach provides information on phenomena such as hydrodynamical flows, tidal-locking behavior, and the dayside-nightside redistribution of energy. The approach is demonstrated using Spitzer data for exoplanet HD189733b. The central advantage of the method is the lack of model assumptions in the detection and observational schemes.

Read this paper on arXiv…

S. Agarwal and J. Wettlaufer
Mon, 30 Oct 17
52/59

Comments: N/A

Multi-Scale Pipeline for the Search of String-Induced CMB Anisotropies [CEA]

http://arxiv.org/abs/1710.00173


We propose a multi-scale edge-detection algorithm to search for the Gott-Kaiser-Stebbins imprints of a cosmic string (CS) network on the Cosmic Microwave Background (CMB) anisotropies. Curvelet decomposition and extended Canny algorithm are used to enhance the string detectability. Various statistical tools are then applied to quantify the deviation of CMB maps having a cosmic string contribution with respect to pure Gaussian anisotropies of inflationary origin. These statistical measures include the one-point probability density function, the weighted two-point correlation function (TPCF) of the anisotropies, the unweighted TPCF of the peaks and of the up-crossing map, as well as their cross-correlation. We use this algorithm on a hundred of simulated Nambu-Goto CMB flat sky maps, covering approximately $10\%$ of the sky, and for different string tensions $G\mu$. On noiseless sky maps with an angular resolution of $0.9’$, we show that our pipeline detects CSs with $G\mu$ as low as $G\mu\gtrsim 4.3\times 10^{-10}$. At the same resolution, but with a noise level typical to a CMB-S4 phase II experiment, the detection threshold would be to $G\mu\gtrsim 1.2 \times 10^{-7}$.

Read this paper on arXiv…

A. Sadr, S. Movahed, M. Farhang, et. al.
Tue, 3 Oct 2017
27/63

Comments: 13 pages, 5 figures, 1 table, Comments are welcome

Large Magellanic Cloud Near-Infrared Synoptic Survey. V. Period-Luminosity Relations of Miras [SSA]

http://arxiv.org/abs/1708.04742


We study the near-infrared properties of 690 Mira candidates in the central region of the Large Magellanic Cloud, based on time-series observations at JHKs. We use densely-sampled I-band observations from the OGLE project to generate template light curves in the near infrared and derive robust mean magnitudes at those wavelengths. We obtain near-infrared Period-Luminosity relations for Oxygen-rich Miras with a scatter as low as 0.12 mag at Ks. We study the Period-Luminosity-Color relations and the color excesses of Carbon-rich Miras, which show evidence for a substantially different reddening law.

Read this paper on arXiv…

W. Yuan, L. Macri, S. He, et. al.
Thu, 17 Aug 17
3/50

Comments: Accepted for publication in The Astronomical Journal

Verification of operational solar flare forecast: Case of Regional Warning Center Japan [SSA]

http://arxiv.org/abs/1707.07903


In this article, we discuss a verification study of an operational solar flare forecast in the Regional Warning Center (RWC) Japan. The RWC Japan has been issuing four-categorical deterministic solar flare forecasts for a long time. In this forecast verification study, we used solar flare forecast data accumulated over 16 years (from 2000 to 2015). We compiled the forecast data together with solar flare data obtained with the Geostationary Operational Environmental Satellites (GOES). Using the compiled data sets, we estimated some conventional scalar verification measures with 95% confidence intervals. We also estimated a multi-categorical scalar verification measure. These scalar verification measures were compared with those obtained by the persistence method and recurrence method. As solar activity varied during the 16 years, we also applied verification analyses to four subsets of forecast-observation pair data with different solar activity levels. We cannot conclude definitely that there are significant performance difference between the forecasts of RWC Japan and the persistence method, although a slightly significant difference is found for some event definitions. We propose to use a scalar verification measure to assess the judgment skill of the operational solar flare forecast. Finally, we propose a verification strategy for deterministic operational solar flare forecasting.

Read this paper on arXiv…

Y. Kubo, M. Den and M. Ishii
Wed, 26 Jul 17
11/68

Comments: 29 pages, 7 figures and 6 tables. Accepted for publication in Journal of Space Weather and Space Climate (SWSC)

Statistical methods in astronomy [CL]

http://arxiv.org/abs/1707.05834


We present a review of data types and statistical methods often encountered in astronomy. The aim is to provide an introduction to statistical applications in astronomy for statisticians and computer scientists. We highlight the complex, often hierarchical, nature of many astronomy inference problems and advocate for cross-disciplinary collaborations to address these challenges.

Read this paper on arXiv…

J. Long and R. Souza
Thu, 20 Jul 17
2/56

Comments: 9 pages, 5 figures

Agatha: disentangling periodic signals from correlated noise in a periodogram framework [EPA]

http://arxiv.org/abs/1705.03089


Periodograms are used as a key significance assessment and visualisation tool to display the significant periodicities in unevenly sampled time series. We introduce a framework of periodograms, called “Agatha”, to disentangle periodic signals from correlated noise and to solve the 2-dimensional model selection problem: signal dimension and noise model dimension. These periodograms are calculated by applying likelihood maximization and marginalization and combined in a self-consistent way. We compare Agatha with other periodograms for the detection of Keplerian signals in synthetic radial velocity data produced for the Radial Velocity Challenge as well as in radial velocity datasets of several Sun-like stars. In our tests we find Agatha is able to recover signals to the adopted detection limit of the radial velocity challenge. Applied to real radial velocity, we use Agatha to confirm previous analysis of CoRoT-7 and to find two new planet candidates with minimum masses of 15.1 $M_\oplus$ and 7.08 $M_\oplus$ orbiting HD177565 and HD41248, with periods of 44.5 d and 13.4 d, respectively. We find that Agatha outperforms other periodograms in terms of removing correlated noise and assessing the significances of signals with more robust metrics. Moreover, it can be used to select the optimal noise model and to test the consistency of signals in time. Agatha is intended to be flexible enough to be applied to time series analyses in other astronomical and scientific disciplines. Agatha is available at this http URL

Read this paper on arXiv…

F. Feng, M. Tuomi and H. Jones
Wed, 10 May 17
55/59

Comments: 22 pages, 16 figures, 5 tables, MNRAS in press, the app is available at this http URL

Fast and scalable Gaussian process modeling with applications to astronomical time series [IMA]

http://arxiv.org/abs/1703.09710


The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large datasets. Gaussian Processes are a popular class of models used for this purpose but, since the computational cost scales as the cube of the number of data points, their application has been limited to relatively small datasets. In this paper, we present a method for Gaussian Process modeling in one-dimension where the computational requirements scale linearly with the size of the dataset. We demonstrate the method by applying it to simulated and real astronomical time series datasets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically-driven damped harmonic oscillators – providing a physical motivation for and interpretation of this choice – but we also demonstrate that it is effective in many other cases. We present a mathematical description of the method, the details of the implementation, and a comparison to existing scalable Gaussian Process methods. The method is flexible, fast, and most importantly, interpretable, with a wide range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.

Read this paper on arXiv…

D. Foreman-Mackey, E. Agol, R. Angus, et. al.
Thu, 30 Mar 17
29/69

Comments: Submitted to the AAS Journals. Comments welcome. Code available: this https URL

Sampling errors in nested sampling parameter estimation [CL]

http://arxiv.org/abs/1703.09701


Sampling errors in nested sampling parameter estimation differ from those in Bayesian evidence calculation, but have been little studied in the literature. This paper provides the first explanation of the two main sources of sampling errors in nested sampling parameter estimation, and presents a new diagrammatic representation for the process. We find no current method can accurately measure the parameter estimation errors of a single nested sampling run, and propose a method for doing so using a new algorithm for dividing nested sampling runs. We empirically verify our conclusions and the accuracy of our new method.

Read this paper on arXiv…

E. Higson, W. Handley, M. Hobson, et. al.
Thu, 30 Mar 17
66/69

Comments: 22 pages + appendix, 10 figures, submitted to Bayesian Analysis

Maximum a posteriori estimation through simulated annealing for binary asteroid orbit determination [IMA]

http://arxiv.org/abs/1703.07408


This paper considers a new method for the binary asteroid orbit determination problem. The method is based on the Bayesian approach with a global optimisation algorithm. The orbital parameters to be determined are modelled through a posteriori, including a priori and likelihood terms. The first term constrains the parameters search space. It allows to introduce knowledge about orbit, if such information is available, but at the same time it does not require a good initial estimation of parameters. The second term is based on given observations, besides it allows us to use and to compare different observational error models. Ones the a posteriori model is build the estimator of the orbital parameters is computed using a global optimisation procedure: the simulated annealing algorithm. The new method was implemented for simulated and real observations, having received successful result, and also verified for ephemeris prediction capability. The new approach can prove useful in case of small numbers of observations and/or in case of non-Gaussian observational errors, when the classical least-squares method can not be applied.

Read this paper on arXiv…

I. Kovalenko, R. Stoica and N. Emelyanov
Thu, 23 Mar 2017
22/47

Comments: N/A

Statistical Topology and the Random Interstellar Medium [CL]

http://arxiv.org/abs/1703.07256


Current astrophysical models of the interstellar medium assume that small scale variation and noise can be modelled as Gaussian random fields or simple transformations thereof, such as lognormal. We use topological methods to investigate this assumption for three regions of the southern sky. We consider Gaussian random fields on two-dimensional lattices and investigate the expected distribution of topological structures quantified through Betti numbers. We demonstrate that there are circumstances where differences in topology can identify differences in distributions when conventional marginal or correlation analyses may not. We propose a non-parametric method for comparing two fields based on the counts of topological features and the geometry of the associated persistence diagrams. When we apply the methods to the astrophysical data, we find strong evidence against a Gaussian random field model for each of the three regions of the interstellar medium that we consider. Further, we show that there are topological differences at a local scale between these different regions.

Read this paper on arXiv…

R. Henderson, I. Makarenko, P. Bushby, et. al.
Wed, 22 Mar 2017
28/65

Comments: 33 pages, 5 figures

Clustering of Gamma-Ray bursts through kernel principal component analysis [CL]

http://arxiv.org/abs/1703.05532


We consider the problem related to clustering of gamma-ray bursts (from “BATSE” catalogue) through kernel principal component analysis in which our proposed kernel outperforms results of other competent kernels in terms of clustering accuracy and we obtain three physically interpretable groups of gamma-ray bursts. The effectivity of the suggested kernel in combination with kernel principal component analysis in revealing natural clusters in noisy and nonlinear data while reducing the dimension of the data is also explored in two simulated data sets.

Read this paper on arXiv…

S. Modak, A. Chattopadhyay and T. Chattopadhyay
Fri, 17 Mar 17
43/50

Comments: 30 pages, 10 figures

The M33 Synoptic Stellar Survey. II. Mira Variables [SSA]

http://arxiv.org/abs/1703.01000


We present the discovery of 1847 Mira candidates in the Local Group galaxy M33 using a novel semi-parametric periodogram technique coupled with a Random Forest classifier. The algorithms were applied to ~2.4×10^5 I-band light curves previously obtained by the M33 Synoptic Stellar Survey. We derive preliminary Period-Luminosity relations at optical, near- & mid-infrared wavelengths and compare them to the corresponding relations in the Large Magellanic Cloud.

Read this paper on arXiv…

W. Yuan, S. He, L. Macri, et. al.
Mon, 6 Mar 17
20/47

Comments: Accepted for publication in the Astronomical Journal

An intermediate-mass black hole in the centre of the globular cluster 47 Tucanae [GA]

http://arxiv.org/abs/1702.02149


Intermediate mass black holes play a critical role in understanding the evolutionary connection between stellar mass and super-massive black holes. However, to date the existence of these species of black holes remains ambiguous and their formation process is therefore unknown. It has been long suspected that black holes with masses $10^{2}-10^{4}M_{\odot}$ should form and reside in dense stellar systems. Therefore, dedicated observational campaigns have targeted globular cluster for many decades searching for signatures of these elusive objects. All candidates found in these targeted searches appear radio dim and do not have the X-ray to radio flux ratio predicted by the fundamental plane for accreting black holes. Based on the lack of an electromagnetic counterpart upper limits of $2060 M_{\odot}$ and $470 M_{\odot}$ have been placed on the mass of a putative black hole in 47 Tucanae (NGC 104) from radio and X-ray observations respectively. Here we show there is evidence for a central black hole in 47 Tuc with a mass of M$_{\bullet}\sim2200 M_{\odot}$$_{-800}^{+1500}$ when the dynamical state of the globular cluster is probed with pulsars. The existence of an intermediate mass black hole in the centre of one of the densest clusters with no detectable electromagnetic counterpart suggests that the black hole is not accreting at a sufficient rate and therefore contrary to expectations is gas starved. This intermediate mass black hole might be a member of electromagnetically invisible population of black holes that are the elusive seeds leading to the formation of supermassive black holes in galaxies.

Read this paper on arXiv…

B. Kiziltan, H. Baumgardt and A. Loeb
Thu, 9 Feb 17
47/67

Comments: Published in Nature

Method for estimating cycle lengths from multidimensional time series: Test cases and application to a massive "in silico" dataset [SSA]

http://arxiv.org/abs/1612.01791


Many real world systems exhibit cyclic behavior that is, for example, due to the nearly harmonic oscillations being perturbed by the strong fluctuations present in the regime of significant non-linearities. For the investigation of such sys- tems special techniques relaxing the assumption to periodicity are required. In this paper, we present the generalization of one of such techniques, namely the D2 phase dispersion statistic, to multidimensional datasets, especially suited for the analysis of the outputs from three-dimensional numerical simulations of the full magnetohydrodynamic equations. We present the motivation and need for the usage of such a method with simple test cases, and present an application to a solar-like semi-global numerical dynamo simulation covering nearly 150 magnetic cycles.

Read this paper on arXiv…

N. Olspert, M. Kapyla and J. Pelt
Wed, 7 Dec 16
21/67

Comments: N/A

Learning an Astronomical Catalog of the Visible Universe through Scalable Bayesian Inference [CL]

http://arxiv.org/abs/1611.03404


Celeste is a procedure for inferring astronomical catalogs that attains state-of-the-art scientific results. To date, Celeste has been scaled to at most hundreds of megabytes of astronomical images: Bayesian posterior inference is notoriously demanding computationally. In this paper, we report on a scalable, parallel version of Celeste, suitable for learning catalogs from modern large-scale astronomical datasets. Our algorithmic innovations include a fast numerical optimization routine for Bayesian posterior inference and a statistically efficient scheme for decomposing astronomical optimization problems into subproblems.
Our scalable implementation is written entirely in Julia, a new high-level dynamic programming language designed for scientific and numerical computing. We use Julia’s high-level constructs for shared and distributed memory parallelism, and demonstrate effective load balancing and efficient scaling on up to 8192 Xeon cores on the NERSC Cori supercomputer.

Read this paper on arXiv…

J. Regier, K. Pamnany, R. Giordano, et. al.
Fri, 11 Nov 16
11/40

Comments: submitting to IPDPS’17

The Non-homogeneous Poisson Process for Fast Radio Burst Rates [HEAP]

http://arxiv.org/abs/1611.00458


This paper presents the non-homogeneous Poisson process (NHPP) for modeling the rate of fast radio bursts (FRBs) and other infrequently observed astronomical events. The NHPP, well-known in statistics, can model changes in the rate as a function of both astronomical features and the details of an observing campaign. This is particularly helpful for rare events like FRBs because the NHPP can combine information across surveys, making the most of all available information. The goal of the paper is two-fold. First, it is intended to be a tutorial on the use of the NHPP. Second, we build an NHPP model that incorporates beam patterns and a power law flux distribution for the rate of FRBs. Using information from 12 surveys including 15 detections, we find an all-sky FRB rate of 586.88 events per sky per day above a flux of 1 Jy (95\% CI: 271.86, 923.72) and a flux power-law index of 0.91 (95\% CI: 0.57, 1.25). Our rate is lower than other published rates, but consistent with the rate given in Champion et al. 2016.

Read this paper on arXiv…

E. Lawrence, S. Wiel, C. Law, et. al.
Thu, 3 Nov 16
27/57

Comments: 19 pages, 2 figures

Neutron stars in the light of SKA: Data, statistics, and science [IMA]

http://arxiv.org/abs/1610.08139


The Square Kilometre Array (SKA), when it becomes functional, is expected to enrich neutron star (NS) catalogues by at least an order of magnitude over their current state. This includes the discovery of new NS objects leading to better sampling of under-represented NS categories, precision measurements of intrinsic properties such as spin period and magnetic field, as also data on related phenomena such as microstructure, nulling, glitching, etc. This will present a unique opportunity to seek answers to interesting and fundamental questions about the extreme physics underlying these exotic objects in the universe. In this paper, we first present a meta-analysis (from a methodological viewpoint) of statistical analyses performed using existing NS data, with a two-fold goal: First, this should bring out how statistical models and methods are shaped and dictated by the science problem being addressed. Second, it is hoped that these analyses will provide useful starting points for deeper analyses involving richer data from SKA whenever it becomes available. We also describe a few other areas of NS science which we believe will benefit from SKA which are of interest to the Indian NS community.

Read this paper on arXiv…

M. Arjunwadkar, A. Kashikar and M. Bagchi
Thu, 27 Oct 16
46/59

Comments: To appear in Journal of Astrophysics and Astronomy (JOAA) special issue on “Science with the SKA: an Indian perspective”

Period estimation for sparsely-sampled quasi-periodic light curves applied to Miras [SSA]

http://arxiv.org/abs/1609.06680


We develop a non-linear semi-parametric Gaussian process model to estimate periods of Miras with sparsely-sampled light curves. The model uses a sinusoidal basis for the periodic variation and a Gaussian process for the stochastic changes. We use maximum likelihood to estimate the period and the parameters of the Gaussian process, while integrating out the effects of other nuisance parameters in the model with respect to a suitable prior distribution obtained from earlier studies. Since the likelihood is highly multimodal for period, we implement a hybrid method that applies the quasi-Newton algorithm for Gaussian process parameters and search the period/frequency parameter over a dense grid.
A large-scale, high-fidelity simulation is conducted to mimic the sampling quality of Mira light curves obtained by the M33 Synoptic Stellar Survey. The simulated data set is publicly available and can serve as a testbed for future evaluation of different period estimation methods. The semi-parametric model outperforms an existing algorithm on this simulated test data set as measured by period recovery rate and quality of the resulting Period-Luminosity relations.

Read this paper on arXiv…

S. He, W. Yuan, J. Huang, et. al.
Thu, 22 Sep 16
22/62

Comments: Accepted for publication in The Astronomical Journal. Software package and test data set available at this http URL

The Type Ia Supernova Color-Magnitude Relation and Host Galaxy Dust: A Simple Hierarchical Bayesian Model [CEA]

http://arxiv.org/abs/1609.04470


Conventional Type Ia supernova (SN Ia) cosmology analyses currently use a simplistic linear regression of magnitude versus color and light curve shape, which does not model intrinsic SN Ia variations and host galaxy dust as physically distinct effects, resulting in low color-magnitude slopes. We construct a probabilistic generative model for the distribution of dusty extinguished absolute magnitudes and apparent colors as a convolution of the intrinsic SN Ia color-magnitude distribution and the host galaxy dust reddening-extinction distribution. If the intrinsic color-magnitude (M_B vs. B-V) slope beta_int differs from the host galaxy dust law R_B, this convolution results in a specific curve of mean extinguished absolute magnitude vs. apparent color. The derivative of this curve smoothly transitions from beta_int in the blue tail to R_B in the red tail of the apparent color distribution. The conventional linear fit approximates this effective curve at this transition near the average apparent color, resulting in an apparent slope beta_app between beta_int and R_B. We incorporate these effects into a hierarchical Bayesian statistical model for SN Ia light curve measurements, and analyze a dataset of SALT2 optical light curve fits of a compilation of 277 nearby SN Ia at z < 0.10. The conventional linear fit obtains beta_app = 3. Our model finds a beta_int = 2.2 +/- 0.3 and a distinct dust law of R_B = 3.7 +/- 0.3, consistent with the average for Milky Way dust, while correcting a systematic distance bias of ~0.10 mag in the tails of the apparent color distribution.

Read this paper on arXiv…

K. Mandel, D. Scolnic, H. Shariff, et. al.
Fri, 16 Sep 16
17/63

Comments: 22 pages, 16 figures, submitted to ApJ

astroABC: An Approximate Bayesian Computation Sequential Monte Carlo sampler for cosmological parameter estimation [IMA]

http://arxiv.org/abs/1608.07606


Given the complexity of modern cosmological parameter inference where we are faced with non-Gaussian data and noise, correlated systematics and multi-probe correlated data sets, the Approximate Bayesian Computation (ABC) method is a promising alternative to traditional Markov Chain Monte Carlo approaches in the case where the Likelihood is intractable or unknown. The ABC method is called “Likelihood free” as it avoids explicit evaluation of the Likelihood by using a forward model simulation of the data which can include systematics. We introduce astroABC, an open source ABC Sequential Monte Carlo (SMC) sampler for parameter estimation. A key challenge in astrophysics is the efficient use of large multi-probe datasets to constrain high dimensional, possibly correlated parameter spaces. With this in mind astroABC allows for massive parallelization using MPI, a framework that handles spawning of jobs across multiple nodes. A key new feature of astroABC is the ability to create MPI groups with different communicators, one for the sampler and several others for the forward model simulation, which speeds up sampling time considerably. For smaller jobs the Python multiprocessing option is also available. Other key features include: a Sequential Monte Carlo sampler, a method for iteratively adapting tolerance levels, local covariance estimate using scikit-learn’s KDTree, modules for specifying optimal covariance matrix for a component-wise or multivariate normal perturbation kernel, output and restart files are backed up every iteration, user defined metric and simulation methods, a module for specifying heterogeneous parameter priors including non-standard prior PDFs, a module for specifying a constant, linear, log or exponential tolerance level, well-documented examples and sample scripts. This code is hosted online at https://github.com/EliseJ/astroABC

Read this paper on arXiv…

E. Jennings and M. Madigan
Tue, 30 Aug 16
37/78

Comments: 19 pages, 3 figures. Comments welcome

Bayesian isochrone fitting and stellar ages [GA]

http://arxiv.org/abs/1607.03000


Stellar evolution theory has been extraordinarily successful at explaining the different phases under which stars form, evolve and die. While the strongest constraints have traditionally come from binary stars, the advent of asteroseismology is bringing unique measures in well-characterised stars. For stellar populations in general, however, only photometric measures are usually available, and the comparison with the predictions of stellar evolution theory have mostly been qualitative. For instance, the geometrical shapes of isochrones have been used to infer ages of coeval populations, but without any proper statistical basis. In this chapter we provide a pedagogical review on a Bayesian formalism to make quantitative inferences on the properties of single, binary and small ensembles of stars, including unresolved populations. As an example, we show how stellar evolution theory can be used in a rigorous way as a prior information to measure the ages of stars between the ZAMS and the Helium flash, and their uncertainties, using photometric data only.

Read this paper on arXiv…

D. Valls-Gabaud
Tue, 12 Jul 16
11/71

Comments: 43 pages, Proceedings of the Evry Schatzman School of Stellar Astrophysics “The ages of stars”, EAS Publications Series, Volume 65

Comparing cosmic web classifiers using information theory [CEA]

http://arxiv.org/abs/1606.06758


We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-web, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

Read this paper on arXiv…

F. Leclercq, G. Lavaux, J. Jasche, et. al.
Thu, 23 Jun 16
23/49

Comments: 20 pages, 8 figures, 6 tables. Public data available from the first author’s website (currently this http URL)

Tests for Comparing Weighted Histograms. Review and Improvements [CL]

http://arxiv.org/abs/1606.06591


Histograms with weighted entries are used to estimate probability density functions. Computer simulation is the main application of this type of histograms. A review on chi-square tests for comparing weighted histograms is presented in this paper. Improvements to these tests that have a size closer to its nominal value are proposed. Numerical examples are presented for evaluation and demonstration of various applications of the tests.

Read this paper on arXiv…

N. Gagunashvili
Wed, 22 Jun 16
21/50

Comments: 23 pages, 2 figures. arXiv admin note: text overlap with arXiv:0905.4221

A Goldilocks principle for modeling radial velocity noise [EPA]

http://arxiv.org/abs/1606.05196


The doppler measurements of stars are diluted and distorted by stellar activity noise. Different choices of noise models and statistical methods have led to much controversy in the confirmation of exoplanet candidates obtained through analysing radial velocity data. To quantify the limitation of various models and methods, we compare different noise models and signal detection criteria for various simulated and real data sets in the Bayesian framework. According to our analyses, the white noise model tend to interpret noise as signal, leading to false positives. On the other hand, the red noise models are likely to interprete signal as noise, resulting in false negatives. We find that the Bayesian information criterion combined with a Bayes factor threshold of 150 can efficiently rule out false positives and confirm true detections. We further propose a Goldilocks principle aimed at modeling radial velocity noise to avoid too many false positives and too many false negatives. We propose that the noise model with RHK-dependent jitter is used in combination with the moving average model to detect planetary signals for M dwarfs. Our work may also shed light on the noise modeling for hotter stars, and provide a valid approach for finding similar principles in other disciplines.

Read this paper on arXiv…

F. Feng, M. Tuomi, H. Jones, et. al.
Fri, 17 Jun 16
36/65

Comments: 14 pages, 6 figures, accepted for publication in MNRAS

Using Extreme Value Theory for Determining the Probability of Carrington-Like Solar Flares [CL]

http://arxiv.org/abs/1604.03325


Space weather events can negatively affect satellites, the electricity grid, satellite navigation systems and human health. As a consequence, extreme space weather has been added to the UK and other national risk registers. However, by their very nature, extreme events occur rarely and statistical methods are required to determine the probability of occurrence solar storms. Space weather events can be characterised by a number of natural phenomena such as X-ray (solar) flares, solar energetic particle (SEP) fluxes, coronal mass ejections and various geophysical indices (Dst, Kp, F10.7). Here we use extreme value theory (EVT) to investigate the probability of extreme solar flares. Previous work has suggested that the distribution of solar flares follows a power law. However such an approach can lead to overly “fat-tails” in the probability distribution function and thus to an under estimation of the return time of such events. Using EVT and GOES X-ray flux data we find that the expected 150 year return level is an X60 flare (6×10^(-3) Wm-2, 1-8 {\AA} X-ray flux). We also show that the EVT results are consistent with flare data from the Kepler space telescope mission.

Read this paper on arXiv…

S. Elvidge and M. Angling
Wed, 13 Apr 16
43/60

Comments: 10 pages, 3 figures, submitted to Nature

Photo-z Estimation: An Example of Nonparametric Conditional Density Estimation under Selection Bias [CL]

http://arxiv.org/abs/1604.01339


Redshift is a key quantity for inferring cosmological model parameters. In photometric redshift estimation, cosmologists use the coarse data collected from the vast majority of galaxies to predict the redshift of individual galaxies. To properly quantify the uncertainty in the predictions, however, one needs to go beyond standard regression and instead estimate the full conditional density f(z|x) of a galaxy’s redshift z given its photometric covariates x. The problem is further complicated by selection bias: usually only the rarest and brightest galaxies have known redshifts, and these galaxies have characteristics and measured covariates that do not necessarily match those of more numerous and dimmer galaxies of unknown redshift. Unfortunately, there is not much research on how to best estimate complex multivariate densities in such settings. Here we describe a general framework for properly constructing and assessing nonparametric conditional density estimators under selection bias, and for combining two or more estimators for optimal performance. We propose new improved photo-z estimators and illus- trate our methods on data from the Sloan Data Sky Survey and an application to galaxy-galaxy lensing. Although our main application is photo-z estimation, our methods are relevant to any high-dimensional regression setting with complicated asymmetric and multimodal distributions in the response variable.

Read this paper on arXiv…

R. Izbicki, A. Lee and P. Freeman
Thu, 7 Apr 16
42/51

Comments: N/A

Unfolding problem clarification and solution validation [CL]

http://arxiv.org/abs/1602.05834


The unfolding problem formulation for correcting experimental data distortions due to finite resolution and limited detector acceptance is discussed. A novel validation of the problem solution is proposed. Attention is drawn to fact that different unfolded distributions may satisfy the validation criteria, in which case a conservative approach using entropy is suggested. The importance of analysis of residuals is demonstrated.

Read this paper on arXiv…

N. Gagunashvili
Fri, 19 Feb 16
22/50

Comments: 9 pages,4 figures

Bayesian Estimates of Astronomical Time Delays between Gravitationally Lensed Stochastic Light Curves [IMA]

http://arxiv.org/abs/1602.01462


The gravitational field of a galaxy can act as a lens and deflect the light emitted by a more distant object such as a quasar. If the galaxy is a strong gravitational lens, it can produce multiple images of the same quasar in the sky. Since the light in each gravitationally lensed image traverses a different path length from the quasar to the Earth, fluctuations in the source brightness are observed in the several images at different times. The time delay between these fluctuations can be used to constrain cosmological parameters and can be inferred from the time series of brightness data or light curves of each image. To estimate the time delay, we construct a model based on a state-space representation for irregularly observed time series generated by a latent continuous-time Ornstein-Uhlenbeck process. We account for microlensing, an additional source of independent long-term extrinsic variability, via a polynomial regression. Our Bayesian strategy adopts a Metropolis-Hastings within Gibbs sampler. We improve the sampler by using an ancillarity-sufficiency interweaving strategy and adaptive Markov chain Monte Carlo. We introduce a profile likelihood of the time delay as an approximation of its marginal posterior distribution. The Bayesian and profile likelihood approaches complement each other, producing almost identical results; the Bayesian method is more principled but the profile likelihood is simpler to implement. We demonstrate our estimation strategy using simulated data of doubly- and quadruply-lensed quasars, and observed data from quasars Q0957+561 and J1029+2623.

Read this paper on arXiv…

H. Tak, K. Mandel, D. Dyk, et. al.
Fri, 5 Feb 16
31/47

Comments: N/A

Using hydrodynamical simulations of stellar atmospheres for periodogram standardization : application to exoplanet detection [CL]

http://arxiv.org/abs/1601.07375


Our aim is to devise a detection method for exoplanet signatures (multiple sinusoids) that is both powerful and robust to partially unknown statistics under the null hypothesis. In the considered application, the noise is mostly created by the stellar atmosphere, with statistics depending on the complicated interplay of several parameters. Recent progresses in hydrodynamic (HD) simulations show however that realistic stellar noise realizations can be numerically produced off-line by astrophysicists. We propose a detection method that is calibrated by HD simulations and analyze its performances. A comparison of the theoretical results with simulations on synthetic and real data shows that the proposed method is powerful and robust.

Read this paper on arXiv…

S. Sulis, D. Mary and L. Bigot
Fri, 29 Jan 16
10/52

Comments: 5 pages, 3 figures. This manuscript was submitted and accepted to the 41st IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2016

Testing the anisotropy in the angular distribution of $Fermi$/GBM gamma-ray bursts [HEAP]

http://arxiv.org/abs/1512.02865


Gamma-ray bursts (GRBs) were confirmed to be of extragalactic origin due to their isotropic angular distribution, combined with the fact that they exhibited an intensity distribution that deviated strongly from the $-3/2$ power law. This finding was later confirmed with the first redshift, equal to at least $z=0.835$, measured for GRB970508. Despite this result, the data from $CGRO$/BATSE and $Swift$/BAT indicate that long GRBs are indeed distributed isotropically, but the distribution of short GRBs is anisotropic. $Fermi$/GBM has detected 1669 GRBs up to date, and their sky distribution is examined in this paper. A number of statistical tests is applied: nearest neighbour analysis, fractal dimension, dipole and quadrupole moments of the distribution function decomposed into spherical harmonics, binomial test, and the two point angular correlation function. Monte Carlo benchmark testing of each test is performed in order to evaluate its reliability. It is found that short GRBs are distributed anisotropically on the sky, and long ones have an isotropic distribution. The probability that these results are not a chance occurence is equal to at least 99.98\% and 30.68\% for short and long GRBs, respectively. The cosmological context of this finding and its relation to large-scale structures is briefly discussed.

Read this paper on arXiv…

M. Tarnopolski
Thu, 10 Dec 15
3/63

Comments: 13 pages, 5 figures; submitted; comments very welcome

An integrative approach based on probabilistic modelling and statistical inference for morpho-statistical characterization of astronomical data [CL]

http://arxiv.org/abs/1510.05553


This paper describes several applications in astronomy and cosmology that are addressed using probabilistic modelling and statistical inference.

Read this paper on arXiv…

R. Stoica, S. Liu, L. Liivamagi, et. al.
Tue, 20 Oct 15
45/92

Comments: N/A

Detecting Effects of Filaments on Galaxy Properties in the Sloan Digital Sky Survey III [GA]

http://arxiv.org/abs/1509.06376


We study the effects of filaments on galaxy properties in the Sloan Digital Sky Survey (SDSS) Data Release 12 using filaments from the `Cosmic Web Reconstruction’ catalogue (Chen et al. 2015a), a publicly available filament catalogue for SDSS. Since filaments are tracers of medium-to-high density regions, we expect that galaxy properties associated with the environment are dependent on the distance to the nearest filament. Our analysis demonstrates a red galaxy or a high-mass galaxy tend to reside closer to filaments than a blue or low-mass galaxy. After adjusting the effect from stellar mass, on average, late-forming galaxies or large galaxies have a shorter distance to filaments than early-forming galaxies or small galaxies. For the Main galaxy sample, all signals are very significant ($> 5\sigma$). For the LOWZ and CMASS samples, most of the signals are significant (with $> 3\sigma$). The filament effects we observe persist until z = 0.7 (the edge of the CMASS sample). Comparing our results to those using the galaxy distances from redMaPPer galaxy clusters as a reference, we find a similar result between filaments and clusters. Our findings illustrate the strong correlation of galaxy properties with proximity to density ridges, strongly supporting the claim that density ridges are good tracers of filaments.

Read this paper on arXiv…

Y. Chen, S. Ho, R. Mandelbaum, et. al.
Wed, 23 Sep 15
41/63

Comments: 12 pages, 6 figures, 2 tables

Cosmic Web Reconstruction through Density Ridges: Catalogue [CEA]

http://arxiv.org/abs/1509.06443


We construct a catalogue for filaments using a novel approach called SCMS (subspace constrained mean shift; Ozertem & Erdogmus 2011; Chen et al. 2015). SCMS is a gradient-based method that detects filaments through density ridges (smooth curves tracing high-density regions). A great advantage of SCMS is its uncertainty measure, which allows an evaluation of the errors for the detected filaments. To detect filaments, we use data from the Sloan Digital Sky Survey, which consist of three galaxy samples: the NYU main galaxy sample (MGS), the LOWZ sample and the CMASS sample. Each of the three dataset covers different redshift regions so that the combined sample allows detection of filaments up to z = 0.7. Our filament catalogue consists of a sequence of two-dimensional filament maps at different redshifts that provide several useful statistics on the evolution cosmic web. To construct the maps, we select spectroscopically confirmed galaxies within 0.050 < z < 0.700 and partition them into 130 bins. For each bin, we ignore the redshift, treating the galaxy observations as a 2-D data and detect filaments using SCMS. The filament catalogue consists of 130 individual 2-D filament maps, and each map comprises points on the detected filaments that describe the filamentary structures at a particular redshift. We also apply our filament catalogue to investigate galaxy luminosity and its relation with distance to filament. Using a volume-limited sample, we find strong evidence (6.1$\sigma$ – 12.3$\sigma$) that galaxies close to filaments are generally brighter than those at significant distance from filaments.

Read this paper on arXiv…

Y. Chen, S. Ho, J. Brinkmann, et. al.
Wed, 23 Sep 15
45/63

Comments: 14 pages, 12 figures, 4 tables

Polarized CMB recovery with sparse component separation [CEA]

http://arxiv.org/abs/1508.07131


The polarization modes of the cosmological microwave background are an invaluable source of information for cosmology, and a unique window to probe the energy scale of inflation. Extracting such information from microwave surveys requires disentangling between foreground emissions and the cosmological signal, which boils down to solving a component separation problem. Component separation techniques have been widely studied for the recovery of CMB temperature anisotropies but quite rarely for the polarization modes. In this case, most component separation techniques make use of second-order statistics to discriminate between the various components. More recent methods, which rather emphasize on the sparsity of the components in the wavelet domain, have been shown to provide low-foreground, full-sky estimate of the CMB temperature anisotropies. Building on sparsity, the present paper introduces a new component separation technique dubbed PolGMCA (Polarized Generalized Morphological Component Analysis), which refines previous work to specifically tackle the estimation of the polarized CMB maps: i) it benefits from a recently introduced sparsity-based mechanism to cope with partially correlated components, ii) it builds upon estimator aggregation techniques to further yield a better noise contamination/non-Gaussian foreground residual trade-off. The PolGMCA algorithm is evaluated on simulations of full-sky polarized microwave sky simulations using the Planck Sky Model (PSM), which show that the proposed method achieve a precise recovery of the CMB map in polarization with low noise/foreground contamination residuals. It provides improvements with respect to standard methods, especially on the galactic center where estimating the CMB is challenging.

Read this paper on arXiv…

J. Bobin, F. Sureau and J. Starck
Mon, 31 Aug 15
3/63

Comments: Accepted to A&A, august 2015

Detecting Abrupt Changes in the Spectra of High-Energy Astrophysical Sources [CL]

http://arxiv.org/abs/1508.07083


Variable-intensity astronomical sources are the result of complex and often extreme physical processes. Abrupt changes in source intensity are typically accompanied by equally sudden spectral shifts, i.e., sudden changes in the wavelength distribution of the emission. This article develops a method for modeling photon counts collected form observation of such sources. We embed change points into a marked Poisson process, where photon wavelengths are regarded as marks and both the Poisson intensity parameter and the distribution of the marks are allowed to change. We believe this is the first effort to embed change points into a marked Poisson process. Between the change points, the spectrum is modeled non-parametrically using a mixture of a smooth radial basis expansion and a number of local deviations from the smooth term representing spectral emission lines. Because the model is over parameterized we employ an $\ell_1$ penalty. The tuning parameter in the penalty and the number of change points are determined via the minimum description length principle. Our method is validated via a series of simulation studies and its practical utility is illustrated in the analysis of the ultra-fast rotating yellow giant star known as FK Com.

Read this paper on arXiv…

R. Wong, V. Kashyap, T. Lee, et. al.
Mon, 31 Aug 15
21/63

Comments: 27 pages, 5 figures

Multivariate Approaches to Classification in Extragalactic Astronomy [GA]

http://arxiv.org/abs/1508.06756


Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono-or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.

Read this paper on arXiv…

D. Fraix-Burnet, M. Thuillard and A. Chattopadhyay
Fri, 28 Aug 15
15/49

Comments: Open Access paper. this http URL&gt;. \&lt;10.3389/fspas.2015.00003 \&gt

Investigating Galaxy-Filament Alignments in Hydrodynamic Simulations using Density Ridges [CEA]

http://arxiv.org/abs/1508.04149


In this paper, we study the filamentary structures and the galaxy alignment along filaments at redshift $z=0.06$ in the MassiveBlack-II simulation, a state-of-the-art, high-resolution hydrodynamical cosmological simulation which includes stellar and AGN feedback in a volume of (100 Mpc$/h$)$^3$. The filaments are constructed using the subspace constrained mean shift (SCMS; Ozertem & Erdogmus (2011) and Chen et al. (2015a)). First, we show that reconstructed filaments using galaxies and reconstructed filaments using dark matter particles are similar to each other; over $50\%$ of the points on the galaxy filaments have a corresponding point on the dark matter filaments within distance $0.13$ Mpc$/h$ (and vice versa) and this distance is even smaller at high-density regions. Second, we observe the alignment of the major principal axis of a galaxy with respect to the orientation of its nearest filament and detect a $2.5$ Mpc$/h$ critical radius for filament’s influence on the alignment when the subhalo mass of this galaxy is between $10^9M_\odot/h$ and $10^{12}M_\odot/h$. Moreover, we find the alignment signal to increase significantly with the subhalo mass. Third, when a galaxy is close to filaments (less than $0.25$ Mpc$/h$), the galaxy alignment toward the nearest galaxy group depends on the galaxy subhalo mass. Finally, we find that galaxies close to filaments or groups tend to be rounder than those away from filaments or groups.

Read this paper on arXiv…

Y. Chen, S. Ho, A. Tenneti, et. al.
Wed, 19 Aug 15
5/32

Comments: 11 pages, 10 figures

Improving the precision matrix for precision cosmology [CEA]

http://arxiv.org/abs/1508.03162


The estimation of cosmological constraints from observations of the large scale structure of the Universe, such as the power spectrum or the correlation function, requires the knowledge of the inverse of the associated covariance matrix, namely the precision matrix, $\mathbf{\Psi}$. In most analyses, $\mathbf{\Psi}$ is estimated from a limited set of mock catalogues. Depending on how many mocks are used, this estimation has an associated error which must be propagated into the final cosmological constraints. For future surveys such as Euclid and DESI, the control of this additional uncertainty requires a prohibitively large number of mock catalogues. In this work we test a novel technique for the estimation of the precision matrix, the covariance tapering method, in the context of baryon acoustic oscillation measurements. Even though this technique was originally devised as a way to speed up maximum likelihood estimations, our results show that it also reduces the impact of noisy precision matrix estimates on the derived confidence intervals, without introducing biases on the target parameters. The application of this technique can help future surveys to reach their true constraining power using a significantly smaller number of mock catalogues.

Read this paper on arXiv…

D. Paz and A. Sanchez
Fri, 14 Aug 15
43/49

Comments: 9 pages, 7 figures, submitted to MNRAS

Trans-Dimensional Bayesian Inference for Gravitational Lens Substructures [IMA]

http://arxiv.org/abs/1508.00662


We introduce a Bayesian solution to the problem of inferring the density profile of strong gravitational lenses when the lens galaxy may contain multiple dark or faint substructures. The source and lens models are based on a superposition of an unknown number of non-negative basis functions (or “blobs”) whose form was chosen with speed as a primary criterion. The prior distribution for the blobs’ properties is specified hierarchically, so the mass function of substructures is a natural output of the method. We use reversible jump Markov Chain Monte Carlo (MCMC) within Diffusive Nested Sampling (DNS) to sample the posterior distribution and evaluate the marginal likelihood of the model, including the summation over the unknown number of blobs in the source and the lens. We demonstrate the method on a simulated data set with a single substructure, which is recovered well with moderate uncertainties. We also apply the method to the g-band image of the “Cosmic Horseshoe” system, and find some hints of potential substructures. However, we caution that such results could also be caused by misspecifications in the model (such as the shape of the smooth lens component or the point spread function), which are difficult to guard against in full generality.

Read this paper on arXiv…

B. Brewer, D. Huijser and G. Lewis
Wed, 5 Aug 15
5/46

Comments: Submitted. 10 pages, 10 figures

Weighted ABC: a new strategy for cluster strong lensing cosmology with simulations [CEA]

http://arxiv.org/abs/1507.05617


Comparisons between observed and predicted strong lensing properties of galaxy clusters have been routinely used to claim either tension or consistency with $\Lambda$CDM cosmology. However, standard approaches to such cosmological tests are unable to quantify the preference for one cosmology over another. We advocate using a `weighted’ variant of approximate Bayesian computation (ABC), whereby the parameters of the scaling relation between Einstein radii and cluster mass, $\alpha$ and $\beta$, are treated as summary statistics. We demonstrate, for the first time, a method of estimating the likelihood of the data under the $\Lambda$CDM framework, using the X-ray selected $z>0.5$ MACS clusters as a case in point and employing both N-body and hydrodynamic simulations of clusters. We investigate the uncertainty in the calculated likelihood, and consequential ability to compare competing cosmologies, that arises from incomplete descriptions of baryonic processes, discrepancies in cluster selection criteria, redshift distribution, and dynamical state. The relation between triaxial cluster masses at various overdensities provide a promising alternative to the strong lensing test.

Read this paper on arXiv…

M. Killedar, S. Borgani, D. Fabjan, et. al.
Wed, 22 Jul 15
34/59

Comments: 15 pages, 6 figures, 1 table, submitted to MNRAS, comments welcome

The Overlooked Potential of Generalized Linear Models in Astronomy-III: Bayesian Negative Binomial Regression and Globular Cluster Populations [IMA]

http://arxiv.org/abs/1506.04792


In this paper, the third in a series illustrating the power of generalized linear models (GLMs) for the astronomical community, we elucidate the potential of the class of GLMs which handles count data. The size of a galaxy’s globular cluster population $N_{\rm GC}$ is a prolonged puzzle in the astronomical literature. It falls in the category of count data analysis, yet it is usually modelled as if it were a continuous response variable. We have developed a Bayesian negative binomial regression model to study the connection between $N_{\rm GC}$ and the following galaxy properties: central black hole mass, dynamical bulge mass, bulge velocity dispersion, and absolute visual magnitude. The methodology introduced herein naturally accounts for heteroscedasticity, intrinsic scatter, errors in measurements in both axes (either discrete or continuous), and allows modelling the population of globular clusters on their natural scale as a non-negative integer variable. Prediction intervals of 99% around the trend for expected $N_{\rm GC}$comfortably envelope the data, notably including the Milky Way, which has hitherto been considered a problematic outlier. Finally, we demonstrate how random intercept models can incorporate information of each particular galaxy morphological type. Bayesian variable selection methodology allows for automatically identifying galaxy types with different productions of GCs, suggesting that on average S0 galaxies have a GC population 35% smaller than other types with similar brightness.

Read this paper on arXiv…

R. Souza, J. Hilbe, B. Buelens, et. al.
Wed, 17 Jun 15
42/47

Comments: 14 pages, 12 figures. Comments are welcome

Cosmic web-type classification using decision theory [CEA]

http://arxiv.org/abs/1503.00730


We propose a decision criterion for segmenting the cosmic web into different structure types (voids, sheets, filaments and clusters) on the basis of their respective probabilities and the strength of data constraints. Our approach is inspired by an analysis of games of chance where the gambler only plays if a positive expected net gain can be achieved based on some degree of privileged information. The result is a general solution for classification problems in the face of uncertainty, including the option of not committing to a class for a candidate object. As an illustration, we produce high-resolution maps of web-type constituents in the nearby Universe as probed by the Sloan Digital Sky Survey main galaxy sample. Other possible applications include the selection and labeling of objects in catalogs derived from astronomical survey data.

Read this paper on arXiv…

F. Leclercq, J. Jasche and B. Wandelt
Wed, 4 Mar 15
45/45

Comments: 5 pages, 2 figures, submitted to A&A Letters

Constrained correlation functions from the Millennium Simulation [CEA]

http://arxiv.org/abs/1502.04491


Context. In previous work, we developed a quasi-Gaussian approximation for the likelihood of correlation functions, which, in contrast to the usual Gaussian approach, incorporates fundamental mathematical constraints on correlation functions. The analytical computation of these constraints is only feasible in the case of correlation functions of one-dimensional random fields.
Aims. In this work, we aim to obtain corresponding constraints in the case of higher-dimensional random fields and test them in a more realistic context.
Methods. We develop numerical methods to compute the constraints on correlation functions which are also applicable for two- and three-dimensional fields. In order to test the accuracy of the numerically obtained constraints, we compare them to the analytical results for the one-dimensional case. Finally, we compute correlation functions from the halo catalog of the Millennium Simulation, check whether they obey the constraints, and examine the performance of the transformation used in the construction of the quasi-Gaussian likelihood.
Results. We find that our numerical methods of computing the constraints are robust and that the correlation functions measured from the Millennium Simulation obey them. Despite the fact that the measured correlation functions lie well inside the allowed region of parameter space, i.e. far away from the boundaries of the allowed volume defined by the constraints, we find strong indications that the quasi-Gaussian likelihood yields a substantially more accurate description than the Gaussian one.

Read this paper on arXiv…

P. Wilking, R. Roseler and P. Schneider
Tue, 17 Feb 15
58/60

Comments: 11 pages, 13 figures, submitted to A&A

Fast Bayesian Inference for Exoplanet Discovery in Radial Velocity Data [IMA]

http://arxiv.org/abs/1501.06952


Inferring the number of planets $N$ in an exoplanetary system from radial velocity (RV) data is a challenging task. Recently, it has become clear that RV data can contain periodic signals due to stellar activity, which can be difficult to distinguish from planetary signals. However, even doing the inference under a given set of simplifying assumptions (e.g. no stellar activity) can be difficult. It is common for the posterior distribution for the planet parameters, such as orbital periods, to be multimodal and to have other awkward features. In addition, when $N$ is unknown, the marginal likelihood (or evidence) as a function of $N$ is required. Rather than doing separate runs with different trial values of $N$, we propose an alternative approach using a trans-dimensional Markov Chain Monte Carlo method within Nested Sampling. The posterior distribution for $N$ can be obtained with a single run. We apply the method to $\nu$ Oph and Gliese 581, finding moderate evidence for additional signals in $\nu$ Oph with periods of 36.11 $\pm$ 0.034 days, 75.58 $\pm$ 0.80 days, and 1709 $\pm$ 183 days; the posterior probability that at least one of these exists is 85%. The results also suggest Gliese 581 hosts many (7-15) “planets” (or other causes of other periodic signals), but only 4-6 have well determined periods. The analysis of both of these datasets shows phase transitions exist which are difficult to negotiate without Nested Sampling.

Read this paper on arXiv…

B. Brewer and C. Donovan
Thu, 29 Jan 15
36/49

Comments: Accepted for publication in MNRAS. 9 pages, 12 figures. Code at this http URL

Cosmic Web Reconstruction through Density Ridges: Method and Algorithm [CEA]

http://arxiv.org/abs/1501.05303


The detection and characterization of filamentary structures in the cosmic web allows cosmologists to constrain parameters that dictates the evolution of the Universe. While many filament estimators have been proposed, they generally lack estimates of uncertainty, reducing their inferential power. In this paper, we demonstrate how one may apply the Subspace Constrained Mean Shift (SCMS) algorithm (Ozertem and Erdogmus (2011); Genovese et al. (2012)) to uncover filamentary structure in galaxy data. The SCMS algorithm is a gradient ascent method that models filaments as density ridges, one-dimensional smooth curves that trace high-density regions within the point cloud. We also demonstrate how augmenting the SCMS algorithm with bootstrap-based methods of uncertainty estimation allows one to place uncertainty bands around putative filaments. We apply the SCMS method to datasets sampled from the P3M N-body simulation, with galaxy number densities consistent with SDSS and WFIRST-AFTA and to LOWZ and CMASS data from the Baryon Oscillation Spectroscopic Survey (BOSS). To further assess the efficacy of SCMS, we compare the relative locations of BOSS filaments with galaxy clusters in the redMaPPer catalog, and find that redMaPPer clusters are significantly closer (with p-values $< 10^{-9}$) to SCMS-detected filaments than to randomly selected galaxies.

Read this paper on arXiv…

Y. Chen, S. Ho, P. Freeman, et. al.
Fri, 23 Jan 15
65/65

Comments: 15 pages, 15 figures, 1 table

Relative distribution of dark matter and stellar mass in three massive galaxy clusters [CEA]

http://arxiv.org/abs/1501.01814


This work observationally addresses the relative distribution of total and optically luminous matter in galaxy clusters by computing the radial profile of the stellar-to-total mass ratio. We adopt state-of-the-art accurate lensing masses free from assumptions about the mass radial profile and we use extremely deep multicolor wide–field optical images to distinguish star formation from stellar mass, to properly calculate the mass in galaxies of low mass, those outside the red sequence, and to allow a contribution from galaxies of low mass that is clustercentric dependent. We pay special attention to issues and contributions that are usually underrated, yet are major sources of uncertainty, and we present an approach that allows us to account for all of them. Here we present the results for three very massive clusters at $z\sim0.45$, MACSJ1206.2-0847, MACSJ0329.6-0211, and RXJ1347.5-1145. We find that stellar mass and total matter are closely distributed on scales from about 150 kpc to 2.5 Mpc: the stellar-to-total mass ratio is radially constant. We find that the characteristic mass stays constant across clustercentric radii and clusters, but that the less-massive end of the galaxy mass function is dependent on the environment.

Read this paper on arXiv…

S. Andreon
Fri, 9 Jan 15
15/49

Comments: A&A, in press

A Unifying Theory for Scaling Laws of Human Populations [CL]

http://arxiv.org/abs/1501.00738


The spatial distribution of people exhibits clustering across a wide range of scales, from household (~$10^{-2}$ km) to continental (~$10^4$ km) scales. Empirical data indicates simple power-law scalings for the size distribution of cities (known as Zipf’s law), the geographic distribution of friends, and the population density fluctuations as a function of scale. We derive a simple statistical model that explains all of these scaling laws based on a single unifying principle involving the random spatial growth of clusters of people on all scales. The model makes important new predictions for the spread of diseases and other social phenomena.

Read this paper on arXiv…

H. Lin and A. Loeb
Wed, 7 Jan 15
32/67

Comments: 13 pages, 2 figures, press embargo until published

Bayesian inference of CMB gravitational lensing [IMA]

http://arxiv.org/abs/1412.4079


The Planck satellite, along with ground based telescopes such as the Atacama Cosmology Telescope (ACT) and the South Pole Telescope (SPT), have mapped the cosmic microwave background (CMB) at such an unprecedented resolution as to allow a detection of the subtle distortions due to the gravitational influence of intervening dark matter. This distortion is called gravitational lensing and has become a powerful probe of cosmology and dark matter. Estimating gravitational lensing is important for two reasons. First, the weak lensing estimates can be used to construct a map of dark matter which would be invisible otherwise. Second, weak lensing estimates can, in principle, un-lense the observed CMB to construct the original CMB radiation fluctuations. Both of these maps, the unlensed CMB radiation field and the dark matter field, are deep probes of cosmology and cosmic structure. Bayesian techniques seem a perfect fit for the statistical analysis of lensing and the CMB. One reason is that the priors for the unlensed CMB and the lensing potential are—very nearly—Gaussian random fields. The Gaussianity coming from physically predicted quantum randomness in the early universe. However, challenges associated with a full Bayesian analysis have prevented previous attempts at developing a working Bayesian prototype. In this paper we solve many of these obstacles with a re-parameterization of CMB lensing. This allows us to obtain draws from the Bayesian lensing posterior of both the lensing potential and the unlensed CMB which converges remarkably fast.

Read this paper on arXiv…

E. Anderes, B. Wandelt and G. Lavaux
Mon, 15 Dec 14
46/53

Comments: N/A

Bayesian Evidence and Model Selection [CL]

http://arxiv.org/abs/1411.3013


In this paper we review the concept of the Bayesian evidence and its application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. Application to several practical examples within the context of signal processing are discussed.

Read this paper on arXiv…

K. Knuth, M. Habeck, N. Malakar, et. al.
Thu, 13 Nov 14
38/49

Comments: 39 pages, 8 figures. Submitted to DSP. Features theory, numerical methods and four applications

Densities mixture unfolding for data obtained from detectors with finite resolution and limited acceptance [CL]

http://arxiv.org/abs/1410.1586


A procedure based on a Mixture Density Model for correcting experimental data for distortions due to finite resolution and limited detector acceptance is presented. Addressing the case that the solution is known to be non-negative, in the approach presented here the true distribution is estimated by a weighted sum of probability density functions with positive weights and with the width of the densities acting as a regularisation parameter responsible for the smoothness of the result. To obtain better smoothing in less populated regions, the width parameter scales inversely proportional to the square root of estimated density. Furthermore, the non-negative garrotte method is used to find the most economic representation of the solution. Cross-validation is employed to determine the optimal values of the resolution and garrotte parameters. The proposed approach is directly applicable to multidimensional problems. Numerical examples in one and two dimensions are presented to illustrate the procedure.

Read this paper on arXiv…

N. Gagunashvili
Wed, 8 Oct 14
68/68

Comments: 25 pages, 14 figures. arXiv admin note: text overlap with arXiv:1209.3766

Proceedings of the First Astrostatistics School: Bayesian Methods in Cosmology [GA]

http://arxiv.org/abs/1409.4294


These are the proceedings of the First Astrostatistics School: Bayesian Methods in Cosmology, held in Bogot\’a D.C., Colombia, June 9-13, 2014. The first astrostatistics school has been the first event in Colombia where statisticians and cosmologists from some universities in Bogot\’a met to discuss the statistic methods applied to cosmology, especially the use of Bayesian statistics in the study of Cosmic Microwave Background (CMB), Baryonic Acoustic Oscillations (BAO), Large Scale Structure (LSS) and weak lensing.

Read this paper on arXiv…

H. Hortua
Tue, 16 Sep 14
21/63

Comments: Proceedings (in Spanish), supplemental electronic material accessible via links within the PDF, Cuadernos en Estadistica aplicada, Edicion especial, (2014)

The impact of super-survey modes on cosmological constraints from cosmic shear fields [CEA]

http://arxiv.org/abs/1408.1744


Owing to the mass-sheet degeneracy, cosmic shear maps do not probe directly the Fourier modes of the underlying mass distribution on scales comparable to the survey size and larger. To assess the corresponding effect on attainable cosmological parameter constraints, we quantify the information on super-survey modes in a lognormal model and, when interpreted as nuisance parameters, their degeneracies to cosmological parameters. Our analytical and numerical calculations clarify the central role of super-sample covariance (SSC) in shaping the statistical power of cosmological observables. Reconstructing the background modes from their non-Gaussian statistical dependence to small scales modes yields the renormalized convergence. This diagonalizes the spectrum covariance matrix, and the information content of the corresponding power spectrum is increased by a factor of two over standard methods. Unfortunately, careful calculation of the Cramer-Rao bound shows that the information recovery can never be made complete, any observable built from shear fields, including optimal sufficient statistics, are subject to severe information loss, typically $80\%$ to $90\%$ below $\ell \sim 3000$ for generic cosmological parameters. The lost information can only be recovered from additional, non-shear based data. Our predictions hold just as well for a tomographic analysis, and/or full sky surveys.

Read this paper on arXiv…

J. Carron and I. Szapudi
Mon, 11 Aug 14
43/55

Comments: 11 pages, 4 figures, submitted

Estimating the distribution of Galaxy Morphologies on a continuous space [GA]

http://arxiv.org/abs/1406.7536


The incredible variety of galaxy shapes cannot be summarized by human defined discrete classes of shapes without causing a possibly large loss of information. Dictionary learning and sparse coding allow us to reduce the high dimensional space of shapes into a manageable low dimensional continuous vector space. Statistical inference can be done in the reduced space via probability distribution estimation and manifold estimation.

Read this paper on arXiv…

G. Vinci, P. Freeman, J. Newman, et. al.
Tue, 1 Jul 14
57/70

Comments: 4 pages, 3 figures, Statistical Challenges in 21st Century Cosmology, Proceedings IAU Symposium No. 306, 2014

Fitting FFT-derived Spectra: Theory, Tool, and Application to Solar Radio Spike Decomposition [SSA]

http://arxiv.org/abs/1406.2280


Spectra derived from fast Fourier transform (FFT) analysis of time-domain data intrinsically contain statistical fluctuations whose distribution depends on the number of accumulated spectra contributing to a measurement. The tail of this distribution, which is essential for separation of the true signal from the statistical fluctuations, deviates noticeably from the normal distribution for a finite number of the accumulations. In this paper we develop a theory to properly account for the statistical fluctuations when fitting a model to a given accumulated spectrum. The method is implemented in software for the purpose of automatically fitting a large body of such FFT-derived spectra. We apply this tool to analyze a portion of a dense cluster of spikes recorded by our FST instrument during a record-breaking event that occurred on 06 Dec 2006. The outcome of this analysis is briefly discussed.

Read this paper on arXiv…

G. Nita, G. Fleishman, D. Gary, et. al.
Tue, 10 Jun 14
32/60

Comments: Accepted to ApJ, 57 pages, 16 figures

The insignificant evolution of the richness-mass relation of galaxy clusters [GA]

http://arxiv.org/abs/1406.1651


We analysed the richness-mass scaling of 23 very massive clusters at $0.15<z<0.55$ with homogenously measured weak-lensing masses and richnesses within a fixed aperture of $0.5$ Mpc radius. We found that the richness-mass scaling is very tight (the scatter is $<0.09$ dex with 90 \% probability) and independent on cluster evolutionary status and morphology. This implies a close association between infall/evolution of dark matter and galaxies in the central region of clusters. We also found that the evolution of the richness-mass intercept is minor at most, and, given the small mass evolution across the studied redshift range, also the richness evolution of individual massive clusters turns out to be very small. Finally, we found that it is of paramount importance to account for the cluster mass function and the selection function. Ignoring them would led to biases larger than the (otherwise quoted) errors. Our study benefits from: a) weak-lensing masses instead of proxy-based masses, removing the ambiguity between a real trend and one induced by an accounted evolution of the used mass proxy; b) the use of projected masses simplifies the statistical analysis, not requiring consideration of the unknown covariance induced by the cluster orientation/triaxiality; c) the use of aperture masses, free from the pseudo-evolution of mass definitions anchored to the evolving Universe density; d) accounting for the sample selection function and for the Malmquist-like effect induced by the cluster mass function; e) cosmological numerical simulations for the computation of the cluster mass function, its evolution, and the mass growth of each individual cluster.

Read this paper on arXiv…

S. Andreon and P. Congdon
Mon, 9 Jun 14
16/40

Comments: A&A, in press

Astrophysical data analysis with information field theory [IMA]

http://arxiv.org/abs/1405.7701


Non-parametric imaging and data analysis in astrophysics and cosmology can be addressed by information field theory (IFT), a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms. It exploits spatial correlations of the signal fields even for nonlinear and non-Gaussian signal inference problems. The alleviation of a perception threshold for recovering signals of unknown correlation structure by using IFT will be discussed in particular as well as a novel improvement on instrumental self-calibration schemes. IFT can be applied to many areas. Here, applications in in cosmology (cosmic microwave background, large-scale structure) and astrophysics (galactic magnetism, radio interferometry) are presented.

Read this paper on arXiv…

T. Ensslin
Mon, 2 Jun 14
43/56

Comments: 4 pages, 2 figures, accepted chapter to the conference proceedings for MaxEnt 2013, to be published by AIP

Statistical characterization of polychromatic absolute and differential squared visibilities obtained from AMBER/VLTI instrument [IMA]

http://arxiv.org/abs/1405.3429


In optical interferometry, the visibility squared modulus are generally assumed to follow a Gaussian distribution and to be independent of each other. A quantitative analysis of the relevance of such assumptions is important to help improving the exploitation of existing and upcoming multi-wavelength interferometric instruments. Analyze the statistical behaviour of both the absolute and the colour-differential squared visibilities: distribution laws, correlations and cross-correlations between different baselines. We use observations of stellar calibrators obtained with AMBER instrument on VLTI in different instrumental and observing configurations, from which we extract the frame-by-frame transfer function. Statistical hypotheses tests and diagnostics are then systematically applied. For both absolute and differential squared visibilities and under all instrumental and observing conditions, we find a better fit for the Student distribution than for the Gaussian, log-normal and Cauchy distributions. We find and analyze clear correlation effects caused by atmospheric perturbations. The differential squared visibilities allow to keep a larger fraction of data with respect to selected absolute squared visibilities and thus benefit from reduced temporal dispersion, while their distribution is more clearly characterized. The frame selection based on the criterion of a fixed SNR value might result in either a biased sample of frames or in a too severe selection.

Read this paper on arXiv…

A. Schutz, M. Vannier, D. Mary, et. al.
Thu, 15 May 14
12/55

Comments: A&A, 13 pages and 9 figures

Functional Regression for Quasar Spectra [CL]

http://arxiv.org/abs/1404.3168


The Lyman-alpha forest is a portion of the observed light spectrum of distant galactic nuclei which allows us to probe remote regions of the Universe that are otherwise inaccessible. The observed Lyman-alpha forest of a quasar light spectrum can be modeled as a noisy realization of a smooth curve that is affected by a `damping effect’ which occurs whenever the light emitted by the quasar travels through regions of the Universe with higher matter concentration. To decode the information conveyed by the Lyman-alpha forest about the matter distribution, we must be able to separate the smooth `continuum’ from the noise and the contribution of the damping effect in the quasar light spectra. To predict the continuum in the Lyman-alpha forest, we use a nonparametric functional regression model in which both the response and the predictor variable (the smooth part of the damping-free portion of the spectrum) are function-valued random variables. We demonstrate that the proposed method accurately predicts the unobservable continuum in the Lyman-alpha forest both on simulated spectra and real spectra. Also, we introduce distribution-free prediction bands for the nonparametric functional regression model that have finite sample guarantees. These prediction bands, together with bootstrap-based confidence bands for the projection of the mean continuum on a fixed number of principal components, allow us to assess the degree of uncertainty in the model predictions.

Read this paper on arXiv…

M. Ciollaro, J. Cisewski, P. Freeman, et. al.
Mon, 14 Apr 14
4/41

Use of spatial cross correlation function to study formation mechanism of massive elliptical galaxies [CL]

http://arxiv.org/abs/1403.1057


Spatial clustering nature of galaxies have been studied previously through auto correlation function. The same type of cross correlation function has been used to investigate parametric clustering nature of galaxies e.g. with respect to masses and sizes of galaxies.
Here formation and evolution of several components of nearby massive early type galaxies have been envisaged through cross correlations, in the mass-size parametric plane, with high redshift early type galaxies (hereafter ETG).It is found that the inner most components of nearby ETGs have significant correlation with ETGs in the highest redshift range called red nuggets whereas intermediate components are highly correlated with ETGs in the redshift range with z value greater than 0.5 and less than 0.75. The outer most part has no correlation in any range, suggesting a scenario through in situ accretion. The above formation scenario is consistent with the previous results obtained for NGC5128 (Chattopadhyay et al. (2009), Chattopadhyay et al. (2013)) and to some extent for nearby elliptical galaxies (Huang et al. (2013)) after considering a sample of ETGs at high redshift with stellar masses greater than or equal to 108.73 M-Sun. So the present work indicates a three phase formation of massive nearby elliptical galaxies instead of two as discussed in previous works.

Read this paper on arXiv…

T. De, T. Chattopadhyay and A. Chattopadhyay
Thu, 6 Mar 14
20/53

Type Ia Supernova Colors and Ejecta Velocities: Hierarchical Bayesian Regression with Non-Gaussian Distributions [CEA]

http://arxiv.org/abs/1402.7079


We investigate the correlations between the peak intrinsic colors of Type Ia supernovae (SN Ia) and their expansion velocities at maximum light, measured from the Si II 6355 A spectral feature. We construct a new hierarchical Bayesian regression model and Gibbs sampler to estimate the dependence of the intrinsic colors of a SN Ia on its ejecta velocity, while accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust. The method is applied to the apparent color data from BVRI light curves and Si II velocity data for 79 nearby SN Ia. Comparison of the apparent color distributions of high velocity (HV) and normal velocity (NV) supernovae reveals significant discrepancies in B-V and B-R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B-band, rather than dust reddening. The mean intrinsic B-V and B-R color differences between HV and NV groups are 0.06 +/- 0.02 and 0.09 +/- 0.02 mag, respectively. Under a linear model for intrinsic B-V and B-R colors versus velocity, we find significant slopes of -0.021 +/- 0.006 and -0.030 +/- 0.009 mag/(1000 km/s), respectively. Since the ejecta velocity distribution is skewed towards high velocities, these effects imply non-Gaussian intrinsic color population distributions with skewness up to +0.3. Accounting for the intrinsic color-velocity correlation results in corrections in A_V extinction estimates as large as -0.12 mag for HV SN Ia and +0.06 mag for NV events. Deviance information criteria strongly favor simple functions for intrinsic colors versus velocity over no trend, while higher-order polynomials are disfavored. Velocity measurements from SN Ia spectra have potential to diminish systematic errors from the confounding of intrinsic colors and dust reddening affecting supernova distances.

Read this paper on arXiv…

K. Mandel, R. Foley and R. Kirshner
Mon, 3 Mar 14
19/55

The Spatial Sensitivity Function of a Light Sensor [IMA]

http://arxiv.org/abs/1402.2169


The Spatial Sensitivity Function (SSF) is used to quantify a detector’s sensitivity to a spatially-distributed input signal. By weighting the incoming signal with the SSF and integrating, the overall scalar response of the detector can be estimated. This project focuses on estimating the SSF of a light intensity sensor consisting of a photodiode. This light sensor has been used previously in the Knuth Cyberphysics Laboratory on a robotic arm that performs its own experiments to locate a white circle in a dark field (Knuth et al., 2007). To use the light sensor to learn about its surroundings, the robot’s inference software must be able to model and predict the light sensor’s response to a hypothesized stimulus. Previous models of the light sensor treated it as a point sensor and ignored its spatial characteristics. Here we propose a parametric approach where the SSF is described by a mixture of Gaussians (MOG). By performing controlled calibration experiments with known stimulus inputs, we used nested sampling to estimate the SSF of the light sensor using an MOG model with the number of Gaussians ranging from one to five. By comparing the evidence computed for each MOG model, we found that one Gaussian is sufficient to describe the SSF to the accuracy we require. Future work will involve incorporating this more accurate SSF into the Bayesian machine learning software for the robotic system and studying how this detailed information about the properties of the light sensor will improve robot’s ability to learn.

Read this paper on arXiv…

N. Malakar, A. Mesiti and K. Knuth
Tue, 11 Feb 14
54/55

Modeling Light Curves for Improved Classification [CL]

http://arxiv.org/abs/1401.3211


Many synoptic surveys are observing large parts of the sky multiple times. The resulting lightcurves provide a wonderful window to the dynamic nature of the universe. However, there are many significant challenges in analyzing these light curves. These include heterogeneity of the data, irregularly sampled data, missing data, censored data, known but variable measurement errors, and most importantly, the need to classify in astronomical objects in real time using these imperfect light curves. We describe a modeling-based approach using Gaussian process regression for generating critical measures representing features for the classification of such lightcurves. We demonstrate that our approach performs better by comparing it with past methods. Finally, we provide future directions for use in sky-surveys that are getting even bigger by the day.

Read this paper on arXiv…

Wed, 15 Jan 14
41/67

Nonparametric 3D map of the IGM using the Lyman-alpha forest [IMA]

http://arxiv.org/abs/1401.1867


Visualizing the high-redshift Universe is difficult due to the dearth of available data; however, the Lyman-alpha forest provides a means to map the intergalactic medium at redshifts not accessible to large galaxy surveys. Large-scale structure surveys, such as the Baryon Oscillation Spectroscopic Survey (BOSS), have collected quasar (QSO) spectra that enable the reconstruction of HI density fluctuations. The data fall on a collection of lines defined by the lines-of-sight (LOS) of the QSO, and a major issue with producing a 3D reconstruction is determining how to model the regions between the LOS. We present a method that produces a 3D map of this relatively uncharted portion of the Universe by employing local polynomial smoothing, a nonparametric methodology. The performance of the method is analyzed on simulated data that mimics the varying number of LOS expected in real data, and then is applied to a sample region selected from BOSS. Evaluation of the reconstruction is assessed by considering various features of the predicted 3D maps including visual comparison of slices, PDFs, counts of local minima and maxima, and standardized correlation functions. This 3D reconstruction allows for an initial investigation of the topology of this portion of the Universe using persistent homology.

Read this paper on arXiv…

Fri, 10 Jan 14
21/69

Inverse Bayesian Estimation of Gravitational Mass Density in Galaxies from Missing Kinematic Data [CL]

http://arxiv.org/abs/1401.1052


In this paper we focus on a type of inverse problem in which the data is expressed as an unknown function of the sought and unknown model function (or its discretised representation as a model parameter vector). In particular, we deal with situations in which training data is not available. Then we cannot model the unknown functional relationship between data and the unknown model function (or parameter vector) with a Gaussian Process of appropriate dimensionality. A Bayesian method based on state space modelling is advanced instead. Within this framework, the likelihood is expressed in terms of the probability density function ($pdf$) of the state space variable and the sought model parameter vector is embedded within the domain of this $pdf$. As the measurable vector lives only inside an identified sub-volume of the system state space, the $pdf$ of the state space variable is projected onto the space of the measurables, and it is in terms of the projected state space density that the likelihood is written; the final form of the likelihood is achieved after convolution with the distribution of measurement errors. Application motivated vague priors are invoked and the posterior probability density of the model parameter vectors, given the data is computed. Inference is performed by taking posterior samples with adaptive MCMC. The method is illustrated on synthetic as well as real galactic data.

Read this paper on arXiv…

Wed, 8 Jan 14
52/62

Hierarchical Reverberation Mapping [IMA]

http://arxiv.org/abs/1312.0919


Reverberation mapping (RM) is an important technique in studies of active galactic nuclei (AGN). The key idea of RM is to measure the time lag $\tau$ between variations in the continuum emission from the accretion disc and subsequent response of the broad line region (BLR). The measurement of $\tau$ is typically used to estimate the physical size of the BLR and is combined with other measurements to estimate the black hole mass $M_{\rm BH}$. A major difficulty with RM campaigns is the large amount of data needed to measure $\tau$. Recently, Fine et al (2012) introduced a new approach to RM where the BLR light curve is sparsely sampled, but this is counteracted by observing a large sample of AGN, rather than a single system. The results are combined to infer properties of the sample of AGN. In this letter we implement this method using a hierarchical Bayesian model and contrast this with the results from the previous stacked cross-correlation technique. We find that our inferences are more precise and allow for more straightforward interpretation than the stacked cross-correlation results.

Read this paper on arXiv…

Wed, 4 Dec 13
14/55

The SWELLS Survey. VI. hierarchical inference of the initial mass functions of bulges and discs [IMA]

http://arxiv.org/abs/1310.5177


The long-standing assumption that the stellar initial mass function (IMF) is universal has recently been challenged by a number of observations. Several studies have shown that a “heavy” IMF (e.g., with a Salpeter-like abundance of low mass stars and thus normalisation) is preferred for massive early-type galaxies, while this IMF is inconsistent with the properties of less massive, later-type galaxies. These discoveries motivate the hypothesis that the IMF may vary (possibly very slightly) across galaxies and across components of individual galaxies (e.g. bulges vs discs). In this paper we use a sample of 19 late-type strong gravitational lenses from the SWELLS survey to investigate the IMFs of the bulges and discs in late-type galaxies. We perform a joint analysis of the galaxies’ total masses (constrained by strong gravitational lensing) and stellar masses (constrained by optical and near-infrared colours in the context of a stellar population synthesis [SPS] model, up to an IMF normalisation parameter). Using minimal assumptions apart from the physical constraint that the total stellar mass within any aperture must be less than the total mass within the aperture, we find that the bulges of the galaxies cannot have IMFs heavier (i.e. implying high mass per unit luminosity) than Salpeter, while the disc IMFs are not well constrained by this data set. We also discuss the necessity for hierarchical modelling when combining incomplete information about multiple astronomical objects. This modelling approach allows us to place upper limits on the size of any departures from universality. More data, including spatially resolved kinematics (as in paper V) and stellar population diagnostics over a range of bulge and disc masses, are needed to robustly quantify how the IMF varies within galaxies.

Read this paper on arXiv…

Date added: Tue, 22 Oct 13