Science Platforms for Heliophysics Data Analysis [IMA]

http://arxiv.org/abs/2301.00878


We recommend that NASA maintain and fund science platforms that enable interactive and scalable data analysis in order to maximize the scientific return of data collected from space-based instruments.

Read this paper on arXiv…

M. Bobra, W. Barnes, T. Chen, et. al.
Wed, 4 Jan 23
17/43

Comments: Heliophysics 2050 White Paper

Figure and Figure Caption Extraction for Mixed Raster and Vector PDFs: Digitization of Astronomical Literature with OCR Features [IMA]

http://arxiv.org/abs/2209.04460


Scientific articles published prior to the “age of digitization” in the late 1990s contain figures which are “trapped” within their scanned pages. While progress to extract figures and their captions has been made, there is currently no robust method for this process. We present a YOLO-based method for use on scanned pages, post-Optical Character Recognition (OCR), which uses both grayscale and OCR-features. When applied to the astrophysics literature holdings of the Astrophysics Data System (ADS), we find F1 scores of 90.9% (92.2%) for figures (figure captions) with the intersection-over-union (IOU) cut-off of 0.9 which is a significant improvement over other state-of-the-art methods.

Read this paper on arXiv…

J. Naiman, P. Williams and A. Goodman
Tue, 13 Sep 22
13/85

Comments: 16 pages, 3 figures, accepted to TPDL 2022

SciCodes: Astronomy Research Software and Beyond [IMA]

http://arxiv.org/abs/2111.14278


The Astrophysics Source Code Library (ASCL ascl.net), started in 1999, is a free open registry of software used in refereed astronomy research. Over the past few years, it has spearheaded an effort to form a consortium of scientific software registries and repositories. In 2019 and 2020, ASCL contacted editors and maintainers of discipline and institutional software registries and repositories in math, biology, neuroscience, geophysics, remote sensing, and other fields to develop a list of best practices for these research software resources. At the completion of that project, performed as a Task Force for a FORCE11 working group, members decided to form SciCodes as an ongoing consortium. This presentation covered the consortium’s work so far, what it is currently working on, what it hopes to achieve for making scientific research software more discoverable across disciplines, and how the consortium can benefit astronomers.

Read this paper on arXiv…

A. Allen
Tue, 30 Nov 21
81/105

Comments: 1 table

Citation method, please? A case study in astrophysics [IMA]

http://arxiv.org/abs/2111.12574


Software citation has accelerated in astrophysics in the past decade, resulting in the field now having multiple trackable ways to cite computational methods. Yet most software authors do not specify how they would like their code to be cited, while others specify a citation method that is not easily tracked (or tracked at all) by most indexers. Two metadata file formats, codemeta.json and CITATION.cff, developed in 2016 and 2017 respectively, are useful for specifying how software should be cited. In 2020, the Astrophysics Source Code Library (ASCL, ascl.net) undertook a year-long effort to generate and send these software metadata files, specific to each computational method, to code authors for editing and inclusion on their code sites. We wanted to answer the question, “Would sending these files to software authors increase adoption of one, the other, or both of these metadata files?” The answer in this case was no. Furthermore, only 41% of the 135 code sites examined for use of these files had citation information in any form available. The lack of such information creates an obstacle for article authors to provide credit to software creators, thus hindering citation of and recognition for computational contributions to research and the scientists who develop and maintain software.

Read this paper on arXiv…

A. Allen
Thu, 25 Nov 21
58/60

Comments: 11 pages, 6 figures, 1 table

Citation method, please? A case study in astrophysics [IMA]

http://arxiv.org/abs/2111.12574


Software citation has accelerated in astrophysics in the past decade, resulting in the field now having multiple trackable ways to cite computational methods. Yet most software authors do not specify how they would like their code to be cited, while others specify a citation method that is not easily tracked (or tracked at all) by most indexers. Two metadata file formats, codemeta.json and CITATION.cff, developed in 2016 and 2017 respectively, are useful for specifying how software should be cited. In 2020, the Astrophysics Source Code Library (ASCL, ascl.net) undertook a year-long effort to generate and send these software metadata files, specific to each computational method, to code authors for editing and inclusion on their code sites. We wanted to answer the question, “Would sending these files to software authors increase adoption of one, the other, or both of these metadata files?” The answer in this case was no. Furthermore, only 41% of the 135 code sites examined for use of these files had citation information in any form available. The lack of such information creates an obstacle for article authors to provide credit to software creators, thus hindering citation of and recognition for computational contributions to research and the scientists who develop and maintain software.

Read this paper on arXiv…

A. Allen
Thu, 25 Nov 21
59/60

Comments: 11 pages, 6 figures, 1 table

Metrics of research impact in astronomy: Predicting later impact from metrics measured 10-15 years after the PhD [IMA]

http://arxiv.org/abs/2110.14115


This paper calibrates how metrics derivable from the SAO/NASA Astrophysics Data System can be used to estimate the future impact of astronomy research careers and thereby to inform decisions on resource allocation such as job hires and tenure decisions. Three metrics are used, citations of refereed papers, citations of all publications normalized by the numbers of co-authors, and citations of all first-author papers. Each is individually calibrated as an impact predictor in the book Kormendy (2020), “Metrics of Research Impact in Astronomy” (Publ Astron Soc Pac, San Francisco). How this is done is reviewed in the first half of this paper. Then, I show that averaging results from three metrics produces more accurate predictions. Average prediction machines are constructed for different cohorts of 1990-2007 PhDs and used to postdict 2017 impact from metrics measured 10, 12, and 15 years after the PhD. The time span over which prediction is made ranges from 0 years for 2007 PhDs to 17 years for 1990 PhDs using metrics measured 10 years after the PhD. Calibration is based on perceived 2017 impact as voted by 22 experienced astronomers for 510 faculty members at 17 highly-ranked university astronomy departments world-wide. Prediction machinery reproduces voted impact estimates with an RMS uncertainty of 1/8 of the dynamic range for people in the study sample. The aim of this work is to lend some of the rigor that is normally used in scientific research to the difficult and subjective job of judging people’s careers.

Read this paper on arXiv…

J. Kormendy
Thu, 28 Oct 21
36/76

Comments: 11 pages, 8 postscript figures, 5 tables accepted for publication in Proceedings of the National Academy of Sciences

From Data Processes to Data Products: Knowledge Infrastructures in Astronomy [IMA]

http://arxiv.org/abs/2109.01707


We explore how astronomers take observational data from telescopes, process them into usable scientific data products, curate them for later use, and reuse data for further inquiry. Astronomers have invested heavily in knowledge infrastructures – robust networks of people, artifacts, and institutions that generate, share, and maintain specific knowledge about the human and natural worlds. Drawing upon a decade of interviews and ethnography, this article compares how three astronomy groups capture, process, and archive data, and for whom. The Sloan Digital Sky Survey is a mission with a dedicated telescope and instruments, while the Black Hole Group and Integrative Astronomy Group (both pseudonyms) are university-based, investigator-led collaborations. Findings are organized into four themes: how these projects develop and maintain their workflows; how they capture and archive their data; how they maintain and repair knowledge infrastructures; and how they use and reuse data products over time. We found that astronomers encode their research methods in software known as pipelines. Algorithms help to point telescopes at targets, remove artifacts, calibrate instruments, and accomplish myriad validation tasks. Observations may be reprocessed many times to become new data products that serve new scientific purposes. Knowledge production in the form of scientific publications is the primary goal of these projects. They vary in incentives and resources to sustain access to their data products. We conclude that software pipelines are essential components of astronomical knowledge infrastructures, but are fragile, difficult to maintain and repair, and often invisible. Reusing data products is fundamental to the science of astronomy, whether or not those resources are made publicly available. We make recommendations for sustaining access to data products in scientific fields such as astronomy.

Read this paper on arXiv…

C. Borgman and M. Wofford
Tue, 7 Sep 21
56/89

Comments: 37 pages, including 5 figures

Furthering a Comprehensive SETI Bibliography [CL]

http://arxiv.org/abs/2107.02887


In 2019, Reyes & Wright used the NASA Astrophysics Data System (ADS) to initiate a comprehensive bibliography for SETI accessible to the public. Since then, updates to the library have been incomplete, partly due to the difficulty in managing the large number of false positive publications generated by searching ADS using simple search terms. In preparation for a recent update, the scope of the library was revised and reexamined. The scope now includes social sciences and commensal SETI. Results were curated based on five SETI keyword searches: “SETI”, “technosignature”, “Fermi Paradox,” “Drake Equation”, and “extraterrestrial intelligence.” These keywords returned 553 publications that merited inclusion in the bibliography that were not previously present. A curated library of false positive results is now concurrently maintained to facilitate their exclusion from future searches. A search query and workflow was developed to capture nearly all SETI-related papers indexed by ADS while minimizing false positives. These tools will enable efficient, consistent updates of the SETI library by future curators, and could be adopted for other bibliography projects as well.

Read this paper on arXiv…

J. LaFond, J. Wright and M. Huston
Thu, 8 Jul 21
49/52

Comments: 7 pages, 3 figures, accepted to JBIS

The uniqueness of observatory publications [IMA]

http://arxiv.org/abs/2104.12838


Observatory publications comprise the work of local astronomers from observatories around the world and are traditionally exchanged between observatories through libraries. However, large collections of observatory publications seem to be rare; or at the least rarely digitally described or accessible on the Internet. Notable examples to the contrary are the Woodman Astronomical Library at Wisconsin-Madison and the Dudley Observatory in Loudonville, New York both in the US. Due to the irregularities in receiving material, the collections are generally often incomplete both with respect to the observatories included as well as volumes. In order to assess the unique properties of the collections, we summarize and compare observatories present in our own as well as the collections from the Woodman Library and the Dudley Observatory.

Read this paper on arXiv…

O. Ellegaard and S. Dorch
Wed, 28 Apr 21
14/60

Comments: 4 pages, 2 figures, 2 tables, to appear in proceedings of IAU Symposium 367, Education and Heritage in the Era of Big Data in Astronomy

Making organizational software easier to find in ASCL and ADS [IMA]

http://arxiv.org/abs/2012.12526


Software is the most used instrument in astronomy, and organizations such as NASA and the Heidelberg Institute for Theoretical Physics (HITS) fund, develop, and release research software. NASA, for example, has created sites such as code.nasa.gov to share its software with the world, but how easy is it to see what NASA has? Until recently, searching NASA’s Astrophysics Data System (ADS) for NASA astronomy research software has not been fruitful. Through its ADAP program, NASA funded the Astrophysics Source Code Library to improve the discoverability of these codes. Adding institutional tags to ASCL entries makes it easy to find this software not only in the ASCL but also in ADS and other services that index the ASCL. This presentation covered the changes the ASCL made as a result of this funding and how you can use the results of this work to better find organizational software in ASCL and ADS.

Read this paper on arXiv…

A. Allen, S. Mavuram, R. Nemiroff, et. al.
Thu, 24 Dec 20
1/73

Comments: 4 pages; to be published in the proceedings of the ADASS XXX meeting

Second Order Operators in the NASA Astrophysics Data System [CL]

http://arxiv.org/abs/2010.01418


Second Order Operators (SOOs) are database functions which form secondary queries based on attributes of the objects returned in an initial query; they can provide powerful methods to investigate complex, multipartite information graphs. The NASA Astrophysics Data System (ADS) has implemented four SOOs, reviews, useful, trending, and similar which use the citations, references, downloads, and abstract text.
This tutorial describes these operators in detail, both alone and in conjunction with other functions. It is intended for scientists and others who wish to make fuller use of the ADS database. Basic knowledge of the ADS is assumed.

Read this paper on arXiv…

M. Kurtz and R. Chyla
Tue, 6 Oct 2020
19/85

Comments: ADS Bibcode:2020BAAS…52b0207K, author’s version

The decline of astronomical research in Venezuela [CL]

http://arxiv.org/abs/2008.04595


During the last 15 years the number of astronomy-related papers published by scientists in Venezuela has been continuously decreasing, mainly due to emigration. If rapid corrective actions are not implemented, Venezuelan astronomy could disappear.

Read this paper on arXiv…

N. Sanchez
Wed, 12 Aug 20
-864/63

Comments: 7 pages including 1 table and 2 figures. Comment published on Nature Astronomy. This is the author version, the published version is available at the Shareedit link this https URL

Towards a more realistic citation model: The key role of research team sizes [CL]

http://arxiv.org/abs/2008.04711


We propose a new citation model which builds on the existing models that explicitly or implicitly include “direct” and “indirect” (learning about a cited paper’s existence from references in another paper) citation mechanisms. Our model departs from the usual, unrealistic assumption of uniform probability of direct citation, in which initial differences in citation arise purely randomly. Instead, we demonstrate that a two-mechanism model in which the probability of direct citation is proportional to the number of authors on a paper (team size) is able to reproduce the empirical citation distributions of articles published in the field of astronomy remarkably well, and at different points in time. Interpretation of our model is that the intrinsic citation capacity, and hence the initial visibility of a paper, will be enhanced when more people are intimately familiar with some work, favoring papers from larger teams. While the intrinsic citation capacity cannot depend only on the team size, our model demonstrates that it must be to some degree correlated with it, and distributed in a similar way, i.e., having a power-law tail. Consequently, our team-size model qualitatively explains the existence of a correlation between the number of citations and the number of authors on a paper.

Read this paper on arXiv…

S. Milojević
Wed, 12 Aug 20
-837/63

Comments: Published in journal Entropy. Open access article available at this https URL

A Catalogue of Locus Algorithm Pointings for Optimal Differential Photometry for 23,779 Quasars [GA]

http://arxiv.org/abs/2003.04590


This paper presents a catalogue of optimised pointings for differential photometry of 23,779 quasars extracted from the Sloan Digital Sky Survey (SDSS) Catalogue and a score for each indicating the quality of the Field of View (FoV) associated with that pointing. Observation of millimagnitude variability on a timescale of minutes typically requires differential observations with reference to an ensemble of reference stars. For optimal performance, these reference stars should have similar colour and magnitude to the target quasar. In addition, the greatest quantity and quality of suitable reference stars may be found by using a telescope pointing which offsets the target object from the centre of the field of view. By comparing each quasar with the stars which appear close to it on the sky in the SDSS Catalogue, an optimum pointing can be calculated, and a figure of merit, referred to as the “score” calculated for that pointing. Highly flexible software has been developed to enable this process to be automated and implemented in a distributed computing paradigm, which enables the creation of catalogues of pointings given a set of input targets. Applying this technique to a sample of 40,000 targets from the 4th SDSS quasar catalogue resulted in the production of pointings and scores for 23,779 quasars. This catalogue is a useful resource for observers planning differential photometry studies and surveys of quasars to select those which have many suitable celestial neighbours for differential photometry

Read this paper on arXiv…

O. Creaner, K. Nolan, D. Grennan, et. al.
Wed, 11 Mar 20
9/65

Comments: 7 pages, 5 figures

Economic Power, Population, and the Size of Astronomical Community [CL]

http://arxiv.org/abs/1908.02584


The number of astronomers for a country registered to the IAU is known to have a correlation with the GDP. However, the robustness of this relationship can be doubted, because the fraction of astronomers joining the IAU differs from country to country. Here we revisit this correlation by using the recent data updated as of 2017, and then we find a similar correlation by using the total enumeration of astronomers and astrophysicists with PhD degrees and working in each country, instead of adopting the number of IAU members. We confirm the existence of two subgroup in the correlation. One group consists of European advanced countries having long history of modern astronomy, while the other group consists of countries having experienced recent rapid economic development. In order to find causation in the correlation, we obtain the long-term variations of the number of astronomers, population, and the GDP for a number of countries to find that the number of astronomers per citizen for recently developing countries has increased more rapidly as GDP per capita increased, than that for fully developed countries. We collect a demographic data of the Korean astronomical community. From these findings we estimate the proper size of the Korean astronomical community by considering the society’s economic power and population. The current number of PhD astronomers working in Korea is approximately 310, but it should be 550 that is large enough to be comparable and competitive to the sizes of Spainish, Canadian, and Japanese astronomical communities. We discuss on the way how to overcome the vulnerability of the Korean astronomical community, based on the statistics of national R&D expenditure structure comparing with that of other major advanced countries.

Read this paper on arXiv…

S. Ahn
Thu, 8 Aug 19
61/78

Comments: 18 pages, 6 figures, 4 tables, accepted for publication in Journal of Korean Astronomical Society 2019 August Issue

Towards a Comprehensive Bibliography for SETI [CL]

http://arxiv.org/abs/1908.02587


In this work, we motivate, describe, and announce a living bibliography for academic papers and other works published in the Search for Extraterrestrial Intelligence (SETI). The bibliography makes use of bibliographic groups (bibgroups) in the NASA Astrophysics Data System (ADS), allowing it to be accessed and searched by any interested party, and is composed only of works which have a presence on the ADS. We establish criteria that describe the scope of our bibliography, which we define as any academic work which broadly: 1) advances knowledge within SETI, 2) deals with topics that are fundamentally related to or about SETI, or 3) is useful for the better understanding of SETI, and which has a presence on ADS. We discuss the future work needed to continue the development of the bibliography. The bibliography can be found by using the bibgroup field (bibgroup: SETI) in the ADS search engine.

Read this paper on arXiv…

A. Reyes and J. Wright
Thu, 8 Aug 19
75/78

Comments: 7 pages, accepted to the Journal of the British Interplanetary Society The SETI bibliography on ADS can be accessed at this https URL&sort=date%20desc%2C%20bibcode%20desc&p_=0

Economic Power, Population, and the Size of Astronomical Community [CL]

http://arxiv.org/abs/1908.02584


The number of astronomers for a country registered to the IAU is known to have a correlation with the GDP. However, the robustness of this relationship can be doubted, because the fraction of astronomers joining the IAU differs from country to country. Here we revisit this correlation by using the recent data updated as of 2017, and then we find a similar correlation by using the total enumeration of astronomers and astrophysicists with PhD degrees and working in each country, instead of adopting the number of IAU members. We confirm the existence of two subgroup in the correlation. One group consists of European advanced countries having long history of modern astronomy, while the other group consists of countries having experienced recent rapid economic development. In order to find causation in the correlation, we obtain the long-term variations of the number of astronomers, population, and the GDP for a number of countries to find that the number of astronomers per citizen for recently developing countries has increased more rapidly as GDP per capita increased, than that for fully developed countries. We collect a demographic data of the Korean astronomical community. From these findings we estimate the proper size of the Korean astronomical community by considering the society’s economic power and population. The current number of PhD astronomers working in Korea is approximately 310, but it should be 550 that is large enough to be comparable and competitive to the sizes of Spainish, Canadian, and Japanese astronomical communities. We discuss on the way how to overcome the vulnerability of the Korean astronomical community, based on the statistics of national R&D expenditure structure comparing with that of other major advanced countries.

Read this paper on arXiv…

S. Ahn
Thu, 8 Aug 19
76/78

Comments: 18 pages, 6 figures, 4 tables, accepted for publication in Journal of Korean Astronomical Society 2019 August Issue

Robust Archives Maximize Scientific Accessibility [IMA]

http://arxiv.org/abs/1907.06234


We present a bibliographic analysis of Chandra, Hubble, and Spitzer publications. We find (a) archival data are used in >60% of the publication output and (b) archives for these missions enable a much broader set of institutions and countries to scientifically use data from these missions. Specifically, we find that authors from institutions that have published few papers from a given mission publish 2/3 archival publications, while those with many publications typically have 1/3 archival publications. We also show that countries with lower GDP per capita overwhelmingly produce archival publications, while countries with higher GDP per capital produce guest observer and archival publications in equal amounts. We argue that robust archives are thus not only critical for the scientific productivity of mission data, but also the scientific accessibility of mission data. We argue that the astronomical community should support archives to maximize the overall scientific societal impact of astronomy, and represent an excellent investment in astronomy’s future.

Read this paper on arXiv…

J. Peek, V. Desai, R. White, et. al.
Tue, 16 Jul 19
83/89

Comments: White Paper submitted to the NAS call for Astro2020 Decadal Survey APC papers

What Does a Successful Postdoctoral Fellowship Publication Record Look Like? [IMA]

http://arxiv.org/abs/1810.09505


Obtaining a prize postdoctoral fellowship in astronomy and astrophysics involves a number of factors, many of which cannot be quantified. One criterion that can be measured is the publication record of an applicant. The publication records of past fellowship recipients may, therefore, provide some quantitative guidance for future prospective applicants. We investigated the publication patterns of recipients of the NASA prize postdoctoral fellowships in the Hubble, Einstein, and Sagan programs from 2014 through 2017, using the NASA ADS reference system. We tabulated their publications at the point where fellowship applications were submitted, and we find that the 133 fellowship recipients in that time frame had a median of 6 +/- 2 first-author publications, and 14 +/- 6 co-authored publications. The full range of first author papers is 1 to 15, and for all papers ranges from 2 to 76, indicating very diverse publication patterns. Thus, while fellowship recipients generally have strong publication records, the distribution of both first-author and co-authored papers is quite broad; there is no apparent threshold of publications necessary to obtain these fellowships. We also examined the post-PhD publication rates for each of the three fellowship programs, between male and female recipients, across the four years of the analysis and find no consistent trends. We hope that these findings will prove a useful reference to future junior scientists.

Read this paper on arXiv…

J. Pepper, O. Krupinska, K. Stassun, et. al.
Wed, 24 Oct 18
9/75

Comments: Accepted to PASP, 11 pages, 6 figures

ESO telbib: learning from experience, preparing for the future [IMA]

http://arxiv.org/abs/1806.08746


The ESO telescope bibliography (telbib) dates back to 1996. During the 20+ years of its existence, it has undergone many changes. Most importantly, the telbib system has been enhanced to cater to new use cases and demands from its stakeholders. Based on achievements of the past, we will show how a system like telbib can not only stay relevant through the decades, but gain importance, and provide an essential tool for the observatory’s management and the wider user community alike.

Read this paper on arXiv…

U. Grothkopf, S. Meakins and D. Bordelon
Mon, 25 Jun 18
53/54

Comments: 6 pages, 2 figures. To be published in SPIE conference proceedings 10704 (10704-29), Observatory Operations: Strategies, Processes, and Systems VII (June 2018)

Italian center for Astronomical Archives publishing solution: modular and distributed [IMA]

http://arxiv.org/abs/1805.08040


The Italian center for Astronomical Archives tries to provide astronomical data resources as interoperable services based on IVOA standards. Its VO expertise and knowledge comes from active participation within IVOA and VO at European and international level, with a double-fold goal: learn from the collaboration and provide inputs to the community. The first solution to build an easy to configure and maintain resource publisher conformant to VO standards proved to be too optimistic. For this reason it has been necessary to re-think the architecture with a modular system built around the messaging concept, where each modular component speaks to the other interested parties through a system of broker-managed queues. The first implemented protocol, the Simple Cone Search, shows the messaging task architecture connecting the parametric HTTP interface to the database backend access module, the logging module, and allows multiple cone search resources to be managed together through a configuration manager module. Even if relatively young, it already proved the flexibility required by the overall system when the database backend changed from MySQL to PostgreSQL+PgSphere. Another implementation test has been made to leverage task distribution over multiple servers to serve simultaneously: FITS cubes direct linking, cubes cutout and cubes positional merging. Currently the implementation of the SIA-2.0 standard protocol is ongoing while for TAP we will be adapting the TAPlib library. Alongside these tools a first administration tool (TASMAN) has been developed to ease the build up and maintenance of TAP_SCHEMA-ta including also ObsCore maintenance capability. Future work will be devoted at widening the range of VO protocols covered by the set of available modules, improve the configuration management and develop specific purpose modules common to all the service components.

Read this paper on arXiv…

M. Molinaro, N. Calabria, R. Butora, et. al.
Tue, 22 May 18
47/69

Comments: SPIE Astronomical Telescopes + Instrumentation 2018, Software and Cyberinfrastructure for Astronomy V, pre-publishing draft proceeding (reduced abstract)

Peer-review under review – A statistical study on proposal ranking at ESO. Part I: the pre-meeting phase [CL]

http://arxiv.org/abs/1805.06981


Peer review is the most common mechanism in place for assessing requests for resources in a large variety of scientific disciplines. One of the strongest criticisms to this paradigm is the limited reproducibility of the process, especially at largely oversubscribed facilities. In this and in a subsequent paper we address this specific aspect in a quantitative way, through a statistical study on proposal ranking at the European Southern Observatory. For this purpose we analysed a sample of about 15000 proposals, submitted by more than 3000 Principal Investigators over 8 years. The proposals were reviewed by more than 500 referees, who assigned over 140000 grades in about 200 panel sessions. After providing a detailed analysis of the statistical properties of the sample, the paper presents an heuristic model based on these findings, which is then used to provide quantitative estimates of the reproducibility of the pre-meeting process. On average, about one third of the proposals ranked in the top quartile by one referee are ranked in the same quartile by any other referee of the panel. A similar value is observed for the bottom quartile. In the central quartiles, the agreement fractions are very marginally above the value expected for a fully aleatory process (25%). The agreement fraction between two panels composed by 6 referees is 55+/-5% (50% confidence level) for the top and bottom quartiles. The corresponding fraction for the central quartiles is 33+/-5%. The model predictions are confirmed by the results obtained from boot-strapping the data for sub-panels composed by 3 referees, and fully consistent with the NIPS experiment. The post-meeting phase will be presented and discussed in a forthcoming paper.

Read this paper on arXiv…

F. Patat
Mon, 21 May 18
58/71

Comments: 22 pages, 18 figures. Accepted for publication in the Publications of the Astronomical Society of Pacific

Evaluation of research publications and publication channels in astronomy and astrophysics [CL]

http://arxiv.org/abs/1804.08435


The astronomy community usually turns to the Astrophysics Data System for bibliometrics. When the context is cross-disciplinary, commercial products like Web of Science and Scopus are used along with related analytics tools instead. The results are often tainted by inherent problems in the chosen classification system. A review of the most common challenges and pitfalls is given.
Commercial altmetrics products could be added to the evaluation toolbox in the near future despite the fact that they are best suited for promotion instead of evaluation.
Norway, Denmark, and Finland have created journal and publisher ranking systems that are used in national funding models. Differences in how astronomy journals are weighed in these systems night be related to the volume of papers published on a national level.

Read this paper on arXiv…

E. Isaksson and H. Vesterinen
Tue, 24 Apr 18
35/87

Comments: 9 pages, 9 figures. Library and Information Services in Astronomy (LISA) 8 conference, Strasbourg June 6-9, 2017. To appear in EPJ Web of Conferences

Merging the Astrophysics and Planetary Science Information Systems [IMA]

http://arxiv.org/abs/1803.03598


Conceptually exoplanet research has one foot in the discipline of Astrophysics and the other foot in Planetary Science. Research strategies for exoplanets will require efficient access to data and information from both realms. Astrophysics has a sophisticated, well integrated, distributed information system with archives and data centers which are interlinked with the technical literature via the Astrophysics Data System (ADS). The information system for Planetary Science does not have a central component linking the literature with the observational and theoretical data. Here we propose that the Committee on an Exoplanet Science Strategy recommend that this linkage be built, with the ADS playing the role in Planetary Science which it already plays in Astrophysics. This will require additional resources for the ADS, and the Planetary Data System (PDS), as well as other international collaborators

Read this paper on arXiv…

M. Kurtz, A. Accomazzi and E. Henneken
Mon, 12 Mar 2018
24/45

Comments: Whitepaper submitted to the Committee on an Exoplanet Science Strategy

Astrolabe: Curating, Linking and Computing Astronomy's Dark Data [IMA]

http://arxiv.org/abs/1802.03629


Where appropriate repositories are not available to support all relevant astronomical data products, data can fall into darkness: unseen and unavailable for future reference and re-use. Some data in this category are legacy or old data, but newer datasets are also often uncurated and could remain “dark”. This paper provides a description of the design motivation and development of Astrolabe, a cyberinfrastructure project that addresses a set of community recommendations for locating and ensuring the long-term curation of dark or otherwise at-risk data and integrated computing. This paper also describes the outcomes of the series of community workshops that informed creation of Astrolabe. According to participants in these workshops, much astronomical dark data currently exist that are not curated elsewhere, as well as software that can only be executed by a few individuals and therefore becomes unusable because of changes in computing platforms. Additional astronomical research questions and challenges would be better addressed with integrated data and computational resources that fall outside the scope of existing observatory and space mission projects. As a solution, the design of the Astrolabe system is aimed at developing new resources for management of astronomical data. The project is based in CyVerse cyberinfrastructure technology and is a collaboration between the University of Arizona and the American Astronomical Society. Overall the project aims to support open access to research data by leveraging existing cyberinfrastructure resources and promoting scientific discovery by making potentially-useful data in a computable format broadly available to the astronomical community.

Read this paper on arXiv…

P. Heidorn, G. Stahlman and J. Steffen
Tue, 13 Feb 18
2/76

Comments: Submitted to the Astrophysical Journal Supplement Series, 22 pages, 2 figures

The ESO Survey of Non-Publishing Programmes [IMA]

http://arxiv.org/abs/1802.03272


One of the classic ways to measure the success of a scientific facility is the publication return, which is defined as the refereed papers produced per unit of allocated resources (for example, telescope time or proposals). The recent studies by Sterzik et al. (2015, 2016) have shown that 30-50 % of the programmes allocated time at ESO do not produce a refereed publication. While this may be inherent to the scientific process, this finding prompted further investigation. For this purpose, ESO conducted a Survey of Non-Publishing Programmes (SNPP) within the activities of the Time Allocation Working Group, a, similar to the monitoring campaign that was recently implemented at ALMA (Stoehr et al. 2016). The SNPP targeted 1278 programmes scheduled between ESO Periods 78 and 90 (October 2006 to March 2013) that had not published a refereed paper as of April 2016. The poll was launched on 6 May 2016, remained open for four weeks, and returned 965 valid responses. This article summarises and discusses the results of this survey, the first of its kind at ESO.

Read this paper on arXiv…

F. Patat, H. Boffin, D. Bordelon, et. al.
Mon, 12 Feb 18
4/53

Comments: 10 pages, 4 figures, Appeared on The Messenger, 170, 51

Astrophysicists and physicists as creators of ArXiv-based commenting resources for their research communities. An initial survey [CL]

http://arxiv.org/abs/1802.02149


This paper conveys the outcomes of what results to be the first, though initial, overview of commenting platforms and related 2.0 resources born within and for the astrophysical community (from 2004 to 2016). Experiences were added, mainly in the physics domain, for a total of 22 major items, including four epijournals, and four supplementary resources, thus casting some light onto an unexpected richness and consonance of endeavours. These experiences rest almost entirely on the contents of the database ArXiv, which adds to its merits that of potentially setting the grounds for web 2.0 resources, and research behaviours, to be explored.
Most of the experiences retrieved are UK and US based, but the resulting picture is international, as various European countries, China and Australia have been actively involved.
Final remarks about creation patterns and outcome of these resources are outlined. The results integrate the previous studies according to which the web 2.0 is presently of limited use for communication in astrophysics and vouch for a role of researchers in the shaping of their own professional communication tools that is greater than expected. Collaterally, some aspects of ArXiv s recent pathway towards partial inclusion of web 2.0 features are touched upon. Further investigation is hoped for.

Read this paper on arXiv…

M. Marra
Thu, 8 Feb 18
31/43

Comments: Journal article 16 pages

Best Practices for a Future Open Code Policy: Experiences and Vision of the Astrophysics Source Code Library [IMA]

http://arxiv.org/abs/1802.00552


We are members of the Astrophysics Source Code Library’s Advisory Committee and its editor-in-chief. The Astrophysics Source Code Library (ASCL, ascl.net) is a successful initiative that advocates for open research software and provides an infrastructure for registering, discovering, sharing, and citing this software. Started in 1999, the ASCL has been expanding in recent years, with an average of over 200 codes added each year, and now houses over 1,600 code entries.

Read this paper on arXiv…

L. Shamir, B. Berriman, P. Teuben, et. al.
Mon, 5 Feb 18
3/52

Comments: White paper submitted to the National Academies of Sciences, Engineering, and Medicine’s Best Practices for a Future Open Code Policy for NASA Space Science Project Committee

On the Availability of ESO Data Papers on arXiv/astro-ph [IMA]

http://arxiv.org/abs/1801.03366


Using the ESO Telescope Bibliography database telbib, we have investigated the percentage of ESO data papers that were submitted to the arXiv/astro-ph e-print server and that are therefore free to read. Our study revealed an availability of up to 96% of telbib papers on arXiv over the years 2010 to 2017. We also compared the citation counts of arXiv vs. non-arXiv papers and found that on average, papers submitted to arXiv are cited 2.8 times more often than those not on arXiv. While simulations suggest that these findings are statistically significant, we cannot yet draw firm conclusions as to the main cause of these differences.

Read this paper on arXiv…

U. Grothkopf, D. Bordelon, S. Meakins, et. al.
Thu, 11 Jan 18
38/56

Comments: 4 pages, 3 figures, 2 tables

The Unified Astronomy Thesaurus: Semantic Metadata for Astronomy and Astrophysics [IMA]

http://arxiv.org/abs/1801.01021


Several different controlled vocabularies have been developed and used by the astronomical community, each designed to serve a specific need and a specific group. The Unified Astronomy Thesaurus (UAT) attempts to provide a highly structured controlled vocabulary that will be relevant and useful across the entire discipline, regardless of content or platform. As two major use cases for the UAT include classifying articles and data, we examine the UAT in comparison with the Astronomical Subject Keywords used by major publications and the JWST Science Keywords used by STScI’s Astronomer’s Proposal Tool.

Read this paper on arXiv…

K. Frey and A. Accomazzi
Thu, 4 Jan 2018
5/44

Comments: Submitted to the Astrophysical Journal Supplements, 10 pages, 3 tables

A Model for Data Citation in Astronomical Research using Digital Object Identifiers (DOIs) [CL]

http://arxiv.org/abs/1801.00004


Standardizing and incentivizing the use of digital object identifiers (DOIs) to aggregate and identify both data analyzed and data generated by a research project will advance the field of astronomy to match best practices in other research fields like geosciences and medicine. Increase in the use of DOIs will prepare the discipline for changing expectations among funding agencies and publishers, who increasingly expect accurate and thorough data citation to accompany scientific outputs. The use of DOIs ensures a robust, sustainable, and interoperable approach to data citation in which due credit is given to researchers and institutions who produce and maintain the primary data. We describe in this work the advantages of DOIs for data citation and best practices for integrating a DOI service in an astronomical archive. We report on a pilot project carried out in collaboration with AAS Journals. During the course of the 1.5 year pilot, over 75% of submitting authors opted to use the integrated DOI service to clearly identify data analyzed during their research project when prompted at the time of paper submission.

Read this paper on arXiv…

J. Novacescu, J. Peek, S. Weissman, et. al.
Wed, 3 Jan 2018
54/59

Comments: 13 pages, 3 figures. Accepted on Dec 19, 2017 for publication in Astrophysical Journal Supplement Series

New ADS Functionality for the Curator [IMA]

http://arxiv.org/abs/1710.08505


In this paper we provide an update concerning the operations of the NASA Astrophysics Data System (ADS), its services and user interface, and the content currently indexed in its database. As the primary information system used by researchers in Astronomy, the ADS aims to provide a comprehensive index of all scholarly resources appearing in the literature. With the current effort in our community to support data and software citations, we discuss what steps the ADS is taking to provide the needed infrastructure in collaboration with publishers and data providers. A new API provides access to the ADS search interface, metrics, and libraries allowing users to programmatically automate discovery and curation tasks. The new ADS interface supports a greater integration of content and services with a variety of partners, including ORCID claiming, indexing of SIMBAD objects, and article graphics from a variety of publishers. Finally, we highlight how librarians can facilitate the ingest of gray literature that they curate into our system.

Read this paper on arXiv…

A. Accomazzi, M. Kurtz, E. Henneken, et. al.
Wed, 25 Oct 17
66/68

Comments: Submitted to the Proceedings of Library and Information Services in Astronomy VIII, Strasbourg, France

The arXiv of the future will not look like the arXiv [CL]

http://arxiv.org/abs/1709.07020


The arXiv is the most popular preprint repository in the world. Since its inception in 1991, the arXiv has allowed researchers to freely share publication-ready articles prior to formal peer review. The growth and the popularity of the arXiv emerged as a result of new technologies that made document creation and dissemination easy, and cultural practices where collaboration and data sharing were dominant. The arXiv represents a unique place in the history of research communication and the Web itself, however it has arguably changed very little since its creation. Here we look at the strengths and weaknesses of arXiv in an effort to identify what possible improvements can be made based on new technologies not previously available. Based on this, we argue that a modern arXiv might in fact not look at all like the arXiv of today.

Read this paper on arXiv…

A. Pepe, M. Cantiello and J. Nicholson
Fri, 22 Sep 17
71/75

Comments: The authors of this document welcome public comments and ideas from its readers, at the online version of this article (this https URL)

The arXiv of the future will not look like the arXiv [CL]

http://arxiv.org/abs/1709.07020


The arXiv is the most popular preprint repository in the world. Since its inception in 1991, the arXiv has allowed researchers to freely share publication-ready articles prior to formal peer review. The growth and the popularity of the arXiv emerged as a result of new technologies that made document creation and dissemination easy, and cultural practices where collaboration and data sharing were dominant. The arXiv represents a unique place in the history of research communication and the Web itself, however it has arguably changed very little since its creation. Here we look at the strengths and weaknesses of arXiv in an effort to identify what possible improvements can be made based on new technologies not previously available. Based on this, we argue that a modern arXiv might in fact not look at all like the arXiv of today.

Read this paper on arXiv…

A. Pepe, M. Cantiello and J. Nicholson
Fri, 22 Sep 17
71/75

Comments: The authors of this document welcome public comments and ideas from its readers, at the online version of this article (this https URL)

Comparing People with Bibliometrics [CL]

http://arxiv.org/abs/1707.09955


Bibliometric indicators, citation counts and/or download counts are increasingly being used to inform personnel decisions such as hiring or promotions. These statistics are very often misused. Here we provide a guide to the factors which should be considered when using these so-called quantitative measures to evaluate people. Rules of thumb are given for when begin to use bibliometric measures when comparing otherwise similar candidates.

Read this paper on arXiv…

M. Kurtz
Tue, 1 Aug 17
34/55

Comments: to appear in Proceedings of Library and Information Science in Astronomy VIII (LISA-8)

Usage Bibliometrics as a Tool to Measure Research Activity [CL]

http://arxiv.org/abs/1706.02153


Measures for research activity and impact have become an integral ingredient in the assessment of a wide range of entities (individual researchers, organizations, instruments, regions, disciplines). Traditional bibliometric indicators, like publication and citation based indicators, provide an essential part of this picture, but cannot describe the complete picture. Since reading scholarly publications is an essential part of the research life cycle, it is only natural to introduce measures for this activity in attempts to quantify the efficiency, productivity and impact of an entity. Citations and reads are significantly different signals, so taken together, they provide a more complete picture of research activity. Most scholarly publications are now accessed online, making the study of reads and their patterns possible. Click-stream logs allow us to follow information access by the entire research community, real-time. Publication and citation datasets just reflect activity by authors. In addition, download statistics will help us identify publications with significant impact, but which do not attract many citations. Click-stream signals are arguably more complex than, say, citation signals. For one, they are a superposition of different classes of readers. Systematic downloads by crawlers also contaminate the signal, as does browsing behavior. We discuss the complexities associated with clickstream data and how, with proper filtering, statistically significant relations and conclusions can be inferred from download statistics. We describe how download statistics can be used to describe research activity at different levels of aggregation, ranging from organizations to countries. These statistics show a correlation with socio-economic indicators. A comparison will be made with traditional bibliometric indicators. We will argue that astronomy is representative of more general trends.

Read this paper on arXiv…

E. Henneken and M. Kurtz
Thu, 8 Jun 17
7/69

Comments: 25 pages, 11 figures, accepted for publication in Handbook of Quantitative Science and Technology Research, Springer

Knowledge discovery through text-based similarity searches for astronomy literature [CL]

http://arxiv.org/abs/1705.05840


The increase in the number of researchers coupled with the ease of publishing and distribution of scientific papers (due to technological advancements) has resulted in a dramatic increase in astronomy literature. This has likely led to the predicament that the body of the literature is too large for traditional human consumption and that related and crucial knowledge is not discovered by researchers. In addition to the increased production of astronomical literature, recent decades have also brought several advancements in computer linguistics. Especially, the machine aided processing of literature dissemination might make it possible to convert this stream of papers into a coherent knowledge set. In this paper, we present the application of computer linguistics techniques on astronomy literature. In particular, we developed a tool that will find similar articles purely based on text content given an input paper. We find that our technique performs robustly in comparison with other tools recommending articles given a reference papers (known as recommender system). Our novel tool shows the great power in combining computer linguistics with astronomy literature and suggests that additional research in this endeavor will likely produce even better tools that will help researchers cope with the vast amounts of knowledge being produced.

Read this paper on arXiv…

W. Kerzendorf
Thu, 18 May 17
22/60

Comments: 6 pages, 5 figures, subm A&A, comments welcome

Implementing Ideas for Improving Software Citation and Credit [IMA]

http://arxiv.org/abs/1611.06232


Improving software citation and credit continues to be a topic of interest across and within many disciplines, with numerous efforts underway. In this Birds of a Feather (BoF) session, we started with a list of actionable ideas from last year’s BoF and other similar efforts and worked alone or in small groups to begin implementing them. Work was captured in a common Google document; the session organizers will disseminate or otherwise put this information to use in or for the community in collaboration with those who contributed.

Read this paper on arXiv…

P. Teuben, A. Allen, G. Berriman, et. al.
Tue, 22 Nov 16
69/79

Comments: 4 pages; to be published in ADASS XXVI (held Oct 16-20, 2016) proceedings

The Durability and Fragility of Knowledge Infrastructures: Lessons Learned from Astronomy [CL]

http://arxiv.org/abs/1611.00055


Infrastructures are not inherently durable or fragile, yet all are fragile over the long term. Durability requires care and maintenance of individual components and the links between them. Astronomy is an ideal domain in which to study knowledge infrastructures, due to its long history, transparency, and accumulation of observational data over a period of centuries. Research reported here draws upon a long-term study of scientific data practices to ask questions about the durability and fragility of infrastructures for data in astronomy. Methods include interviews, ethnography, and document analysis. As astronomy has become a digital science, the community has invested in shared instruments, data standards, digital archives, metadata and discovery services, and other relatively durable infrastructure components. Several features of data practices in astronomy contribute to the fragility of that infrastructure. These include different archiving practices between ground- and space-based missions, between sky surveys and investigator-led projects, and between observational and simulated data. Infrastructure components are tightly coupled, based on international agreements. However, the durability of these infrastructures relies on much invisible work – cataloging, metadata, and other labor conducted by information professionals. Continual investments in care and maintenance of the human and technical components of these infrastructures are necessary for sustainability.

Read this paper on arXiv…

C. Borgman, P. Darch, A. Sands, et. al.
Wed, 2 Nov 16
27/55

Comments: Paper presented at the 2016 Annual Meeting of the Association for Information Science and Technology, October 14-18, 2016, Copenhagen, Denmark. 10 pages; this https URL

Quantitative Evaluation of Gender Bias in Astronomical Publications from Citation Counts [IMA]

http://arxiv.org/abs/1610.08984


We analyze the role of first (leading) author gender on the number of citations that a paper receives, on the publishing frequency and on the self-citing tendency. We consider a complete sample of over 200,000 publications from 1950 to 2015 from five major astronomy journals. We determine the gender of the first author for over 70% of all publications. The fraction of papers which have a female first author has increased from less than 5% in the 1960s to about 25% today. We find that the increase of the fraction of papers authored by females is slowest in the most prestigious journals such as Science and Nature. Furthermore, female authors write 19$\pm$7% fewer papers in seven years following their first paper than their male colleagues. At all times papers with male first authors receive more citations than papers with female first authors. This difference has been decreasing with time and amounts to $\sim$6% measured over the last 30 years. To account for the fact that the properties of female and male first author papers differ intrinsically, we use a random forest algorithm to control for the non-gender specific properties of these papers which include seniority of the first author, number of references, total number of authors, year of publication, publication journal, field of study and region of the first author’s institution. We show that papers authored by females receive 10.4$\pm$0.9% fewer citations than what would be expected if the papers with the same non-gender specific properties were written by the male authors. Finally, we also find that female authors in our sample tend to self-cite more, but that this effect disappears when controlled for non-gender specific variables.

Read this paper on arXiv…

N. Caplar, S. Tacchella and S. Birrer
Mon, 31 Oct 16
10/49

Comments: Abridged version to be submitted to Nature Astronomy. Comments welcome. For readers with very little time, the central result of the paper is covered by Figure 6 (Section 5)

Instruments on large optical telescopes — A case study [IMA]

http://arxiv.org/abs/1606.06674


In the distant past, telescopes were known, first and foremost, for the sizes of their apertures. Advances in technology (not merely those related to astronomical detectors) are now enabling astronomers to build extremely powerful instruments to the extent that instruments have now achieved importance comparable or even exceeding the usual importance accorded to the apertures of the telescopes. However, the cost of successive generations of instruments has risen at a rate far above that of the rate of inflation. Here, given the vast sums of money now being expended on optical telescopes and their instrumentation, I argue that astronomers must undertake “cost-benefit” analysis for future planning. I use the scientific output of the first two decades of the W. M. Keck Observatory as a laboratory for this purpose. I find, in the absence of upgrades, that the time to reach peak paper production for an instrument is about six years. The prime lifetime of instruments (sans upgrades), as measured by citations returns, is about a decade. I investigate how well instrument builders are rewarded (via citations by users of their instruments) and find acknowledgements ranging from 60% to 100%. Next, given the increasing cost of operating optical telescopes, the management of existing observatories continue to seek new partnerships. This naturally raises the question “What is the cost of a single night of telescope time”. I provide a rational basis to compute this quantity. I then end the paper with some thoughts on the future of large ground-based optical telescopes, bearing in mind the explosion of synoptic precision photometric, astrometric and imaging surveys across the electromagnetic spectrum, the increasing cost of instrumentation and the rise of mega instruments.

Read this paper on arXiv…

S. Kulkarni
Wed, 22 Jun 16
41/50

Comments: 29 pages, 16 figures, destination: PASP

Aggregation and Linking of Observational Metadata in the ADS [IMA]

http://arxiv.org/abs/1601.07858


We discuss current efforts behind the curation of observing proposals, archive bibliographies, and data links in the NASA Astrophysics Data System (ADS). The primary data in the ADS is the bibliographic content from scholarly articles in Astronomy and Physics, which ADS aggregates from publishers, arXiv and conference proceeding sites. This core bibliographic information is then further enriched by ADS via the generation of citations and usage data, and through the aggregation of external resources from astronomy data archives and libraries. Important sources of such additional information are the metadata describing observing proposals and high level data products, which, once ingested in ADS, become easily discoverable and citeable by the science community. Bibliographic studies have shown that the integration of links between data archives and the ADS provides greater visibility to data products and increased citations to the literature associated with them.

Read this paper on arXiv…

A. Accomazzi, M. Kurtz, E. Henneken, et. al.
Fri, 29 Jan 16
22/52

Comments: 4 pages, Proceedings of the ADASS XXV conference

Improving Software Citation and Credit [CL]

http://arxiv.org/abs/1512.07919


The past year has seen movement on several fronts for improving software citation, including the Center for Open Science’s Transparency and Openness Promotion (TOP) Guidelines, the Software Publishing Special Interest Group that was started at January’s AAS meeting in Seattle at the request of that organization’s Working Group on Astronomical Software, a Sloan-sponsored meeting at GitHub in San Francisco to begin work on a cohesive research software citation-enabling platform, the work of Force11 to “transform and improve” research communication, and WSSSPE’s ongoing efforts that include software publication, citation, credit, and sustainability.
Brief reports on these efforts were shared at the BoF, after which participants discussed ideas for improving software citation, generating a list of recommendations to the community of software authors, journal publishers, ADS, and research authors. The discussion, recommendations, and feedback will help form recommendations for software citation to those publishers represented in the Software Publishing Special Interest Group and the broader community.

Read this paper on arXiv…

A. Allen, G. Berriman, K. DuPrie, et. al.
Tue, 29 Dec 15
44/54

Comments: Birds of a Feather session organized by the Astrophysics Source Code Library (ASCL, this http URL ); to be published in Proceedings of ADASS XXV (Sydney, Australia; October, 2015). 4 pages

The data sharing advantage in astrophysics [IMA]

http://arxiv.org/abs/1511.02512


We present here evidence for the existence of a citation advantage within astrophysics for papers that link to data. Using simple measures based on publication data from NASA Astrophysics Data System we find a citation advantage for papers with links to data receiving on the average significantly more citations per paper than papers without links to data. Furthermore, using INSPEC and Web of Science databases we investigate whether either papers of an experimental or theoretical nature display different citation behavior.

Read this paper on arXiv…

S. Dorch, T. Drachen and O. Ellegaard
Tue, 10 Nov 15
24/62

Comments: 4 pages, 2 figures, Conference proceedings of Focus Meeting 3 on Scholarly Publication in Astronomy, IAU GA 2015, Honolulu

Quantifying the Cognitive Extent of Science [CL]

http://arxiv.org/abs/1511.00040


While the modern science is characterized by an exponential growth in scientific literature, the increase in publication volume clearly does not reflect the expansion of the cognitive boundaries of science. Nevertheless, most of the metrics for assessing the vitality of science or for making funding and policy decisions are based on productivity. Similarly, the increasing level of knowledge production by large science teams, whose results often enjoy greater visibility, does not necessarily mean that “big science” leads to cognitive expansion. Here we present a novel, big-data method to quantify the extents of cognitive domains of different bodies of scientific literature independently from publication volume, and apply it to 20 million articles published over 60-130 years in physics, astronomy, and biomedicine. The method is based on the lexical diversity of titles of fixed quotas of research articles. Owing to large size of quotas, the method overcomes the inherent stochasticity of article titles to achieve <1% precision. We show that the periods of cognitive growth do not necessarily coincide with the trends in publication volume. Furthermore, we show that the articles produced by larger teams cover significantly smaller cognitive territory than (the same quota of) articles from smaller teams. Our findings provide a new perspective on the role of small teams and individual researchers in expanding the cognitive boundaries of science. The proposed method of quantifying the extent of the cognitive territory can also be applied to study many other aspects of “science of science.”

Read this paper on arXiv…

S. Milojevic
Tue, 3 Nov 15
56/90

Comments: Accepted for publication in Journal of Informetrics

Measuring Metrics – A forty year longitudinal cross-validation of citations, downloads, and peer review in Astrophysics [CL]

http://arxiv.org/abs/1510.09099


Citation measures, and newer altmetric measures such as downloads are now commonly used to inform personnel decisions. How well do or can these measures measure or predict the past, current of future scholarly performance of an individual? Using data from the Smithsonian/NASA Astrophysics Data System we analyze the publication, citation, download, and distinction histories of a cohort of 922 individuals who received a U.S. PhD in astronomy in the period 1972-1976. By examining the same and different measures at the same and different times for the same individuals we are able to show the capabilities and limitations of each measure. Because the distributions are lognormal measurement uncertainties are multiplicative; we show that in order to state with 95% confidence that one person’s citations and/or downloads are significantly higher than another person’s, the log difference in the ratio of counts must be at least 0.3 dex, which corresponds to a multiplicative factor of two.

Read this paper on arXiv…

M. Kurtz and E. Henneken
Mon, 2 Nov 15
8/51

Comments: Author’s version of manuscript accepted for publication in the Journal of the Association for Information Science and Technology (JASIST); 35 pages 16 figures

A New Ranking Scheme for the Institutional Scientific Performance [IMA]

http://arxiv.org/abs/1508.03713


We propose a new performance indicator to evaluate the productivity of research institutions by their disseminated scientific papers. The new quality measure includes two principle components: the normalized impact factor of the journal in which paper was published, and the number of citations received per year since it was published. In both components, the scientific impacts are weighted by the contribution of authors from the evaluated institution. As a whole, our new metric, namely, the institutional performance score takes into account both journal based impact and articles specific impacts. We apply this new scheme to evaluate research output performance of Turkish institutions specialized in astronomy and astrophysics in the period of 1998-2012. We discuss the implications of the new metric, and emphasize the benefits of it along with comparison to other proposed institutional performance indicators.

Read this paper on arXiv…

S. Bilir, E. Gogus, O. Tas, et. al.
Tue, 18 Aug 15
11/43

Comments: 12 pages, 3 figures and 2 tables, accepted for publication in Journal of Scientometric Research

Greek Astronomy PhDs: The last 200 years [CL]

http://arxiv.org/abs/1507.02585


We have recently compiled a database with all doctoral dissertations (PhDs) completed in modern Greece (1837-2014), in the general area of astronomy and astrophysics, as well as in space and ionospheric physics. A preliminary statistical analysis of the data is presented, along with a discussion of the general trends observed.

Read this paper on arXiv…

V. Charmandaris
Fri, 10 Jul 15
3/53

Comments: 8 pages, 7 figures, (original file also available at this http URL )

Astrophysics Source Code Library Enhancements [IMA]

http://arxiv.org/abs/1411.2031


The Astrophysics Source Code Library (ASCL; ascl.net) is a free online registry of codes used in astronomy research; it currently contains over 900 codes and is indexed by ADS. The ASCL has recently moved a new infrastructure into production. The new site provides a true database for the code entries and integrates the WordPress news and information pages and the discussion forum into one site. Previous capabilities are retained and permalinks to ascl.net continue to work. This improvement offers more functionality and flexibility than the previous site, is easier to maintain, and offers new possibilities for collaboration. This presentation covers these recent changes to the ASCL.

Read this paper on arXiv…

R. Hanisch, A. Allen, G. Berriman, et. al.
Tue, 11 Nov 14
49/61

Comments: 4 pages; to be published in ADASS XXIV Proceedings. ASCL can be accessed at this http URL

Data engineering for archive evolution [IMA]

http://arxiv.org/abs/1410.3481


From the moment astronomical observations are made the resulting data products begin to grow stale. Even if perfect binary copies are preserved through repeated timely migration to more robust storage media, data standards evolve and new tools are created that require different kinds of data or metadata. The expectations of the astronomical community change even if the data do not. We discuss data engineering to mitigate the ensuing risks with examples from a recent project to refactor seven million archival images to new standards of nomenclature, metadata, format, and compression.

Read this paper on arXiv…

R. Seaman
Wed, 15 Oct 14
37/58

Comments: 11 pages, this is a longer version of a poster paper submitted to the proceedings of ADASS XXIV

Two years of ALMA bibliography – lessons learned [IMA]

http://arxiv.org/abs/1407.6930


Telescope bibliographies are integral parts of observing facilities. They are used to associate the published literature with archived observational data, to measure an observatory’s scientific output through publication and citation statistics, and to define guidelines for future observing strategies.
The ESO and NRAO librarians as well as NAOJ jointly maintain the ALMA (Atacama Large Millimeter/submillimeter Array) bibliography, a database of refereed papers that use ALMA data.
In this paper, we illustrate how relevant articles are identified, which procedures are used to tag entries in the database and link them to the correct observations, and how results are communicated to ALMA stakeholders and the wider community. Efforts made to streamline the process will be explained and evaluated, and a first analysis of ALMA papers published after two years of observations will be given.

Read this paper on arXiv…

S. Meakins, U. Grothkopf, M. Bishop, et. al.
Mon, 28 Jul 14
36/56

Comments: 7 pages; to be published in the Proceedings of SPIE, vol. 9149, 9149-81 (2014)

The recent Italian regulations about the open-access availability of publicly-funded research publications, and the documentation landscape in astrophysics [CL]

http://arxiv.org/abs/1407.6296


In October 2013 Italy enacted a law containing the first national regulations about the open-access availability of publicly-funded research results (publications).This contribution examines how these new regulations match with the specific situation of that open-access pioneering discipline which is astrophysics.

Read this paper on arXiv…

M. Marra
Thu, 24 Jul 14
17/61

Comments: To be published in the proceedings of LISA VII Conference, Naples, Italy, 18-20.6.2014

Looking before leaping: Creating a software registry [IMA]

http://arxiv.org/abs/1407.5378


What lessons can be learned from examining numerous efforts to create a repository or directory of scientist-written software for a discipline? Astronomy has seen a number of efforts to build a repository or directory of scientist-written software, one of which is the Astrophysics Source Code Library (ASCL). The ASCL (ascl.net) was founded in 1999, had a period of dormancy, and was restarted in 2010. When taking over responsibility for the ASCL in 2010, Allen sought to answer the opening question, hoping this would better inform her work. We also provide specific steps the ASCL is taking to try to improve code sharing and discovery in astronomy and share recent improvements to the resource.

Read this paper on arXiv…

A. Allen and J. Schmidt
Tue, 22 Jul 14
20/45

Comments: 3 pages; submission for WSSSPE2

The Virtual Observatory Registry [IMA]

http://arxiv.org/abs/1407.3083


In the Virtual Observatory (VO), the Registry provides the mechanism with which users and applications discover and select resources — typically, data and services — that are relevant for a particular scientific problem. Even though the VO adopted technologies in particular from the bibliographic community where available, building the Registry system involved a major standardisation effort, involving about a dozen interdependent standard texts. This paper discusses the server-side aspects of the standards and their application, as regards the functional components (registries), the resource records in both format and content, the exchange of resource records between registries (harvesting), as well as the creation and management of the identifiers used in the system based on the notion of authorities. Registry record authors, registry operators or even advanced users thus receive a big picture serving as a guideline through the body of relevant standard texts. To complete this picture, we also mention common usage patterns and open issues as appropriate.

Read this paper on arXiv…

M. Demleitner, G. Greene, P. Sidaner, et. al.
Mon, 14 Jul 14
43/64

Comments: N/A

Computing and Using Metrics in the ADS [CL]

http://arxiv.org/abs/1406.4542


Finding measures for research impact, be it for individuals, institutions, instruments or projects, has gained a lot of popularity. More papers than ever are being written on new impact measures, and problems with existing measures are being pointed out on a regular basis. Funding agencies require impact statistics in their reports, job candidates incorporate them in their resumes, and publication metrics have even been used in at least one recent court case. To support this need for research impact indicators, the SAO/NASA Astrophysics Data System (ADS) has developed a service which provides a broad overview of various impact measures. In this presentation we discuss how the ADS can be used to quench the thirst for impact measures. We will also discuss a couple of the lesser known indicators in the metrics overview and the main issues to be aware of when compiling publication-based metrics in the ADS, namely author name ambiguity and citation incompleteness.

Read this paper on arXiv…

E. Henneken, A. Accomazzi, M. Kurtz, et. al.
Thu, 19 Jun 14
1/62

Comments: to appear in proceedings of LISA VII conference, Naples, Italy

Bibliometric Indicators of Young Authors in Astrophysics: Can Later Stars be Predicted? [CL]

http://arxiv.org/abs/1404.3084


We test 16 bibliometric indicators with respect to their validity at the level of the individual researcher by estimating their power to predict later successful researchers. We compare the indicators of a sample of astrophysics researchers who later co-authored highly cited papers before their first landmark paper with the distributions of these indicators over a random control group of young authors in astronomy and astrophysics. We find that field and citation-window normalisation substantially improves the predicting power of citation indicators. The two indicators of total influence based on citation numbers normalised with expected citation numbers are the only indicators which show differences between later stars and random authors significant on a 1% level. Indicators of paper output are not very useful to predict later stars. The famous $h$-index makes no difference at all between later stars and the random control group.

Read this paper on arXiv…

F. Havemann and B. Larsen
Mon, 14 Apr 14
38/41

The Unified Astronomy Thesaurus [IMA]

http://arxiv.org/abs/1403.6656


The Unified Astronomy Thesaurus (UAT) is an open, interoperable and community-supported thesaurus which unifies the existing divergent and isolated Astronomy & Astrophysics vocabularies into a single high-quality, freely-available open thesaurus formalizing astronomical concepts and their inter-relationships. The UAT builds upon the existing IAU Thesaurus with major contributions from the astronomy portions of the thesauri developed by the Institute of Physics Publishing, the American Institute of Physics, and SPIE. We describe the effort behind the creation of the UAT and the process through which we plan to maintain the document updated through broad community participation.

Read this paper on arXiv…

A. Accomazzi, N. Gray, C. Erdmann, et. al.
Thu, 27 Mar 14
59/62

Principles of scientific research team formation and evolution [CL]

http://arxiv.org/abs/1403.2787


Research teams are the fundamental social unit of science, and yet there is currently no model that describes their basic property: size. In most fields teams have grown significantly in recent decades. We show that this is partly due to the change in the character of team-size distribution. We explain these changes with a comprehensive yet straightforward model of how teams of different sizes emerge and grow. This model accurately reproduces the evolution of empirical team-size distribution over the period of 50 years. The modeling reveals that there are two modes of knowledge production. The first and more fundamental mode employs relatively small, core teams. Core teams form by a Poisson process and produce a Poisson distribution of team sizes in which larger teams are exceedingly rare. The second mode employs extended teams, which started as core teams, but subsequently accumulated new members proportional to the past productivity of their members. Given time, this mode gives rise to a power-law tail of large teams (10-1000 members), which features in many fields today. Based on this model we construct an analytical functional form that allows the contribution of different modes of authorship to be determined directly from the data and is applicable to any field. The model also offers a solid foundation for studying other social aspects of science, such as productivity and collaboration.

Read this paper on arXiv…

S. Milojevic
Thu, 13 Mar 14
8/58

10 Simple Rules for the Care and Feeding of Scientific Data [CL]

http://arxiv.org/abs/1401.2134


This article offers a short guide to the steps scientists can take to ensure that their data and associated analyses continue to be of value and to be recognized. In just the past few years, hundreds of scholarly papers and reports have been written on questions of data sharing, data provenance, research reproducibility, licensing, attribution, privacy, and more, but our goal here is not to review that literature. Instead, we present a short guide intended for researchers who want to know why it is important to “care for and feed” data, with some practical advice on how to do that.

Read this paper on arXiv…

Fri, 10 Jan 14
58/69

Ideas for Advancing Code Sharing (A Different Kind of Hack Day) [IMA]

http://arxiv.org/abs/1312.7352


How do we as a community encourage the reuse of software for telescope operations, data processing, and calibration? How can we support making codes used in research available for others to examine? Continuing the discussion from last year Bring out your codes! BoF session, participants separated into groups to brainstorm ideas to mitigate factors which inhibit code sharing and nurture those which encourage code sharing. The BoF concluded with the sharing of ideas that arose from the brainstorming sessions and a brief summary by the moderator.

Read this paper on arXiv…

Tue, 31 Dec 13
36/49

Astrophysics Source Code Library: Incite to Cite! [IMA]

http://arxiv.org/abs/1312.6693


The Astrophysics Source Code Library (ASCL, this http URL) is an online registry of over 700 source codes that are of interest to astrophysicists, with more being added regularly. The ASCL actively seeks out codes as well as accepting submissions from the code authors, and all entries are citable and indexed by ADS. All codes have been used to generate results published in or submitted to a refereed journal and are available either via a download site or froman identified source. In addition to being the largest directory of scientist-written astrophysics programs available, the ASCL is also an active participant in the reproducible research movement with presentations at various conferences, numerous blog posts and a journal article. This poster provides a description of the ASCL and the changes that we are starting to see in the astrophysics community as a result of the work we are doing.

Read this paper on arXiv…

Wed, 25 Dec 13
16/23