The NASA-ADS Abstract Service provides a sophisticated search capability for the literature in Astronomy, Planetary Sciences, Physics/Geophysics, and Space Instrumentation. The ADS is funded by NASA and access to the ADS services is free to anybody worldwide without restrictions. It allows the user to search the literature by author, title, and abstract text. The ADS database contains over 3.6 million references, with 965,000 in the Astronomy/Planetary Sciences database, and 1.6 million in the Physics/Geophysics database. 2/3 of the records have full abstracts, the rest are table of contents entries (titles and author lists only). The coverage for the Astronomy literature is better than 95% from 1975. Before that we cover all major journals and many smaller ones. Most of the journal literature is covered back to volume 1. We now get abstracts on a regular basis from most journals. Over the last year we have entered basically all conference proceedings tables of contents that are available at the Harvard Smithsonian Center for Astrophysics library. This has greatly increased the coverage of conference proceedings in the ADS. The ADS also covers the ArXiv Preprints. We download these preprints every night and index all the preprints. They can be searched either together with the other abstracts or separately. There are currently about 260,000 preprints in that database. In January 2004 we have introduced two new services, full text searching and a personal notification service called "myADS". As all other ADS services, these are free to use for anybody.
In the recent years, there have been significant advancements in the areas of scientific data management and retrieval techniques, particularly in terms of standards and protocols for archiving data and metadata. Scientific data is rich, and spread across different places. In order to integrate these pieces together, a data archive and associated metadata should be generated. Data should be stored in a format that can be retrievable and more importantly it should be in a format that will continue to be accessible as technology changes, such as XML. While general-purpose search engines (such as Google or Bing) are useful formore finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. One such system, Mercury, a metadata harvesting, data discovery, and access system, built for researchers to search to, share and obtain spatiotemporal data used across a range of climate and ecological sciences. Mercury is open-source toolset, backend built on Java and search capability is supported by the some popular open source search libraries such as SOLR and LUCENE. Mercury harvests the structured metadata and key data from several data providing servers around the world and builds a
harvest moon back to nature iso download german
Data discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these "experts." Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research. The cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the "Google generation" than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive. Two user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is "Google-like," enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface. Using a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F1,19=37.3, P
Since the turn of the millennium a constant concern of astronomical archives have begun providing data to the public through standardized protocols unifying data from disparate physical sources and wavebands across the electromagnetic spectrum into an astronomical virtual observatory (VO). In October 2014, NASA began support for the NASA Astronomical Virtual Observatories (NAVO) program to coordinate the efforts of NASA astronomy archives in providing data to users through implementation of protocols agreed within the International Virtual Observatory Alliance (IVOA). A major goal of the NAVO collaboration has been to step back from a piecemeal implementation of IVOA standards and define what the appropriate presence for the US and NASA astronomy archives in the VO should be. This includes evaluating what optional capabilities in the standards need to be supported, the specific versions of standards that should be used, and returning feedback to the IVOA, to support modifications as needed. We discuss a standard archive model developed by the NAVO for data archive presence in the virtual observatory built upon a consistent framework of standards defined by the IVOA. Our standard model provides for discovery of resources through the VO registries, access to observation and object data, downloads of image and spectral data and general access to archival datasets. It defines specific protocol versions, minimum capabilities, and all dependencies. The model will evolve as the capabilities of the virtual observatory and needs of the community change.
Historically, astronomers have utilized a piecemeal set of archives such as SIMBAD, the Washington Double Star Catalog, various exoplanet encyclopedias and electronic tables from the literature to cobble together stellar and exo-planetary parameters in the absence of corresponding images and spectra. As the search for planets around young stars through direct imaging, transits and infrared/optical radial velocity surveys blossoms, there is a void in the available set of to create comprehensive lists of the stellar parameters of nearby stars especially for important parameters such as metallicity and stellar activity indicators. For direct imaging surveys, we need better resources for downloading existing high contrast images to help confirm new discoveries and find ideal target stars. Once we have discovered new planets, we need a uniform database of stellar and planetary parameters from which to look for correlations to better understand the formation and evolution of these systems. As a solution to these issues, we are developing the Starchive - an open access stellar archive in the spirit of the open exoplanet catalog, the Kepler Community Follow-up Program and many others. The archive will allow users to download various datasets, upload new images, spectra and metadata and will contain multiple plotting tools to use in presentations and data interpretations. While we will highly regulate and constantly validate the data being placed into our archive the open nature of its design is intended to allow the database to be expanded efficiently and have a level of versatility which is necessary in today's fast moving, big data community. Finally, the front-end scripts will be placed on github and users will be encouraged to contribute new plotting tools. Here, I will introduce the community to the content and expected capabilities of the archive and query the audience for community feedback. 2ff7e9595c
Comments