Astrophysics Data System - Data in The System

Data in The System

Papers are indexed within the database by their bibliographic record, containing the details of the journal they were published in and various associated metadata, such as author lists, references and citations. Originally this data was stored in ASCII format, but eventually the limitations of this encouraged the database maintainers to migrate all records to an XML (Extensible Markup Language) format in 2000. Bibliographic records are now stored as an XML element, with sub-elements for the various metadata.

Since the advent of online editions of journals, abstracts are loaded into the ADS on or before the publication date of articles, with the full journal text available to subscribers. Older articles have been scanned, and an abstract is created using optical character recognition software. Scanned articles from before about 1995 are usually available free, by agreement with the journal publishers.

Scanned articles are stored in TIFF format, at both medium and high resolution. The TIFF files are converted on demand into GIF files for on-screen viewing, and PDF or PostScript files for printing. The generated files are then cached to eliminate needlessly frequent regenerations for popular articles. As of 2000, ADS contained 250 GB of scans, which consisted of 1,128,955 article pages comprising 138,789 articles. By 2005 this had grown to 650 GB, and is expected to grow further, to about 900 GB by 2007. No further information has been published.

The database initially contained only astronomical references, but has now grown to incorporate three databases, covering astronomy (including planetary sciences and solar physics) references, physics (including instrumentation and geosciences) references, as well as preprints of scientific papers from arXiv. The astronomy database is by far the most advanced and its use accounts for about 85% of the total ADS usage. Articles are assigned to the different databases according to the subject rather than the journal they are published in, so that articles from any one journal might appear in all three subject databases. The separation of the databases allows searching in each discipline to be tailored, so that words can automatically be given different weight functions in different database searches, depending on how common they are in the relevant field.

Data in the preprint archive is updated daily from the arXiv, the main repository of physics and astronomy preprints. The advent of preprint servers has, like ADS, had a significant impact on the rate of astronomical research, as papers are often made available from preprint servers weeks or months before they are published in the journals. The incorporation of preprints from the arXiv into ADS means that the search engine can return the most current research available, with the caveat that preprints may not have been peer reviewed or proofread to the required standard for publication in the main journals. ADS's database links preprints with subsequently published articles wherever possible, so that citation and reference searches will return links to the journal article where the preprint was cited.

Read more about this topic:  Astrophysics Data System

Famous quotes containing the words data and/or system:

    Mental health data from the 1950’s on middle-aged women showed them to be a particularly distressed group, vulnerable to depression and feelings of uselessness. This isn’t surprising. If society tells you that your main role is to be attractive to men and you are getting crow’s feet, and to be a mother to children and yours are leaving home, no wonder you are distressed.
    Grace Baruch (20th century)

    Justice in the hands of the powerful is merely a governing system like any other. Why call it justice? Let us rather call it injustice, but of a sly effective order, based entirely on cruel knowledge of the resistance of the weak, their capacity for pain, humiliation and misery. Injustice sustained at the exact degree of necessary tension to turn the cogs of the huge machine-for- the-making-of-rich-men, without bursting the boiler.
    Georges Bernanos (1888–1948)