The concept behind citation indexing is fundamentally simple. By recognizing
that the value of information is determined by those who use it, what better way
to measure the quality of the work than by measuring the impact it makes on the
community at large. The widest possible population within the scholarly community
(i.e. anyone who uses or cites the source material) determines the influence or
impact of the idea and its originator on our body of knowledge. Because of its
simplicity, one tends to forget that citation indexing is actually a fairly recent
form of information management and retrieval.
There were three factors that led to the development of citation indexing
back in the 1950's. With the huge influx of government dollars into
research and development following World War II, the research community
naturally began to publicly document its findings through the accepted
channel of published scientific journal literature. The subsequent burgeoning
of the literature created a need for a method of indexing and retrieval
that would be more cost effective and efficient than the then-current
model of human indexing of materials for subject specific indices. While
the subtle judgements made by subject specialists were valuable in giving
depth to a subject index, manual indexing was both a more time consuming
process and labor intensive. Its costs increased in proportion to the
growth of material to be indexed. So the need for a better way of managing
information was the first factor.
The second factor was the growing dissatisfaction with the capacity of subject
indexing to meet the needs of the active researcher. At this point in time, a subject index
could have excessive lag times in adding materials to the indexes of the time; months
could pass before researchers in one field would learn of published findings in some
other field that had relevance to their own study. Furthermore, there were limitations to
the subject indexing in terms of retrieval. Terminology appropriate to one specific
discipline would not necessarily have meaning to researchers in another, perhaps
overlapping, discipline. At the same time, scientists were recognizing that they had to
be aware of, if not completely familiar with, work in a number of different subject
disciplines in order to be confident that they had properly grounded the research through
an appropriate review of the literature.
Along with this need was the hope that automation might hold the answers, the
third and final factor in the development of citation indexing. Computerization in the
1950s was far removed from the desktop environment of today, but there was tremendous
excitement over potential benefits to be derived from the application of machines to the
generation and compilation of data. The U.S. government hoped that automation could
mitigate or even eliminate completely the difficulties of manual indexing. A number of
projects were launched by the United States with the intention of investigating these
possibilities.
Dr. Eugene Garfield, founder and now Chairman Emeritus of ISI® (now Thomson Scientific),
was deeply involved in the research relating to machine generated indexes in
the mid-1950's and early 1960's. One of his earliest points of involvement was
a project sponsored by the Armed Forces Medical Library (predecessor to our
current National Library of Medicine). The Welch Medical Library Indexing project,
as it was called, was to investigate the role of automation in the organization
and retrieval of medical literature. The hope was that the problems associated
with subjective human judgement in selection of descriptors and indexing terms
could be eliminated. By removing the human element, one might thereby increase
the speed with which information was incorporated in to the indexes. It might
also increase the cost-effectiveness of the indexes. Garfield grasped early
on that review articles in the journal literature were heavily reliant on the
bibliographic citations that referred the reader to the original published source
for the notable idea or concept. By capturing those citations, Garfield believed,
the researcher could immediately get a view of the approach taken by another
scientist to support an idea or methodology based on the sources that the published
writer had consulted and cited as pertinent in the bibliography. As retrieval
terms, citations could function as well as keywords and descriptors that were
thoughtfully assigned by a professional indexer.
In the early 1960s, Eugene Garfield and Associates developed two pilot projects
that would test the viability and efficiency of citation indexing. The first project involved
the creation of a database that would index the citations of 5,000 chemical patents held by
two private pharmaceutical companies. The referenced citations in this instance were to
prior patents, the documentation sources that the government patent examiners were
using to support a decision to grant or deny a patent. The connections that the patent
citation index made were then analyzed with two comparable classification and indexing
systems that were currently being used by the participants. Based on this investigation
and analysis, the project sponsors determined that citation indexing permitted the
retrieval of relevant literature across arbitrary classifications in a way that subject-
oriented indexing could not.
A second pilot project in 1962 involved Garfield's recently incorporated
enterprise, the Institute for Scientific Information (now Thomson Scientific), with the United
States National Institutes of Health in building an index to the published
literature on genetics. This project was far more complex in nature
than the patents index. Three databases were built to cover the literature
over 1 year, 5 years and 14 years with a varying number of source publications
indexed in each. While this project was to test the feasibility and
utility of a narrow, discipline-oriented citation index, at completion,
it was concluded that the database with the most broadly based set
of source publications formed the most comprehensive and useful
guide to the published literature in the field of genetics. The database
for the single-year term had drawn not just on journals that were primarily
devoted to the field of genetics research but had drawn as well from
a large pool of journals that published genetics papers on a more peripheral
or occasional basis. Additionally, while the automated system required
a certain level of effort in standardizing the entries from a wide variety
of published materials, the project demonstrated the cost-effectiveness
of citation indexing as opposed to the expense of traditional subject
indexing processes.
While, at the time of the project's completion, the government sponsors chose
not to subsidize the development of a national citation database, Eugene Garfield
was encouraged to move ahead with the private publication of his multidisciplinary
citation index as the first edition of the Science
Citation Index® (SCI®). Available for
purchase since 1963, the SCI then and now represents the most comprehensive
citation index to the scientific journal literature. Today, the Web-based version
of that index covers 5,600 journals across more than 150 scientific disciplines.
Garfield's achievement lay in establishing the utility and objectivity of
a citation index in pulling up related papers in published literature that at
first glance might not have seemed pertinent to the researcher's inquiry. Today,
it is considered to be one of the most reliable of resources in tracing the
development of an idea across the multitude of disciplines that are part of
our body of scientific knowledge.