This essay was originally published in the Current Contents print editions
January 3, 1994
Introduction
This is my first Current Contents® (CC®)
essay under the rubric of Citation Comments.
As discussed in last week's CC, this
new monthly feature will focus on the applications
of the Institute for Scientific Information®'s
(now Thomson Scientific's) databases.
1 An appropriate
topic to launch this new series is perhaps the most
rudimentary — the basic concept of
citation indexing.
To start, it is important to clarify the terminological distinction between
"citation" and "reference". In his classic book Little Science, Big Science,
Derek Price gave a clear definition of both terms. He said: "It seems to me
a great pity to waste a good technical term by using the words citation
and reference interchangeably. I therefore propose and adopt the
convention that if Paper R contains a bibliographic footnote using and describing
Paper C, then R contains a reference to C, and C has a citation from
R. The number of references a paper has is measured by the number of items in
its bibliography as endnotes, footnotes, etc., while the number of citations
a paper has is found by looking it up [in a] citation index and seeing how many
others papers mention it." 2 (p. 284)
In a nutshell, citations symbolize the conceptual association of scientific
ideas as recognized by publishing research authors. 3
By the references they cite in their papers, authors make explicit linkages
between their current research and prior work in the archive of scientific literature.
These conceptual associations have been described by Robert Merton, Manfred
Kochen, and other scholars as intellectual transactions, formal acknowledgments
of "intellectual debt" to an earlier source of information. 4,
5 That is, explicit references imply that an author
has found useful a particular published theory, method, or other finding.
Thomson Scientific's databases index these intellectual
transactions by listing both the cited and
citing works.
(That is, the cited work
is a paper or book that has been mentioned in the
references of other works, while the citing work
is the one that contains the references.) The citation
indexes were originally designed primarily for information
retrieval. Mainly but not exclusively through citation
connections, the databases enable you to navigate
the literature in unique ways. As a result, you are
able to locate relevant papers independent of language,
title words, or author keywords. A variety of citation-based
search strategies are available, including bibliographic
coupling or linking of papers through
shared references (Related
Records®) 6, KeyWords
Plus® ,7 8,
and others.
Unique Advantages of Citation Indexes
The Thomson Scientific databases differ from traditional indexing and abstracting
services in several ways. From the outset, the Science Citation Index®
(SCI®), Social Sciences Citation Index®
(SSCI®), and Arts & Humanities
Citation Index® (A&HCI®) have
been multidisciplinary. They cover virtually all disciplines whereas
traditional services are limited to a single field.
The advantages of a multidisciplinary index can be exemplified by the work
of Nobelist Harold C. Urey. Published in Science in 1962, "Lifelike
forms in meteorites" described the chemical compounds they contained that were
essential to the formation of life on earth under the right conditions.
9 This paper deserved to be indexed in a variety
of single-discipline databases. But more importantly, citations to this paper
have appeared in a large variety of journals in astrophysics, biology, cosmology,
chemistry, earth sciences, geochemistry, and so on.
Thomson Scientific's indexes are also comprehensive, providing
complete coverage of all types of published source
items--not just original research papers, review
articles, and technical notes but also letters, corrections
and retractions, editorials, and other items. Thomson
Scientific studies have shown that these latter items
are important, have substantial impact, and provide
useful links to scientific issues and controversies.
As stated at the outset and perhaps most importantly,
Thomson Scientific uniquely indexes the references
cited by these source items. This gives you the ability
to perform
prospective as well as retrospective searches
of the literature. Like other indexes, Thomson Scientific's
databases allow you to move back in time
to locate previously published papers. But Thomson
Scientific's databases uniquely allow you to move forward in time—to determine who has subsequently cited
an earlier work. Thus, by starting with a single paper or book, you can identify
whatever additional papers have referred to it. And each retrieved paper, in
turn, may provide a new list of references with which to continue the citation
search.
Authoritative, Timely, In-Depth Access to the Literature
It is important to stress that the citation-based associations and connections
within the literature are made by authors themselves. Traditional indexes typically
rely on human subject specialists to categorize and describe papers, usually
using controlled vocabularies or thesauri.
A potential drawback of the latter method is illustrated in my early experience
in compiling a list of references on "general adaptation syndrome." Out of a
sample of papers published in a five-year period, 23 had cited Hans Selye's
primordial paper. 10 But even though all
23 were indexed in Index Medicus, not one was listed under the MeSH
heading, "Adaptation."
Another shortcoming of human indexing is that there is an inevitable delay
due to the time required to read or scan the papers and make subjective judgments
about relevant descriptors. In short, timeliness is reduced. In contrast, citation
indexing does not involve this type of analysis, which enables the SCI,
SSCI, and A&HCI to be virtually concurrent with the
literature.
In addition, due to the expense of human indexing,
traditional indexes limit the number of terms. But
in Thomson Scientific's citation indexes, all cited
references are indexed. Since the typical research
paper today contains from 25 to 35 references, the
resulting number of index entries is correspondingly
high. Indeed, citing papers provide useful indexing "statements" or descriptors through the papers
they cite.
Citations as Indexing Statements
Thanks to a suggestion by Chauncey Leake in the 1950s, I conducted a thorough
analysis of review articles and their cited references. By doing what today
would be called context analysis, I soon discovered that the sentences in the
review articles were actually detailed, descriptive indexing statements about
papers or books they cited.
Several years before ISI® (now
Thomson Scientific) was founded, this basic notion
was further developed with Robert L. Hayne when we
both were at Smith, Kline and French Labs in the
1950s. Through large test samples, we concluded that
the titles of papers cited in reviews and other articles
were sufficient to add useful descriptive words and
phrases to the citing paper. This was later confirmed
in studies by A. J. Harley, as Irv Sher and I recently
reported.11,
12
In 1990, ISI (now Thomson Scientific) was able
to introduce this citation-based method of derivative
(algorithmic) subject indexing, called KeyWords
Plus®.
7, 8 In addition
to title words, author-supplied keywords, and/or
abstract words, KeyWords Plus supplies words
and phrases to enhance these other descriptors and
thereby retrievability. These KeyWords
Plus terms are derived from the titles of cited
papers, which have been algorithmically processed
to identify the most-commonly recurring words and
phrases.
Conclusion
In the space available, it is not possible to stress all the innovative advantages
of citation indexing for information retrieval or to illustrate in detail the
variety of search strategies it makes possible. While future Citation
Comments will address these topics, it is perhaps more important to stress
here why scientists should get into the habit of literature searching.
One of the more obvious reasons is to avoid the unwitting duplication of research
and the wasted time, effort, and funds this involves.
For example, in 1964, John Martyn, Aslib Research Department, London, showed
how unintentional duplication is related to ignored or missed sources in the
literature.13 He surveyed about 650 British scientists and asked
if they had later discovered information in the literature they wished they
had at the beginning of their projects. Twenty-two percent said yes and cited
245 specific instances. Of these, 18 percent involved unintentional research
duplication. And in 43 percent of these instances, the researchers felt that
time, money, or work was wasted.
I've always believed that authors should be held by journal editors to the
same "due diligence" standards required of inventors by patent offices. That
is, authors should formally assert and verify that their ideas are original
and do not replicate discoveries already reported in the archives. Consequently,
they should be required to acknowledge the "prior art" that directly or indirectly
influenced their research.
In my opinion, the problem begins with teaching. Too few colleges require
undergraduates to learn how to search the literature. But with proper mentoring,
students should come to graduate schools already conditioned to do "prior art"
searching—and practice these techniques throughout their careers, whether in
academia or industry.
Dr. Eugene Garfield
Founder and Chairman Emeritus, ISI
References
1. Garfield E. From Current Comments®
to Citation Comments: continuing a 31-year series of Current
Contents® essays with a new focus. Current Contents
(51/52):3-5, 20-27 December 1993.
2. Price D. J. D. Little science, big
science...and beyond. New York: Columbia University Press, 1986. 301 p.
3. Small H. G. Cited documents as concept symbols.
Soc. Stud. Sci. 8:327-40, 1978.
4. Merton R. K. Foreword. (Garfield E) Citation
indexing—its theory and application in science, technology, and the humanities.
Philadelphia: ISI Press®, 1983. p. vi.
5. Kochen M. How do we acknowledge intellectual
debts? J. Doc. 43:54-64, 1987.
6. Garfield E. Announcing the SCI®
Compact Disc Edition: CD-ROM gigabyte storage technology, novel software,
and bibliographic coupling make desktop research and discovery a reality.
Current Contents® (22):3-13, 30 May 1988. (Reprinted in:
Essays of an information scientist: science literacy, policy, evaluation, and
other essays. Philadelphia: ISI Press®, 1990. Vol. 11.
p. 160-70.)
7. ---------- KeyWords Plus®: ISI®'s
breakthrough retrieval method. Part 1. Expanding your searching power on
Current Contents on Diskette®. Current Contents
(32):5-9, 6 August 1990. (Reprinted in: Ibid., 1991. Vol. 13. p.
295-9.)
8. ----------KeyWords Plus takes you beyond
title words. Part 2. Expanded journal coverage for Current Contents on
Diskette includes social and behavioral sciences. Current Contents
(33):5-9, 13 August 1990. (Reprinted in: Ibid., 1991. Vol. 13.
p. 300-4.)
9. Urey H. C. Lifelike forms in meteorites. Science
137:623-8, 1962.
10. Selye H. General adaptation syndrome.
J. Clin. Endocrinol. 6:117-230, 1946.
11. Gray W. A. & Harley A. J. Computer assisted
indexing. Inform. Storage Retrieval 7:167-74, 1971.
12. Garfield E. & Sher I. H. KeyWords
Plus—algorithmic derivative indexing. J. Amer. Soc. Inform. Sci.
44:298-9, 1993.
13. Martyn J. Unwitting duplication of research.
New Sci. 21:338, 1964.