mardi, juin 14, 2005

Web: Broken links in online journals

I have just read a worrying article in the latest edition of the Journal of Computer-Mediated Communication, " Hyperlink Obsolescence in Scholarly Online Journals", by James Ho. In this article, the author examines three online scientific journals: the Journal of Computer-Mediated Communication, First Monday, and the Journal of Interactive Media in Education.

One of the first interesting observations to come out of the study is the high proportion of articles that contains hyperlinks. The authors provide a somewhat complicated table, broken down by journal and by date, but I did the calculation myself by adding up all the figures: 76% of online articles contain hyperlinks. This is perfectly logical, since it is one of the main reasons for publishing online in the first place.

At the same time, the authors calculated the proportion of broken links, and the result is extremely worrying. It rises to 49%, just under half, when all publications and dates are combined. Even more frightening is the fact that this proportion is growing over time. Here too, I did the calculations myself by adding up the different journals. For articles published after 2000, the proportion of broken links is 33% - one third, which is annoying enough in itself. However, it rises to 60% for those articles published before 2000! What will the situation be like in 20 years’ time?

The article mentions several best practices that we would all do well to follow, and reminds us of efforts made regarding persistent URLs, etc. But I fear they may be no more than pious vows. Authors on the web have no control over what lies at the other end of the link, and I can’t see this situation changing any time soon. It would seem to me that the only really valid solution is to make a local copy of the page linked to, while obviously maintaining complete information about the source. With certain limitations, this would seem to fall under the definition of "fair use" within the framework of scientific publications, which is what should apply in terms of questions of copyright and intellectual property.

This may just be wishful thinking on my part. In any case, the scientific community will certainly need to face up to the question soon with the move from paper publication to electronic publication, which will probably increase over the years. Bibliographical references are the cornerstone of modern scientific endeavour. They are necessary in order to give credit to those who have gone before, as well as allowing readers to verify or contradict the authors’ claims by going back to the sources …

It is true that these healthy scientific practices are being applied less and less often. I am frequently enraged to see publications limit themselves to a horizon of five years -- as if nothing ever existed before the beginning of the author’s thesis (or the defense of their advisor's thesis). I am also enraged to see the number of approximate or inappropriate references -- ones that are patently second-hand, with the author not even taking the trouble to check what the sources really said. Publish or perish, indeed.

Frankly, I can’t help but think that if the catastrophic proportion of broken links revealed to us by James Ho hasn’t created more of an outcry, it can only be because the inclusion of references is becoming a kind of simple social ritual that must be followed in order to be published, but which in the end no-one really cares about -- not the authors themselves, nor their readers, and least of all referees, who never have enough time to go and check them out.

French novelist, poet, playwright and essayist Georges Perec (who once wrote an entire novel without using the letter e) provides a wonderful caricature of this state of affairs in his Cantatrix Sopranica. I would like to thank my friend Benoît Habert for letting me know that this work is now available on the Internet [en] [fr]: it hasn’t aged a bit and is still absolutely hilarious (although some puns in the citations may be lost in the translation). Let’s not forget that Perec was a librarian at the CNRS (the French National Centre for Scientific Research), which must have helped a bit!

