Towards building a collection of web archiving research articles

Gespeichert in:

Bibliographische Detailangaben
Zeitschriftentitel:	Proceedings of the American Society for Information Science and Technology
Personen und Körperschaften:	Ayala, Brenda Reyes, Caragea, Cornelia
In:	Proceedings of the American Society for Information Science and Technology, 51, 2014, 1, S. 1-5
Format:	E-Article
Sprache:	Englisch
veröffentlicht:	Wiley
Schlagwörter:	Library and Information Sciences Information Systems

Details
Zusammenfassung:	<jats:title>ABSTRACT</jats:title><jats:p>The field of Web Archiving exists in a fluid, fragmented, and heterogeneous state. Part of the problem is that this field is relatively new and its literature is scattered across a wide range of journal and conference venues. This makes the state of Web Archiving as a discipline particularly difficult to ascertain. This paper presents an approach to building a collection of articles about the subject. We begin with a small dataset of articles taken from a Web Archiving Bibliography and then proceed to expand it by crawling the Web and collecting additional documents. The crawled documents are then classified using machine learning classification techniques. We show that by extracting the documents’ titles and abstracts and representing them using the “bag of words” approach, we are able to accurately identify documents from the Web crawler as documents that are about Web Archiving. We also discuss our results in the context of Web Archiving as an emerging field.</jats:p>
Umfang:	1-5
ISSN:	0044-7870 1550-8390
DOI:	10.1002/meet.2014.14505101150