Classification of web resources using user generated terms

Margaret E.I. Kipp, Inkyung Choi, Soohyung Joo

Research output: Contribution to conferencePaperpeer-review

Abstract

In this study, we suggest a useful method to classify web resources based on social tag information generated by users. We attempted to examine whether social tags could be a tool of classifying websites in a certain domain. We applied two statistical methods, including principal component analysis (PCA) and hierarchical clustering for classifying websites in the domain of consumer health information. First, PCA method was applied to identify different dimensions of the selected websites. Six dimensions were extracted from PCA: women, seniors, kids/parenting, drugs, men, and research. Second, we conducted a hierarchical clustering analysis to group similar websites in different hierarchical levels. These two methods reveal that social tags well represent the characteristics of individual websites in the domain of health information. This study yields a methodological implication that social tags can be used to automatically classify resources on the Web.
Original languageEnglish (US)
Number of pages8
StatePublished - 2013
Externally publishedYes
EventIFLA WLIC 2013: Future Libraries: Infinite Possibilities - , Singapore
Duration: Aug 17 2013Aug 23 2013

Conference

ConferenceIFLA WLIC 2013
CountrySingapore
Period8/17/138/23/13

Fingerprint Dive into the research topics of 'Classification of web resources using user generated terms'. Together they form a unique fingerprint.

Cite this