Mining access patterns efficiently from web logs

Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Hua Zhu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the explosive growth of data avaiilable on the World Wide Web, discovery and analysis of useful information from the World Wide Web becomes a practical necessity. Web access pattern, which is the sequence of accesses pursued by users frequently, is a kind of interesting and useful knowledge in practice. In this paper, we study the problem of mining access patterns from Web logs efficiently. A novel data structure, called Web access pattern tree, or WAP-tree in short, is developed for efficient mining of access patterns from pieces of logs. The Web access pattern tree stores highly compressed, critical information for access pattern mining and facilitates the development of novel algorithms for mining access patterns in large set of log pieces. Our algorithm can find access patterns from Web logs quite efficiently. The experimental and performance studies show that our method is in general an order of magnitude faster than conventional methods.

Original languageEnglish (US)
Title of host publicationKnowledge Discovery and Data Mining
Subtitle of host publicationCurrent Issues and New Applications - 4th Pacific-Asia Conference, PAKDD 2000, Proceedings
EditorsTakao Terano, Huan Liu, Arbee L.P. Chen
PublisherSpringer-Verlag Berlin Heidelberg
Pages396-407
Number of pages12
ISBN (Print)3540673822, 9783540673828
DOIs
StatePublished - 2000
Externally publishedYes
Event4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2000 - Kyoto, Japan
Duration: Apr 18 2000Apr 20 2000

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1805
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2000
CountryJapan
CityKyoto
Period4/18/004/20/00

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Mining access patterns efficiently from web logs'. Together they form a unique fingerprint.

Cite this