Open Set Authorship Attribution Toward Demystifying Victorian Periodicals

Sarkhan Badirli, Mary Borgo Ton, Abdulmecit Gungor, Murat Dundar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Existing research in computational authorship attribution (AA) has primarily focused on attribution tasks with a limited number of authors in a closed-set configuration. This restricted set-up is far from being realistic in dealing with highly entangled real-world AA tasks that involve a large number of candidate authors for attribution during test time. In this paper, we study AA in historical texts using a new data set compiled from the Victorian literature. We investigate the predictive capacity of most common English words in distinguishing writings of most prominent Victorian novelists. We challenged the closed-set classification assumption and discussed the limitations of standard machine learning techniques in dealing with the open set AA task. Our experiments suggest that a linear classifier can achieve near perfect attribution accuracy under closed set assumption yet, the need for more robust approaches becomes evident once a large candidate pool has to be considered in the open-set classification setting.

Original languageEnglish (US)
Title of host publicationDocument Analysis and Recognition – ICDAR 2021
Subtitle of host publication16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part IV
EditorsJosep Lladós, Daniel Lopresti, Seiichi Uchida
PublisherSpringer
Pages221-235
Number of pages15
ISBN (Print)9783030863364
DOIs
StatePublished - 2021
Event16th International Conference on Document Analysis and Recognition, ICDAR 2021 - Lausanne, Switzerland
Duration: Sep 5 2021Sep 10 2021

Publication series

NameLecture Notes in Computer Science
Volume12824
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Document Analysis and Recognition, ICDAR 2021
Country/TerritorySwitzerland
CityLausanne
Period9/5/219/10/21

Keywords

  • Author attribution
  • Open-set classification
  • Victorian literature

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Open Set Authorship Attribution Toward Demystifying Victorian Periodicals'. Together they form a unique fingerprint.

Cite this