Abstract
Corporate adverse events are important resources for investors to analyze business stability and predict future performance. However, adverse events are often scattered across different news media and difficult to recognize. In this paper, we introduce a classification framework to recognize adverse events. A novel under-sampling method based on majority instances clustering is also proposed to deal with the imbalanced data issue. The framework and the under-sampling method are tested using a sample of manually labelled news articles collected for S&P 500 companies. Our experimental results show that both the framework and the under-sampling method are effective in classifying the imbalanced data, and produce better performance than three baseline methods. The proposed framework can be conveniently applied to other text classification areas as well.
Original language | English (US) |
---|---|
State | Published - 2014 |
Externally published | Yes |
Event | 24th Annual Workshop on Information Technologies and Systems: Value Creation from Innovative Technologies, WITS 2014 - Auckland, New Zealand Duration: Dec 17 2014 → Dec 19 2014 |
Conference
Conference | 24th Annual Workshop on Information Technologies and Systems: Value Creation from Innovative Technologies, WITS 2014 |
---|---|
Country/Territory | New Zealand |
City | Auckland |
Period | 12/17/14 → 12/19/14 |
Keywords
- Adverse events
- Classification
- Imbalanced data
- Text mining
- Under-sampling
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Computer Science Applications