Dealing with High Dimensional Sentiment Data Using Gradient Boosting Machines

Vasileios Athanasiou; Manolis Maragoudakis

doi:10.1007/978-3-319-44944-9_42

Conference Papers Year : 2016

Dealing with High Dimensional Sentiment Data Using Gradient Boosting Machines

(1) , (1)

Vasileios Athanasiou

Function : Author
PersonId : 1011988

University of the Aegean

Manolis Maragoudakis

Function : Author
PersonId : 992336

University of the Aegean

Abstract

One of the most common classification tasks that applies on textual information is sentiment analysis, i.e. the prediction of the sentiment of a given document. With the vast use of social media and internet applications such as e-commerce, e-tourism and e-government, numerous comments and opinions are broadcasted per day, thus an automatic way of analyzing them is of great importance. The present paper focuses on sentiment analysis for Greek texts, obtained from Web 2.0 platforms. Greek is a language that lacks an in-depth availability of natural language processing tools in the sense that most of them are not publicly available. The novelty of the article is that instead of utilizing preprocessing tools such as Part-of-Speech taggers, text stemmers and polar-word lexica, it incorporates the translation of the Greek token as provided by the Google Translator® API. Since automatic translation of Greek sentences often results in poor translations where the meaning of the original sentence is severely deteriorated, the translation of each token individually is almost 100 % correct. However, taking the translation of every Greek token poses a significant issue to the outcome of the classification process for practically any classifier, therefore, we introduce the use of a powerful ensemble algorithm that is highly customizable to the particular needs of the application, such as being learned with respect to different loss functions and thus dealing with a large number of dimensions. This algorithm is called Gradient Boosting Machines and experimental results support our claim that it surpasses other, well-known machine learning techniques with a significant improvement for our task.

Keywords

Gradient Boosting Machines Sentiment analysis High-dimensional data Modern Greek

Domains

Computer Science [cs]

Fichier principal

430537_1_En_42_Chapter.pdf (563.41 Ko)

Origin	Files produced by the author(s)

Hal Ifip : Connect in order to contact the contributor

https://inria.hal.science/hal-01557615

Submitted on : Thursday, July 6, 2017-1:55:17 PM

Last modification on : Tuesday, September 22, 2020-1:38:06 PM

Long-term archiving on : Wednesday, January 24, 2018-2:11:29 AM

Dates and versions

hal-01557615 , version 1 (06-07-2017)

Licence

Attribution

Identifiers

HAL Id : hal-01557615 , version 1
DOI : 10.1007/978-3-319-44944-9_42

Cite

Vasileios Athanasiou, Manolis Maragoudakis. Dealing with High Dimensional Sentiment Data Using Gradient Boosting Machines. 12th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI), Sep 2016, Thessaloniki, Greece. pp.481-489, ⟨10.1007/978-3-319-44944-9_42⟩. ⟨hal-01557615⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP IFIP-AICT IFIP-TC IFIP-WG IFIP-TC12 IFIP-AIAI IFIP-WG12-5 IFIP-AICT-475

88 View

163 Download

Dealing with High Dimensional Sentiment Data Using Gradient Boosting Machines

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share