Please use this identifier to cite or link to this item: http://hdl.handle.net/10773/39596
Title: Predicting how much a consumer is willing to pay for a bottle of wine: dealing with data imbalance
Author: Alonso, Hugo
Candeias, Teresa
Keywords: Wine
Classification
Data imbalance
Re-sampling
Learning methods
Predictive models
Issue Date: 2023
Publisher: SciTePress
Abstract: The wine industry has becoming increasingly important worldwide and is one of the most significant industries in Portugal. In a previous paper, the problem of predicting how much a Portuguese consumer is willing to pay for a bottle of wine was considered for the first time ever. The problem was treated as a multi-class ordinal classification task. Although we achieved good prediction results, globally speaking, it was difficult to identify rare cases of consumers who are interested in paying for more expensive wines. We found that this was a direct consequence of data imbalance. Therefore, here, we present a first attempt to deal with this issue, based on the use of re-sampling strategies to balance the training data, namely random under-sampling, random over- sampling with replacement and the synthetic minority over-sampling technique. We consider several learning methods and develop various predictive models. A comparative study is carried out and its results highlight the importance of a careful choice of the re-sampling strategy and the learning method in order to get the best possible prediction results.
Peer review: yes
URI: http://hdl.handle.net/10773/39596
DOI: 10.5220/0012068800003541
ISSN: 2184-285X
Appears in Collections:CIDMA - Comunicações
SCG - Comunicações

Files in This Item:
File Description SizeFormat 
DATA23.pdf207.78 kBAdobe PDFView/Open


FacebookTwitterLinkedIn
Formato BibTex MendeleyEndnote Degois 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.