Brands.Br – a Portuguese Reviews Corpus


The Brands.Br corpus was built from a fraction of B2W-Reviews01 corpus. We use a set of 252 samples selected by B2W to be enriched. In Brands.Br corpus we want to solve two main challenges in product reviews corpus. The first: it is very common to find customer reviews referring to distinct things. To deal with the cross-topics problem we add a new layer, that classifies the subject of the review. This field can be multi label and covers 9 classes (Elogio, Reclamação, Dúvida, Solicitação, Indicação, Sugestão, Atendimento, Produto e Entrega) – (Compliment, Complaint, Doubt, Request, Indication, Suggestion, Service, Product and Delivery). The second challenge refer to unclassified Brands. To perform the annotations we use a semi-automatic method. That is, the annotations were performed using our proprietary software. To produce the gold standard, the samples were manually revised by linguists. The Brands.Br corpus is freely available.


