Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/8378
Title: TOBB ETU at CheckThat! 2021: Data engineering for detecting check-worthy claims
Authors: Zengin, M.S.
Kartal, Y.S.
Kutlu, M.
Keywords: Check worthiness
Data engineering
Fact checking
Computer aided language translation
Cross-lingual
Data augmentation
Data engineering
Machine translations
Transformer models
Turkishs
Under-sampling
Learning to rank
Publisher: CEUR-WS
Abstract: In this paper, we present our participation in CLEF 2021 CheckThat! Lab's Task 1 on check-worthiness estimation in tweets. We explore how to fine-tune transformer models effectively by changing the train set. The methods we explore include language-specific training, weak supervision, data augmentation by machine translation, undersampling, and cross-lingual training. As our primary model submitted for official results, we fine-tune language-specific BERT-based models using cleaned tweets for each language. Our models ranked 1st in Spanish and Turkish datasets. However, our rank in Arabic, Bulgarian, and English datasets is 6t?, 4t?, and 10th, respectively. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Description: 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 -- 21 September 2021 through 24 September 2021 -- 171327
URI: https://hdl.handle.net/20.500.11851/8378
ISSN: 1613-0073
Appears in Collections:Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Show full item record



CORE Recommender

Page view(s)

226
checked on Dec 16, 2024

Google ScholarTM

Check





Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.