nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

UCoD: Ensemble BERT for Hierarchical Classification of the Urdu Disinformation Corpus

verfasst von : Umar Farooq, Omer Beg, Faisal Riaz, Saeid Jamali, William Holderbaum, Umar Raza

Erschienen in: The Second International Adaptive and Sustainable Science, Engineering and Technology Conference

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Online disinformation poses a growing threat, requiring fact-checking and detection/prevention measures. To address this, we propose a hierarchical classification approach using the DistilBERT and XLM-RoBERTa ensemble architectures on the Urdu Corpus of Disinformation (UCoD). Our ensemble outperforms other models like RNNs, LSTMs, k-nearest neighbors, random forests, and quadratic discriminant analysis, achieving a weighted F1 of 68.7 on UCoD. These results confirm the advantage of ensembles for imbalanced corpora, supporting the use of deep learning techniques in combating disinformation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Reconfigurable Load-Modulated Balanced 9-GHz Power Amplifier Design for Radar Applications

Nächstes Kapitel Human Health Risk Assessment of Trace Metals Due to Dietary Intake of Edible Fishes Obtained from Qua Iboe River, Ibeno Local Government Area, Akwa Ibom State

Agarwal, N., Balasubramanian, V.N., Jawahar, C.: Improving multi-class classification by deep networks using dagsvm and triplet loss. Pattern Recogn. Lett. 112, 184–190 (2018)CrossRef

Amjad, M., Sidorov, G., Zhila, A.: Naive Bayes data augmentation using machine translation for fake news detection in the Urdu language. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 2537–2542. European Language Resources Association, Marseille (2020)

Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor. Newsl. 19(1), 22–36 (2017)CrossRef

Nelson, T., Nicole K., Claire C., Alan H., Albert H.: The danger of misinformation in the COVID-19 crisis. Missouri Medicine 117(6), 510 (2020)

Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Perspect. 31(2), 211–236 (2017)CrossRef

Lazer, D.M., Baum, M.A., Benkler, Y., Berinsky, A.J., Greenhill, K.M., Menczer, F., et al.: The science of fake news. Science. 359(6380), 1094–1096 (2018)CrossRef

Da San Martino, G., Barrón-Cedeno, A., Rosso, P.: Automatic detection and classification of propaganda techniques in news articles. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong (2019)

Mustafa, R.U., Nawaz, M.S., Lali, M., Shahzad, B.: Early detection of controversial urdu speeches from social media. Data Sci. Pattern Recognit. 1(2), 26–42 (2017)

Barrón-Cedeño, Da San Martino, Jaradat, and Nakov, Proppy: Organizing News Coverage on the Basis of Their Propagandistic Content, Information Processing and Management, 2019

10.

Conroy, N.K., Rubin, V.L., Chen, Y.: Automatic deception detection: methods for finding fake news. Proc. Assoc. Inf. Sci. Technol. 52, 1–4 (2015)CrossRef

11.

Torok, R.: Symbiotic radicalisation strategies: Propaganda tools and neuro linguistic programming (2015)

12.

Barrón-Cedeno, A., Giovanni Da San M., Israa J., Preslav N.: Proppy: A system to unmask propaganda in online news. In Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 9847–9848 (2019)

13.

Rashkin, H., Choi, E., Jang, J.Y., Volkova, S., Choi, Y.: Truth of varying shades: analyzing language in fake news and political fact-checking. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2931–2937. Association for Computational Linguistics, Copenhagen (2017)

14.

Habernal, I., Hannemann, R., Pollak, C., Klamm, C., Pauli, P., Gurevych, I.: Argotario: computational argumentation meets serious games. arXiv preprint arXiv:1700.06002 2017

15.

Wang, W.Y.: “Liar, liar pants on fire”: a new benchmark dataset for fake news detection. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 422–426. Association for Computational Linguistics, Vancouver, Canada (2017)CrossRef

16.

Nelson, J.L., Taneja, H.: The small, disloyal fake news audience: the role of audience availability in fake news consumption. New Media Soc. 20, 3720–3737 (2018)CrossRef

17.

Rubin, V.L., Chen, Y., Conroy, N.K.: Deception detection for news: three types of fakes. Proc. Assoc. Inf. Sci. Technol. 52(1), 1–4 (2015)CrossRef

18.

Zhou, X., Jain, A., Phoha, V.V., Zafarani, R.: Fake news early detection: a theory-driven model. Digital Threats: Res. Pract. 1, 1–25 (2020)CrossRef

19.

Daud, A., Wahab K., Dunren C.: Urdu language processing: a survey. Artificial Intelligence Review 47, 279–311 (2017)

20.

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Minneapolis, Association for Computational Linguistics (2019)

21.

Demirkaya, A., Chen, J., Oymak, S.: Exploring the role of loss functions in multiclass classification. In: 2020 54th Annual Conference on Information Sciences and Systems (CISS), pp. 1–5. IEEE (2020)

22.

Ho, Y., Wookey, S.: The real-world-weight cross-entropy loss function: modeling the costs of mislabeling. IEEE Access. 8, 4806–4813 (2019)CrossRef

23.

Semenov, A., Boginski, V., Pasiliao, E.L.: Neural networks with multi-dimensional cross-entropy loss functions. In: International Conference on Computational Data and Social Networks, pp. 57–62. Springer (2019)CrossRef

24.

Amjad, M., Grigori S., Alisa Z., Helena Gómez-Adorno, Ilia V., Alexander G.: Bend the truth: Benchmark dataset for fake news detection in Urdu language and its evaluation. Journal of Intelligent & Fuzzy Systems 39(2), 2457–2469 (2020)

25.

Benito, D., Araque, O., Iglesias, C.A.: Gsi-upm at semeval-2019 task 5: semantic similarity and word embeddings for multilingual detection of hate speech against immigrants and women on twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 396–403. Association for Computational Linguistics, Minneapolis, Minnesota, USA (2019)CrossRef

Titel: UCoD: Ensemble BERT for Hierarchical Classification of the Urdu Disinformation Corpus
verfasst von: Umar Farooq
Omer Beg
Faisal Riaz
Saeid Jamali
William Holderbaum
Umar Raza
Verlag: Springer Nature Switzerland
Buch: The Second International Adaptive and Sustainable Science, Engineering and Technology Conference
Print ISBN: 978-3-031-53934-3

Electronic ISBN: 978-3-031-53935-0

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-53935-0_25

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Suresh Vittal/© Alteryx, Additiv gefertigte Teile/© Marina_Skoropadskaya | Getty Images | iStock, Warnschild "Land unter"/© Bluedesign / Fotolia, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.