Muğurtay, Nihat and Şirin, Kaan Güray and Heshmatnajafabad, Mehrdad and Kahya, Ahmet Taha and Yılmaz, Fazlı Göktuğ and Zouzou, Yasser and Bahçeci, Batuhan and Demir, Ayça and Tosun, Doğukan and Müftüler-Baç, Meltem and Varol, Onur (2025) Quantifying global foreign affairs with a multimodal dataset of diplomatic websites. Scientific Data, 12 (1). ISSN 2052-4463
Full text not available from this repository. (Request a copy)
Official URL: https://dx.doi.org/10.1038/s41597-025-06334-5
Abstract
This research introduces a global dataset of diplomatic news and images compiled from the official webpages of ministries of foreign affairs and chief executive offices across 156 countries spanning over 20 years. The collection provides over 1.16 million news articles and 1.18 million associated images. Our research initially shows how web scraping and Natural Language Processing (NLP) tools enhance labor-saving, novel data acquisition and processing methods. First, we extracted named entities for people, countries, and organizations mentioned in diplomatic texts. Second, GlobalDiplomacyNET processes and analyzes images published on diplomatic webpages, capturing governments’ image-sharing practices. This textual and visual information together provides substantial information on countries’ news-sharing habits, geographical and multilateral attention, visual assertiveness, and gender representation. GlobalDiplomacyNET is the first of its kind, offering a global corpus of textual and visual data that support novel research directions particularly in international relations and political science.
| Item Type: | Article |
|---|---|
| Divisions: | Center of Excellence in Data Analytics Faculty of Arts and Social Sciences Faculty of Engineering and Natural Sciences |
| Depositing User: | Nihat Muğurtay |
| Date Deposited: | 23 Mar 2026 10:55 |
| Last Modified: | 23 Mar 2026 10:55 |
| URI: | https://research.sabanciuniv.edu/id/eprint/53602 |

