Data on the Russian invasion of Ukraine available in near-real time
Written by Morgan Sherburne
In order to track and share data on violent events unfolding in Ukraine, a University of Michigan researcher has developed a platform that collects this information in almost real time.
Political scientist Yuri Zhukov has launched VIINA: Violent Incident Information from News Articles on the 2022 Russian Invasion of Ukraine. VIINA is a near-real time multisource event data system for the invasion.
“I wanted to make these data available immediately because media sites in both countries are already being shut down, due to either censorship (in Russia) or military operations (in Ukraine),” said Zhukov, associate professor of political science and research associate professor at the Institute for Social Research’s Center for Political Studies. “It is thus essential that researchers have access to information about the war, as reported across media organizations and other actors in the information space.”
While different media cover different types of events, VIINA’s multisource approach will capture a more accurate picture of events as they unfold. The platform allows researchers to access data based on news reports from Ukrainian and Russian media, which have been geocoded and classified into standard conflict event categories through machine learning.
VIINA is freely available for use by students, journalists, policymakers and researchers. Using an automated web scraping routine that runs every six hours, VIINA extracts the text of news reports published by each source and their associated metadata, including publication time, date and web urls. GIS-ready data can be downloaded from VIINA, with temporal precision down to the minute.
VIINA draws on news reports from a variety of Ukrainian and Russian news providers. Data sources currently include news wires, TV stations, newspapers and online publications in both countries.
Zhukov plans to expand these sources as the conflict unfolds, to include OSINT social media feeds and other key sources. The set of sources may also change as the war unfolds because of interruptions to journalistic activity from military operations, cyberattacks and state censorship, as well as the availability of new data from other information providers.