STVD “large-Scale TV Dataset”
The LIFAT laboratory (Tours city, France) has
published a dataset, named STVD “large-Scale TV Dataset”, for the
research community in the Computer Vision field. The STVD dataset is designed to aim at evaluating performance of partial video copy detection (PVCD) methods. The PVCD goal is to find one or more video segments of a reference video which have transformed copies. The STVD dataset is now public available (website link below) under an intellectual property agreement /
terms of use.
STVD is the largest public dataset on the PVCD task. It was
constituted with about 83 thousands of videos having in total of more than 10
thousands of hours duration and including more than 420 thousands of video copy pairs. It offers different test sets for a fine performance characterization
(frame degradation, global transformation, video speeding, etc.) with a
frame level annotation for the real-time detection and video alignment. Baseline comparisons were reported to show a room for improvement.
More information about the STVD dataset can be found into the
publications [1, 2].
[1] V.H. Le, M. Delalandre and D. Conte. A large-Scale TV Dataset
for partial video copy detection. International Conference on Image
Analysis and Processing (ICIAP), Lecture Notes in Computer Science (LNCS), vol
13233, pp. 388-399, 2022.
[2] V.H. Le, M. Delalandre and D. Conte. Une large base de données
pour la détection de segments de vidéos TV. Journées Francophones des Jeunes Chercheurs en Vision par Ordinateur (ORASIS), 2021.

The LIFAT laboratory (Tours city, France) has published a dataset named STVD-FC “large-Scale TV Dataset – Fact Checking”. This dataset is dedicated to the French political content analysis and fact-checking having a particular focus on the 2022 French presidential election. This dataset is public available (below) under an
intellectual property agreement / terms of use.
STVD-FC is the largest public dataset on the political content
analysis and fact-checking tasks. It consists of more than 1,200 fact-checked
claims that have been scraped from a fact-checking service with associated
metadata. For the video counterpart, the dataset contains nearly 6,730 TV
programs, having a total duration of 6,540 hours, with metadata. These programs have been collected during the 2022 French presidential election with a dedicated workstation and protocol. The dataset is delivered as different parts for accessibility of the 2 TB of data and proper indexes. More
information about the STVD-FC dataset can be found into the publication [1].
[1] F. Rayar, M. Delalandre and V.H. Le. A large-scale TV video and
metadata database for French political content analysis and fact-checking.
Conference on Content-Based Multimedia Indexing (CBMI), pp. 181–185, 2022.
NB: For all benchmarks and software, please see here: