Grand Challenges:

VCIP 2025 Grand Challenge on Live Broadcasting Video Quality Assessment

Challenge Introduction:

With the rise of live broadcasting services, users have a higher expectation of video quality. The great variations of videographic skills in shot environment, photographic apparatus, compression and processing protocols give rise to very complicated impairments in the live broadcasting videos, which can adversely impact the quality of experience (QoE) of end users.The complexity of distortion in live broadcasting videos and the fact that user experience is subjective and hard to quantify and measure pose challenges to QoE-based live video quality assessment (VQA). To address this challenge, we have built a large VQA database for live broadcasting with the associated QoE scores. It consists of 1013 videos with a carefully selected range of distortion type and intensity, focusing on the complicated impairments induced in the live broadcasting services. The paper can be found at here.

Our dataset fills the gap in publicly available datasets for studying the comprehensive effects of distortions in live video. We also conducted a subjective experiment, and with the analysis of the collected subjective data, it was found that the test results in the proposed database were not satisfactory, indicating that further work has to be done to deeply measure the characteristics of those specific distortions in live broadcasting videos. This proposed competition invites participants to benchmark their models on our publicly available database.

Challenge Significance:

The competition aims to foster innovation in both subjective and objective VQA techniques tailored to live broadcasting videos, addressing the unique challenges posed by live streaming impairments while emphasizing the evaluation of QoE. It also encourages the exploration of novel perceptual metrics that integrate QoE evaluation with learning-based approaches, extending beyond traditional distortion-focused methods. The competition is highly relevant to the fields of visual communication and video processing, as it supports the development of practical algorithms that can be applied in real-world scenarios such as live video streaming, content recommendation, mobile video capturing, and video enhancement. By advancing the state-of-the-art in live video quality assessment, this challenge will contribute to optimizing the visual quality and Quality of Experience (QoE) for live streaming platform users.

Submission Requirements:

The participants are required to follow the steps to submit their results:

A.Results

(1) Process all video files and store predicted scores in a structured JSON output file named result.json. The output structure should preserve the original video processing order through sequential array positioning, with each entry containing:
video_name: Original filename (string)
scores: Corresponding predicted scores (float)

(2) To ensure order consistency:
Maintain native file sequence from the processing pipeline
Implement explicit sorting with Python's sorted() before JSON serialization

[
  {"video_name": "video_001.mp4", "scores": 0.92},
  {"video_name": "video_002.mp4", "scores": 0.88},
  {"video_name": "video_003.mp4", "scores": 0.95}
]

Technical guidance:

Use json.dump() with indent=4 parameter for human-readable formatting
Validate JSON structure with jsonlint before deployment

B. Computational Complexity Analysis

(1) Execution Time Measurement:

Per-video processing time (ms)
Total pipeline duration (seconds)
Time complexity trend analysis (linear/polynomial/exponential)

(2) Temporal Statistics Example:

Time Metrics:
- Average processing time per frame: 34.2 ± 2.8 ms
- Total execution duration: 12m 45s
- Complexity scaling factor: O(n^1.1) observed

C.Environment

Hardware Configuration:
- CPU: Intel Xeon Platinum 8480CL @ 2.4GHz (64 cores)
- GPU: NVIDIA A100 80GB PCIe
- Memory: 512GB DDR5 ECC
- Storage: NVMe SSD RAID-0 Array

Software Environment:
- OS: Ubuntu 22.04 LTS
- Python: 3.10.12
- CUDA: 12.2
- Dependencies:
  • OpenCV 4.8.0
  • PyTorch 2.1.1+cu121
  • NumPy 1.26.2

D. Benchmarking Information

[Benchmark Parameters]
Batch Size: 16  
Precision: FP16  
Warmup Iterations: 100  
Measurement Iterations: 1000

Implementation Guidance:

Use Python's time.perf_counter() for microsecond-level measurements
Record environment details via pip freeze > requirements.txt
Include hardware fingerprints using nvidia-smi and lscpu outputs

E. Summarize

(1)Store the predicted scores in a structured JSON file named result.json according to requirement A
(2)Store computational metrics and system configurations in a standardized text file benchmark_report.txt using human-readable key-value pairs according to requirement B,C,D
(3)(Optional) Create a paper in the VCIP paper template to describe the designed VQA method for this challenge. It should give a detailed description of the proposed evaluation scheme and the computational complexity analysis.
(4)Create and send a ZIP package named submission.zip, which contains the result.json, benchmark_report.txt files and the research paper (if any).
(5)DATASET can be download at Baidu Netdisk (code:p63p) (All data is stored in videos and distinguished by "id" in train/test.xlsx)

Important Dates:

2025.04.20: Release of train and validation data
2025.05.25: Release of final test data
2025.06.04: Test output results submission deadline
2025.06.05: Code submission deadline
2025.06.15: Winner announcement
2025.06.30: Challenge paper submission deadline

Evaluation:

The evaluation consists of the comparison of the predictions with the reference ground truth, Mean Opinion Scores (MOS). We will assess your solutions based on Pearson Linear Correlation Coefficient (PLCC) and Spearman’s Rank-order Correlation Coefficient (SRCC).

1. Pearson Linear Correlation Coefficient (PLCC)

Pearson’s linear correlation coefficient (PLCC) is a measure of the linear correlation between the subjective scores and the mapped scores, which means the prediction accuracy:

\( PLCC = 1 - \frac{\sum_{i=1}^{N} (s_i - \bar{s})(f_i - \bar{f})}{\sqrt{\sum_{i=1}^{N} (s_i - \bar{s})^2 \sum_{i=1}^{N} (f_i - \bar{f})^2}} \)

Where:

\( s_i \) and \( \bar{s} \) are the \(i\)-th subjective score and the mean of all \(s_i\)
\( f_i \) and \( \bar{f} \) are the \(i\)-th mapped objective score after the non-linear mapping and the mean of all \(f_i\).

2. Spearman’s Rank-order Correlation Coefficient (SRCC)

Spearman’s Rank-order Correlation Coefficient (SRCC) computes the prediction monotonicity and indicates how well the relationship between subjective and objective quality can be depicted by a monotonic function:

\( SRCC = 1 - \frac{6 \sum_{i=1}^{N} d_i^2}{N(N^2 - 1)} \)

Where:

\( N \) represents the size of the testing dataset.
\( d_i \) is the rank difference of the \(i\)-th video’s subjective and objective scores.

Organizer 1

Name: Pengfei Chen
Unit/Institution: Xidian University
Email: chenpengfei@xidian.edu.cn
Homepage: Google Scholar Profile
Introduction: Pengfei Chen is a tenure-track Associate Professor at the School of Artificial Intelligence, Xidian University. He received his Bachelor's degree from Xidian University in 2014 and his Ph.D. from China University of Mining and Technology in 2022. His main research areas include image/video quality assessment, domain adaptation/domain generalization, and self-supervised learning. In recent years, leveraging the Ministry of Education Key Laboratory of Intelligent Perception and Image Understanding at Xidian University, he has conducted fruitful research in video quality assessment (VQA) and video Quality of Experience (QoE) evaluation. He has published over 20 papers in internationally renowned journals and conferences, such as IEEE TIP, ICCV, ACM MM, and Pattern Recognition, which have been cited more than 500 times according to Google Scholar. His innovative contributions have been successfully applied in various commercial products and in the defense and security sectors. Pengfei currently serves as a reviewer for several academic journals and conferences, including IEEE TIP, TMM, TCSVT, CVPR, ICCV, ACM MM, and AAAI.

Organizer 2

Name: Leida Li
Unit/Institution: Xidian University
Homepage: Leida Li's Homepage
Email: ldli@xidian.edu.cn
Introduction: Leida Li received the B.E. and Ph.D. degrees from Xidian University in 2004 and 2009, respectively. From Feb. to Jun. 2008, he was a research associate in Kaohsiung University of Science and Technology, Taiwan. From Jan. 2014 to Jan. 2015, he was a Research Fellow with the Rapid-rich Object SEarch (ROSE) Lab, School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore, working with Prof. Alex C. Kot and Prof. Weisi Lin. From Jul. 2016 to Jul. 2017, he was a Senior Research Fellow in ROSE lab, NTU, Singapore. From July 2009 to June 2019, he worked as Lecturer, Associate Professor and Professor, in the School of Information and Control Engineering, China University of Mining and Technology, China. Currently, he is a Full Professor with the School of Artificial Intelligence, Xidian University, China. He is an Associate Editor of IEEE Transactions on Image Processing (TIP) and Journal of Visual Communication and Image Representation (JVCI Best Associate Editor Award 2021/2023), Young Associate Editor of Journal of Image and Graphics (Excellent Editor Award 2022). His research interests include image processing and recognition, multimedia quality assessment, information hiding and image forensics. He has published more than 100 papers in these areas with 8000+ citations. He is a senior member of IEEE/CCF/CSIG.

Other Organizers

Wenqi Fei:FeiWenQi2024@163.com
Jiabin Shen:vedaeistelu77@gmail.com
Xinrui Xu:xu.xinrui@foxmail.com