Publications

Adaptive Feature Learning for Unbiased Scene Graph Generation [PDF]

Published in IEEE TIP, 2024

Scene Graph Generation (SGG) aims to detect all objects and identify their pairwise relationships in the scene. Recently, tremendous progress has been made in exploring better context relationship representations. Previous work mainly focuses on contextual information aggregation and uses de-biasing strategies on samples to eliminate the preference for head predicates. However, there remain challenges caused by indeterminate feature training. Overlooking the label confusion problem in feature training easily results in a messy feature distribution among the confused categories, thereby affecting the prediction of predicates. To alleviate the aforementioned problem, in this paper, we focus on enhancing predicate representation learning. In this project, we provide an effective and general solution, termed AMP-BiC, from the view of feature learning for complex scene graph understanding. AMP-BiC simultaneously achieves both the discriminated information propagation and aggregation during message passing and the de-confusion and de-bias during training.</p>

Focusing on Flexible Masks: A Novel Framework for Panoptic Scene Graph Generation with Relation Constraints[PDF]

Published in ACM MM, 2023

Panoptic Scene Graph Generation (PSG) presents pixel-wise instance detection and localization, leading to comprehensive and precise scene graphs. To alleviate insufficient training caused by incomplete segmentation and proposal extraction, we construct a novel twostage framework for the PSG problem. In the training phase, we design a proposal matching strategy, which replaces deterministic segmentation results with proposals extracted from the off-theshelf panoptic head for label alignment, thereby ensuring the allmatching of training samples. In the inference phase, we present an innovative concept of employing relation predictions to constrain segmentation and design a relation-constrained segmentation algorithm. The algorithm recovers more valid instances and predicts more complete scene graphs by reconstructing the process of generating segmentation results from proposals using predicted relation results.

Download here

A DATABASE FOR MULTI-MODAL SHORT VIDEO QUALITY ASSESSMENT [PDF]

Published in ICASSP, 2023

The short video has gained increasing attention in information sharing and commercial promotions due to the fast development of social platforms. Accompanying, it introduces great requirements for assessing the quality of short videos for efficient information acquirement and propagation. However, existing video quality assessment researches focus on assessing video content with five rating scores, limiting the assessment to a one-dimension and simplified criterion. In this paper, we establish a novel database dubbed MMSVD-Douyin for assessing multi-modal short video quality under consideration of three evaluation criteria. It includes 4,684 short videos, three kinds of modalities, six kinds of data formats, and three assessment criteria. To conduct the short video quality assessment, we set up an all-around multi-modal short video quality assessment benchmark (MulSVQA) that dynamically fuses representations from three modalities and produces numbers of ”likes”, ”shares” and ”comments” of short videos.

Download here

Why Do Attributes Propagate in Graph Convolutional Neural Networks?

Published in AAAI, 2021

In this paper, we aim at providing a novel explanation to the question of “Why do attributes propagate in GCNNs?”, which not only gives the essence of the oversmoothing, but also illustrates why the GCN extensions, including multiscale GCN and GCN with initial residual, can improve the performance.

Text Co-Detection in Multi-View Scene

Published in IEEE Transactions on Image Processing, 2020

We address the corresponding text detection task and propose a novel text co-detection method to identify the co-occurring texts among multi-view scene images with compositions of detection and correspondence under large environmental variations.