Text Co-Detection in Multi-View Scene
Published in IEEE Transactions on Image Processing, 2020
Multi-view scene analysis has been widely explored in computer vision, including numerous practical applications. The texts in multi-view scenes are often detected by following the existing text detection method in a single image, which however ignores the multi-view corresponding constraint. The multi-view correspondences may contain structure, location information and assist difficulties induced by factors like occlusion and perspective distortion, which are deficient in the single image scene. In our text co-detection method, the visual and geometrical correspondences are designed to explore texts holding high pairwise representation similarity and guide the exploitation of texts with geometrical correspondences, simultaneously. To guarantee the pairwise consistency among multiple images, we additionally incorporate the cycle consistency constraint, which guarantees alignments of text correspondences in the image set. Finally, text correspondence is represented by a permutation matrix and solved via positive semidefinite and low-rank constraints. Moreover, we also collect a new text co-detection dataset consisting of multi-view image groups obtained from the same scene with different photographing conditions.
Recommended citation: Chuan Wang, Huazhu Fu, Liang Yang, Xiaochun Cao: Text Co-Detection in Multi-View Scene. IEEE Trans. Image Process. 29: 4627-4642 (2020).