InteBOMB: Integrating generic object tracking and segmentation with pose estimation for animal behavior analysis

Hao Zhai; Hai-Yang Yan; Jing-Yuan Zhou; Jing Liu; Qi-Wei Xie; Li-Jun Shen; Xi Chen; Hua Han

doi:10.24272/j.issn.2095-8137.2024.268

Hao Zhai, Hai-Yang Yan, Jing-Yuan Zhou, Jing Liu, Qi-Wei Xie, Li-Jun Shen, Xi Chen, Hua Han. 2025. InteBOMB: Integrating generic object tracking and segmentation with pose estimation for animal behavior analysis. Zoological Research, 46(2): 355-369. DOI: 10.24272/j.issn.2095-8137.2024.268

Citation:

InteBOMB: Integrating generic object tracking and segmentation with pose estimation for animal behavior analysis

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Advancements in animal behavior quantification methods have driven the development of computational ethology, enabling fully automated behavior analysis. Existing multi-animal pose estimation workflows rely on tracking-by-detection frameworks for either bottom-up or top-down approaches, requiring retraining to accommodate diverse animal appearances. This study introduces InteBOMB, an integrated workflow that enhances top-down approaches by incorporating generic object tracking, eliminating the need for prior knowledge of target animals while maintaining broad generalizability. InteBOMB includes two key strategies for tracking and segmentation in laboratory environments and two techniques for pose estimation in natural settings. The “background enhancement” strategy optimizes foreground-background contrastive loss, generating more discriminative correlation maps. The “online proofreading” strategy stores human-in-the-loop long-term memory and dynamic short-term memory, enabling adaptive updates to object visual features. The “automated labeling suggestion” technique reuses the visual features saved during tracking to identify representative frames for training set labeling. Additionally, the “joint behavior analysis” technique integrates these features with multimodal data, expanding the latent space for behavior classification and clustering. To evaluate the framework, six datasets of mice and six datasets of non-human primates were compiled, covering laboratory and natural scenes. Benchmarking results demonstrated a 24% improvement in zero-shot generic tracking and a 21% enhancement in joint latent space performance across datasets, highlighting the effectiveness of this approach in robust, generalizable behavior analysis.