Citation: | Chuxi Li, Zifan Xiao, Yerong Li, Zhinan Chen, Xun Ji, Yiqun Liu, Shufei Feng, Zhen Zhang, Kaiming Zhang, Jianfeng Feng, Trevor W. Robbins, Shisheng Xiong, Yongchang Chen, Xiao Xiao. Deep learning-based activity recognition and fine motor identification using 2D skeletons of cynomolgus monkeys. Zoological Research, 2023, 44(5): 967-980. doi: 10.24272/j.issn.2095-8137.2022.449 |
[1] |
Ahmad Z, Khan N. 2020. Human action recognition using deep multilevel multimodal (M2) fusion of depth and inertial sensors.
|
[2] |
Amir RE, Van den Veyver IB, Wan MM, et al. 1999. Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2.
|
[3] |
Andriluka M, Pishchulin L, Gehler P, et al. 2014. 2D human pose estimation: new benchmark and state of the art analysis. In: Proceedings of 2014 IEEE Computer Vision and Pattern Recognition. Columbus: IEEE.
|
[4] |
Bala PC, Eisenreich BR, Yoo SBM, et al. 2020. Automated markerless pose estimation in freely moving macaques with OpenMonkeyStudio.
|
[5] |
Ben Mabrouk A, Zagrouba E. 2018. Abnormal behavior recognition for intelligent video surveillance systems: a review.
|
[6] |
Berger M, Agha NS, Gail A. 2020. Wireless recording from unrestrained monkeys reveals motor goal encoding beyond immediate reach in frontoparietal cortex.
|
[7] |
Blake R. 1993. Cats perceive biological motion.
|
[8] |
Cao KD, Ji JW, Cao ZJ, et al. 2020. Few-shot video classification via temporal alignment. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 10618–10627.
|
[9] |
Cashin A, Yorke J. 2018. The relationship between anxiety, external structure, behavioral history and becoming locked into restricted and repetitive behaviors in autism spectrum disorder.
|
[10] |
Chahrour M, Zoghbi HY. 2007. The story of Rett syndrome: from clinic to neurobiology.
|
[11] |
Chattopadhay A, Sarkar A, Howlader P, et al. 2018. Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks. In: Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 839–847.
|
[12] |
Chen YC, Yu JH, Niu YY, et al. 2017. Modeling rett syndrome using TALEN-Edited MECP2 mutant cynomolgus monkeys.
|
[13] |
Chen YL, Wang ZC, Peng YX, et al. 2018. Cascaded pyramid network for multi-person pose estimation. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE.
|
[14] |
Delanoeije J, Gerencsér L, Miklósi Á. 2020. Do dogs mind the dots? Investigating domestic dogs' (Canis familiaris) preferential looking at human‐shaped point‐light figures.
|
[15] |
Dittrich WH, Lea SEG. 1993. Motion as a natural category for pigeons: generalization and a feature‐positive effect.
|
[16] |
Downey R, Rapport MJK. 2012. Motor activity in children with autism: a review of current literature.
|
[17] |
Feichtenhofer C, Pinz A, Zisserman A. 2016. Convolutional two-stream network fusion for video action recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 1933–1941.
|
[18] |
Feng XL, Wang LN, Yang SC, et al. 2011. Maternal separation produces lasting changes in cortisol and behavior in rhesus monkeys.
|
[19] |
Gosztolai A, Günel S, Lobato-Ríos V, et al. 2021. LiftPose3D, a deep learning-based approach for transforming two-dimensional to three-dimensional poses in laboratory animals.
|
[20] |
Harlow HF, Suomi SJ. 1971. Production of depressive behaviors in young monkeys.
|
[21] |
He KM, Zhang XY, Ren SQ, et al. 2016a. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 770–778.
|
[22] |
He KM, Zhang XY, Ren SQ, et al. 2016b. Identity mappings in deep residual networks. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 630–645.
|
[23] |
Hirasaki E, Kumakura H, Matano S. 2000. Biomechanical analysis of vertical climbing in the spider monkey and the Japanese macaque.
|
[24] |
Hossain E, Chetty G, Goecke R. 2013. Multi-view multi-modal gait based human identity recognition from surveillance videos. In: Proceedings of the 1st IAPR Workshop on Multimodal Pattern Recognition of Social Signals in Human-Computer Interaction. Tsukuba: Springer, 88–99.
|
[25] |
Hryniewiecka-Jaworska A, Foden E, Kerr M, et al. 2016. Prevalence and associated features of depression in women with Rett syndrome.
|
[26] |
Joosten AV, Bundy AC, Einfeld SL. 2009. Intrinsic and extrinsic motivation for stereotypic and repetitive behavior.
|
[27] |
Karashchuk P, Tuthill JC, Brunton BW. 2021. The DANNCE of the rats: a new toolkit for 3D tracking of animal behavior.
|
[28] |
Karpathy A, Toderici G, Shetty S, et al. 2014. Large-scale video classification with convolutional neural networks. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 1725–1732.
|
[29] |
Kay W, Carreira J, Simonyan K, et al. 2017. The kinetics human action video dataset. arXiv preprint arXiv: 1705.06950.
|
[30] |
Labuguen R, Matsumoto J, Negrete SB, et al. 2021. MacaquePose: a novel "in the wild" macaque monkey pose dataset for markerless motion capture.
|
[31] |
Li CX, Yang C, Li YR, et al. 2021a. MonkeyPosekit: automated markerless 2D pose estimation of monkey. In: Proceedings of 2021 China Automation Congress. Beijing: IEEE, 1280–1284.
|
[32] |
Li WT, Wang QX, Liu X, et al. 2021b. Simple action for depression detection: using kinect-recorded human kinematic skeletal data.
|
[33] |
Li YS, Xia RJ, Liu X. 2020. Learning shape and motion representations for view invariant skeleton-based action recognition.
|
[34] |
Li ZY, Gavrilyuk K, Gavves E, et al. 2018. VideoLSTM convolves, attends and flows for action recognition.
|
[35] |
Lin J, Gan C, Han S. 2019. TSM: temporal shift module for efficient video understanding. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE, 7083–7093.
|
[36] |
Liu MS, Gao JQ, Hu GY, et al. 2022. MonkeyTrail: a scalable video-based method for tracking macaque movement trajectory in daily living cages.
|
[37] |
Liu Z, Li X, Zhang JT, et al. 2016. Autism-like behaviours and germline transmission in transgenic monkeys overexpressing MeCP2.
|
[38] |
Lo Presti L, La Cascia M. 2016. 3D skeleton-based human action classification: a survey.
|
[39] |
Ma X, Ma CL, Huang J, et al. 2017. Decoding lower limb muscle activity and kinematics from cortical neural spike trains during monkey performing stand and squat movements. Frontiers in Neuroscience, 11: 44.
|
[40] |
Mathis A, Mamidanna P, Cury KM, et al. 2018. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning.
|
[41] |
Mendes LST, Manfro GG, Gadelha A, et al. 2018. Fine motor ability and psychiatric disorders in youth. European Child & Adolescent Psychiatry, 27(5): 605−613.
|
[42] |
Nath T, Mathis A, Chen AC, et al. 2019. Using DeepLabCut for 3D markerless pose estimation across species and behaviors.
|
[43] |
Ng JYH, Hausknecht M, Vijayanarasimhan S, et al. 2015. Beyond short snippets: deep networks for video classification. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 4694–4702.
|
[44] |
Qin DD, Wu SH, Chen YC, et al. 2019. Behavioral screening tools for identifying autism in macaques: existing and promising tests.
|
[45] |
Qin ZQ, Zhang PY, Wu F, et al. 2020. FcaNet: frequency channel attention networks. arXiv preprint arXiv: 2012.11879.
|
[46] |
Ricciardi C, Amboni M, De Santis C, et al. 2019. Using gait analysis’ parameters to classify Parkinsonism: a data mining approach.
|
[47] |
Richter CP. 1931. The grasping reflex in the new-born monkey.
|
[48] |
Sabbe B, Hulstijn W, Van Hoof J, et al. 1996. Fine motor retardation and depression.
|
[49] |
Shah RR, Bird AP. 2017. MeCP2 mutations: progress towards understanding and treating Rett syndrome.
|
[50] |
Sharma S, Kiros R, Salakhutdinov R. 2015. Action recognition using visual attention. arXiv preprint arXiv: 1511.04119.
|
[51] |
Simonyan K, Zisserman A. 2014. Two-stream convolutional networks for action recognition in videos. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 568–576.
|
[52] |
Soomro K, Zamir AR, Shah M. 2012. UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv: 1212.0402.
|
[53] |
Sun B, Zhang XY, Liu LZ, et al. 2017. Effects of head-down tilt on nerve conduction in rhesus monkeys.
|
[54] |
Tran D, Bourdev L, Fergus R, et al. 2015. Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 4489–4497.
|
[55] |
Tran TH, Le TL, Hoang VN, et al. 2017. Continuous detection of human fall using multimodal features from Kinect sensors in scalable environment.
|
[56] |
Van Damme T, Simons J, Sabbe B, et al. 2015. Motor abilities of children and adolescents with a psychiatric condition: a systematic literature review.
|
[57] |
Venkataraman V, Turaga P, Lehrer N, et al. 2013. Attractor-shape for dynamical analysis of human movement: applications in stroke rehabilitation and action recognition. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Portland: IEEE, 514–520.
|
[58] |
Viher PV, Docx L, Van Hecke W, et al. 2019. Aberrant fronto-striatal connectivity and fine motor function in schizophrenia.
|
[59] |
Vyas S, Rawat YS, Shah M. 2020. Multi-view action recognition using cross-view video prediction. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow: Springer, 427–444.
|
[60] |
Wang JD, Sun K, Cheng TH, et al. 2021. Deep high-resolution representation learning for visual recognition.
|
[61] |
Wang LM, Xiong YJ, Wang Z, et al. 2016. Temporal segment networks: towards good practices for deep action recognition. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 20–36.
|
[62] |
Xiao ZF, Liu YQ, Li CX, et al. 2022. Two-stream action recognition network based on temporal shift and split attention. Computer Systems & Applications, 31(1): 204−211. (in Chinese)
|
[63] |
Xie SN, Girshick R, Dollár P, et al. 2017. Aggregated residual transformations for deep neural networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 1492–1500.
|
[64] |
Yang LJ, Fan YC, Xu N. 2019. Video instance segmentation. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE, 5188–5197.
|
[65] |
Zhang H, Wu CR, Zhang ZY, et al. 2022. ResNeSt: split-attention networks. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New Orleans: IEEE.
|
[66] |
Zhou Y, Sharma J, Ke Q, et al. 2019. Atypical behaviour and connectivity in SHANK3-mutant macaques.
|
![]() |
![]() |