During the breast-feeding period, infants undergo remarkable changes, including rapid physiological and developmental growth. However, little is known about gene expression features and sex-specific gene expression in breast-feeding infants. In this study, we sequenced 32 blood transcriptomes from 16 breast-feeding rhesus macaque (Macaca mulatta) infants and their lactating mothers. We identified 218 differentially expressed genes (DEGs) between infants and mothers, including 91 up-regulated and 127 down-regulated DEGs in the infant group. Functional enrichment analysis of the up-regulated DEGs and unique hub genes in infants showed primary enrichment in immunity, growth, and development. Protein-protein interaction analysis also revealed that genes at key positions in infants were mainly related to development and immunity. However, we only detected 23 DEGs between female and male infants, including three DEGs located on chromosome X and 14 DEGs located on chromosome Y. Of these DEGs, TMF1 regulated nuclear protein 1 (Trnp1), which was highly expressed in female infants, is crucial for controlling the tangential and radial expansion of the cerebral cortex in mammals. Thus, our study provides novel insight into the gene expression features of breast-feeding infants in non-human primates (NHPs) and reveals sex-specific gene expression between these infants.
During the breast-feeding period, breast milk provides the first source of immune protection for infants and also activates the development of an infant’s own immune system (Rogier et al., 2014). Moreover, breast-feeding infants also experience significant changes in growth, metabolism, and neurobehavioral development during this period (Feldman & Eidelman, 2003; Shoji & Shimizu, 2019). Therefore, the expression levels of genes related to such functions or pathways in infants may show significant differences to that found in adults.
Although male and female genomes share much of the same genetic information, previous studies have found sex-specific gene expression in various species and tissues (Gershoni & Pietrokovski, 2017; Ronen & Benvenisty, 2014; Tower, 2017; Villa et al., 2018). For example, males and females exhibit differences in brain anatomy and development (Collaer & Hines, 1995; Giedd et al., 1997), women consistently exhibit greater longevity than men (Tower, 2017), and sex differences are pervasive in metabolic and cardiovascular traits (Chella Krishnan et al., 2018).
Rhesus macaques exhibit a close evolutionary relationship to humans and are thus widely used in biomedical studies (Rhesus Macaque Genome Sequencing and Analysis Consortium et al., 2007). They are also considered an ideal animal model for human infant nutritional research (Lönnerdal, 2012). To date, however, little is known about the gene expression features in breast-feeding rhesus macaques, or the differences between infants and their mothers. Furthermore, whether breast-feeding female and male infants exhibit differences in gene expression remains unclear. In this study, we sequenced the blood transcriptomes of 16 breast-feeding rhesus macaque infants (eight males and eight females) and their lactating mothers to determine the: (1) transcriptional profiles of these infants; (2) gene expression differences between infants and their mothers; and (3) sex-specific gene expression in breast-feeding infants.
Peripheral whole blood samples were collected during a routine examination of 32 captive rhesus macaques in September 2019. All libraries were sequenced using the Illumina NovaSeq 6000 platform with a paired-end sequencing length of 150 bp (PE150) at Novogene (China). The raw data were submitted to the NCBI Gene Expression Omnibus (GEO) under accession PRJNA599261. The NGS QC Toolkit v2.3.3 (Patel & Jain, 2012) and HISAT2 v2.1.0 (Kim et al., 2015) were used for quality control and read mapping, respectively, with StringTie v1.3.6 (Pertea et al., 2015) then used to assemble transcriptomes and obtain raw read counts for each gene and transcript. Based on the expression values, we used the ComplexHeatmap R package (Gu et al., 2016) to perform cluster analysis and the DESeq2 R package (Love et al., 2014) to perform differential expression analysis. Weighted gene co-expression network analysis (WGCNA) (Langfelder & Horvath, 2008) was used to construct gene co-expression modules and identify hub genes. We performed gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses using g:Profiler (Raudvere et al., 2019) to functionally classify DEGs. Finally, protein-protein interaction networks of hub genes and DEGs were constructed based on the STRING protein interaction database (Szklarczyk et al., 2017). Cytoscape (Shannon et al., 2003) was used to visualize the protein-protein interaction network for hub genes and DEGs and determine key regulator (driver) genes.
Deep RNA-seq was performed using the whole peripheral blood samples. From these samples, approximately 955 million 150 bp paired-end raw reads were acquired. After removing adaptor sequences and discarding low-quality reads, we obtained 921 million clean reads. We aligned the clean reads to the reference genome (rhesus macaque, MMUL_10) separately with an average mapping rate of 96.47%, indicating that most paired reads were aligned correctly (Supplementary Table S1). After removing low expression genes and transcripts, the reads were assembled into 18 053 known genes and 33 324 annotated transcripts.
The RNA-seq data were firstly normalized to reduce the influence of technical noise (Supplementary Figure S1). We then performed principal component analysis (PCA) of gene expression in the 32 samples. Results showed that infants and mothers were clearly clustered into two separate groups based on the first principal component (PC1), which explained 42% of the variance. However, there was no significant grouping by sex in the infants (Supplementary Figure S2). Hierarchical cluster analysis based on inter-array correlation also showed distinct clustering of these two groups (Supplementary Figure S3).
In total, 218 DEGs were identified between the infant and mother groups, after adjusting for confounding factors such as sex. Among these DEGs, 91 were up-regulated and 127 were down-regulated in the infant group compared with the mother group. In addition, we conducted hierarchical cluster analysis of DEGs across the 32 samples. The up-regulated genes clustered into one group and the down-regulated genes clustered into another group. Similarly, infants clustered into one group and mothers clustered into another group (Figure 1A), thus highlighting gene expression differences between infants and their mothers.
Figure 1. Heat map and enrichment analysis of DEGs and weighted gene co-expression network analysis (WGCNA)
To gain insight into the biological roles of DEGs, we performed GO category and KEGG pathway enrichment analyses of up-regulated and down-regulated DEGs separately. In GO enrichment analysis, the up-regulated DEGs in infants were mainly enriched in categories associated with the immune system, such as immune response (GO: 0006955) and defense response (GO: 0006952). Several growth and development-related categories were also enriched in the up-regulated DEGs in infants, such as replacement ossification (GO: 0036075), endochondral ossification (GO: 0001958), and skeletal system morphogenesis (GO: 0048705) (Figure 1B, Supplementary Table S2). KEGG enrichment analysis showed that the up-regulated genes in infants were significantly enriched in eight pathways (Supplementary Figure S4). The down-regulated DEGs in infants did not show significant enrichment in any GO term and were only significantly enriched in the complement and coagulation cascades pathway in KEGG analysis.
We applied WGCNA to gain further insight into gene expression distribution in infants and their mothers (Langfelder & Horvath, 2008). After merging modules with high similarity, we ranked all genes according to the median absolute deviation (MAD) value and selected the first 5 000 as representative genes, which were then divided into 11 modules (Figure 1C), three of which showed significant module-trait relationships (Supplementary Figure S5). The modules with the highest correlations in the infant and mother groups were selected as the key module of each group. As a result, 586 hub genes were identified in the blue module (mothers) and 193 hub genes were identified in the green module (infants), with 105 of them shared between the blue and green modules. Therefore, 88 unique hub genes were identified in infants. We performed GO and KEGG enrichment analyses with these unique hub genes. Results showed that they were mainly enriched in categories related to differentiation of immune cells and immune system development, such as lymphocyte activation (GO: 0046649), B cell differentiation (GO: 0030183), T cell tolerance induction (GO: 0002517), and immune system development (GO: 0002520). Several gene expression and translation-related categories were also enriched, including gene expression (GO: 0010467), rRNA processing (GO: 0006364), ribosome biogenesis (GO: 0042254), peptide biosynthetic process (GO: 0043043), and translation (GO: 0006412). KEGG analysis only highlighted two enriched pathways: i.e., ribosome (KEGG: 03010) and ribosome biogenesis in eukaryotes (KEGG: 03008) (Supplementary Table S3).
We further performed protein-protein interaction analysis of the 193 hub genes and 91 up-regulated DEGs in infants and then reconstructed their network structure. A total of 452 interactions among 203 genes were extracted from the STRING database. Several important genes were located at key positions of the interaction network (Figure 1D). For example, the DEGs within the network with the most connecting lines and highest combined score were related to development and immunity, including elastase, neutrophil expressed gene (Elane), dihydrofolate reductase gene (Dhfr), and matrix metallopeptidase 13 gene (Mmp13). The most important hub genes within the network, such as fibrillarin (Fbl) and BOP1 ribosomal biogenesis factor (Bop1), are related to the ribosome biogenesis (Strezoska et al., 2000; El Hassouni, 2019), and tumor protein P53 (Tp53) gene is the most frequent mutations in human cancers (Olivier et al., 2010). The protein coded by Elane plays a role in degenerative and inflammatory diseases through proteolysis of collagen-IV and elastin (Horwitz et al., 2013; Korkmaz et al., 2010). The protein coded by Dhfr plays a key role in cell growth and proliferation (Jensen et al., 1997; Xu et al., 2007). The gene Mmp13 encodes a member of the peptidase M10 family of matrix metalloproteinases (MMPs), which plays a role in wound healing, tissue remodeling, cartilage degradation, bone development, bone mineralization, and ossification (Ståhle-Bäckdahl et al., 1997; Toriseva et al., 2012; Wang et al., 2013).
In conclusion, the examined infants are still undergoing remarkable developmental changes. Many identified up-regulated DEGs and hub genes in infants were related to the immune system, growth, and development, with higher gene expression levels than that found in adults. These results highlight the gene expression features of infants during the breast-feeding period.
We directly compared gene expression differences between male and female infants to explore the influence of sex (Table 1). As a result, 23 DEGs were identified, including 17 genes located on the sex chromosomes and six genes located on autosomes. The limited number of DEGs suggests that gene expression differences between male and female infants during the breast-feeding period are small.
Interestingly, one DEG between the sexes was the brain development-related gene, TMF1 regulated nuclear protein 1 (Trnp1). This gene was also identified as a DEG between infants and mothers, with down-regulation in infants. Stahl et al. (2013) identified the DNA-associated protein TRNP1 as a regulator of cerebral cortex expansion in both tangential and radial expansion, two ways in which to increase the size of the cerebral cortex and enhance the evolution of the mammalian brain. They also found that high TRNP1 levels in the brain could promote neural stem cell self-renewal and tangential expansion, while lower levels could promote radial expansion. Thus, Trnp1, which encodes the protein TRNP1 and is expressed in both brain and blood (Genecards, 2020; NCBI, 2020), is crucial for control of tangential and radial expansion of the cerebral cortex in mammals (Stahl et al., 2013). To sum up, Trnp1 exhibited differential expression between female and male infants, which may indicate differences in brain development between the sexes in infant rhesus macaques.
Gene ID Gene name Chromosome Log2 fold change Padj ENSMMUG00000000192 MAP7D2 chrX 2.840 292 6.79E–25 ENSMMUG00000041274 ENSMMUG00000041274 chrY –22.469 8 4.23E–05 ENSMMUG00000045017 KDM5D chrY –13.276 2.57E–70 ENSMMUG00000046378 ZFY chrY –12.224 4 6.01E–56 ENSMMUG00000054467 ENSMMUG00000054467 chrY –7.683 34 6.10E–10 ENSMMUG00000042046 C14H11orf87 chr14 7.377 826 0.002 683 ENSMMUG00000045991 USP9Y chrY –13.284 9 1.72E–69 ENSMMUG00000038182 UTY chrY –12.551 6 9.93E–62 ENSMMUG00000059200 ENSMMUG00000059200 chr13 6.556 07 0.004 637 ENSMMUG00000043966 DDX3Y chrY –14.122 4 1.06E–78 ENSMMUG00000049458 ENSMMUG00000049458 chrY –9.074 96 4.46E–26 ENSMMUG00000060008 LY6D chr8 4.700 807 0.028 138 7 ENSMMUG00000049951 ENSMMUG00000049951 chrX 9.939 943 2.84E–34 ENSMMUG00000043715 EIF1AY chrY –11.388 6 6.20E–15 ENSMMUG00000056331 PRKX chrX –4.430 09 2.69E–20 ENSMMUG00000021031 TRNP1 chr1 4.054 105 0.022 402 2 ENSMMUG00000061378 ENSMMUG00000061378 chrY –7.757 77 2.89E–12 ENSMMUG00000038824 RPS4Y2 chrY –13.9147 1.84E–77 ENSMMUG00000046495 TBL1Y chrY –10.186 4 2.25E–05 ENSMMUG00000055339 ENSMMUG00000055339 chr16 26.352 57 1.62E–08 ENSMMUG00000038331 RPS4Y1 chrY –17.108 1 1.73E–117 ENSMMUG00000055357 ENSMMUG00000055357 chr10 –21.304 4 0.000 173 9 ENSMMUG00000049104 CYorf15A chrY –10.877 8 2.01E–44
Table 1. The DEGs between female and male infants
|ZR-2020-044 Supplementary Materials and Methods.doc|