Distribution Characteristics and Biological Function of Tandem Repeat Sequences in the Genomes of Different Organisms
-
Graphical Abstract
-
Abstract
Tandem repeat sequences, also known as direct repeats, are repeat sequences in which the length of the repeat unit changes mainly from 1 to 200 bp size, and the repeat unit is arranged in a “head-tail” conjunction mode, and is distributed widely in the genome of eukaryotes and some prokaryotes. At the level of full genomes, both the abundance and distribution characteristics of repeat types, such as dinucleotide repeats and trinucleotide repeats et cetera are varied in different organisms, and the variedness also occurs in different repeat classes, such as AT and AC repeat classes etc. and across inter-chromosomes, and even between coding regions and noncoding regions. All of the above differences indicate that the genesis and evolution of tandem repeat sequences are complex and may involve several mechanisms and factors, as is typical of biology. Additionally, there exist some problems preventing us from further studying the tandem repeat sequences, e.g. the software to analyze repeat sequences, criteria such as the length, the copy number, and the perfect or imperfect delimitation to determine what is a repeat sequence or not which varies across researchers. In order to address these problems, six future research directions should be pursued: The study of tandem repeat sequences, the self-evolution relations of tandem repeat sequences, the evolution status in the level of full genomes, the biology function, the establishment of tandem repeat sequence data-banks, and their application researches.
-
-