Christian J. Michel and Ahmed Ahmed
A trinucleotide circular code is a set of trinucleotides allowing the reading frame in genes to be retrieved locally, i.e. anywhere in genes and in particular without start codon, and automatically with a window of a few nucleotides. In 1996, a common circular code X has been identified simultaneously in two large populations of eukaryotic and prokaryotic genes. The method proposed here identifies periodic signals of this code X in the two frameshift types (+1 and -1) of both eukaryotic and prokaryotic frameshift genes. As expected by the code theory, the circular code modulo 3 signals move in the same direction of translational frameshifting. Finally, in 68% of frameshift genes in the RECODE 2 database, the frameshift type (+1 and -1) is automatically identified using only this circular code periodic signal. This circular code information constitutes a new structural property of frameshift genes. It may be used directly or in association with existing methods to identify frameshift genes in genomes and their encoded proteins.
Поделиться этой статьей