Double-stranded DNA is in equilibrium between right-handed B-DNA and left-handed Z-DNA. The B-DNA form is dominant and Z-DNA form makes only a small contribution to the equilibrium. It is reported that Z-DNA can be stabilized by cations and anions, dehydrating solvents, numerous covalent modifications of DNA, negative supercoiling, and Z-DNA binding proteins . Segments of DNA with alternating d(CG) sequences are the most favored for forming Z-DNA. Z-DNA binding proteins have been identified on the basis of their preferential binding to segments of Z-DNA, and they have been isolated from Drosophila melanogaster , rat brain neurons , wheat germ , bull testis , Escherichia coli [6 , 7], Deinococcus radiodurans , chicken , and human . The three-dimensional (3D) structures of Z-DNA binding domain from human [11 - 14] revealed that it differs from that of E. coli . This result indicated that there are at least two types of Z-DNA binding domains. The recA protein from E. coli has Z-DNA binding domain, this protein promotes homologous recombination and it has Z-DNA stimulated ATPase activity . RecA has multiple activities, all related to DNA repair. The RNA editing enzyme, adenosine deaminase acting on RNA (ADAR) from human includes Z-DNA binding domain  and this domain acts as an effector of gene expression .
We assumed that the presence/absence of Z-DNA binding domain would give the clues to the function and evolution of Z-DNA binding proteins. We expected the survey of genome sequences would reveal the presence/absence of Z-DNA binding domain, as genome sequence has all protein information the organism has. We conducted genome sequence analysis to examine the presence/absence of two types of Z-DNA binding domains in various organisms from archaea, bacteria, and eukaryotes using the database of genomes to protein structures and functions (GTOP) [17 , 18]. GTOP provides protein annotation of 3D structures and functions based on homology search against Protein Data Bank [19 , 20] and Structural Classification of Proteins (SCOP)  database protein sequences of known structure. We used GTOP because it has a powerful search aid by keyword to survey the query protein fold prediction in genome sequences.
2. Materials and Methods
2.1. Estimation of Presence/Absence of Z-DNA Binding Domain
The determination of presence/absence of Z-DNA binding domain in organisms was simply done by using GTOP database. GTOP is containing protein fold predictions based on homology search against protein sequences of known structure. If there was a homologous hit for the Z-DNA binding domain with an e-value less than 10−10, it is estimated that the organism has the Z-DNA binding domain. If there was no hit, it is considered that the Z-DNA binding domain is absent in the organism.
2.2. Protein Domain Structure
The amino acid sequences of Z-DNA binding domain of E. coli recA protein and that of human ADAR protein from GTOP are shown in Figure 1(a) and Figure 1(b), respectively. GTOP uses Swiss-Prot protein sequence database , so the Swiss-Prot codes are given for the sequences.
GTOP adopted SCOP classification of protein structures, the unit of classification is usually the protein domain. SCOP organizes protein structures according to evolutionary origin and structure similarity. Actually, protein domains are classified on hierarchical levels into four categories: class, fold, superfamily, and family. The 3D structure of recA protein from E. coli has a domain described as class: alpha and beta
Figure 1. (a) Amino acid sequence of E. coli recA protein Z-DNA binding domain. Swiss Prot: RECA_ECOLI [residues 4 to 269]; (b) Amino acid sequence of human adenosine deaminase acting on RNA Z-DNA binding domain. Swiss Prot: DSRAD_HUMAN [residues 134 to 198].
protein, fold: p-loop containing nucleotide triphosphate hydrolases, superfamily: p-loop containing nucleotide triphosphate hydrolases, family: recA protein-like (ATPase-domain). This domain is described as c.37.1.11 in SCOP code, and this is used as a keyword in GTOP search. Another Z-DNA binding domain in ADAR from human is described as class: all alpha protein, fold: DNA/RNA-binding 3-helical bundle, superfamily: winged helix DNA-binding protein, and family: Z-DNA binding domain. This domain is expressed as a.4.5.19 in SCOP code and used as a keyword in GTOP search. As more genomic sequences become available, the survey of proteins becomes difficult without useful tools. GTOP has a tool of keyword search on the web. For example, we searched the Z-DNA binding domain in GTOP using c.37.1.11 as keyword, then the homologous proteins in an organism were displayed with e-values. Therefore, we can simply estimate the presence or absence of the Z-DNA binding domain.
3.1. Classification of Organisms
We employed GTOP for the search of two types of Z-DNA binding domains. In GTOP, organisms are classified based on the annotation in the genome sequence according to hierarchy: three kingdoms (archaea, bacteria, and eukaryotes), phylum, and section. In GTOP, 68 organisms in archaea were divided into 5 phyla and 13 sections (Table 1(a)), 914 organisms in bacteria were divided into 21 phyla and 45 sections (Table 1(b)), and 199 organisms in eukaryotes were divided into 13 phyla and 21 sections (Table 1(c)).
3.2. Two Types of Z-DNA Binding Domains
The Z-DNA binding domain in recA protein from E. coli was observed in all the sections of archaea, bacteria, and eukaryotes in GTOP. Therefore, there is no need to distinguish the presence/absence of the Z-DNA binding domain in recA protein. This result indicated that this domain is essential for all the organisms.
Table 1. (a) Organisms that have Z-DNA binding domain in archaea; (b) Organisms that have Z-DNA binding domain in bacteria; (c) Organisms that have Z-DNA binding domain in eukaryotes.
Another Z-DNA binding domain in ADAR from human was observed in some organisms of archaea, bacteria, and eukaryotes, respectively. The representative organism in the column of organism in Tables 1(a)-(c) indicates the presence of Z-DNA binding domain in ADAR from human. The white space in the column of organism means the absence of this domain.
3.3. Z-DNA Binding Domain from Archaea, Bacteria, and Eukaryotes
Comparisons of the ribosomal RNA sequences from various organisms are commonly used to deduce the phylogenetic trees . The trees indicate clustered classification into three kingdoms, archaea, bacteria, and eukaryotes, and sub-clustered groups into phyla and sections according to their sequence similarities. The phylogenetic tree based on archaeal 16S small subunit ribosomal RNA sequences revealed that phylum Thaumachaeota may emerge before the divergence between Crenarchaeota and Euryarchaeota . The presence of Z-DNA binding domain in the organisms belongs to the phyla Thaumachaeota, Crenarchaeota and Euryarchaeota suggested that the emergence of the Z-DNA binding domain was preceding to the branch between Crenarchaeota and Euryarchaeota (Table 1(a)). Four organisms, H. butylicus, A. fulgidus, T. onnurineus, and Candidatus K. cryptofilum are thermophiles, and M. burtonii and N. maritimus are mesophililes (Table 1(a)). This result suggested that the Z-DNA binding domain is favorable in thermophiles.
The evolutionary history of organisms of bacteria can be obtained by a comparison of conserved protein sequences of elongation factor-1 alpha/Tu or 70-kDa heat shock protein . A clear separation of the Gram-positive and Gram-negative bacteria can be obtained. The phylum firmicutes only indicated the presence of Z-DNA binding domain in the Gram-positive bacteria (Table 1(b)), and the organisms have low G + C content. In Gram-negative bacteria, the phyla of aquificae, chlorobi, proteobacteria and thermotogae showed the presence of Z-DNA binding domain (Table 1(b)), and their G + C content ranged from 35% to 67% among the organisms. The organisms of P. marina, T. tengcongensis, Nitratiruptor sp., and F. nodosum are thermophiles.
The Z-DNA binding domain was observed only in the organisms belong to phylum metazoan, section eumetazoa in eukaryotes (Table 1(c)). It was interesting that only vertebrates indicated the presence of this domain and invertebrates indicated the absence of this domain. Mammalian genomes encode three ADAR genes, ADAR1, ADAR2 and ADAR3 [26 , 27]. ADAR1 contains two Z-DNA binding domains, but not ADAR2 nor ADAR3. The ADAR genes are present in Caenorhabditis elegans genome , Drosophila genome , squid nervous system , and their ADAR gene products have no Z-DNA binding domain . The ADAR family catalyzes the conversion of adenosine to inosine in pre-mRNA, and the substrates require duplex RNA secondary structure. Adenosine to inosine editing modulates the calcium permeability of neural glutamate receptors  and reduces the G-protein coupling efficiency of serotonin 2C receptors .
As mentioned above, the Z-DNA binding proteins have been isolated from various organisms based on the measurements of the interactions between Z-DNA and its binding proteins. The presence of the Z-DNA binding domain in various organisms is consistent with the result that the Z-DNA binding domain in recA protein was observed in all the organisms examined. It is reported that the experiments of the Z-DNA binding proteins in E. coli were performed in the recA protein free strain . This result suggested that there is another type of Z-DNA binding protein beside recA protein. However, we found Z-DNA binding domain only in recA protein in E. coli, so this point is not clear.
Ideally, taxonomic classification should reflect the evolutionary history of the organism for the presence/absence of Z-DNA binding domain. If organisms A and B are phylogenetically close enough, it is expected that both organisms A and B have Z-DNA binding domain or not. This expectation varied among organisms as follows. There were 71 organisms in the section of betaproteobacteria and 24 organisms belong to Bundrkholderia species. Only Burkholderia xenovorans indicated the presence of Z-DNA binding domain, and other organisms indicated the absence of this domain. There were 49 vertebrates in the section of eumetazoa in eukaryotes. 43 vertebrates showed the presence of Z-DNA binding domain and 6 vertebrates including chicken showed the absence of this domain. It is reported that chicken has Z-DNA binding protein , but our results indicated the absence of this domain in chicken. This point is not clear. In eukaryotes, only vertebrates had the Z-DNA binding domain. The acquisition of this domain in vertebrates would be caused by horizontal gene transfer (HGT) [34 - 36]. Isolated occurrence of Z-DNA binding domain in ADAR from human in distantly related species of archaea and bacteria suggested that the gene of this domain might have arrived via an HGT event.
It is considered that if the function of a protein is essential, the protein would be conserved. The Z-DNA binding domain in recA protein from E. coli was conserved in all the organisms. This result indicated that the function of this domain is essential. Another type of the Z-DNA binding domain in ADAR from human was observed in some organisms in archaea, bacteria and eukaryotes. This result suggested that the function of this domain is non-essential, even though the biological function of this domain is not clearly understood.
Unfortunately, GTOP database has not been updated since 2010 October 6. However, GTOP offers valuable information on Z-DNA binding domains. As far as we examined, there was no database like GTOP with useful keyword search. There are two types of Z-DNA binding domains and some organisms have both domains. However, it seems that most researchers do not distinguish E. coli type or human type Z-DNA binding domain they are analyzing. To study the function of this domain, it is necessary to discern which type of this domain they are studying.