Illumina reads were also used to correct potential base errors an

Illumina reads were also used to correct potential base errors and increase consensus quality using the software Polisher developed at JGI [49]. The error rate of the completed genome sequence is less than 1 in 100,000. Together, the combination of the Illumina www.selleckchem.com/products/INCB18424.html and 454 sequencing platforms provided 308.6 x coverage of the genome. The final assembly contained 291,505 pyrosequence and 75,503,620 Illumina reads. Genome annotation Genes were identified using Prodigal [50] as part of the Oak Ridge National Laboratory genome-annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [51]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases.

These data sources were combined to assert a product description for each predicted protein. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes – Expert Review (IMG-ER) platform [52]. Genome properties The genome consists of a 4,000,057 bp long circular chromosome with a G+C content of 40.2% (Figure 3 and Table 3). Of the 3,563 genes predicted, 3,518 were protein-coding genes, and 45 RNAs; 33 pseudogenes were also identified. The majority of the protein-coding genes (67.9%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. Figure 3 Graphical map of the chromosome.

From outside to center: Genes on forward strand (colored by COG categories), Genes on reverse strand (colored by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content (black), GC skew (purple/olive). … Table 3 Genome Statistics Table 4 Number of genes associated with the general COG functional categories Insights into the genome sequence Genome analysis of strain UST20020801T revealed the presence of genes encoding an arylsulfatase A family protein (Oweho_0043), a bacteriophytochrome (light-regulated signal transduction histidine kinase (Oweho_0350), a cytochrome c2 and a cytochrome c oxidase cbb3 type (Oweho_2085)). Additional gene sequences of interest encode a homogenisate 1,2-dioxigenase (Oweho_2010), a haloacid dehalogenase superfamily protein (Oweho_2094) as well as a 2-haloalkanoic acid dehalogenase type II (Oweho_2503).

The presence of these genes could indicate that strain UST20020801T plays a role in the respiratory degradation of recalcitrant compounds in its ecological niche. Further, a light-dependent regulation of metabolic activities using bacteriophytochrome Batimastat as a sensor seems to be possible. Acknowledgements The authors would like to gratefully acknowledge the help of Helga Pomrenke for growing O. hongkongensis cultures and Evelyne-Marie Brambilla for DNA extraction and quality control (both at DSMZ).

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>