Abstract During the last four years, the pyrosequencing-based 454 platform has rapidly displaced the traditional Sanger sequencing method due to its high throughput and cost effectiveness. Meanwhile, the Sanger sequencing meth-odology still provides the longest reads, and paired-end sequencing that is based on that chemistry offers an opportunity to ensure accurate assembly results. In this report, we describe an optimized approach for hybrid de novo genome assembly using pyrosequencing data and varying amounts of Sanger-type reads. 454 platform- derived contigs can be used as single non-breakable virtual reads or converted to simpler contigs that consist of editable, overlapping pseudoreads. These modified contigs maintain their integrity at the first jumpstarting assembly stage and are edited by fragmenting and rejoining. Pre-existing assembly software then can be applied for mixed assembly with 454-derived data and Sanger reads. An effective method for identifying ge-nomic differences between reference and sample se-quences in whole-genome resequencing procedures al-so is suggested.Abbreviations: CelAsm (Celera Assembler)Keywords: hybrid assembly, pyrosequencing, resequen-cingThe 454 sequencing platform (Roche Applied Science GS 20 or GS FLX), which is based on massively parallel sequence determination by pyrosequencing on clonally amplified genome fragments that are captured on micro-scopic beads, is becoming more and more popular in genome sequencing applications (Margulies et al., 2005). Its characteristics, which are superior to the traditional Sanger method - such as high production rate with an affordable cost, absence of cloning bias, and ability to go beyond strong secondary structure - enlarge its field of application in genome technology. Although there are several commercial next-generation sequencing tech-nologies that have become available in recent years (Shendure et al., 2004), 454 pyrosequencing is the only one that can be used for de novo genome sequencing among the high-throughput, short-read sequencing tech-nologies due to its long read length (∼250 bp in GS FLX; announced to be extended to 400 bp by the end of 2008). Many sequencing centers, however, may want to mix a limited amount of traditional Sanger-type sequences, usually generated from fosmid libraries, for scaffolding purposes. Also, a few may want to mix a considerable amount of Sanger read data to 454 pyrosequencing da-ta to produce more accurate results. Among the SFF tools that Roche Applied Science provides for the han-dling of raw data files, SFFINFO can generate FASTA and quality score files from an SFF file. Although the converted files can be assembled using PHRAP (http:// www.phrap.org/), it does not ensure correct assembly because the quality scores that are generated from 454 data are not compatible with those from Sanger reads. Further, PHRAP has problems with handling massive reads (usually hundreds of thousands from an SFF file). A recent report has demonstrated that GS assembler programs (gsAssembler for de novo assembly and gsMapping for reference-guided assembly; http://www. 454.com/enabling-technology/the-software.asp) that are supplied by Roche Applied Science are ideal for correct assembly of 454 data that are short and inherently er-ror-rich (Chaisson and Pevzner, 2008). Recent versions (1.1.02.15 and later) of GS assembler programs support mixed assembly with Sanger-type reads, but their performance is not well known at present. Moreover, because pre-existing assembly soft-ware such as PHRAP and CelAsm (Huson et al., 2001) do not directly support data that are produced by 454 machines, 454-derived contigs (GS contigs) should be used as if they were individual reads or be shredded to generate many overlapping 'pseudoreads' (Goldberg et al., 2006). Pseudoreads, made from GS contigs to emu-late the read size of standard Sanger data (ca. 600 bp), are virtual reads whose stepping between consecutive