Helicos BioSciences Corporation for Heliscope Sequencing

Genetics research, conceptual artwork

Helicos BioSciences Corporation traces its roots to a paper published in April 2003, in the Proceedings of the National Academy of Sciences (PNAS), by Cal Tech professor and primary author Dr. Steve Quake. The paper described the preliminary development of a technique for single-molecule DNA sequencing derived from the Sanger method for sequencing-by-synthesis. Using the new technique, fluorescent signals were utilized to detect labeled nucleotide triphosphates incorporated onto DNA templates bound to a quartz slide.

Despite limitations in the sensitivity, speed and the size of obtainable sequence, the new sequencing method described in PNAS was novel and showed sufficient promise to catch the eye of venture capitalists who approached the professor about investing in his technology. There must have been something about the technique that was what venture investors are looking for as this was a first, according to a long-time staff member and Senior Director of Research, Dr. Timothy Harris...venture investors don't usually approach the scientists, it's the other way around!

The PNAS publication was released on April 1, 2003, the first round of financing for a new company was initiated Dec. 19, 2003, and on Jan. 2, 2004, Helicos opened its doors with 5 employees, including Dr. Harris, a specialist on measurement science and single molecule technology. Helicos is currently situated in Cambridge MA, USA and, after 2 rounds of investment financing, and as of an IPO on May 27, 2007, it is now publicly traded under NASDAQ: HLCS.

Helicos specializes in genetic analysis technologies, in particular, a True Single-Molecule Sequencing (tSMSTM) technology, validated with the sequencing of the M13 virus genome as described in Science Magazine in April 2008. The specialized tSMSTM platform uses the HeliScopeTM Single Molecule Sequencer. According to Dr. Harris, this particular project was begun in January 2004, and by June 2005 they had successfully sequenced the M13 virus, a medically relevant sequence, described in the Science paper.

How Does tSMSTM Work?

A strand of DNA about 100-200 base pairs is cut into smaller fragments using restriction enzymes, and polyA tails are added. The shortened strands are then hybridized to the Helicos flow cell plate, which has billions of polyT chains bound to its surface. Each hybridized template is sequenced at once. Therefore billions per run can be read. Labeling is performed in "quads" consisting of 4 cycles each, for each of the 4 nucleotide bases. Fluorescent-labeled bases are added, and a laser in the instrument illuminates the label, taking a read of which strands have taken up that particular labeled base. The label is then cleaved, and the next cycle begins with a new base. After the flow cell has been treated with each base (4 cycles), the quad is complete, and a new one begins again with the initial nucleotide base.

Currently, the HeliScopeTM can read DNA fragments of about 55 base pairs in length. The more bases in the sequence, the lower the percentage of strands that can be used in a sample, because some strands cease to elongate during the process. For reads of 20 or so bases, about 86% of the strands can be used. For longer reads (55+ base pairs) this percentage drops to about 50%.

The Single-Molecule Advantage

While several other companies offer various sequencing-by-synthesis technologies with high throughput platforms, various different reagents, at comparable costs, and for short reads of 25-40 base pairs, only Helicos reads the DNA sequence one nucleotide at a time with their patented labeling technique that is sensitive enough to allow reads on a single molecule. Other methods require that the DNA be amplified (using PCR) to make multiple (millions) of copies prior to sequencing. It introduces the potential for a significant degree of inaccuracy due to processing errors by polymerase enzymes during amplification.

As of April 2008, the HeliScopeTM was reportedly able to sequence billions of nucleotide bases per day. Helicos is a member of the Personalized Medicine Coalition and has received "$1000 genome" grant funding. The $1000 genome in one day is a projected goal that would require the sequencer to process billions of bases per hour. Currently, the prototype sequencer would take years to identify an entire genome, which would cost much more than $1000.

The applications for tSMSTM technology are many, including detection of genetic variants in humans and other species for determining causes of disease, antibiotic resistance in bacteria, virility in viruses and more. The ability to detect a single gene without amplification has many potential uses in environmental microbiology, as genetic techniques are often used to detect viable, non-culturable microorganisms or those found in soil and other matrices that prohibit isolation by current methods. Furthermore, the nature of environmental samples often presents difficulties for gene amplification using PCR, due to contamination issues. However, these difficulties would also have to be overcome in order for the polymerase enzymes used in tSMSTM to function without interference.

The theory behind single-molecule sequencing is fairly basic, and you might wonder why no one has thought of it before. Although it sounds simple enough, there are many technical components involved in developing such platforms, and plenty of challenges to keep Helicos busy, including the development of new chemical reactions and reagents, plates and high throughput readers.

The ability to detect fluorescence of a single label on a single base requires highly sensitive instrumentation, and the chemistry for labeling and detecting signals needs to be just right to minimize interference and optimize fidelity of the DNA polymerase as it is applied to immobilized templates and labeled nucleotides. These are some of the challenges faced by Helicos as it continues to develop this technology in hopes of someday delivering the $1000, 1-day human genome.