Bioinformatics: DNA in Bits and Bytes


I can still remember exactly when my love of computers was born. It was 20 years ago, when I was at High School in Biggin Hill, U.K. The math department had just acquired a Commodore Pet computer. Like a magpie I was bedazzled by the colorful graphics all over the keyboard. And just a few minutes after turning the thing on, the ebulliently typed "Hello World!" proclamation of a newbie appeared as if by magic on the screen.

About a year later the Sinclair ZX81 hit the market, and I convinced my parents to get this 1K RAM, no hard-drive, 8K ROM bundle of joy. And at the local computer club my friends and I contrived to bend these little machines to our will. The skills I began to develop back then have served me well ever since, first as a biologist, and later as Director of Bioinformatics at CuraGen Corp.

How did I manage to elevate myself to the lofty title of "Director?" My undergrad was in cell and molecular sciences, which turned out to be a fancy name for biochemistry, as much as anything else. Unusually, statistics was mandatory for 2 of the 3 years, and so I had the joy of pitting my wits against a VAX computer running the Minitab statistics package.

Fortunately, I beat off the VAX, graduated, and went to work as a technician for the British Government, where I was hot on the trail of the human equivalent of mad cow disease. My understanding of computers allowed me to perform the lab's analyses of prion gene sequences, which I did by connecting via telnet to the server at the Massachusetts Institute of Technology that was running the Genetics Computer Group (GCG) sequence analysis package. These skills traveled with me when I moved to Boston in 1990 to start my Ph.D. in molecular pharmacology.

I was amazed at how wired American universities were, and I couldn't resist the temptations of the online world. Soon I was one of the more prolific posters on the BioNet newsgroups, where I took up residence on the Methods and Reagents thread. Then, in 1994, I saw a posting in one of the software newsgroups for a new virtual environment called BioMOO. A young student at the Weizmann Institute had created a virtual world and was inviting participants to telnet over. I soon logged in and built my own little empire, including a port for the Mailfasta program used for sequence analysis via e-mail. This was the first programming project in which I was able to meld my knowledge of biology, sequence analysis, and computers. Shortly after I joined BioMOO, the Web hit the fan and provided a whole new avenue of exploration.

Armed with the tools to build a Web site, I set myself the goal of attracting people to a Web site of my own. With that in mind, I created the Biotech Company Registry (BCR), a site at which life science vendors and biotech and pharmaceutical companies registered. I ran the Web site out of my lab at Boston University, using a Mac Classic II to power the server.

BCR became so successful that--still a grad student--I sold it in 1995. But now I was stuck. How was I going to get eyeballs to look at my resume with no Web site? The obvious answer: start a new Web site! So, through my numerous contacts in the life sciences, diagnostics, and biotech industries, I got the "Biotech Rumor Mill" going. This was a one-of-a-kind site that allowed anonymous posting of gossip and rumors, and traffic soon grew.

In a matter of months I upgraded to an Internet Service Provider. And with my Web server running on a Sun--a "real" computer--I decided to pick up a "Learn Perl in 30-days" book. It took me 6 months to master the first 10 days of the book, and I was able to do that only with the help of additional reading in the widely acclaimed Llama and Camel books on Perl (nicknamed for the llama and camel on the covers of the books--see the references below). But learning Perl was well worth the effort. In fact, it was the straw that broke the Camel's back (pun intended), because it lead directly to my proverbial big break and my switch from biologist to bioinformatician.

Words of advice for budding bioinformaticians?

  • At a minimum learn Perl, HTML, and relational databases (I still have to master the latter).

  • If you have just taken a Master's course in bioinformatics or computer science, do some of your own programming projects. These are looked upon favorably when you apply for bioinformatics positions.

  • Download databases and software from the National Center for Biotechnology Information and become familiar with sequence analysis, expression analysis, and other forms of genomic data manipulation.

  • Look to join a small company if your programming skills are limited. Your limitations will be overlooked out of necessity.

One night, Jonathan Rothberg, fellow entrepreneur and president and CEO of CuraGen Corp., surfed over and made contact. He was building the bioinformatics group at CuraGen and needed biologists. My chance had come! While finishing my Ph.D., I consulted for CuraGen. Twelve months later, in 1997, I graduated and joined CuraGen as a research scientist, one of the only biologists hired for the bioinformatics group; my first project was to complete the CuraTools sequence analysis package that I had started as a consultant.

From the start, my success has come from forming strong bonds with discovery scientists and working with them to solve problems using simple programs that help to automate their discovery tasks. I've been following this strategy for the past 3 years, and I now head up CuraGen's bioinformatics group for gene expression, sequence analysis, drug development, and bioinformatics support.


Follow Science Careers

Search Jobs

Enter keywords, locations or job types to start searching for your new science career.

Top articles in Careers