Pittsburgh, PA
Tuesday
November 24, 2009
    News           Sports           Lifestyle           Classifieds           About Us
Health & Science
 
Place an Ad
Travel Getaways
Headlines by E-mail
Home >  Health & Science Printer-friendly versionE-mail this story
Pitt, CMU get $9 million grant to translate the language of proteins

Tuesday, October 01, 2002

By Byron Spice, Post-Gazette Science Editor

The secrets of proteins, scientists believe, are written in an unknown foreign language. So, as new molecular biology research techniques spew an ever-growing tower of untranslated babble, it is only logical that biologists are turning to linguists and their computer tools for help.

This approach, called computational biolinguistics, received a $9 million boost last week in the form of a five-year grant from the National Science Foundation.

Computer software developed at Carnegie Mellon University's Language Technologies Institute to decipher human languages will be used to statistically analyze the sequence of amino acids that comprise each protein.

There are only 20 different amino acids. But each protein is a chain that strings together hundreds or even thousands of amino acids. Presumably, those chains are the equivalent of sentences, said Judith Klein-Seetharaman, a pharmacologist at the University of Pittsburgh School of Medicine who, along with CMU computer scientist Raj Reddy, is a principal investigator of the new project.

"At the moment, we don't know what the equivalent of a word is," said Klein-Seetharaman, who also holds an appointment to the Language Technologies Institute. Biologists now treat each amino acid as if it were a word, she said, but each is probably just the equivalent of a letter or a syllable.

Proteins are the very stuff of life, providing the structural components of each cell, as well as the enzymes and hormones necessary for cell functions. Understanding their structure and how they work could lead to the development of new types of pharmaceuticals.

Nothing is more mysterious about proteins than how they assume their three-dimensional shape. The function of proteins depends mightily on how each chain of amino acids is scrunched, twisted and bent, yet scientists have been unable to figure out the rules by which this folding takes place.

Klein-Seetharaman said that the folding instructions likely are hidden somewhere in those amino acid sequences and that she and her colleagues will devote much of their energy to ferreting out those key words or phrases.

Designing software that allows computers to understand human language has been a major challenge to computer scientists for decades. Ronald Rosenfeld, an associate professor at the Language Technologies Institute, said one technique is to statistically analyze large amounts of text. By noting how often certain words or combinations of words appear, the computer program can sometimes deduce the meaning or function of those words.

Molecular biologists are producing plenty of raw data on amino acid sequences, Rosenfeld noted, so this approach might help decipher the protein language.

CMU and Pitt researchers began work on this analysis last year, he said, and already have found that different organisms each appear to have short sequences within their proteins that are unique to each organism. That might aid in developing antibiotics that target just a single pathogen, he added.

Different organisms may also speak different protein languages, Klein-Seetharaman said. For instance, if the gene for a human protein is spliced into a bacterium, the protein produced by that bacterium will have the same amino acid sequence as that produced in a human, but may well be folded differently. Apparently, the folding instructions encoded by the amino acid sequence are read differently from one organism to another, she explained.

In addition to Pitt and CMU, the project's collaborators include researchers at the Massachusetts Institute of Technology, the University of Boston and the National Canadian Research Council.


Byron Spice can be reached at bspice@post-gazette.com or 412-263-1578.

Back to top Back to top E-mail this story E-mail this story
Search | Contact Us |  Site Map | Terms of Use |  Privacy Policy |  Advertise | Help |  Corrections