The Notes: What is bioinformatics?

Introduction

The answer to the question “What is bioinformatics?” is not straightforward, yet in addressing this question the richness and extent of the field become clear. Part of the reason that it is difficult to give a concise definition of bioinformatics is that, as researchers publishing in the field realize, the definition is somewhat artificial and its boundaries are still expanding. This is not surprising, as bioinformatics might also be called mathematical/computational molecular biology, which points to large parts of biology taking on the aspects of a “hard” science such as physics or chemistry.

The creation of bioinformatics was triggered by a combination of factors in the 1990s. Key elements were progress in computing power, the existence of much larger data sets, and increasingly quantitative approaches to molecular biology, including molecular evolutionary studies. The large data sets came from a number of sources, including long individual DNA sequences (for example, genomes), large between-species comparative or evolutionary alignments, microarray-generated gene expression data, proteomics data from two-dimensional gel electrophoresis and mass spectroscopy techniques, and structural information—broadly speaking, the fields of comparative, functional, and structural genomics. It was also increasingly recognized that quantitative molecular biology required vast amounts of computer power not only to assemble genomes but also to complete fundamental analyses, such as aligning related DNA sequences or building a tree from such an alignment. For example, with just twenty sequences there are more different trees relating these sequences than Avogadro’s number (approximately 6 times 1023), and every tree must be checked to ensure that the optimal solution has been found.

The Scope of Research

Bioinformatics itself touches on other areas of science such as biomedical informatics, computer science, statistical analysis, molecular biology, and mathematical modeling. In turn, each of these fields contributes uniquely to the progress of bioinformatics toward a mature science. Equally definitive of bioinformatics is recognizing those areas wholly or partly subsumed by an approach mixing computing power with mathematical and statistical modeling to solve biological questions based on molecular data. These areas include genomics, evolutionary biology, population genetics, structural biology, microarray gene expression analysis, proteomics, and the modeling of cellular processes plus systems biology (for example, modeling a neurological pathway in which individual neurons respond to molecular events).

In bioinformatics, as in chemistry and physics, there is a fundamental split between empirical/experimental and theoretical science. At one extreme may be a laboratory focusing on generating large amounts of microarray data with relatively little analysis, and at the other extreme may be a mathematician working alone to solve a theorem with an application to better analyze that microarray data. It is clear that both approaches are needed for science to develop. However, it is not uncommon to find researchers actively tackling both problems (for example, gathering large data sets and seeking better methods to analyze them). Increasingly, the scale and cost of major bioinformatics projects call for a new model of interdisciplinary biological research in which biologists, statisticians, computer scientists, chemists, mathematicians, physicians, and physicists interact closely together.

The nature of bioinformatics research highlights the need for interdisciplinary skills in modern biology. Some universities issue bioinformatics degrees based on their own formulas. A more direct approach is to require a quadruple major in statistics, computer science, mathematics, and biology. The importance of such a background is that, for example, someone who is not the best mathematician still needs to know how to ask the best mathematicians for help with the problems that inevitably crop up in research in this area. A good example of this interdependence arose in the Celera Genomics effort to complete the human genome, in which mathematicians with a specialty in tiling algorithms were essential to reassembling the millions of sequenced fragments.

In the future, bioinformatics will be increasingly involved with projects, the magnitude of which are technically and intellectually as challenging as anything previously faced in science. For example, a key problem might be a complete computer model of a single cell. Perfection would be achieved only when a biologist could not tell the difference between real and experimental data when the cell experienced a change internally or externally.

Perspective and Prospects

The implications of bioinformatics for medicine are enormous. The strictly informatics side is already central to medical genetics. Databases of human characteristics, including detailed medical histories and biochemical profiles, are matched up with millions of genetic markers within each individual. Only through such enormous databases can statistical sleuths uncover the basis of most diseases that are caused by multiple genes. This is the population genetics of humans on a vast scale. Elsewhere, medical research such as cancer modeling is rapidly becoming a branch of bioinformatics, driven by the fact that cancer is caused by many interacting genes. Bioinformatics is key to the advancement of clinical genomic medicine and genomic technologies affecting complex diseases and disorders, drug dosing, and vaccine design.

The overall prospect is that bioinformatics will make possible a different sort of medicine in the twenty-first century in which fundamental research leads to pharmaceutical intervention, which leads to treating a disease at its root cause in a way that avoids the need for surgical intervention. Treatments of tomorrow, from diagnosis to cure, will involve processing large amounts of data via computers, with doctors remaining the key to ensuring an appropriate treatment regime, one that is personalized and precise, with the consent and comfort of the patient foremost.

In short, one answer to the question “What is bioinformatics?” is the development of virtual molecular biology. As time passes, this scientific endeavor will propagate upward and outward to meet other major areas of biology, such as physiology and ecology. Eventually, much of biology and medicine may come to rest solidly on the same principles as chemistry and physics, yet require major computational resources because of the complexity of the models needed to approximate reality reliably.

Bibliography

Brazas, Michelle D., et al. “ A Quick Guide to Genomics and Bioinformatics Training for clinical and Public Audiences.” PLoS Computational Biology. 10.4 (2014): 1–6. Academic Search Complete. Web. 17 Feb. 2015.

Campbell, A. Malcolm, and Laurie J. Heyer. Discovering Genomics, Proteomics, and Bioinformatics. 2d ed. San Francisco: Pearson/Benjamin Cummings, 2007.

Davidson, Eric H. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution. Boston: Academic Press/Elsevier, 2006.

Higgs, Paul G., and Teresa K. Attwood. Bioinformatics and Molecular Evolution. Malden: Wiley-Blackwell, 2013.

International Human Genome Sequencing Consortium. “Initial Sequencing and Analysis of the Human Genome.” Nature 409, no. 6822 (2001): 860–921.

National Institute of Allergy and Infectious Disease. “Bioinformatics.” National Institutes of Health, May 24, 2013.

Pevsner, Jonathan. Bioinformatics and Functional Genomics. 2d ed. Malden: Wiley-Blackwell, 2013.

The Notes

Thursday, February 13, 2014

What is bioinformatics?

No comments:

Post a Comment

What are hearing tests?