When was bioinformatics first used
Dissimilarity analysis: a new technique of hierarchical sub-division. Rosen R. Dynamical modelling of genetic and epigenetic control.
In: Bellmann K, editor. Modelling and simulation of molecular genetic information systems. Berlin: Akademie Verlag; Hagen J. The origins of bioinformatics. Nat Rev Genet. Dayhoff M, Eck R.
Atlas of protein sequence and structure — Dayhoff M. Atlas of protein sequence and structure. Volume 5. A model study on biomorphological description. Pattern Recognit. Lindenmayer A. Mathematical models for cellular interactions in development I. Filaments with one-sided inputs. Mathematical models for cellular interactions in development II. Simple and branching filaments with two-sided inputs.
Iterative character weighing in numerical taxonomy. Isozyme variation in Silene pratensis: a response to different environments. Biochem Syst Ecol. May R. Biological populations with nonoverlapping generations: stable points, stable cycles, and chaos.
Gillespie D. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. Cellular automata as a paradigm for ecological modeling. Appl Math Comput. Boerlijst M, Hogeweg P. Spiral wave structure in pre-biotic evolution: hypercycles stable against parasites.
Physica D: Nonlinear Phenomena. Knowledge seeking in variable structure models. Simulation in the artificial intelligence era. Amsterdam: North Holland; An adaptive, selfmodifying, non goal directed modelling methodology.
Knowledge systems paradigms. Honk C, Hogeweg P. The ontogeny of the social structure in a captive Bombus terrestris colony. Behav Ecol Sociobiol. The alignment of sets of sequences and the construction of phyletic trees: an integrated method.
J Mol Evol. Evolution of the primary and secondary structures of the E1a mRNAs of the adenovirus. Mol Biol Evol. Stoletzki N. Conflicting selection pressures on synonymous codon use in yeast suggest selection on mRNA secondary structures.
BMC Evol Biol. Minimal energy foldings of eukaryotic mRNAs form a separate leader domain. Kozak M. Regulation of translation via mRNA structure in prokaryotes and eukaryotes. A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. Genome-wide measurement of RNA secondary structure in yeast. Odum EP. Energy flow in ecosystems: a historical review. Integr Comp Biol. Varma A, Palsson B. Metabolic flux balancing: basic concepts, scientific and practical use.
Nat Biotechnol. Integrating high-throughput and computational data elucidates bacterial networks. Chance and necessity in the evolution of minimal metabolic networks. Decoupling environment-dependent and independent genetic robustness across bacterial species. PLoS Comp Biol. Van Hoek M, Hogeweg P. In silico evolved lac operons exhibit bistability for artificial inducers, but not for lactose.
Biophys J. Later on, DNA analysis also emerged due to parallel advances in i molecular biology methods, which allowed easier manipulation of DNA, as well as its sequencing, and ii computer science, which saw the rise of increasingly miniaturized and more powerful computers, as well as novel software better suited to handle bioinformatics tasks. In the s through the s, major improvements in sequencing technology, along with reduced costs, gave rise to an exponential increase of data.
The arrival of 'Big Data' has laid out new challenges in terms of data mining and management, calling for more expertise from computer science into the field.
Coupled with an ever-increasing amount of bioinformatics tools, biological Big Data had and continues to have profound implications on the predictive power and reproducibility of bioinformatics results. The development of cost-effective, next generation sequencing NGS platforms [ 79 , 80 ] has helped to completely decode nearly the entire genome of many different organisms including human and many other model and specialty organisms, or crop genomes with complex polyploidy levels within a short period.
For example, according to the listings in the Genomes OnLine Database GOLD as of March 8, , there were 79, genome sequencing projects of which were completed projects, 33, were permanent drafts, 35, were incomplete projects, and were targeted projects [ 81 ]. There are 73, organism, including archaea , bacteria 55, , eukaryotes 11, , and viruses , listed for sequencing.
These numbers should be increased if the sequencing of the , whole-human genomes [ 82 ] is added. Bioinformatics tools are needed in annotation and prediction of genes from sequenced genomes that requires computerized approaches because genomes are large to be manually annotated as mentioned above.
Bioinformatics-based gene finding and annotation including a search for protein-coding genes, RNA transcripts, and other functional sequences within a genome is possible because there are patterns to recognize the start, stop regions, introns, exons, motifs, repeats, and other regulatory and sensory as well as signaling regions with some variations between genes and among organisms. With the availability and need for analysis of H.
Bioinformatics tools are very important to analyze gene and protein expression profiles. Large-scale sequencing of cDNA libraries has generated large volumes of serial analysis of gene expression SAGE , expressed sequences tags ESTs , massively parallel signature sequencing MPSS , transcriptome profiling, or RNA-Seq, and various applications of multiplexed in-situ hybridization microarray profile data [ 83 — 95 ].
In this context, chapter by Zhao et al. Moreover, Sripathy et al. In this book, readers can find an interesting chapter on bioinformatics challenges and tools for Hepatitis B genome analysis written by Bell and Kramvis, which highlight features of this small genome virus for bioinformatics analysis.
Similarly, protein microarrays and high-throughput mass spectrometry require bioinformatics analysis to identify proteins through the complex sequence similarity searches using protein sequence databases [ 96 — ]. Bioinformatics is a great help for analysis of gene regulation through searching and comparing the sequence motifs related to promoters and other regulatory elements.
Examples of such bioinformatics tools include k-means clustering, hierarchical clustering, and consensus clustering methods such as the Bi-CoPaM, and self-organizing maps SOMs that can identify functionally active sequences from very complex microarray datasets [ — ]. Not only just these, bioinformatics plays a major role in data collection of the functional elements of sequenced genomes that use the next-generation DNA-sequencing technologies and genomic tiling arrays.
Thanks to bioinformatics and applications of its tools, genomes and genes, and protein sequences of different organisms can be rapidly compared, searched, and interpreted. In addition, mutations can be identified that help to judge and diagnose many complex human and plant diseases, crop traits, and interpret complex evolutionary process, such as genome duplications, polyploidization, adaptation, and speciation.
One of the widely used applications of bioinformatics is identification of three-dimensional protein structures, molecular modeling, and folding to predict the possible function of proteins or other molecular structures, model behavior of molecules, fold the molecule to its native biologically functional three-dimensional structure, and design biomedical drugs for many complex human diseases.
From the coding DNA sequences, the primary structure of proteins can be easily determined that is vital in understanding the function of the protein s. Further, based on homology patterns in primary structure of proteins and using homology modeling, important structural formations and interaction sites with other proteins can be determined. This helps to predict reliably the structure of a protein based on known structure of a homologous protein s.
Moreover, the identification of secondary, tertiary, and quaternary structures of proteins is very important to understand the function of proteins. The exact three-dimensional structure is essential for correct function, and a failure to fold into native structure generally produces inactive proteins or misfolded proteins that can be toxic [ ].
Bioinformatics of protein folding includes 1 energy landscape of protein folding and 2 modeling of protein folding approaches [ 12 , 13 , ]. Using the I-TASSER, all above-mentioned functional and structural characteristics of proteins, including ligand-binding sites, enzyme commission number, and gene ontology terms can be explored in a comparative scale [ , ].
These are routinely used to investigate the structure, dynamics, surface properties, and thermodynamics of inorganic, biological and polymeric systems.
It helps to explore conformational changes associated with biomolecular function, and molecular recognition of proteins, and membrane complexes. The protein folding, identification of catalysis sites of enzymes, and protein stability can be studied using molecular modeling. Vast different bioinformatics tools for modeling of biomolecules and designing are available [ — ]. In this book, the chapter by Leong et al. In addition, in this book, by Filntisi et al.
The properties of nodes and edges form the network topology. Above highlighted molecular sequence analysis, prediction and annotation, and molecular modeling-related bioinformatics approaches are also the core for building, organizing, and systematizing biological networks of molecules e.
These include reception, signal transduction, and gene regulation and gene co-expression. Such molecular networks integrate many different data types including DNA sequences, regulatory RNA, proteins, secondary metabolites, gene expression data, and other small molecules, which may be all connected physically and functionally.
Perhaps even more important on a technological level are all other more recent life science data explosions such as genotyping, transcriptomics, or proteomics were only possible because of the availability of genomes. These new techniques can help us unravel what we cannot see in the cellular system.
We thus can assemble new data, and work on even more innovative techniques. The basic principle stays the same, but data are tightly linked to techniques that follow up each other and will rapidly be replaced by new ones. I do sometimes already feel old. As Berend rightfully emphasized, this is only possible because of the sequencing revolution and the wealth of data that came along. The first bioinformatic breakthrough came from the vision of Margaret Dayhoff, back in the fifties, at a time when data sharing was a hassle.
This was the first example of a systematic, smart and well documented way of storing, sharing and querying data! That is the most intriguing of the field of bioinformatics: there is so much we do not know yet!
0コメント