The summary for a busy person: I am interested in Tufte's sparklines and teaching old dogs new tricks. My interests are eclectic and not at all well focused. However, this particular page is just about biology related informatics. For this discussion, biomedical informatics is the macro and bioinformatics is the micro division of the subject matter. All kinds of informatics benefit from the database backed Web site and statistical models. Many of these terms are new, but derive from problems that have attracted me since the late sixties.
Bioinformatics problems could be subdivided. First, the algorithms used to piece together the genome. Second, the predictions about proteins, and hence cellular behavior, using that genome. Third, translational research, or the acceleration of transforming genomic data into therapeutic products.
The global recession has shifted the emphasis of all health-care related research considerably toward those areas which are, or are perceived to be, likely to lower costs and improve quality. Vaccines, insulin and ACE inhibitors are examples of the very few innovations that have done both. Performance based pay has not been demonstrated to do either. The U.S. governmental incentives to adopt IT in health care are for use of the current technology. Because this existing technology is fragmented into EHR, diagnostic and e-prescribing software, it has little chance of lowering costs or increasing quality.
The Human Genome Project enabled us to specify the state of an individual human organism in far greater detail, but at the cost of entering, saving and retrieving a far greater number of data points. Edward Tufte's invention, sparklines, is a very important attempt to re-compress this explosion of data so that it can communicated to decision makers in a timely manner -- thus saving time and money and reducing errors and waste.
Where will the manpower come from to manage all this data? For such a big problem, it would be efficient to provide some more training to those who have already mastered some of the requisite skills. We can teach old clinician dogs, new tricks like sparklines and Ruby on Rails. We can teach old computer dogs the advances in molecular biology. The Y2K problem employed a lot of old dogs, and young dogs who learned some old Cobol tricks. The problems where biology and informatics intersect, dwarf the Y2K problem.
There are persistant problems. More than a half century ago Turing found that a computer, even if it knew another computer's whole software program or genome, could not always predict what the other computer would do. Another old problem lies with the relational database that underlies most database backed Web sites. No algorithm exists to get to a fully normalized database where each of the rules is both necessary and sufficient for modeling an enterprise. So our models in biomedical informatics will be approximate and rely on statistics -- "All models are wrong, some are useful," George Box.
Revised March 14, 2009