Archive for the ‘Publications’ Category

Evaluation of tiling arrays from major chip shops

Posted on February 20th, 2008 by Roland Krause in Publications, Technology

Everyone is using tiling arrays these days but the quality of results is still hard to judge even for the most experienced people in the field. Designs differ in type of the oligo density, oligo length, selection algorithms and experimental procedures. On top of that, several analysis procedures claim to be superior over each other. Eight groups performed independent evaluations of tiling arrays for the human genome from the major vendors (Affymetrix, NimbleGen and Agilent) and report their findings in Genome Research, on February 7th, 2008. As the scientists had no knowledge about the PCR products that were spiked in the sample, this is an blind evaluation and thus much more powerful than the typical validations using real time PCR.

The overall results are a sobering: only 50% of the sequences selected were consistently detected at10% FDR. This is not to be taken literally as the majority of the missed samples were present in low numbers only (1.25 to 4 fold) and we do not have good data for the true fold changes in these experiments.

All three vendors supported this study and employ a good number of the authors. In that light, it is no surprise that the study does not report major quality differences between the chips. They do report that Affymetrix arrays require more repetitions to reach the same quality of the results but are of lower price (conveniently tabulated in the paper). No method performs well in the low concentration regime, although Agilent and Nimblegen arrays look a little better at it. I certainly don’t want to imply that the affiliations biased the results of this study (this is the internet after all). On the contrary, this is a very useful collaboration between chip vendors and technology leaders in academia. But read it before the next sales person in the chip business knocks on your door or you want to build trust in a particular data set.

Biological interpretation of protein-protein interaction networks

Posted on January 11th, 2008 by Roland Krause in Publications

The current issue of Nature Biotechnology contains a commentary on protein-protein interaction networks that nicely reflects the view of the informed end of the community. The only thing that I would criticize in their treatment is the lack of differentiation between the methods. Only by checking the references, you’ll notice that the lack of overlap they cite is between the now very dated data sets by Uetz and Ito. If you compare the high-confidence interactions (not the complete sets) of the 2006 studies by Krogan et al. and Gavin et al. you’ll notice that they are in good agreement, even if we are nowhere close to what we are used to from studying genomic information.

Retractions on the rise

Posted on January 4th, 2008 by Roland Krause in Publications, Publishing

The current issue of EMBO Reports contains a short analysis, which shows that the number of retractions of scientific publications increases dramatically. The authors give two possible explanations: Competitions amongst scientists lowers the quality of published research. It might also mean that scientists are more aware of other people’s mistakes and that “the self-correction of science is improving”.

While both alternative are plausible, my favourite suggestion is that online publications have made it feasible to retract papers and there is an incentive for the journals to show that they take care of possible misconduct. A large number of the retractions might be of heavily flawed works rather than fraud (a blogger’s assumption, I have not checked enough retractions myself). Earlier, the community would know that a particular work is not reproducible, but the retraction process was cumbersome and consumed too much time, so it was only pursued in the grossest instances.

Unlikely outcomes

Posted on June 19th, 2007 by Roland Krause in Evolution, Publications

The cosmological model of eternal inflation and the transition from chance to biological evolution in the history of life is the kind of paper that you would want for breakfast. Eugene Koonin combines a few back-of-the-envelope calculations on the absurdly low probabilities of the emergence of ribonucleotides/proteins capable of natural selection with a multiverse view of cosmology to explain why the existence of life as we know it is not unlikely.

The essay touches on the anthropic principle and I won’t be surprised if the anticreationists science blogs will have a word with him on fueling the intelligent design debate, although he states that their is no room for such quakery in this view of the world. Much of it sounds like I have heard it before; the novel elements are the numbers and the break point in evolution after the RNA world-protein world transition.
The article appears in Biology Direct, still my favorite journal for its open peer review. Again, the reviews are quite critical, including questioning Koonin’s background in philosophy. Then again, how many philosophers have his expertise in the RNA world?

What’s a gene in 2007?

Posted on June 14th, 2007 by Roland Krause in Publications

If you haven’t had the time to follow up on ENCODE, you’re probably not alone. From what I have perceived so far, it looks a lot more interesting than the publication of the human genome back in 2000, which was much hyped but had hardly novel findings so that everyone had to elaborate on the “lower than expected” number of genes. I am still yawing but that’s because it’s 7am.

My day had started with one of the Genome Research publications accompanying the major ENCODE publication in Nature, the working definition of a gene by Mark Gerstein et al. It contains a review of the many concepts that we’ve had for the unit of heredity (remember “one gene, one enzyme”?). Their notion of gene has become very operational and reads:

A set of gene productsThe gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products.

In particular, the aspect of regulation has been removed, there is no single hard structure like a start codon and the focus is on products, irrespective of intermediate transcripts. It does not differ from the view of most bioinformaticians who always focused on a representative gene product for analyses, albeit typically ignoring ncRNAs. One should watch this space, even I it do not see much disagreement with my previous concepts on the matter, different opinions exist, particular the elimination of transcripts might find difficulties in acceptance.

Given that the paper describes a single sentence definition for gene, its conclusion seems fairly weak: The next big thing, the notion of function of a gene product, can hardly be summarized as elegantly ever in my opinion as the complexity of the organism is reflected in the functional definition - besides, Gene Ontology already has a fairly good grasp on the different aspects of it.

Anyway, could someone please update the Wikipedia article for Gene. I got to work.

On metagenomics

Posted on March 13th, 2007 by Roland Krause in Databases, Evolution, Publications, Technology

Konrad was the first this morning to hint at release of Venters effort of providing  environmental sequencing samples from the world oceans. The data is backed by several papers in PLoS Biology and the new camera database. Other bloggers have followed and the main stream media will pick it up soon.
What to add on a busy day like today? The results might not breathtaking but that was as true for the release of the release of the human genome project back in 2000. Sequencing the human genome was a necessity - but the environmental samples provide a complete new picture of our planet, even if our initial view is warped and noisy and our ways of understanding the data is limited.

Hyperstructures probably underhyped

Posted on March 9th, 2007 by Roland Krause in Publications

Friday’s my literature review day; the most noteworthy one from today’s lunch break reading is Functional Taxonomy of Bacterial Hyperstructures in the current issue of Microbiology and Molecular Biology Reviews. The work by Vic Norris and many others discusses the notion of hyperstructures, the principle, large scale organization of cellular processes in bacteria.

The review is not at all a patch work of the latest research on particular subject areas. Instead, it revisits modern concepts of well characterized structures and processes such as transertion, the physical interplay of transcription, translation and secretion, which is happing in physical proximity in bacteria. Other hyperstructures discussed are related to the heavily discussed bacterial cytoskeleton and processes like glycolysis and cell division.

Despite a philosophical inclination, the review it does not wander off into a quasi-esoterical musings and I would recommend it to anyone working with prokaryotes as an update of our view of the bacterial cell. Besides, they show that they don’t have to do the buzz word ride: systems biology is never mentioned.

Dangling on String

Posted on January 29th, 2007 by Roland Krause in Databases, Publications

Singling out my favorite amongst the 174 biological information resources in the current database issue of Nucleic Acids Research is easily achieved: String, a protein-protein interaction database primarily developed in the group of Peer Bork at the EMBL was updated to version 7, introducing many small and a few major improvements and should finally be covered here.
(more…)

What’s in a genome sequence

Posted on November 12th, 2006 by Roland Krause in Publications, Technology

Sequencing a eukaryote organism apparently needs to pay back the effort with the initial publication, just like Hollywood blockbuster needs to bring its money on the opening weekend.
The problem is just that there are rarely exploitable surprises found in the sequence. At best, we bump into are harsh reminders of our ignorance. The genome of Mycobacterium tuberculosis told us that the PE/PPE genes, a large group of species-specific proteins were missed in 100 years of research on the bug. And remember that the main message of the human genome was that we had fewer genes than expected. Breaking news: estimates were wrong.Urchin
So, do the 222 Toll-like receptors found in the sea urchin’s genome (special Science issue) tell us much about the development of innate immunity? It rather shows that it will be nearly impossible to make simple inferences for these systems of the sea urchin and metazoans – which is certainly highly interesting yet somewhat unpleasant.

The conclusions of the review of the TLRs in Science include the following.
First, this genome sequence significantly refines our understanding of deuterostome immunity.
Which is a nice way of saying that many previous assumptions need to be jettisoned. Sequencing a genome is a good way of supplementing our lack of understanding with another myriad of unanswered questions - and a necessary framework to answer them. But didn’t yesterday’s ignorance feel more knowledgeable when the question/answers ratio suddenly increases dramatically?

N.B. Check Jonathan Eisen’s comment on justifying the efforts of sequencing.

Implausible interactions

Posted on November 6th, 2006 by Roland Krause in Miscelleanous, Publications

Browsing any protein-protein interaction screen, the number of obvious false positives you can tell from simply assessing the function and cellular role of the two proteins is substantial. Hands up: If your screen would come up with a gyrase interacting with a glutamate racemase, would you put it in the abstract as a cool example?
After all, if you run two-hybrid, the gyrase obviously interacts with the DNA. If your screen is based on co-immunoprecipitation or another biochemical technique, the high abundance of an enzyme like glutamate racemase drives an obviously unspecific interaction. Obviously, you’d be discarding something interesting.