I have just returned from another trip to Singapore, where I was lucky enough to attend the Merlion Metabolomics Workshop at National University of Singapore. The meeting was a deliberate linkage between metabolomics researchers in France, including MetaboHub, and those in Singapore such as Professor Ong Choon Nam. This international partnership was highlighted by the presence of the French Ambassador who graced us with a 5 minute morale-booster before he legged it to his next event.
Day one saw me speak in the ‘young researchers’ group, which tickled me somewhat. Technically I do qualify as a young researcher given that I’m only a couple of years out of my PhD but it doesn’t often feel like it.
My presentation was focused on my conversion of the Viant group metabolomics analysis pipeline into the Galaxy workflow system. I laid it on quite thick about the need for reproducibility in science in general and included the allegation, “I’m not saying that the 20 people you’ve heard today were lying, I’m just saying that if you wanted to use what they’ve told you, you’d be lucky to reproduce 4 of their experiments.”
I’m not saying that the 20 people you’ve heard today were lying, I’m just saying that if you wanted to use what they’ve told you, you’d be lucky to reproduce 4 of their experiments.
DOIs for conference presentations
In an attempt to be more open myself, I thought I’d try to get a permanent link to my talk and put that up on the first and last slides – so that the audience could access the slides at any time and could hold me to account! I opted to put it on FigShare … which also means they can cite it and I can track my impact via my ORCID account. That seemed to go down well and I know of at least one researcher that has followed suit (Dave Broadhurst’s slides on FigShare)
Reproducibility and standardisation
It was great to see these themes of reproducibility and standardisation throughout the rest of the conference. MetaboHub has also been putting metabolomics tools into Galaxy (we’ve both submitted papers at the same time) and Etienne Thévenot mentioned this and the benefits of Galaxy in his talk about biostatistics for biomarker discovery. What I especially appreciated was he raised the need for common standards and open, reference datasets.
The growing data repository, MetaboLights, was promoted by PhD student Umashankar Shivshankar while he talked about his attempts to remove systematic variation from data after it’s been long collected. I could empathise with his troubles because I had the same problem as a PhD student, presented with data and asked to perform some sort of statistical miracle in order to remove artefacts. This is a continuous problem for informatics researchers, particularly at PhD level. Mick Watson has written a brilliant blog about this problem called “You’re not allowed bioinformatics anymore”.
Eminent Professor Roy Goodacre brought all these concepts together during his part of the closing panel discussion by highlighting the data standards development of COSMOS and the MetabolomeXchange. It was really heartening to hear him prioritise the need for open, transparent data and research.
But there was also a good talk by Dr Ho Ying Swan of the Bioprocessing Technology Institute at A*STAR wherein the references to their metabolite identification software included a patent application number. Of course, the patent doesn’t mean they would charge academic researchers for use of that software – I’m waiting to hear further details on that – but I much prefer the model whereby researchers get reward via publications and future research funding rather than by patents and license fees. That said, BTI may be more private sector than public… not sure that makes me feel better.
Highlights for the future
For me, the two most interesting concepts raised (other than reproducible, open science) came from Associate Prof. Markus Wenk of NUS who showed that the high level of variation in important compounds such as cholesterol mean that longitudinal studies are necessary if we are to diagnose or prognosis based on them (so we can’t even say ‘you need to worry about your high cholesterol’, maybe it’s always been high!) and Estelle Pujos-Guillot who talked briefly about her use of Formal Concept Analysis for extracting knowledge from data.
I had recently talked at length with Barend Mons about the benefits of converting data into semantic ‘assertions’ (triplets of [concept: relationship: concept]), linked to formal ontologies so that computers can interpret knowledge from the raw data. This is something that Barend is promoting as part of his Data FAIRport project and that the ISA-Tab team are aiming towards specifically for metabolomics and I really feel that this type of standardisation is necessary for the future development of biological research.
As reflects my own interests, this article focuses on reproducibility and standards. There were many other presentations regarding specific applications from health, environment, nutrition and bio-engineering (Sastia Prama Putri describing her attempts to engineer cells to improve alcohol production was pretty futuristic and cool). But if metabolomics is to join with the other ‘omics to produce a holistic understanding of disease, climate adaptation or anything else, then it will need 2 things:
First to improve its standards and reproducibility which are sadly lacking behind other technologies. Finding robust correlations between metabolic profile and e.g. transcriptomic profile will require low level of systemic variation – something that metabolomics can have problems with, especially for large studies.
Second, I expect that application specific presentations will need to start gathering at multi-omics conferences focused on that disease/organ/animal model, and single ‘omics conferences will remain for the purposes of advancing technology and best practice within that field. It can be interesting to see someone talk through their metabolomics study but if they are applying a fairly standard pipeline to an organism that is unrelated to your own niche, it can be quite tedious to see yet-another-scores-plot and yet-another-list-of-putative-metabolites. It only really becomes interesting when their methods change.
For these two reasons, I suspect that a methods and reproducibility focus is fitting for reviewing this metabolomics meeting. Hopefully the momentum from having the annual Metabolomics Society meeting in Japan and then having this meeting in Singapore will continue with more Asia based meetings in the new year.