The CompOmics group, headed by Prof. Dr. Lennart Martens, is part of the Department of Biomolecular Medicine of the Faculty of Medicine and Health Sciences of Ghent University, and the VIB-UGent Center for Medical Biotechnology of VIB, both in Ghent, Belgium.
The group has its roots in Ghent, but has active members all over Europe, and specializes in the management, analysis and integration of high-throughput Omics data with an aim towards establishing solid data stores, processing methods and tools to enable downstream systems biology research.
The CompOmics team is always looking for talented people. Go to the jobs section on the VIB website to look for open positions.
The following web applications are developed and hosted by the group.
Here is a selection of our free and open-source tools. A full list can be found on GitHub.
Do you want to learn about Proteomics and Proteomics data analysis? Have a look at our CompOmics tutorials:
BIOINFORMATICS, 2021Read article
MOLECULAR & CELLULAR PROTEOMICS, 2021
Proteogenomics approaches often struggle with the distinction between true and false peptide-to-spectrum matches as the database size enlarges. However, features extracted from tandem mass spectrometry intensity predictors can enhance the peptide identification rate and can provide extra confidence for peptide-to-spectrum matching in a proteogenomics context. To that end, features from the spectral intensity pattern predictors MS2PIP and Prosit were combined with the canonical scores from MaxQuant in the Percolator postprocessing tool for protein sequence databases constructed out of ribosome profiling and nanopore RNA-Seq analyses. The presented results provide evidence that this approach enhances both the identification rate as well as the validation stringency in a proteogenomic setting.Read article
JACS AU, 2021
Rising population density and global mobility are among the reasons why pathogens such as SARS-CoV-2, the virus that causes COVID-19, spread so rapidly across the globe. The policy response to such pandemics will always have to include accurate monitoring of the spread, as this provides one of the few alternatives to total lockdown. However, COVID-19 diagnosis is currently performed almost exclusively by reverse transcription polymerase chain reaction (RT-PCR). Although this is efficient, automatable, and acceptably cheap, reliance on one type of technology comes with serious caveats, as illustrated by recurring reagent and test shortages. We therefore developed an alternative diagnostic test that detects proteolytically digested SARS-CoV-2 proteins using mass spectrometry (MS). We established the Cov-MS consortium, consisting of 15 academic laboratories and several industrial partners to increase applicability, accessibility, sensitivity, and robustness of this kind of SARS-CoV-2 detection. This, in turn, gave rise to the Cov-MS Digital Incubator that allows other laboratories to join the effort, navigate, and share their optimizations and translate the assay into their clinic. As this test relies on viral proteins instead of RNA, it provides an orthogonal and complementary approach to RT-PCR using other reagents that are relatively inexpensive and widely available, as well as orthogonally skilled personnel and different instruments. Data are available via ProteomeXchange with identifier PXD022550.Read article
NATURE COMMUNICATIONS, 2021
The placenta is the interface between mother and fetus and inadequate function contributes to short and long-term ill-health. The placenta is absent from most large-scale RNA-Seq datasets. We therefore analyze long and small RNAs (101 and 20 million reads per sample respectively) from 302 human placentas, including 94 cases of preeclampsia (PE) and 56 cases of fetal growth restriction (FGR). The placental transcriptome has the seventh lowest complexity of 50 human tissues: 271 genes account for 50% of all reads. We identify multiple circular RNAs and validate 6 of these by Sanger sequencing across the back-splice junction. Using large-scale mass spectrometry datasets, we find strong evidence of peptides produced by translation of two circular RNAs. We also identify novel piRNAs which are clustered on Chr1 and Chr14. PE and FGR are associated with multiple and overlapping differences in mRNA, lincRNA and circRNA but fewer consistent differences in small RNAs. Of the three protein coding genes differentially expressed in both PE and FGR, one encodes a secreted protein FSTL3 (follistatin-like 3). Elevated serum levels of FSTL3 in pregnant women are predictive of subsequent PE and FGR. To aid visualization of our placenta transcriptome data, we develop a web application (https://www.obgyn.cam.ac.uk/placentome/).Read article
JOURNAL OF PROTEOME RESEARCH, 2021
Discovery of variant peptides such as a single amino acid variant (SAAV) in shotgun proteomics data is essential for personalized proteomics. Both the resolution of shotgun proteomics methods and the search engines have improved dramatically, allowing for confident identification of SAAV peptides. However, it is not yet known if these methods are truly successful in accurately identifying SAAV peptides without prior genomic information in the search database. We studied this in unprecedented detail by exploiting publicly available long-read RNA sequences and shotgun proteomics data from the gold standard reference cell line NA12878. Searching spectra from this cell line with the state-of-the-art open modification search engine ionbot against carefully curated search databases resulted in 96.7% false-positive SAAVs and an 85% lower true positive rate than searching with peptide search databases that incorporate prior genetic information. While adding genetic variants to the search database remains indispensable for correct peptide identification, inclusion of long-read RNA sequences in the search database contributes only 0.3% new peptide identifications. These findings reveal the differences in SAAV detection that result from various approaches, providing guidance to researchers studying SAAV peptides and developers of peptide spectrum identification tools.Read article
JOURNAL OF PROTEOME RESEARCH, 2021
The study of microbiomes has gained in importance over the past few years and has led to the emergence of the fields of metagenomics, metatranscriptomics, and metaproteomics. While initially focused on the study of biodiversity within these communities, the emphasis has increasingly shifted to the study of (changes in) the complete set of functions available in these communities. A key tool to study this functional complement of a microbiome is Gene Ontology (GO) term analysis. However, comparing large sets of GO terms is not an easy task due to the deeply branched nature of GO, which limits the utility of exact term matching. To solve this problem, we here present MegaGO, a user-friendly tool that relies on semantic similarity between GO terms to compute the functional similarity between multiple data sets. MegaGO is high performing: Each set can contain thousands of GO terms, and results are calculated in a matter of seconds. MegaGO is available as a web application at https://megago.ugent.be and is installable via pip as a standalone command line tool and reusable software library. All code is open source under the MIT license and is available at https://github.com/MEGA-GO/.Read article
JOURNAL OF PROTEOME RESEARCH, 2021
Metaproteomics has become an important research tool to study microbial systems, which has resulted in increased metaproteomics data generation. However, efficient tools for processing the acquired data have lagged behind. One widely used tool for metaproteomics data interpretation is Unipept, a web-based tool that provides, amongst others, interactive and insightful visualizations. Due to its web-based implementation, however, the Unipept web application is limited in the amount of data that can be analyzed. In this manuscript we therefore present Unipept Desktop, a desktop application version of Unipept that is designed to drastically increase the throughput and capacity of metaproteomics data analysis. Moreover, it provides a novel comparative analysis pipeline and improves the organization of experimental data into projects, thus addressing the growing need for more performant and versatile analysis tools for metaproteomics data.Read article
For mass spectrometry-based peptide and protein quantification, label-free quantification (LFQ) based on precursor mass peak (MS1) intensities is considered reliable due to its dynamic range, reproducibility, and accuracy. LFQ enables peptide-level quantitation, which is useful in proteomics (analyzing peptides carrying post-translational modifications) and multi-omics studies such as metaproteomics (analyzing taxon-specific microbial peptides) and proteogenomics (analyzing non-canonical sequences). Bioinformatics workflows accessible via the Galaxy platform have proven useful for analysis of such complex multi-omic studies. However, workflows within the Galaxy platform have lacked well-tested LFQ tools. In this study, we have evaluated moFF and FlashLFQ, two open-source LFQ tools, and implemented them within the Galaxy platform to offer access and use via established workflows. Through rigorous testing and communication with the tool developers, we have optimized the performance of each tool. Software features evaluated include: (a) match-between-runs (MBR); (b) using multiple file-formats as input for improved quantification; (c) use of containers and/or conda packages; (d) parameters needed for analyzing large datasets; and (e) optimization and validation of software performance. This work establishes a process for software implementation, optimization, and validation, and offers access to two robust software tools for LFQ-based analysis within the Galaxy platform.Read article
ANALYTICAL CHEMISTRY, 2020
Missing values are a major issue in quantitative data-dependent mass spectrometry-based proteomics. We therefore present an innovative solution to this key issue by introducing a hurdle model, which is a mixture between a binomial peptide count and a peptide intensity-based model component. It enables dramatically enhanced quantification of proteins with many missing values without having to resort to harmful assumptions for missingness. We demonstrate the superior performance of our method by comparing it with state-of-the-art methods in the field.Read article
ACS OMEGA, 2020
Despite its growing popularity and use, bottom-up proteomics remains a complex analytical methodology. Its general workflow consists of three main steps: sample preparation, liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS), and computational data analysis. Quality assessment of the different steps and components of this workflow is instrumental to identify technical flaws and avoid loss of precious measurement time and sample material. However, assessment of the extent of sample losses along with the sample preparation protocol, in particular, after proteolytic digestion, is not yet routinely implemented because of the lack of an accurate and straightforward method to quantify peptides. Here, we report on the use of a microfluidic UV/visible spectrophotometer to quantify MS-ready peptides directly in the MS-loading solvent, consuming only 2 mu L of sample. We compared the performance of the microfluidic spectrophotometer with a standard device and determined the optimal sample amount for LC-MS/MS analysis on a Q Exactive HF mass spectrometer using a dilution series of a commercial K562 cell digest. A careful evaluation of selected LC and MS parameters allowed us to define 3 mu g as an optimal peptide amount to be injected into this particular LC-MS/MS system. Finally, using tryptic digests from human HEK293T cells and showing that injecting equal peptide amounts, rather than approximate ones, result in less variable LC-MS/MS and protein quantification data. The obtained quality improvement together with easy implementation of the approach makes it possible to routinely quantify MS-ready peptides as a next step in daily proteomics quality control.Read article
lennart [dot] martens [AT] UGent.be