Computational Systems Biology

Peter Schaap
Associate professor
Maria Suarez-Diez
Associate professor
Edoardo Saccenti
Assistant professor
Jasper Koehorst
Cristina Furlan
Rob Smith

Systems Biology is based on an integrative approach and applies various computational strategies to integrate heterogeneous data to model and discover properties of biological systems. A main goal of the laboratory of Systems and Synthetic Biology is to gain a systems understanding of societal relevant microorganisms and microbial ecosystems and to translate this knowledge into applications of biotechnological, medical and environmental interest.

Our research focus is on deriving a deeper understanding of microbial systems by uncovering biological meaning from genome scale data and through multiscale data integration. Specifically we are interested in (i) how genome information leads to function, (ii) how microbial metabolic processes are regulated and adapt in extant species, (iii) how microbial organisms and ecosystems respond to (a)biotic environmental cues and (iv) how they can be manipulated to enhance the yield of desired products or to diminish their pathogenicity.

To address these questions we build Systems Biology frameworks encompassing different levels of granularity within the fields of Semantics, Metabolic Engineering and Synthetic Biology and apply them in the fields of Host-Pathogen interactions, Systems Medicine and Biotechnology to describe biological phenomena with the specific aim of extracting emergent system properties, biological concepts and knowledge.

Current research projects

Unlock a large-scale infrastructure for research on microbial communities

Microbial communities are of key importance at different scales in our society, ranging from individual-based health issues related to microbial communities inhabiting the human body, to global greenhouse gas (i.e. CH4 and N2O) emissions related to microbial activity. 

Even though microbial ecosystems are characterized by an enormous diversity, and mixed microbial communities inherently have interesting emergent properties, fundamental and applied microbial research historically have been focused on a very limited number of isolated single strains. This is because many of the key-microorganisms of microbial communities depend on symbiotic interactions for growth and therefore are difficult to study and explore in isolation. Because of these challenges we have been able to isolate, study and use no more than 1% of the natural microbial diversity, meaning that we have overlooked a major fraction of the microbial potential available in nature. 

In light of these challenges, three major limitations exist in our current experimental procedures in research on microbial communities: (i) the lack of high-throughput cultivation facilities for parallel and comparative analysis of microbial ecosystems, (ii) the effective integration of these cultivation studies with molecular systems characterizations, and (iii) tight integration and transparent and uniform storage and processing of the generated data.

UNLOCK is a unique facility for research on mixed microbial communities and will address exactly these three limitations, through enabling research on mixed microbial communities at an unprecedented scale and efficiency. The Unlock research infrastructure (Figure 1) is composed of three complementary experimental platforms for high-throughput discovery and characterization of microbial communities and a FAIR-data platform for large scale data storage, data extraction and analysis of high-throughput data in a cloud-based infrastructure that will be developed and maintained by the Computational Systems Biology group at the Laboratory of Systems and Synthetic Biology. As such the platform will be equipped with an up-to-date ecosystem of robust state-of-the-art open source tools for data handling, information retrieval, statistical analysis and visualization of Omics data. The continuous development of this platform provides ample opportunity for MSc students who wish to learn more about FAIR large-scale data handling. Common Workflow Language, Bioinformatics analysis pipelines, Dockers and usage of Semantic web techniques in the Life Sciences 

A picture containing text

Description automatically generated

Semantic Systems Biology

High-throughput biological data generating technologies deliver ever-growing amounts of heterogeneous (meta)data at different scales, which are produced, stored and analysed in different structured and semi-structured formats. Integration and analysis of this heterogeneous biological data and knowledge require efficient information retrieval and management systems. Semantic Systems Biology is a Systems Biology approach that uses Semantic Web technologies to capture this knowledge about biological system. 

To increase the degree of interoperability of genome annotations, the Genome Biology Ontology Language (GBOL) and associated stack (GBOL stack) was developed. GBOL is provenance centred and provides a consistent representation of genome derived automated predictions linked to the dataset-wise and element-wise provenance of predicted elements. GBOL is modular in design, extendible and is integrated with existing ontologies [2]. 

An ontology is a specification of a conceptualization related to a domain of knowledge that must be adaptable to meet new requirements. Future adaptations and expansions must respect the coherence of the ontology as semantic inconsistency occur when the significance of the entities of ontology is changed. To build and manage the large variety of properties and classes in an easy to use format, Empusa was developed as part of the GBOL Stack. Empusa is a java application which converts OWL/Shex like ontologies into an API for Java and R. 

Follow this link for more information.

Translating bacterial genotypes, traits and phenotype extraction from microbial genome data using semantics and machine learning

The unprecedented increase in sequenced microbial (meta)-genomes, is in stark contrast with the amount of phenotypic information available from these sequenced strains and species. Understanding and unravelling of (emergent) microbial phenotypic properties is of key importance for society and requires FAIR sophisticated bioinformatic approaches essential for further exploitation of the latent functional repertoire for medical, environmental and biotechnological prospects. However, current sequence-based methods are limited to analyzing at most a few hundred genomes as the number of input genomes is squared in the execution of all-against-all comparisons required to build up genome specific profiles of evolutionary information. To by-pass this problem and the confounding factor that the sequence-based methods in addition have great difficulty dealing with multiple protein domains subject to fusion and fission events, we use instead of protein sequences ,protein domain architecture strings to compare genome specific profiles of evolutionary information.  Comparison of domain strings allows for the transfer of functional information through guilt-by-association on a very large scale. To obtain such FAIR domain based functional genome annotations we have developed SAPP an ontology driven platform providing genome derived functional domain annotations in a linked data structure that allows for functional domain analysis and comparison across thousands of microbial genomes.  

Conserved domain architecture patterns can reveal beneficial biotechnological, medical properties and provide functional characterizations of conserved domains of unknown function (DUFs) speeding up their exploitation. The aim of this project is to prospect the large corpus of publicly available microbial genome sequences for novel (operonic) traits at a very large scale using context-based pattern mining and tools and language models taken from natural language processing speeding up their exploitation. 

Ecological modelling approaches to microbiomes. 

In nature, microbes usually interact with other microorganisms and thus can form very complex and dynamic systems (MCs), possessing higher-order emergent properties with crucial roles in the environment and in human health. In response to environmental changes the structure of an MC can fluctuate dynamically. To better understand the MC, it is necessary to understand:

– What species play an essential role in the MC.

– What kind of interactions are taking place between the microbial species in the MC.

– What are the transient and steady-state phases of the MC. 

The main objective of this research is: i) Development of robust models which can be used in the MC analysis and study how to use them efficiently, ii) Development of a software platform that will address the problem of selecting the right mathematical model to analyse interactions in MC and iii) analyse and report on a selection of (semi)-complex communities using these tools.

Rationally Designed Microbial Communities to Improve Plant Performance and Plant Resilience rhizobacteria

Rhizosphere inhabiting microorganisms have profound effects on seed germination, seedling vigour, plant growth, disease and development and, nutrition and productivity and thus the rhizobiome plays an important role in the growth and fitness of the host plant. Under controlled (green house) conditions crop yield will for a large part depend on the interactions between the plant and its associated microbiome. While rhizosphere members have to compete for (plant derived) nutrients and space, the composition of the rhizosphere will also respond to organic amendments, the use of control chemicals and depend on external factors such the composition of the input plant seed microbiome. 

To decrease the yield gap and naturally increase the disease resistance of plant production systems thereby reducing the use of control chemicals one strategy would be to (continuously) reshape the rhizosphere microbiome towards higher abundances of, and promote synergy between, plant-growth promoting rhizobacteria (PGPR) using PGPR growth promoting organic amendments and controlled introduction of designed rhizobiomes.

Challenges remain for the study of the metabolic interactions between community members as it is currently unclear of what is the metabolic objective of the PGPR community. In this project we aim to elucidate such interactions using constraint-based modeling of bacterial community from the rhizosphere, Semantic genome annotation and PGPR metabolic phenotypes using Biolog data.  

Metabolic modeling of Syngas assimilation by microbial communities to high-value chemicals.

This research project intends to model novel microbial consortia for syngas assimilation to produce valuable chemicals. Syngas is a mixture of CO, H2 and COand can be produced from many sources, including natural gas, coal, biomass or any hydrocarbon feedstocks. Nowadays, syngas conversion is mostly based on catalytic processes. However, the use of microorganisms as biocatalysts to convert syngas is emerging as an attractive technology. Microorganisms are less sensitive to variations in CO/H2 ratios, are more resistant to certain impurities and there is no need of a costly pre-treatment of feedstocks.

An initial synthetic co-culture of C. autoethanogenum and C. kluyveri, which was previously found to produce C4 and C6 fatty-acids and their alcohols at the Laboratory of Microbiology, will be modeled and optimized to produce a wider range of compounds (C4-C8). Next, a third specie able to produce propionic acid (e.g, A. neopropionicum), will be added in a modular approach and modeled to produce a wider set of products (C5, C7 and C9 fatty-acids and alcohols). This research project will be developed in continuous interaction with the Laboratory of Microbiology where dry-lab work (this PhD) will guide wet-lab work and vice versa. Finally, chemical and  biochemical  methods  will  be  explored  for  the  conversion  of alcohols into aldehydes and/or esters and ethers in collaboration  with the  Swammerdam Institute  of  Life Sciences at the University of Amsterdam and Wageningen Universityand Wageningen Food & Biobased Research (WFBR).

Recently closed research projects

Mycosynvac – A Collaborative Project for Engineering Mycoplasma pneumoniae as a broad-spectrum animal vaccine

The MycoSynVac project aims at using cutting-edge synthetic biology methodologies to engineer Mycoplasma pneumoniae as a universal chassis for vaccination. See website for more information.