Computer scientist with a focus on algorithm engineering and data analysis in bioinformatics.
Position Group
Since August 2017 Group leader Algorithms for reproducible bioinformatics, Institute of Human Genetics, University of Duisburg-Essen
August 2016 -
July 2017
Researcher Life Sciences, CWI Amsterdam
Since May 2016 Consultant Myles Brown, Division of Molecular and Cellular Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School
May 2016 -
July 2016
Postdoc Alexander Schönhuth, Life Sciences, CWI Amsterdam
April 2015 -
April 2016
Postdoc Shirley Liu, Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health
Postdoc Myles Brown, Division of Molecular and Cellular Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School
January 2011 -
March 2015
PhD student Sven Rahmann, Genome Informatics, University Duisburg-Essen
Guest member Eli Zamir, Systems Biology of Cell Matrix Adhesion, Max-Planck-Institute of Molecular Physiology Dortmund
See here for a full CV.




A Bayesian model for single-cell transcript (differential) expression analysis on MERFISH data. The model allows to overcome systematic biases occurring with MERFISH and provides measures of uncertainty and control of the false discovery rate in a strictly Bayesian way. MERFISHtools is a corresponding command line client and analysis library written in Rust and Python. MERFISHtools is also available via Bioconda.

visit homepage


A distribution of bioinformatics software realized as a channel for the versatile package manager Conda.

visit homepage


A bioinformatics library written in the Rust language. The implementation provides state of the art solutions for common tasks in bioinformatics, focusing on stability by using comprehensive unit tests and continuous integration.

visit homepage


ALPACA is a variant caller for next-generation sequencing data that incorporates sample based filtering into the calling. This allows intuitive control of the false discovery rate with generic sample filtering scenarios. Further, it uses preprocessing and merging of BCF files to solve the N+1 problem: an existing study can be extended with new samples without redundant computations. After the preprocessing, the actual calling is a matter of seconds.

visit homepage


PEANUT is a read mapper for DNA or RNA sequence reads. By exploiting the massive parallelism of modern graphics processors and a novel index datastructure, PEANUT achieves superior speed compared to current state of the art read mappers like BWA MEM, Bowtie2 and RazerS3, while maintaining their accuracy. It thereby allows to report both only the best hits or all hits of a read. In case of reporting all hits, PEANUT is four to ten times faster than competitors.

visit homepage


Snakemake is a workflow engine and language. It aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style.

visit at bitbucket


LibModalLogic is a JAVA implementation of Modal Logic K and Propositional Logic. Logic formulas can be build in memory, saved to and read from MathML and formatted human readable. Reasoning is implemented by the (modal) logic tableau algorithm, including dynamic backtracking for maximum performance.

visit at google code


TRMiner is a python tool that aims at scientific data curators. It allows to rapidly prune large collections of scientific publications to sentences relevant for a given mining goal, using a linear time matching algorithm.

visit at google code

Protein Hypernetworks

Protein Hypernetworks are an approach for endowing protein networks with interaction dependencies using propositional logic. This allows refined network based predictions of protein complexes, functional importance and functional similarity.

visit at google code




Awards and Grants

  • NWO Veni grant (€ 250,000) for the project "Fully reproducible workflows scaling from workstations to the cloud"
  • Uhde-Award 2011 for my diploma thesis "Propagating Interaction Logic towards Predictive Protein Hypernetworks".
  • Honorable Mention at the Doktorandenkolleg Ruhr 2011 for my poster "Protein Hypernetworks".
  • Poster Award of the University Hospital Essen at the Forschunstag 2011 for my poster "Protein Hypernetworks".
  • Travel Award of the 9th International Conference on Pathways, Networks and Systems Medicine 2011 for my talk on protein hypernetworks.


2013Guest lecture "Detecting SNVs with Next-generation-Sequencing" in the course "Statistik in der Genetik", Faculty of Statistics, TU Dortmund.
2012Co-supervised bachelor thesis "Rekonstruktion von Protein-Interaktionsabhängigkeiten mit dem Quine-McCluskey-Algorithmus", Bianca Patro, TU Dortmund.
2011Teaching assistant for "Datenstrukturen Algorithmen und Programmierung" (DAP1), Faculty of Computer Science, TU Dortmund
Co-supervised bachelor thesis "Konstruktion von Protein-Hypernetzwerken durch Text-Mining in der PubMed Datenbank", Michael Nimbs, TU Dortmund.
Co-supervised diploma thesis "Entwurf einer Datenstruktur für Pangenome", Christiane Küch, TU Dortmund.

phone+49 (0)201 723 1908
office Room 1.13 University Hospital Essen Virchowstr. 183 45147 Essen
postal Dr. rer. nat. Johannes Köster
Institute of Human Genetics
University of Duisburg-Essen
Hufelandstr. 55
45147 Essen