About me
I am a PhD student in cancer bioinformatics.
I am specialized in computational epigenomics: the analysis of methylation changes happening throughout the genome of cancer patients using different bioinformatic and biostatistical methods on data obtained from patient tumor samples.
My main expertise is focused on analysis strategies of methylation array (Infinium® Human Methylation 450K BeadChip and Infinium® MethylationEPIC data) and bisulfite sequencing data. More specifically, I am interested in the methylation changes happening along repetitive elements (also known as "repeats") of the human genome in cancer patient, in order to improve cancer diagnosis and prognosis.
In order to improve existing methylation analysis strategies, and to study repeats' methylation in cancer, I also have a developper activity: I design bioinformatic tools to analyze methylation data, and produce neat visualisations of the data I work on.
I do so using 2 programming languages: R and Python3.
All the tools I design are open source, so anyone can find them available online on my Github profile and use it at its own convenience.
Selected publications
Pageaud Y. et al. (2018) Enrichment Analysis with EpiAnnotator. Bioinformatics.
Selected tools I developped
BiocompR is an R package built upon ggplot2, and using data.table. It improves some visualisations commonly used in biology and genomics for data comparison and dataset exploration, introduces new kind of plots, provides a toolbox of functions to work with ggplot2 and grid objects, and ultimately, allows users to customize plots produced into publication ready figures.
Methview.qc allows you to run quality control analysis on your methylation array dataset, and to collect all results in neat ready-to-publish plots.
EpiAnnotator is an R Package accompanied by a web interface. It contains regularly updated annotations from 4 public databases: Blueprint, RoadMap, GENCODE and the UCSC Genome Browser. Annotations are hosted locally or in a server environment and automatically updated by scripts of our own design. Thousands of tracks are available, reflecting data on a variety of tissues, cell types and cell lines from the human and mouse genomes. Users need to upload sets of selected and background regions. Results are displayed in customizable and easily interpretable figures.
EpiAnnotator DKFZ web app available here!
NCBI.BLAST2DT is an R package allowing you to submit DNA sequences to NCBI BLAST servers directly from the console, to retrieve potential hits on a genome or sequence database, and to collect all results within an R data.table.
It makes use of the R package hoardeR to submit sequences to the NCBI BLAST API, and then parses the XML BLAST results returned to load them as an R data.table to make it more easy to query, sort, order and subset the resulting hits.
Biotab.manager is an R package allowing you to download, manage, subset, and aggregate TCGA patients clinical data (biotabs) from the GDC portal. The package is built upon TCGAbiolinks to query TCGA databases, and makes use of R data.table handle queries results.
Education
2016 - 2017: 2nd year of MSc in Structural Bioinformatics at Université Paris Diderot (Paris 7), France.
2015 - 2016: 1st year of MSc in Genomics at Université Paris Saclay, France.
2014 - 2015: 3rd year of BSc: Genetic and Bioinformatics at Genopole, France.
2012 - 2014: 1st & 2nd years of BSc: Fundamental & Biomedical Sciences at Université Paris Descartes (Paris 5), France.
Follow me on