Programming

Here you can find the description and links to some of my code. Some of it is a bit of gradware, and I wrote it while learning how to code so it might be a bit rough. It will be cleaned up in the future.

ADvantage

During my time as a fellow of Insight data science, I developed an app to help with SEO campaigns, specifically, keyword search. It uses web scraping, Natural Language Processing (NLP) and combinatorial optimization to get most informative and cheapest set of keywords. This code and some more details can be found at https://github.com/jshleap/ADvantage. All this code was developed in python.

YAAP

Yet Another Amplicon denoising Pipeline (YAAP), is a pipeline to analyse metabarcoding amplicon data. It performs QC, adapter removal, denoising and ZOTU table construction. It is based on a mixture between cutadapt (to remove adepters) and vsearch and usearch to do processing, dechimerization, ZOTU table construction and denoising. This is a bash based pipeline and can be found at https://github.com/CristescuLab/YAAP

PYRS

My own implementation of polygenic risk score using p-value and LD thresholding. In this particular implementation I aim to get the LD for full chromosomes. So far it works well in chromosomes with less than 400K markers. You can check it out at https://github.com/jshleap/pyrs, is based on python and tries to solve some of the bigger than memory issues using the dask package.

ABeRMuSA

I collaborated intensively with Alex Safatly to construct a multiple structure aligner for a big ser of structures. It is written in python and can be found at https://github.com/AlexSafatli/ABeRMuSA

MODULER

A python package to compute evolutionary modules in protein structures using graph theory. It uses the correlation between aminoacids position in the evolutionary space and derived semicontigous domains. It is witten in python and can be found at https://github.com/jshleap/Moduler

StructBio

This harbors a set of scripts to process protein structure data and abstract them as shapes using a geometric morphometric framework. It is a mixture of python and R code and can be found at https://github.com/jshleap/StructBio

Biogeography

A set of scripts to deal with biogeography analyses, form fetching results from GBIF for macoecological and biogeografical studies to extract geographical and taxonomical information https://github.com/jshleap/Biogeography

Phylogenetics

A set of scripts to work with phylogenetic data. These scripts deal primarily with the construction of supertrees. They are written in python and can be found https://github.com/jshleap/Phylogenetics

You can check out other repos in my general repo as well as some forks!