New Atlantis
is an open ocean regeneration project that seeks to address biodiversity loss in our oceans by providing a viable business model to Marine Protected Areas (MPAs). We do this by building an open marine biodiversity analytics platform to monitor and forecast the health of Marine Protected Areas and from which marine biocredits and blue carbon credits can be generated.
The metagenomic pipeline section of the new atlantis github. An easy-to-use pipeline for generating metagenomic data on different ocean samples.
Currently known as the Living Oceans Metagenome Assembly Pipeline
or LOMAP
for short.
Photo used with permission by Paul Nicklen, co-founder of SeaLegacy.org, New Atlantis Founding Advisor, NatGeo Contributor, Instagram
You can set up and use the LOMAP
on the cloud by following along the google colab notebook
Please note that google colab does not provide the computational resources necessary to fully run LOMAP
on a real dataset. This notebook demonstrates how to setup and use LOMAP
by performing the first steps in the workflow on a toy dataset.
You can set up LOMAP
on your computer at home in one line!
git clone https://github.com/new-atlantis-dao/Oceanomics/tree/main/Metagenomics && cd Metagenomics && rm -r .git
Congratulations, you can now start using LOMAP
.
LOMAP
can be used to explore a local section of ocean's planktonic network. A written tutorial on how to use the LOMAP
pipeline will be released at a later date.
├── LICENSE
├── Makefile <- Makefile with commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── docs <- A default Sphinx project; see sphinx-doc.org for details
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── setup.py <- makes project pip installable (pip install -e .) so src can be imported
├── src <- Source code for use in this project.
│ ├── __init__.py <- Makes src a Python module
│ │
│ ├── data <- Scripts to download or generate data
│ │ └── make_dataset.py
│ │
│ ├── features <- Scripts to turn raw data into features for modeling
│ │ └── build_features.py
│ │
│ ├── models <- Scripts to train models and then use trained models to make
│ │ │ predictions
│ │ ├── predict_model.py
│ │ └── train_model.py
│ │
│ └── visualization <- Scripts to create exploratory and results oriented visualizations
│ └── visualize.py
│
└── tox.ini <- tox file with settings for running tox; see tox.readthedocs.io
Software and marker gene sequences used to build a plankton specific database for taxonomic profiling derive from the following publications:
Microbial abundance, activity and population genomic profiling with mOTUs2 (2019)
read_counter A tool to count the number of reads (from a fastq file) that map to a set of nucleotide sequences (in a fasta format).
A robust approach to estimate relative phytoplankton cell abundances from metagenomes (2022)
Toward a global reference database of COI barcodes for marine zooplankton (2021)
A simple Taxonomic Plankton Profiler Tool (unpublished work).
Please reach out with any comments, concerns, or discussion regarding LOMAP