Skip to content

animesh3008/Latent-DNA-Diffusion

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

News

Please check our new paper:

Absorb & Escape: Overcoming Single Model Limitations in Generating Genomic Sequences (Neural Information Processing Systems, 2024) Paper Link

Dataset Access

Access the dataset used to replicate the results presented in the paper at Hugging Face: Latent DNA Diffusion Dataset

Installation

  1. Clone the repo
    git clone https://github.com/Zehui127/Latent-DNA-Diffusion.git
  2. Install the required packages: diffusers, accelerate
    pip install torch torchvision diffusers accelerate einops easydict pytorch_lightning

Code Usage

Download pre-trained models

A pre-trained vae model and unet model can be downloaded from the following links:

https://zenodo.org/records/11061611

Generate DNA sequences using the trained diffusion model

Set the path to the pretrained vae model and unet model in the following command. The number of sequences to generate can be set using the sequence_num argument in src/configs/generate.yaml.

CUDA_VISIBLE_DEVICES="0" accelerate launch --main_process_port 12903 --multi_gpu main.py --model generate --gen_vae_path="" --gen_unet_path=""

Train the diffusion model

Set the path to the dataset in the config file src/configs/un_unet.yaml data_path field. Set the path to the pretrained vae model in the config file src/configs/un_unet.yaml vae_path field.

CUDA_VISIBLE_DEVICES="0,1" accelerate launch --main_process_port 12903 --multi_gpu main.py --model un_unet

Train the vae model

Set the path to the dataset in the config file

CUDA_VISIBLE_DEVICES="0,1" accelerate launch --main_process_port 12903 --multi_gpu main.py --model vanilla_vae

How to Cite This Work

If you use this project or dataset in your research, please cite it as follows:

@article{li2023latent,
  title={Latent Diffusion Model for DNA Sequence Generation},
  author={Li, Zehui* and Ni, Yuhao* and Huygelen, Tim August B and Das, Akashaditya and Xia, Guoxuan and Stan, Guy-Bart and Zhao, Yiren},
  journal={arXiv preprint arXiv:2310.06150},
  year={2023}
}

@article{li2024discdiff,
  title={DiscDiff: Latent Diffusion Model for DNA Sequence Generation},
  author={Li, Zehui and Ni, Yuhao and Beardall, William AV and Xia, Guoxuan and Das, Akashaditya and Huygelen, Tim August B and Stan, Guy-Bart and Zhao, Yiren},
  journal={arXiv preprint arXiv:2402.06079},
  year={2024}
}

About

Latent Diffusion Model for DNA Sequence Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%