From f55b102148d7b92fd0e237f1b9c008806e9f0349 Mon Sep 17 00:00:00 2001 From: PN269 <166166508+PN269@users.noreply.github.com> Date: Thu, 6 Jun 2024 07:42:59 -0400 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index bc4a51e..d0f412b 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Leveraging a large language model to predict protein phase transition: a physical, multiscale and interpretable approach We apply a unified modeling framework to predict protein phase transition (PPT). In classification task A proteins exhibiting experimental evidence of undergoing a phase transition, forming either droplets or amyloids, are consolidated into a single dataset (+Droplet drivers and +Amyloids). Phase transition propensity is predicted versus the preference to maintain the native soluble state (-PT). In classification task B the unified dataset is utilized to predict the propensity to form droplets versus amyloid aggregates. To accomplish this, we fine-tune the ESM-2 model to predict PPT and compare its performance to biophysical knowledge-based models (e.g., random forest). -![Alt text](./Files/Schematic.png?raw=true "Title") +![Alt text](./Files/schematic.png?raw=true "Title") This repository provides code for predicting protein phase transition (PPT) propensity, including two examples of AD-related proteins, their associated genes, and transcription factors. ## Folder and related notebooks: