-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorporating cost #2
Comments
One important consideration is whether compounds all have uniform cost. Scott is exploring this by obtaining quotes from different vendors. |
Additional feedback from the group. We now have multiple costs to consider when comparing an iterative screening effort vs an one-big-screen.
For simulation purposes, each iteration of the active learning pipeline, we record various evaluation metrics. In addition, we should record these cost metrics as well for later analysis. |
There are at least three modes for iterative screening and cherry picking. Mode 1: cherry pick from compounds at SMSF (LC and MLPCN libraries). The compound cost is low, the labor cost is high because it may involve selecting a different plate for each compound in the batch. Mode 2: purchasing compounds from a vendor like ChemDiv. The vendor would have a library of > 1 million compounds. They would likely prepare fixed-sized plates for us so the labor of cherry picking would be incorporated in the compound cost. At least for some vendors, we can get a quote for a constant cost per compound. Mode 3: a virtual library from multiple vendors like ZINC. There would be high labor cost if it takes a lot of time to assess which prioritized compounds can even be purchased. There would be variable compound cost. |
See strategy at #1.
Currently, the code implements budget constraints via
batch_size
with current parameters of[96, 384, 1536]
relating to microplate sizes in practice. The problem is that we don't consider molecule costs when selecting clusters/instances in our strategy. We purely exhaust thebatch_size
.An alternative would be to use a combination of
budget
andbatch_size
, where we want to exhaust thebatch_size
but not go over thebudget
.I see two methods of doing this:
budget
, then that instance is dropped.I am leaning towards method 2.
Please discuss or propose any other solutions.
The text was updated successfully, but these errors were encountered: