Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor pipeline parsing #74

Open
tiagofilipe12 opened this issue Aug 23, 2017 · 2 comments
Open

Refactor pipeline parsing #74

tiagofilipe12 opened this issue Aug 23, 2017 · 2 comments
Labels

Comments

@tiagofilipe12
Copy link
Member

Right now bionode-watermill resolves a set of promises,

const pipeline = join(task1, task2)

but it would be nice to have a object-like structure in pipeline definition

const pipeline = { join: { task1, task2 } }

And then bionode-watermill would parse this object to first get the pipeline before actually execute it and then execute it.
This way it would be possible to predict (before running) the pipeline shape, inputs and outputs and then after running the pipeline confirm that everything was properly set and executed as expected. Also, this can greatly increase pipeline visualization in the sense that we can improve visualization to render different colors for what was run, is running and ended.

@thejmazz
Copy link
Member

Two points:

  1. We will need to be very precise about use of objects and arrays, your example could easily be:
const pipeline = { join: [ task1, task2 ] }
  1. Last time I thought about this, I went and tried to convert an existing, somewhat complex pipeline into an object. It ended up being kinda verbose. So I'm partial to keeping the function style (since it's nice and terse, almost like a DSL, user needs to import specific functions which can each have params validated etc, as opposed to manual keys in objects) but have it build an object representation internally - the "verbose" representation. However maybe this verbose object would be more appealing in YAML (but converting strings into JS functions is sketchy - scripts is fine).

PS: I think also we need not necessarily have an object representation to have a pre computed DAG - it does look like those two features go hand-in-hand, but it is also probably possible to use the orchestrator functions and construct a DAG. If it is possible, I would prefer to implement new features (pre-computed DAG) without overhauls of other things if possible.

@tiagofilipe12
Copy link
Member Author

tiagofilipe12 commented Aug 24, 2017

Just one more thing: after this refactor where DAG can be obtained before running the pipeline, fork also can be refactored to work without those bunch of rules in lib/orchestrators/join.js, since we will have tasks available before running the pipeline. The issue right now is that we can only fetch tasks when we run a given orchestrator, but everything else outside the scope of that orchestrator is unavailable to handle and fork needs to handle downstream tasks (tasks that run after fork), by multiplying them as much as the number of branches that fork has.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants