Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boxed roots #22

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Boxed roots #22

wants to merge 1 commit into from

Conversation

gadmm
Copy link

@gadmm gadmm commented Mar 17, 2021

@gadmm
Copy link
Author

gadmm commented Mar 17, 2021

I believe this RFC can interest you: @stedolan, @kayceesrk, @sadiqj, @Sudha247, @damiendoligez (otherwise, please apologize me for the spam).

@gasche
Copy link
Member

gasche commented Mar 18, 2021

(also @let-def obviously, as the author of the related work on CAMLroots)

@let-def
Copy link

let-def commented Mar 19, 2021

I find this proposal very nice, thanks @gadmm (and @gasche).

First: performance aside, could the boxroot API be implemented using just a doubly linked list for bookkeeping?

The description makes it sound as if it’s more complex than the current implementation of roots.
This API gets rid of the indexation necessary to let the user free to pick the root address. A doubly linked list seems enough for the same effect, while being of the same complexity as the skip list.
So the extra complexity is there for performance reason (which is nice) but not inherent to the proposed change.

Second: do you think a weak root could be added in the future?

Managing complex data structures with global roots in C can cause subtle memory leaks.
For instance, consider a closure that captures a custom object. If this custom object stores the closure in a root (for calling it back later from C code), it creates a cycle between the object and the closure that is opaque to the GC. Neither the closure nor the abstract object will be collectable unless the root is manually mutated.
A variant of this problem is a hierarchy of objects where each level keeps the next one alive through a global root: only one level of this hierarchy can be collected per collection. The root of the next level will be released when finalising the abstract object, making the next level unreachable. But the GC can’t observe that unless the whole generation is re-scanned. Both cases can occur naturally when binding object-oriented frameworks.

Using weak roots and ensuring that all C objects are dominated by OCaml ones solves this problem (the GC can now see the precise dependencies between objects, without going through roots).

In other words, the root API fulfils two purposes: referencing an OCaml object from C and keeping alive an OCaml object from C. Weak roots would allow doing only the former.

I’m not asking for any change, but it would be nice to keep this use case in mind during future work on the proposal, to ease a possible integration in the future. Thanks.

@gasche
Copy link
Member

gasche commented Mar 19, 2021

First: performance aside, could the boxroot API be implemented using just a doubly linked list for bookkeeping?

Yes -- we would use a generational design two linked lists, one that may contain young objects and one that may only contain old objects. I could try to implement this to get performance numbers.

If you don't care about performance, another basic implementation, that is easy to have as a user-side library, is to simply malloc each root, and register the malloced address as a global generational root. In our benchmarks this performs exactly as well (or badly) as generational global roots, when both can be used, the cost due to the extra malloc is minimal.
But we do care about performance (our question was whether this is usable as a central building block for, essentially, any Rust program manipulating the OCaml runtime), so we did try to get a faster prototype. The main difference (besides the generational design) is that instead of a doubly-linked list, we allocate "pools" of roots of size 2^N, so a new root only consumes 1 word in memory instead of 3 with the doubly-linked list -- and scanning has good locality. This does add some complexity related to ensuring that pools are not under-utilized. We are of course open to other implementation approaches of the same API, and an "upstream" implementation could possibly reuse bits from the runtime.

do you think a weak root could be added in the future?

One naive way to implement that is to allocate a boxroot that contains a weak pointer. The question is whether a more merged solution can avoid doubling the indirections / book-keeping costs. I think yes: in addition to our young/old distinction, we could do a strong/weak distinction, where strong roots are scanned by the GC, but weak roots are only used to update moved values, just like the existing weak-pointer implementation. I don't know if there are enough hooks to let us implement this as a user library (as our current prototype), this sounds like something that would be much easier to do inside the runtime, but I think it would not be too hard to add to the implementation, and it would not add a noticeable overhead. (If I understand correctly, currently weak pointers are only allocated in the major heap, and we would follow that strategy as well.)

Note: another question I had in the background was whether some of the quality-of-life improvements proposed by CAMLroots could be integrated into boxroots as well. Currently our main use-case is Rust FFI, where we can already enforce safety by judiciously using the lifetime discipline, we have not explored how much of the CAMLroots-style bug-avoidance we get with the boxroot API, and whether we can do better with dynamic checking.

@gadmm
Copy link
Author

gadmm commented Mar 19, 2021

First: performance aside, could the boxroot API be implemented using just a doubly linked list for bookkeeping?

The description makes it sound as if it’s more complex than the current implementation of roots.
This API gets rid of the indexation necessary to let the user free to pick the root address. A doubly linked list seems enough for the same effect, while being of the same complexity as the skip list.
So the extra complexity is there for performance reason (which is nice) but not inherent to the proposed change.

To add to Gabriel's answer (one can already root a malloced value globally), this is not correct since boxroot will also lift some expressiveness constraints regarding cross-thread deallocation, which might be harder to do efficiently with current global roots or doubly-linked lists. (As explained in the RFC, cross-thread deallocation covers two use-cases: 1) being able to drop a root from a thread that does not hold the lock with systhreads, which would be hard to ensure with a Rust-style type system and might be impractical from an ownership perspective, and 2) being able to pass ownership of roots between domains in multicore.)

To play along with your train of thoughts, if efficiency was completely out of the picture, this could be done simply by protecting global roots (or a doubly linked list) with a mutex. So efficiency is indeed a crucial aspect to the proposal, but for an apples-to-apples comparison in terms of complexity one must ask how cross-thread deallocation would be implemented with a would-be simpler proposal. (The prototype is not currently thread-safe, but it is unlikely to be much more complicated once it is.)

Second: do you think a weak root could be added in the future?

Is there something bad with rooting a weak pointer?

@stedolan
Copy link
Contributor

The proposed API looks good to me! It's quite similar to the caml_root API in multicore, which had similar motivations. I have two minor questions / suggestions:

  • What's the rationale for boxroot_get_ref? It encodes a constraint that boxroot access will never involve a read barrier. This is true now and unlikely to change soon, but I don't currently see what is gained by making this constraint.
  • In boxroot_modify, would you consider changing the signature to something like:
    boxroot boxroot_replace(boxroot, value);
    The current proposed semantics look good, but I found the signature of boxroot_modify misleading: at first glance, it looked like it was modifying an existing boxroot rather than deleting it and creating new one. An API that explicit takes and returns a boxroot might make this clearer.

Regarding the implementation, have you considered reusing the remembered set? (Add_to_ref_table and related macros, used by the minor GC to track young pointers). I suspect an implementation along these lines could be both simpler and faster than the generational-pools approach.

@gasche
Copy link
Member

gasche commented Mar 22, 2021

If I understand correctly, the proposal is to use a non-generational allocator (create new boxroots, either old or new, in a single container (pool, doubly-linked list, etc.), and in addition insert young boxroots to the remembered set? I remember we considered it when a modify operation replaces an old value with a young value, but not to track boxroots that are created as young values.

For the latter usage, one downside I see is that roots in the remembered set / reftable are promoted unconditionally, which would be bad for short-lived boxroots (typically if used instead of local roots). And I don't see how to track the position of a young-boxroot in the reftable to "remove" it on boxroot deletion. So I have the impression that our generational approach should be faster for short-lived young boxroots, which are important. Did you have something else in mind?

@let-def
Copy link

let-def commented Mar 22, 2021

Second: do you think a weak root could be added in the future?

Is there something bad with rooting a weak pointer?

Performance and convenience, which are nice to have and in line with this PR. I did what you suggest and would still prefer some assistance from the API. (Actually with a dynamically resized weak array to pool the allocations, quite like this PR). Let me try to emphasise why weak roots are worth supporting.

When used in lieu of local roots, the current proposal offers a sensible alternative that enables a safe and efficient Rust API.
The lifetime of these local roots is tied to a stack frame and as such is fairly predictable. It lets one implement external code whose semantics with respect to memory retention does not differ from OCaml one. The OCaml native compiler does a finer-grained lifetime analysis that allows releasing local roots earlier, but even that could be implemented in an external with enough work).

However the story is not the same when working with data structures. As soon as a finalizer (either from a custom object or from a call to Gc.finalize) is used to release a root, collection behaves differently. The indirection of going through a root hides the dependencies from the GC.
A weak root alone does not solve the problem, but provides a primitive that allows implementing a solution with slightly more work.

Adding a scanning function to custom objects would be even better. Stable pointers can address the issue too: as long as the caller ensures that the stable pointer outlive the external data structure, both can interoperate without affecting the performance profile. Any of these solutions would simplify binding of external libraries that manipulate structured data, and weak roots seems to be the simplest way to achieve that.

(Personally, I encountered these issues while working with C libraries that construct graph structures: chipmunk for 2d physic simulation and a few UI toolkits). I hope I am not going too off-topic, I am open to more discussions on other channels.

@stedolan
Copy link
Contributor

If I understand correctly, the proposal is to use a non-generational allocator (create new boxroots, either old or new, in a single container (pool, doubly-linked list, etc.), and in addition insert young boxroots to the remembered set? I remember we considered it when a modify operation replaces an old value with a young value, but not to track boxroots that are created as young values.

For the latter usage, one downside I see is that roots in the remembered set / reftable are promoted unconditionally, which would be bad for short-lived boxroots (typically if used instead of local roots). And I don't see how to track the position of a young-boxroot in the reftable to "remove" it on boxroot deletion. So I have the impression that our generational approach should be faster for short-lived young boxroots, which are important. Did you have something else in mind?

I did have something else in mind! Here's a slightly more fleshed-out proposal for the sort of implementation I was imagining:

Boxroots are word-sized, containing either a user value (if allocated) or a pointer to the next free boxroot (if deallocated). Free boxroots then form a singly-linked list, used as a stack. The operations on major values are:

  • To create a boxroot for a major value, pop a boxroot from the freelist, fill it in, and return it. If the free list is empty, allocate a new chunk and thread the new boxroots onto the list. (Allocate new boxroots in large chunks to amortise overhead)
  • To delete a boxroot containing a major value, push it onto the freelist (i.e. overwrite it with the current free list pointer and make it the new head)
  • The free pointers are encoded as integers (1 in the low bit). At the start of major GC, scan each boxroot by sequentially traversing each chunk. Since deallocated boxroots contain integer-encoded values, nothing happens to them.

As well as the freelist described above, use a second freelist for boxroots that are registered in the remembered set:

  • To create a boxroot for a minor value, pop a boxroot from the minor freelist and fill it in. If the minor free list is empty, allocate a major boxroot as above and put it in the remembered set.
  • To delete a boxroot containing a minor value, do as above but put it in the minor freelist rather than major.
  • At each minor GC, push the entire minor freelist onto the major freelist (this is O(1) by swapping pointers)

This way, short-lived boxroots are just pushed and popped onto either freelist, which is fast. Minor GC will traverse each boxroot that held a minor pointer since the last minor GC. (This is a no-op if the boxroot no longer contains a minor pointer).

@gasche
Copy link
Member

gasche commented Mar 22, 2021

@stedolan this sounds similar to our implementation, but a bit different. (I think our "pools" are your "chunks", we have "minor" pools that contain a mix of young and old values, and "major" pools that contain only old values. When all minor pools are full, we "demote" a major pool with low occupancy. Instead you propose to allocate the minor root in a major pool, using the remembered set. My intuition is that it would be slower, opposite to yours it seems.) Could we discuss these implementation details on the implementation repository directly? Feel free to open an issue at https://gitlab.com/ocaml-rust/ocaml-boxroot/-/issues , or if you give me your Gitlab handle I can open an issue and cc: you.

I hope I am not going too off-topic, I am open to more discussions on other channels.

This is related to the design issues underlying the RFC, so I'm happy to discuss it here. Interesting discussions are a better situation than standard RFCs with no discussion.

Adding a scanning function to custom objects would be even better.

If I understand correctly, your idea would be that, instead of using global roots, custom values capturing OCaml values could provide a callback to scan their inner references to OCaml values on the heap. Sounds interesting, and possibly simpler than a workaround using weak roots. Would you like to submit an RFC? :p

(I would be happy to propose to experiment with weak roots, but I don't see how to do it outside the runtime, and also the runtime ephemeron runtime support is noticeably more complex than the old weak-pointers support, so it is a bit less tempting.)

@gadmm
Copy link
Author

gadmm commented Mar 22, 2021

  • What's the rationale for boxroot_get_ref? It encodes a constraint that boxroot access will never involve a read barrier. This is true now and unlikely to change soon, but I don't currently see what is gained by making this constraint.

boxroot_get_ref gets in Rust the type ValueRef<'a> boxroot_get_ref(&'a Boxroot) where ValueRef is the type used as arguments to functions in our caller-root calling discipline. It is used to implement the Deref trait for roots. ValueRef denotes a pointer to a location in memory which may or may not be updated by the GC (which is reflected in its lifetime). The point of ValueRef in our caller-root calling discipline (instead of passing a reference to a Boxroot, say) is (to avoid an indirection, and) support the current idioms where one avoids rooting when not necessary: indeed one can get a ValueRef from many sources: immediates, non-rooted values (references of which can be passed to functions that do not allocate), direct pointers to the OCaml heap, other kinds of roots (if there is still some use for global roots, e.g. for compatibility or if the user cannot chose the memory location, or perhaps in the future direct pointers to the OCaml stack as mentioned in the section "Future possibilities").

ValueRef is defined as &ValueCell where ValueCell is an abstraction of UnsafeCell so Rust knows that the GC is susceptible to update the value of the pointer anytime and avoids assuming it does not change. If you are curious you can find more detail in the following experiment: https://gitlab.com/gadmm/ocaml-rust/-/blob/master/src/main.rs#L607-636.

A read barrier as in the concurrent multicore GC would not change the signature of boxroot_get_ref since mutable cells are never stored directly as a boxroot. But maybe you had some other hypothetical scenario in mind?

  • In boxroot_modify, would you consider changing the signature to something like:
    c boxroot boxroot_replace(boxroot, value);

The current proposed semantics look good, but I found the signature of boxroot_modify misleading: at first glance, it looked like it was modifying an existing boxroot rather than deleting it and creating new one. An API that explicit takes and returns a boxroot might make this clearer.

While it is functionally equivalent to deleting and creating, it does offer the guarantee that it always modifies the boxroot in place except at most once between two minor collections, which is why it is close to modifying the boxroot in place (also you do not need to test whether it is NULL after modification unlike a creation function). I assume in your suggestion, boxroot_replace takes ownership of its first argument? If so, I think the two APIs are equivalent. Is the problem about taking a boxroot * as argument instead of a boxroot, which you call misleading?

@gadmm
Copy link
Author

gadmm commented Mar 22, 2021

Performance and convenience [...]
However the story is not the same when working with data structures. As soon as a finalizer (either from a custom object or from a call to Gc.finalize) is used to release a root, collection behaves differently. The indirection of going through a root hides the dependencies from the GC.

I will try to rephrase the issue you are describing. Roots naturally tend to have a reference-counting flavour to them in the sense that they lead to leaks if you let them create cycles of dependencies, which can happen naturally by exchanging values back and forth across the FFI. From this point of view, it makes sense to offer "weak" variants to break the cycles by hand, that would have similar performance characteristics.

Adding a scanning function to custom objects would be even better. Stable pointers can address the issue too: as long as the caller ensures that the stable pointer outlive the external data structure, both can interoperate without affecting the performance profile. Any of these solutions would simplify binding of external libraries that manipulate structured data, and weak roots seems to be the simplest way to achieve that.

Again to be sure I understand:
All three techniques achieve different things, and none entirely replaces the need for roots and weak pointers. There are already libraries that hack around in order to have stable pointers indeed. About the idea of custom scanning functions, while this sounds appealing, it is unclear how you would write such a scanning function when you manipulate foreign abstract data types. Experiments with GCs in Rust involved the automatic derivation of scanning functions using traits, which simplifies the writing, but this still requires cooperation from the owner of the foreign abstract types involved. Or if you assume solved the problem of tracking the values contained in a foreign data structure without knowledge of the latter, I wonder what is missing for implementing something equivalent already (by storing pointers to a collection of values allocated on the major heap).

I hope I am not going too off-topic, I am open to more discussions on other channels.

I think this is on-topic, especially as it gives an example of something that would be easier to do if boxroot was integrated with the runtime.

@gadmm
Copy link
Author

gadmm commented Mar 22, 2021

Here's a slightly more fleshed-out proposal for the sort of implementation I was imagining:

Thanks, this is a very interesting idea. A couple points:

  • This removes much of the complexity of promoting/demoting pools, which I like very much.
  • I expect that going through the remembered set is more costly for non-light workloads (costs of allocation, using an indirection, and scanning randomly), even if your design solves the problem of deallocating roots to young values.
  • Various details would lead to different performance characteristics (different boxroot_modify optimisations, no early-exit optimisation).

For the performance in light workloads (only a couple of roots allocated between two minor collections), we were already thinking about using the remembered set for the first few young roots after a minor collection before switching to young pool allocation, but not in a way that had a chance to scale to heavier workloads.

Due to the simpler implementation it is definitely something I would like to try, and only the performance figures can decide.

@let-def
Copy link

let-def commented Mar 24, 2021

Performance and convenience [...]
However the story is not the same when working with data structures. As soon as a finalizer (either from a custom object or from a call to Gc.finalize) is used to release a root, collection behaves differently. The indirection of going through a root hides the dependencies from the GC.

I will try to rephrase the issue you are describing. Roots naturally tend to have a reference-counting flavour to them in the sense that they lead to leaks if you let them create cycles of dependencies, which can happen naturally by exchanging values back and forth across the FFI. From this point of view, it makes sense to offer "weak" variants to break the cycles by hand, that would have similar performance characteristics.

Yes, that's precisely that.

Adding a scanning function to custom objects would be even better. Stable pointers can address the issue too: as long as the caller ensures that the stable pointer outlive the external data structure, both can interoperate without affecting the performance profile. Any of these solutions would simplify binding of external libraries that manipulate structured data, and weak roots seems to be the simplest way to achieve that.

Again to be sure I understand:
All three techniques achieve different things, and none entirely replaces the need for roots and weak pointers. There are already libraries that hack around in order to have stable pointers indeed. About the idea of custom scanning functions, while this sounds appealing, it is unclear how you would write such a scanning function when you manipulate foreign abstract data types. Experiments with GCs in Rust involved the automatic derivation of scanning functions using traits, which simplifies the writing, but this still requires cooperation from the owner of the foreign abstract types involved. Or if you assume solved the problem of tracking the values contained in a foreign data structure without knowledge of the latter, I wonder what is missing for implementing something equivalent already (by storing pointers to a collection of values allocated on the major heap).

You are right that the three techniques achieve different things, sorry for conflating them in my previous message and thanks for the clarification. Of all three, i think weak roots & custom scanning would help the most.

I indeed assume "solved" the problem of tracking values contained in a foreign data structure (there is no generic answer to that question, it is very library specific; and, worse, non-rust libraries rarely specify which value is retained in which scope).

A large part of my "Cuite" (Qt bindings) experiment is doing that: (pooled) weak roots to refer to OCaml values from C-side, and OCaml collections that mirror the foreign dependency graph to enable efficient Garbage Collection. With auto-generated code and hooks in Qt runtime to automate that. The goal was to make foreign data structures behave like OCaml ones. This was quite difficult to achieve, so I think it would be worth providing appropriate API to help bindings implementations.
For instance, one can't directly use a custom object: they cannot refer at all to objects in OCaml heap. Two levels of wrapping are needed: one ocaml object to store the dependency graph that is also passed to the stubs code, and one to attach the finalizer.

@ctk21
Copy link

ctk21 commented Apr 8, 2021

The motivation for the proposal and the performance looks good to me.

From a multicore perspective:

  • multicore is not planning a special root mechanism now it uses a stop-the-world minor collection. We aim to be compatible with the existing FFI and root functions.
  • multicore is intending to use and provide the same scanning hooks as the existing runtime; improving the scanning hooks along the lines suggested in this RFC (if that is the route people want to go) doesn't look a problem and might prove useful.

The one bit of the proposal I worry about in a multi-threaded environment is what will happen if one thread is updating a boxroot with boxroot_modify while another thread is reading it with boxroot_get. Care will be needed to devise a scheme to allow the delete/create inside boxroot_modify to be done safely with good performance given that another thread may be reading that boxroot.

It looks like the authors have thought about the multi-threading needed, have they tried things out against the multicore runtime?

Multicore is now up at 4.12 and we have a variant 4.12+domains that removes the effects syntax changes, so it might not be a massive leap to try code against that.

@gadmm
Copy link
Author

gadmm commented Apr 6, 2022

An update about ocaml-boxroot as I worked on it these past few days. Let me also answer to a few remarks (I had this on my to-do list since last year).

@ctk21 and others:

I have improved the support for hooks in multicore at ocaml/ocaml#10965. Following this, I have implemented rudimentary ocaml-boxroot support for OCaml multicore (it still uses a global lock). We would like to implement an efficient version inspired by concurrent allocators; for this we need to efficiently know if we have the domain lock and so I have opened the PR ocaml/ocaml#11138. I would like to try to do it before multicore is released, to ensure that we have all that we need in terms of hooks. (I am eyeing at caml_domain_stop_hook which we might need to make atomic, and which might be useful in general for C libraries that have domain-local state.)

I think boxroot would be easier, simpler and more feature-full to implement it as part of OCaml multicore. In a first time I am fine with having it as a separate library if the addition/adaptation of hooks keeps going well.

The thread-safety of boxroot_modify is a good question. Boxroots are expected to follow an ownership discipline. To share a boxroot between threads, you would use a Rust type like Arc<Boxroot> which only gives read-only references to its contents.

@let-def:

Gabriel has tried with a doubly-linked list implementation (see the dll_boxroot benchmark at https://gitlab.com/ocaml-rust/ocaml-boxroot). The performance was better than we expected. We think that the cache locality of the underlying allocator played a role in our benchmarks. (In particular we suspect that our benchmarks might not be representative of situations where the allocations on this size class compete with allocations in the rest of the program.)

The efficiency of pools is only one part, the other is making it efficiently thread-safe, for which we think pool can help too.

Regarding weak roots, in terms of Rust API it might be quite different. But let us consider it as something doable if boxroot was integrated with OCaml (or implemented by someone motivated enough by proposing an implementation in the library, which includes creating the appropriate hooks in the OCaml runtime).

@stedolan:

We (Gabriel and I) have now tried fairly extensively to implement and optimise your proposed generational strategy. It works well and gave a simpler implementation than we had at the time, but we now see that a better (faster, simpler) strategy is to have an inlined fast path for allocation and deallocation that does not even tests for Is_young inside them. Everything is rooted as-if young, and then we essentially just keep track of which pools have not yet been scanned during minor collection.

This amounts to moving the Is_young test from the mutator to the collector (which is guaranteed to result in fewer tests overall, and could be order of magnitudes fewer in scenarios that create a lot of short-lived roots, e.g. local roots). Benchmarks show that in case of many old root creations and few young roots creations, the overhead during scanning is negligible (even more so by optimizing pool scanning for the minor GC; here you can optimize the Is_young test much better than inside the mutator).

We have observed that inlining has a much greater impact than the choice of generational strategy. The (not so) naive strategy has creation and deletion functions that take 10-ish x86-64 instructions, compared to 30-40ish with your strategy (implemented the best I could), and proportionally fewer branches (and only statically predictable ones). In the context inlined code, I imagine that the gains in terms of code size and branch prediction can be important. (Our benchmarks are fairly small in terms of code size, so not well-suited to test this hypothesis; nevertheless we get slightly better performance with inlining alone than with inlining+remembered set.)

However, the simplicity of your approach (first implemented by Gabriel) pushed me to reconsider the complicated scheme I had before.

Please see https://gitlab.com/ocaml-rust/ocaml-boxroot/-/merge_requests/43 and linked discussions if you want to know or discuss more.

I have a question regarding:

The free pointers are encoded as integers (1 in the low bit). At the start of major GC, scan each boxroot by sequentially traversing each chunk. Since deallocated boxroots contain integer-encoded values, nothing happens to them.

The Is_young test excludes pointers to boxroot pools (free pointers) just like pointers to the major heap, if inserted into the remembered set. As far as I can see, only a sanity check added in DEBUG mode by the multicore PR in one place introduces a stronger constraint (and it would be easy to fake it on our side). Do you think there is a reasonable way to avoid the pointer tagging while using OCaml's ref table? We will probably end up using the remembered set to simplify the implementation and the interface of boxroot_modify (and reimplement it in boxroot, without this debug check, sooner than tagging pointers).

@gadmm
Copy link
Author

gadmm commented Apr 6, 2022

For now, we will try to have a working ocaml-boxroot release with multicore as a library, and add necessary support for it (hooks, etc.). Once we have a better picture towards what integration into OCaml looks like, I will update the RFC.

@gasche
Copy link
Member

gasche commented Jul 27, 2023

Note: we discussed this RFC again at our last development meeting, and the consensus was in favor of us submitting a PR to consider upstreaming a Boxroot implementation. Note that this still requires an important amount of work, as we want to revisit our implementation to reuse existing runtime code where possible.

@gadmm gadmm mentioned this pull request Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants