Nodes are released when they are no longer being observed. That happens in two situations:
When release is called on a root. The graph is traversed and all nodes are marked as no longer being observed.
In a join or bind node, when the inner graph changes, the previous one is released.
Releasing a sub-graph is a significant operation, as it traverses the whole sub-graph and can invoke arbitrary user functions (the release callback of primitives).
Release behavior should commute
Badly implemented, it lead to code that depends on evaluation order:
computation_2 causes expensive_int_computation to be observed for the first time, the graph is traversed and each node is update
Evaluating after the first call to setup_2:
computation_1 also makes use of expensive_int_computation. Because it is already in use (since computation_2 has not been updated yet), this requires no work, evaluation happens in O(1).
computation_2 releases expensive_int_computation, but it is now used by computation_1 , evaluation happens in O(1).
Evaluating after the second call to `setup_21:
computation_1 releases expensive_int_computation. Because it was the last observer, the sub-graph is released and previous evaluation results are dropped, evaluation happens in O(n).
computation_2 acquires expensive_int_computation that was released. The sub-graph is traversed for acquire O(n), and evaluation is done after, at least o(n).
An apparently innocuous change made a computation go from O(1) to o(n). This lack of commutativity is worrying not just for performance reason, but because the program typically does side-effects during acquire/release phase that might can lead to different behaviors.
The solution to this problem is to delay the release operation to not happend during evaluation cycle. Sub-graphs to release are put in a queue that is flushed after evaluation.
The release_queue
To this end, we introduce a new release_queue object. It accumulates node to release and, when deemed appropriate, actually release them. The when is ill-defined, as it might depend a lot on the application.
If a single document is observed it can happen at the end of the current evaluation cycle. But without stretching imagination too much, we can find other scenarios that have a different when:
Nested document evaluation. For instance, for a visual application, a frame can be made of different phases: layout, render, event propagation. Each phase can be done by evaluating a document, but the right granularity for a cycle is a full frame rather than a phase.
This is beyond the scope of Lwd, but Lwd should be flexible enough to integrate this use case well.
If a document is damaged during evaluation (as in the previous section), should we extend the evaluation cycle to the next fixed point? (Certain iterative layout algorithms might rely on that behavior) Release would thus be delayed a bit, but if the layout does not converge, this will turn into a memory leak.
If an exception is raised during evaluation of the graph, should the release queue be flushed before returning the exception to the caller?
If no, then we might introduce memory leaks: as long as the computation fail, the release queue will grow.
If yes, another can of worm opens:
Arbitrary code is executed during release. This can clobber the exception backtrace, and this code can itself raises. We know have multiple exceptions to report to the user! (This is the same problem as a try-finally function when the finally clause raises.)
Commutativity is lost again: we might release a sub-tree that would not have been released if the computation had finished. Maybe the caller was expecting the exception and will fix the problem and resume the computation.
Forbidding exception in Lwd is not acceptable either: there are valid use cases for exception, sometime code can legitimately fails and interrupting the computation is right. Lwd should do its best to handle these situations gracefully.
For all these reasons, it comes with a default behavior that does not require fiddling with release_queues and is well-behaved and commutative as long as no exception is raised.
When exceptions are raised, it defaults to releasing before returning control to the caller (the "If yes" clause above). The exception is wrapped with a decoration that:
captures the backtrace of the first exception
collect other exceptions that might have happened when flushing the queue.
All this can be overriden by providing a custom release_queue and catching exceptions in the caller.
A release_queue accumulates nodes to release and release all of them when flush_release_queue is called.
If releasing raises an exception (that will be in the user-provided release function of a primitive), the backtrace and the exception are captured and returned.
These failures are collected in a list. Normal execution will return the empty list.
Nodes observability can change only during calls to sample or release.
The release_queue is filled with the nodes to release and nothing is released during evaluation.
If an exception is raised when sampling, the exception is not intercepted and evaluation will resume on the next call to sample.
The easier quick_ functions are provided if you don't want to be bothered with release management. However, their behavior is subtler in presence of exception.
quick_sample release nodes immediately after the evaluation.
Exceptions raised during release are caught and the Release_failure exception is thrown at the end with all the exceptions.
If an exception is raised during the evaluation, it is intercepted and the queue is still flushed. It is reraised after unless another exception happens during release, in which case Release_failure is thrown with the original exception stored in the first parameter.
In quick_release, if exceptions happen during release they are caught and wrapped in the Release_failure exception.
Nodes are released when they are no longer being observed. That happens in two situations:
- When `release` is called on a root. The graph is traversed and all nodes are marked as no longer being observed.
- In a `join` or `bind` node, when the inner graph changes, the previous one is released.
Releasing a sub-graph is a significant operation, as it traverses the whole sub-graph and can invoke arbitrary user functions (the `release` callback of primitives).
# Release behavior should commute
Badly implemented, it lead to code that depends on evaluation order:
```ocaml
let computation_1 = Lwd.var (Lwd.pure 0)
let computation_2 = Lwd.var (Lwd.pure 0)
let expensive_int_computation : int Lwd.t =
...
let diff : int Lwd.t =
Lwd.map2 (-)
(Lwd.join (Lwd.get computation_1))
(Lwd.join (Lwd.get computation_2))
let root = Lwd.observe diff
```
Now assume we are observing the `diff` node and that we are switching back-and-forth between these two configurations:
```ocaml
let setup_1 () =
Lwd.set computation_1 (Lwd.pure 0);
Lwd.set computation_2 expensive_int_computation
let setup_2 () =
Lwd.set computation_1 expensive_int_computation;
Lwd.set computation_2 (Lwd.pure 0)
```
Because of the left-to-right evaluation order, the call to `map2` evaluates `computation_1` first and then `computation_2`.
Let's trace the following sequence of calls:
```ocaml
let r0 = Lwd.sample root
let () = setup_1 ()
let r1 = Lwd.sample root
let () = setup_2 ()
let r2 = Lwd.sample root
let () = setup_1 ()
let r3 = Lwd.sample root
```
Evaluating after the first call to `setup_1`:
- `computation_1` does not do anything special
- `computation_2` causes `expensive_int_computation` to be observed for the first time, the graph is traversed and each node is update
Evaluating after the first call to `setup_2`:
- `computation_1` also makes use of `expensive_int_computation`. Because it is already in use (since `computation_2` has not been updated yet), this requires no work, evaluation happens in $O(1)$.
- `computation_2` releases `expensive_int_computation`, but it is now used by `computation_1` , evaluation happens in $O(1)$.
Evaluating after the second call to `setup_21:
- `computation_1` releases `expensive_int_computation`. Because it was the last observer, the sub-graph is released and previous evaluation results are dropped, evaluation happens in $O(n)$.
- `computation_2` acquires `expensive_int_computation` that was released. The sub-graph is traversed for acquire $O(n)$, and evaluation is done after, at least $o(n)$.
An apparently innocuous change made a computation go from $O(1)$ to $o(n)$. This lack of commutativity is worrying not just for performance reason, but because the program typically does side-effects during `acquire`/`release` phase that might can lead to different behaviors.
The solution to this problem is to delay the `release` operation to not happend during evaluation cycle. Sub-graphs to release are put in a queue that is flushed after evaluation.
#### The release_queue
To this end, we introduce a new `release_queue` object. It accumulates node to release and, when deemed appropriate, actually release them. The *when* is ill-defined, as it might depend a lot on the application.
If a single document is observed it can happen at the end of the current evaluation cycle. But without stretching imagination too much, we can find other scenarios that have a different *when*:
- Nested document evaluation. For instance, for a visual application, a frame can be made of different phases: layout, render, event propagation. Each phase can be done by evaluating a document, but the right granularity for a cycle is a full frame rather than a phase.
This is beyond the scope of Lwd, but Lwd should be flexible enough to integrate this use case well.
- If a document is damaged during evaluation (as in the previous section), should we extend the evaluation cycle to the next fixed point? (Certain iterative layout algorithms might rely on that behavior) Release would thus be delayed a bit, but if the layout does not converge, this will turn into a memory leak.
- If an exception is raised during evaluation of the graph, should the release queue be flushed before returning the exception to the caller?
If no, then we might introduce memory leaks: as long as the computation fail, the release queue will grow.
If yes, another can of worm opens:
- Arbitrary code is executed during release. This can clobber the exception backtrace, and this code can itself raises. We know have multiple exceptions to report to the user! (This is the same problem as a try-finally function when the finally clause raises.)
- Commutativity is lost again: we might release a sub-tree that would not have been released if the computation had finished. Maybe the caller was expecting the exception and will fix the problem and resume the computation.
Forbidding exception in Lwd is not acceptable either: there are valid use cases for exception, sometime code can legitimately fails and interrupting the computation is right. Lwd should do its best to handle these situations gracefully.
For all these reasons, it comes with a default behavior that does not require fiddling with `release_queue`s and is well-behaved and commutative as long as no exception is raised.
When exceptions are raised, it defaults to releasing before returning control to the caller (the "If yes" clause above). The exception is wrapped with a decoration that:
- captures the backtrace of the first exception
- collect other exceptions that might have happened when flushing the queue.
All this can be overriden by providing a custom `release_queue` and catching exceptions in the caller.
# New implementation
## release_queue object
```ocaml
type release_failure = exn * Printexc.raw_backtrace
type release_queue
val make_release_queue : unit -> release_queue
val flush_release_queue : release_queue -> release_failure list
```
A `release_queue` accumulates nodes to release and release all of them when `flush_release_queue` is called.
If releasing raises an exception (that will be in the user-provided `release` function of a primitive), the backtrace and the exception are captured and returned.
These failures are collected in a list. Normal execution will return the empty list.
## Sampling and releasing with custom `queue`
```ocaml
val sample : release_queue -> 'a root -> 'a
val release : release_queue -> 'a root -> unit
```
Nodes observability can change only during calls to `sample` or `release`.
The `release_queue` is filled with the nodes to release and nothing is released during evaluation.
If an exception is raised when sampling, the exception is not intercepted and evaluation will resume on the next call to `sample`.
```ocaml
exception Release_failure of exn option * release_failure list
val quick_sample : 'a root -> 'a
val quick_release : 'a root -> unit
```
The easier `quick_` functions are provided if you don't want to be bothered with release management. However, their behavior is subtler in presence of exception.
`quick_sample` release nodes immediately after the evaluation.
Exceptions raised during release are caught and the `Release_failure` exception is thrown at the end with all the exceptions.
If an exception is raised during the evaluation, it is intercepted and the queue is still flushed. It is reraised after unless another exception happens during release, in which case `Release_failure` is thrown with the original exception stored in the first parameter.
In `quick_release`, if exceptions happen during release they are caught and wrapped in the `Release_failure` exception.
def
changed title from WIP: release-batch to release-queue3 years ago
Nodes are released when they are no longer being observed. That happens in two situations:
release
is called on a root. The graph is traversed and all nodes are marked as no longer being observed.join
orbind
node, when the inner graph changes, the previous one is released.Releasing a sub-graph is a significant operation, as it traverses the whole sub-graph and can invoke arbitrary user functions (the
release
callback of primitives).Release behavior should commute
Badly implemented, it lead to code that depends on evaluation order:
Now assume we are observing the
diff
node and that we are switching back-and-forth between these two configurations:Because of the left-to-right evaluation order, the call to
map2
evaluatescomputation_1
first and thencomputation_2
.Let's trace the following sequence of calls:
Evaluating after the first call to
setup_1
:computation_1
does not do anything specialcomputation_2
causesexpensive_int_computation
to be observed for the first time, the graph is traversed and each node is updateEvaluating after the first call to
setup_2
:computation_1
also makes use ofexpensive_int_computation
. Because it is already in use (sincecomputation_2
has not been updated yet), this requires no work, evaluation happens inO(1)
.computation_2
releasesexpensive_int_computation
, but it is now used bycomputation_1
, evaluation happens inO(1)
.Evaluating after the second call to `setup_21:
computation_1
releasesexpensive_int_computation
. Because it was the last observer, the sub-graph is released and previous evaluation results are dropped, evaluation happens inO(n)
.computation_2
acquiresexpensive_int_computation
that was released. The sub-graph is traversed for acquireO(n)
, and evaluation is done after, at leasto(n)
.An apparently innocuous change made a computation go from
O(1)
too(n)
. This lack of commutativity is worrying not just for performance reason, but because the program typically does side-effects duringacquire
/release
phase that might can lead to different behaviors.The solution to this problem is to delay the
release
operation to not happend during evaluation cycle. Sub-graphs to release are put in a queue that is flushed after evaluation.The release_queue
To this end, we introduce a new
release_queue
object. It accumulates node to release and, when deemed appropriate, actually release them. The when is ill-defined, as it might depend a lot on the application.If a single document is observed it can happen at the end of the current evaluation cycle. But without stretching imagination too much, we can find other scenarios that have a different when:
This is beyond the scope of Lwd, but Lwd should be flexible enough to integrate this use case well.
If no, then we might introduce memory leaks: as long as the computation fail, the release queue will grow.
If yes, another can of worm opens:
Forbidding exception in Lwd is not acceptable either: there are valid use cases for exception, sometime code can legitimately fails and interrupting the computation is right. Lwd should do its best to handle these situations gracefully.
For all these reasons, it comes with a default behavior that does not require fiddling with
release_queue
s and is well-behaved and commutative as long as no exception is raised.When exceptions are raised, it defaults to releasing before returning control to the caller (the "If yes" clause above). The exception is wrapped with a decoration that:
All this can be overriden by providing a custom
release_queue
and catching exceptions in the caller.New implementation
release_queue object
A
release_queue
accumulates nodes to release and release all of them whenflush_release_queue
is called.If releasing raises an exception (that will be in the user-provided
release
function of a primitive), the backtrace and the exception are captured and returned.These failures are collected in a list. Normal execution will return the empty list.
Sampling and releasing with custom
queue
Nodes observability can change only during calls to
sample
orrelease
.The
release_queue
is filled with the nodes to release and nothing is released during evaluation.If an exception is raised when sampling, the exception is not intercepted and evaluation will resume on the next call to
sample
.The easier
quick_
functions are provided if you don't want to be bothered with release management. However, their behavior is subtler in presence of exception.quick_sample
release nodes immediately after the evaluation.Exceptions raised during release are caught and the
Release_failure
exception is thrown at the end with all the exceptions.If an exception is raised during the evaluation, it is intercepted and the queue is still flushed. It is reraised after unless another exception happens during release, in which case
Release_failure
is thrown with the original exception stored in the first parameter.In
quick_release
, if exceptions happen during release they are caught and wrapped in theRelease_failure
exception.WIP: release-batchto release-queue 3 years ago454562301a
.