[Bug]: Overlapping contexts #1887

mladedav · 2024-06-17T21:38:09Z

What happened?

If ContextGuards are dropped out of order, the state may be corrupted. Each ContextGuard restores the last context but that may cause a context different than the root one to be active after all ContextGuards are dropped.

The current code assumes that context guards will be dropped in reverse order of their creation. I didn't see anything in the spec saying that this should or shouldn't be assumed. I think this whole report mostly boils to what should be the behavior and whether this should be allowed or the API should be built so that these out-of-order operations were impossible to trigger.

The following is slightly changed opentelemetry::context::tests::nested_contexts.

    #[test]
    fn overlapping_contexts() {
        #[derive(Debug, PartialEq)]
        struct ValueA(&'static str);
        #[derive(Debug, PartialEq)]
        struct ValueB(u64);
        let outer_guard = Context::new().with_value(ValueA("a")).attach();

        // Only value `a` is set
        let current = Context::current();
        assert_eq!(current.get(), Some(&ValueA("a")));
        assert_eq!(current.get::<ValueB>(), None);

        let inner_guard = Context::current_with_value(ValueB(42)).attach();
        // Both values are set in inner context
        let current = Context::current();
        assert_eq!(current.get(), Some(&ValueA("a")));
        assert_eq!(current.get(), Some(&ValueB(42)));

        assert!(Context::map_current(|cx| {
            assert_eq!(cx.get(), Some(&ValueA("a")));
            assert_eq!(cx.get(), Some(&ValueB(42)));
            true
        }));

        drop(outer_guard);

        let current = Context::current();
        assert_eq!(current.get::<ValueA>(), None);
        assert_eq!(current.get::<ValueB>(), None);
        // `inner_guard` is still alive so both `ValueA` and `ValueB` should still be accessible? Not sure about this one.


        drop(inner_guard);

        let current = Context::current();
        assert_eq!(current.get(), Some(&ValueA("a")));
        assert_eq!(current.get::<ValueB>(), None);
        // Both guards are dropped and neither value should be accessible.
    }

API Version

0.23.0, git master

SDK Version

N/A

What Exporter(s) are you seeing the problem on?

N/A

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

lalitb · 2024-06-18T16:37:04Z

We still don't know the right behavior in this scenario. Would be looking in this. Also, @mladedav if you would like to help here.

mladedav · 2024-06-18T17:55:42Z

We still don't know the right behavior in this scenario

When you say still, does that mean this has been discussed before?

I'll try to ask in the spec repo then what should be the bahavior.

lalitb · 2024-06-18T21:00:53Z

@mladedav Sorry for the confusion. I wanted to say that even though the specs is clear about not allowing the incorrect ordering of context detach, if you have suggestion on how to fix the code.

mladedav · 2024-06-18T21:17:35Z

As I read it kind of allows it but we should "identify wrong call order" but it doesn't really say what context should be active after the operation. Since the implementation uses a drop guard, there is no way to return an error.

But other implementations should have the same issue with try/finally or with blocks, do we know how do they handle it?

Since the API is optional, I wonder if not providing the API at all would be an option and maybe providing methods such as Context::set_active<T>(self, fun: impl FnOnce() -> T) -> T { /* runs fun with self as active context */ } instead which prevent this misuse. But I guess that that's steering too for away from the spec?

Otherwise, I guess that the contexts could work as a linked list and if we drop guard for one in the middle we just reconnect its child with its parent but leave the active span as is. Does that sound good?

bantonsson · 2024-11-29T09:49:08Z

Hey @lalitb @mladedav I'm running into this issue as well when looking at tokio tracing and OpenTelemetry Tracing interop. As a reference the Java implementation just drops the offending guard on the floor and logs at the FINE level which I would equate to DEBUG.

Would this behavior be adequate (yes the state would still be corrupted in some sense, but it is bad behavior), or should we have a proper list/stack and reconnect things?

mladedav · 2024-11-29T11:28:37Z

If I understand it correctly, it ignores the close call? I don't think that really works since if you have contexts A, B, C, then drop B and that close gets ignored, you can never really close the B context and by extension the A context.

bantonsson · 2024-11-29T12:07:17Z

@mladedav Yes, that's exactly what the Java one does, and that leaves you with a corrupted state. I also think that the proper way would be the better. I'll look into it.

mladedav added bug Something isn't working triage:todo Needs to be traiged. labels Jun 17, 2024

lalitb removed the triage:todo Needs to be traiged. label Jun 18, 2024

mladedav mentioned this issue Jun 18, 2024

Can active Context scopes be closed in arbitrary order? open-telemetry/opentelemetry-specification#4081

Closed

bantonsson linked a pull request Dec 3, 2024 that will close this issue

fix: Allow overlapping context scopes #2378

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Overlapping contexts #1887

[Bug]: Overlapping contexts #1887

mladedav commented Jun 17, 2024 •

edited

Loading

lalitb commented Jun 18, 2024

mladedav commented Jun 18, 2024

lalitb commented Jun 18, 2024

mladedav commented Jun 18, 2024

bantonsson commented Nov 29, 2024 •

edited

Loading

mladedav commented Nov 29, 2024

bantonsson commented Nov 29, 2024

[Bug]: Overlapping contexts #1887

[Bug]: Overlapping contexts #1887

Comments

mladedav commented Jun 17, 2024 • edited Loading

What happened?

API Version

SDK Version

What Exporter(s) are you seeing the problem on?

Relevant log output

lalitb commented Jun 18, 2024

mladedav commented Jun 18, 2024

lalitb commented Jun 18, 2024

mladedav commented Jun 18, 2024

bantonsson commented Nov 29, 2024 • edited Loading

mladedav commented Nov 29, 2024

bantonsson commented Nov 29, 2024

mladedav commented Jun 17, 2024 •

edited

Loading

bantonsson commented Nov 29, 2024 •

edited

Loading