Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Overlapping contexts #1887

Open
mladedav opened this issue Jun 17, 2024 · 7 comments · May be fixed by #2378
Open

[Bug]: Overlapping contexts #1887

mladedav opened this issue Jun 17, 2024 · 7 comments · May be fixed by #2378
Labels
bug Something isn't working

Comments

@mladedav
Copy link

mladedav commented Jun 17, 2024

What happened?

If ContextGuards are dropped out of order, the state may be corrupted. Each ContextGuard restores the last context but that may cause a context different than the root one to be active after all ContextGuards are dropped.

The current code assumes that context guards will be dropped in reverse order of their creation. I didn't see anything in the spec saying that this should or shouldn't be assumed. I think this whole report mostly boils to what should be the behavior and whether this should be allowed or the API should be built so that these out-of-order operations were impossible to trigger.

The following is slightly changed opentelemetry::context::tests::nested_contexts.

    #[test]
    fn overlapping_contexts() {
        #[derive(Debug, PartialEq)]
        struct ValueA(&'static str);
        #[derive(Debug, PartialEq)]
        struct ValueB(u64);
        let outer_guard = Context::new().with_value(ValueA("a")).attach();

        // Only value `a` is set
        let current = Context::current();
        assert_eq!(current.get(), Some(&ValueA("a")));
        assert_eq!(current.get::<ValueB>(), None);

        let inner_guard = Context::current_with_value(ValueB(42)).attach();
        // Both values are set in inner context
        let current = Context::current();
        assert_eq!(current.get(), Some(&ValueA("a")));
        assert_eq!(current.get(), Some(&ValueB(42)));

        assert!(Context::map_current(|cx| {
            assert_eq!(cx.get(), Some(&ValueA("a")));
            assert_eq!(cx.get(), Some(&ValueB(42)));
            true
        }));

        drop(outer_guard);

        let current = Context::current();
        assert_eq!(current.get::<ValueA>(), None);
        assert_eq!(current.get::<ValueB>(), None);
        // `inner_guard` is still alive so both `ValueA` and `ValueB` should still be accessible? Not sure about this one.


        drop(inner_guard);

        let current = Context::current();
        assert_eq!(current.get(), Some(&ValueA("a")));
        assert_eq!(current.get::<ValueB>(), None);
        // Both guards are dropped and neither value should be accessible.
    }

API Version

0.23.0, git master

SDK Version

N/A

What Exporter(s) are you seeing the problem on?

N/A

Relevant log output

No response

@mladedav mladedav added bug Something isn't working triage:todo Needs to be traiged. labels Jun 17, 2024
@lalitb lalitb removed the triage:todo Needs to be traiged. label Jun 18, 2024
@lalitb
Copy link
Member

lalitb commented Jun 18, 2024

We still don't know the right behavior in this scenario. Would be looking in this. Also, @mladedav if you would like to help here.

@mladedav
Copy link
Author

We still don't know the right behavior in this scenario

When you say still, does that mean this has been discussed before?

I'll try to ask in the spec repo then what should be the bahavior.

@lalitb
Copy link
Member

lalitb commented Jun 18, 2024

@mladedav Sorry for the confusion. I wanted to say that even though the specs is clear about not allowing the incorrect ordering of context detach, if you have suggestion on how to fix the code.

@mladedav
Copy link
Author

As I read it kind of allows it but we should "identify wrong call order" but it doesn't really say what context should be active after the operation. Since the implementation uses a drop guard, there is no way to return an error.

But other implementations should have the same issue with try/finally or with blocks, do we know how do they handle it?

Since the API is optional, I wonder if not providing the API at all would be an option and maybe providing methods such as Context::set_active<T>(self, fun: impl FnOnce() -> T) -> T { /* runs fun with self as active context */ } instead which prevent this misuse. But I guess that that's steering too for away from the spec?

Otherwise, I guess that the contexts could work as a linked list and if we drop guard for one in the middle we just reconnect its child with its parent but leave the active span as is. Does that sound good?

@bantonsson
Copy link
Contributor

bantonsson commented Nov 29, 2024

Hey @lalitb @mladedav I'm running into this issue as well when looking at tokio tracing and OpenTelemetry Tracing interop. As a reference the Java implementation just drops the offending guard on the floor and logs at the FINE level which I would equate to DEBUG.

Would this behavior be adequate (yes the state would still be corrupted in some sense, but it is bad behavior), or should we have a proper list/stack and reconnect things?

@mladedav
Copy link
Author

If I understand it correctly, it ignores the close call? I don't think that really works since if you have contexts A, B, C, then drop B and that close gets ignored, you can never really close the B context and by extension the A context.

@bantonsson
Copy link
Contributor

@mladedav Yes, that's exactly what the Java one does, and that leaves you with a corrupted state. I also think that the proper way would be the better. I'll look into it.

@bantonsson bantonsson linked a pull request Dec 3, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants