Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to get the current span #101

Closed
wants to merge 1 commit into from
Closed

Add a way to get the current span #101

wants to merge 1 commit into from

Conversation

rossabaker
Copy link
Member

I think this is something we want.

  • opentelemetry-java offers it in the form of Span.current()
  • Natchez hides the span and flatMaps, but does similar through Trace

I don't know whether it's performant or correct, but I see it in my toy example. If it's wanted, we can test and refine.

@@ -137,6 +142,7 @@ object Tracer {
private val builder = SpanBuilder.noop(noopBackend)
private val resourceUnit = Resource.unit[F]
val meta: Meta[F] = Meta.disabled
val currentSpan: F[Span[F]] = builder.startUnmanaged
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, we can use Applicative[F].pure(Span.fromBackend(noopBackend))

Comment on lines +31 to +34
/** Returns the span from the scope, falling back to a noop if none is
* available.
*/
def currentSpan: F[Span[F]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ship may have already sailed, but this is the signature that I think prevents using Kleisli (instead of IOLocal)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that ship sailed with the Resource[F, Span[F]] that are already exposed via TracerMacro.

This is why Natchez's resource is a natural transformation: typelevel/natchez#526. I do agree we need to decide on #88 before moving forward.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like, is this a new API, or is it natchez-with-lazy-macros?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the signature that I think prevents using Kleisli (instead of IOLocal)

So actually I'm not sure if this is true now. After getting very lost in typelevel/natchez#713 (and dragging poor @bpholt into my delusions 😅 ) I realized, isn't that PR doing exactly this? i.e. making it possible to access the Span[F] within F, without compromising a Kleisli implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went down that same road for a while with Natchez and turned around. I had a hard time with the inner F being a Kleisli, however much I wanted the outer F to be a Kleisli, and I didn't want to introduce a G.

Note that in our case, the local environment is not a Span[F], but a (right now package-private) TraceScope.Scope.

@@ -33,6 +34,14 @@ private[java] class TracerImpl[F[_]: Sync](
val meta: Tracer.Meta[F] =
Tracer.Meta.enabled

def currentSpan: F[Span[F]] =
scope.current.map {
case TraceScope.Scope.Span(_, jSpan, ctx) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check that the context is also valid if ctx.isValid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so.

@NthPortal
Copy link
Contributor

NthPortal commented Oct 23, 2023

this PR is a decent bit out of date at this point, but I think it's an important feature that needs to move forwards. I've already run into some awkwardness migrating other libraries from natchez to otel4s, and a solution would be very helpful. that said, I have a few concerns about the design.

while the Java otel library has decided that falling back to a noop span is an acceptable default, I'm not so sure we should have that as implicit behavior (does the otel spec require it?). if one doesn't realise it falls back to noop, they may accidentally lose all their tracing in a confusing and difficult to debug way. I'd like to propose the following alternative API:

def currentSpan: F[Option[Span[F]]]

// can probably implement this easily using `Tracer#meta`
def currentSpanOrNoop: F[Span[F]]

// raises an exception if no current span
def currentSpanUnsafe(implicit F: MonadCancelThrow[F]): F[Span[F]] =
  currentSpan.flatMap {
    _.fold(F.raiseError(new IllegalStateException("no current span")))(F.pure)
  }

thoughts?

@armanbilge
Copy link
Member

Bikeshed: currentSpanUnsafe should be currentSpanOrRaise. "Unsafe" gives non-pure/RT vibes.

Not sure about IllegalStateException vs e.g. NoSuchElementException, but it doesn't seem unreasonable.

@iRevive
Copy link
Contributor

iRevive commented Oct 24, 2023

I agree that this API should be available in the library.

Overall, I have a feeling that providing three nearly identical options is an overkill. But we still need to decide which one is more suitable F[Span[F]] or F[Option[Span[F]]].

F[Option[Span[F]]] feels more natural in the Scala world. On the other hand, it all depends on the use cases. I might be wrong, but it seems that the fraction of cases when we care about the validity of the span is relatively tiny.

For example, in which scenarios do we want to change the behavior of the program if the span is undefined?

Here are the operations we can do with the active span:

Read

Available operations:

  1. get span context

The possible use case that comes to my mind is adding span details to the logs to correlate logs with spans.

Write

Available operations:

  1. add attribute
  2. add event
  3. record exception
  4. set status
  5. update span name

For the write operations, in most cases, we shouldn't care whether the span is valid. All we want to do is:

for {
  span <- Tracer[F].currentSpan
  _    <- span.setStatus(Status.Error)
  _    <- span.addAttribute(Attribute("failure", "reason"))
} yield ()

As you mentioned:

if one doesn't realise it falls back to noop, they may accidentally lose all their tracing in a confusing and difficult to debug way

That's correct, but it also means the context wasn't propagated to the fiber in the very first place. So, even if we use the F[Option[Span[F]]] API instead, the outcome will be the same: the details won't be recorded:

for {
  spanOpt <- Tracer[F].currentSpan
  _       <- spanOpt.traverse(span => span.setStatus(Status.Error) *> span.addAttribute(Attribute("failure", "reason")))
} yield ()

The situation will be different if we prefer currentSpanOrRaise. However, how often do we want to interrupt the execution of the business logic because the tracing information is missing?

Terminate

  1. end span

That's a tricky one. We shouldn't do it explicitly in most cases. Even if we do, and the span is invalid, there are two options:

  1. the context wasn't propagated to the fiber - well, it's bad, we should at least log the error
  2. the implementation is noop - it's okay no problem here

From what I see, F[Span[F]] could be more convenient. But again, it's only my opinion. It would be nice to gather more feedback from active users.

Also, the program's behavior will change according to the selected implementation. For example, if we use a noop implementation, the Tracer[F].currentSpanOrRaise will always raise an error.

On a side note, If we want to ensure that the span is valid, we can add a utility method to the personal codebase:

for {
  span <- Tracer[F].currentSpan
  _    <- MonadThrow[F].raiseUnless(span.spanContext.isValid)(new RuntimeException("Where are my traces?"))
} yield ()

That way, we will have the exact error we need.

@NthPortal
Copy link
Contributor

it also means the context wasn't propagated to the fiber in the very first place. So, even if we use the F[Option[Span[F]]] API instead, the outcome will be the same: the details won't be recorded

from my perspective, currentSpan(OrRaise?) should exist purely so that you don't need to explicitly pass a span around; consequently, you MUST only call it when you know you're already within a span. as far as I'm concerned, you always need to be creating a child span inside a call to joinOrRoot, not just to guarantee that you have a span, but for clearer tracing so you know where the division of different locations/machines is. and in that case, you are guaranteed to then be in a span.

For example, if we use a noop implementation, the Tracer[F].currentSpanOrRaise will always raise an error.

this was my first instinct as well, but is incorrect (and should be clarified in docs). the current span may be noop, whether due to using the noop implementation or a call to Tracer#noopScope. so Tracer.noop.currentSpanOrRaise will always return a noop span, and never raise an exception. similarly, an Option-returning method would always return Some.


to me, currentSpanOrRaise (perhaps even OrThrow) as the sole option seems the most reasonable, because it makes it clear to the user that they MUST already be in a span, and that using it otherwise is a programming error. this is also the reason for my choice of IllegalStateException—you were only supposed to call this inside a span, and you made a programming mistake. and if you don't know if you're in a span, call Tracer#span or Tracer#spanBuilder instead

@iRevive
Copy link
Contributor

iRevive commented Oct 25, 2023

Valid points.

We have def currentSpanContext: F[Option[SpanContext]] in the Tracer. It returns Some when a) span exists; b) span is valid. And None otherwise.

Even though it's less handy in some cases, we can follow the same pattern with the current span:

trait Tracer[F[_]] {
  def currentSpan: F[Option[Span[F]]] // returns Some when span is valid and None otherwise
  def currentSpanOrRaise: F[Span[F]]  // always returns a valid span or throws an exception
}

Pros:

  • We clearly indicate the span may be missing from the signature standpoint
  • You can go yolo if you want

P.s. I would also make a dedicated exception, so the error handling would be easier:

abstract class SpanExtractionException(message: String) extends RuntimeException(message)
object SpanExtractException {
  // span exists but it's invalid
  final class InvalidSpan(...) extends SpanExtractionException(....)
  // span does not exist at all
  final class MissingSpan(...) extends SpanExtractionException(....)
}

WDYT?

@NthPortal
Copy link
Contributor

I'm not sure that we're on the same page, so let's take a step back for a moment.

What is the intended purpose of a currentSpan-type method? what are its intended usecases?

@iRevive
Copy link
Contributor

iRevive commented Oct 26, 2023

What is the intended purpose of a currentSpan-type method? what are its intended usecases?

At $work we have the following flow:

  • Application has several layers: HTTP handler, validation, business layer, db layer, etc
  • We start and end spans via Tracer[F].spanBuilder(...) inside of a http4s middleware
  • During the request processing we also create child spans (each layer creates at least a few)

An oversimplified example is:

HttpRoutes.of {
  case req @ POST ... ->  
    for {
      _ <- Tracer[F].span("validate").surround(validateRequest(req))
      _ <- Tracer[F].span("process").surround(process(req))
      _ <- Tracer[F].span("notify").surround(notify(req))
      ...
      response <- Tracer[F].span("buildResponse").surround(buildResponse(req))
    } yield response
}

What we do

We add tracing information to logs in some places (where we don't want or cannot pass Span[F] explicitly):

for {
  currentCtxOpt <- Tracer[F].currentSpanContext
  ctx = currentCtxOpt.map(ctx => Map("trace_id" -> ctx.traceId, "span_id" -> ctx.spanId)).getOrElse(Map.empty)
  _ <- logger.info("Doing some work", ctx)
} yield ()

What we would like to do

Add attributes and events to a span without passing it explicitly/implicitly to the methods somewhere in the deep of the call chain:

for {
  span <- Tracer[F].currentSpan
  _    <- span.foldMapM(_.addAttribute(Attribute("processing_status", status)))
} yield

@NthPortal
Copy link
Contributor

before I forget, I'd like to note another concern: currentSpan arguably should not expose end, as that seems to almost be a security issue (arbitrary nested callers can end your spans prematurely). should we have an EndableSpan type that only startUnmanaged returns where end is exposed? depending how firmly we wish to enforce things, we could just leave it castable, or we could make a box type for it

@NthPortal
Copy link
Contributor

my usecase is for when you know that you're in a span, but you don't want to pass a Span instance around (potentially into deeply nested calls) or add it as a parameter to functions passed to you.


@iRevive for your usecase, would you prefer def currentSpan: F[Option[Span[F]]] or def currentSpanOrNoop: F[Span[F]]? and if the former, would you prefer it return F[None] for noop spans?

@iRevive
Copy link
Contributor

iRevive commented Oct 26, 2023

before I forget, I'd like to note another concern: currentSpan arguably should not expose end, as that seems to almost be a security issue (arbitrary nested callers can end your spans prematurely). should we have an EndableSpan type that only startUnmanaged returns where end is exposed? depending how firmly we wish to enforce things, we could just leave it castable, or we could make a box type for it

Funny enough, it was in the initial sketch of the tracing module. https://github.com/typelevel/otel4s/pull/35/files#diff-681669863e1a9be2e01af9b3c6379d95d6d5a1d20ede638c702325a35b00fc93R116-R138

We gave up on this design eventually due to various complications.

@NthPortal
Copy link
Contributor

NthPortal commented Oct 26, 2023

We gave up on this design eventually due to various complications.

perhaps I shall revive it. I'll open a new ticket for this so I don't derail this discussion (that's already on a PR and not a ticket) any more with this (done: #347)

@iRevive
Copy link
Contributor

iRevive commented Oct 27, 2023

1. def currentSpan: F[Option[Span[F]]]

Pros: interface parity with def currentSpanContext: F[Option[SpanContext]]

Cons: you need to fold an Option to do any operation - Tracer[F].currentSpan.flatMap(spanOpt => spanOpt.foldMapM(_.addEvent("my-event")))


2. def currentSpanOrNoop: F[Span[F]]

Pros: easy to use - Tracer[F].currentSpanOrNoop.flatMap(span => span.addEvent("my-event"))


for your usecase, would you prefer def currentSpan: F[Option[Span[F]]] or def currentSpanOrNoop: F[Span[F]]?

I would prefer def currentSpanOrNoop: F[Span[F]].

would you prefer it return F[None] for noop spans?

There are three possible states, am I right?:

  1. Some(span) - span exists and is valid
  2. Some(span) - span exists and is invalid
  3. None - span does not exist in the context / noop spans

So it's ok to return None in this case.

@NthPortal
Copy link
Contributor

I think currentSpanOrNoop appears to serve your usecase perfectly and mine... hopefully well enough. we can start there and add additional methods later if we find they're needed. when in doubt, leave it out

@NthPortal NthPortal mentioned this pull request Oct 27, 2023
2 tasks
@NthPortal
Copy link
Contributor

closed due to staleness and in favour of #349

@NthPortal NthPortal closed this Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants