Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make handler spans more accurate, re-add connection span #276

Merged
merged 6 commits into from
Oct 21, 2024

Conversation

jackyzha0
Copy link
Member

@jackyzha0 jackyzha0 commented Oct 18, 2024

Why

  • a lot of proc handler specific log messages didnt get picked up by the right tracer
  • move proc stream <> handler proxy messages into the handler span
  • move the invoked message into the client handler span
  • re-add connection telemetry
  • span linking between procs/handlers and their originating sessions
  • we use https://www.npmjs.com/package/@opentelemetry/context-async-hooks so context is tracked safely in an async context so we should leverage that

What changed

Versioning

  • Breaking protocol change
  • Breaking ts/js API change

@jackyzha0 jackyzha0 requested a review from a team as a code owner October 18, 2024 06:36
@jackyzha0 jackyzha0 requested review from Monkatraz, lhchavez and blast-hardcheese and removed request for a team October 18, 2024 06:36
@jackyzha0 jackyzha0 changed the title infer telemetry info from active span rather than threading make handler spans more accurate Oct 18, 2024
@jackyzha0 jackyzha0 changed the title make handler spans more accurate make handler spans more accurate, re-add connection span Oct 18, 2024
const createdSpan = handlerMock.mock.calls[0][0];
// @ts-expect-error: hacking to get parentSpanId
expect(createdSpan.parentSpanId).toBe(span.spanContext().spanId);
});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this didn't actually test anything useful and we dont have an easy way to create a mock session to replicate this so i ended up just yoinking it, the integration test below tests more of the flow

Copy link

@blast-hardcheese blast-hardcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sensible, one comment

if (metadata && !metadata.telemetry) {
const span = trace.getSpan(context.active());
if (span) {
metadata.telemetry = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not expect an innocuous logging function to modify the metadata here. How are these metadata created? Can we just be more strict instead of trying to heal post-hoc? Removing optionality increases intelligibility

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this metadata are manually injected at logging callsites, theres a lot and we kept finding places where we didnt inject the right telemetry :(

i thought it would be easier to 'do the right thing' and follow the open telemetry suggestion of just always logging the active telemetry

this doesnt log anything if its not within an active span

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also means that any logging inside a user-written handler also gets the right span associations

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wouldn't expect it, is the concern that they're too expensive to preallocate or what?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no more so its hard to remember to inject the telemetry info at every place we care about it so i thought it made sense to do it implicitly because the implicit approach will always have the right telemetry

is the worry here that we are modifying the param or that its just doing patching? we do some cleaning here already below this (stripping the payload)

Copy link

@blast-hardcheese blast-hardcheese Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the underlying concern is the implicit complexity. In a logging method, we take an optional parameter with an optional member, then we might write that field before returning back.

A logging function definitely doesn't seem to be the right place to be doing any of that work.

Ultimately it's not the end of the world if we go this way, but I'd much rather solve this by making it easier to create and manage the MessageMetadata instead of offering important functionality for free invisibly just by calling log. If that's easy, then the span fields could even be marked required and defensive complexity can be delegated to where it belongs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of note, this is exactly how most third-party logging libraries operate: they add a global hook where they perform this same operation of massaging the metadata, only to avoid callers having to reason about adding metadata explicitly so that all logs get the tracing metadata.

@@ -263,6 +267,12 @@ class RiverServer<Services extends AnyServiceSchemaMap>
id: sessionId,
} = initialSession;

// dont use the session span here, we want to create a new span for the procedure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there still a way to link them (maybe starting an async span or using links)? or is it not desirable to begin with? (if it's the latter, maybe it's worth expanding the code comment explaining the rationale / consequences)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! i can add a comment but we add the link already in span itself, see createProcTelemetryInfo

if (metadata && !metadata.telemetry) {
const span = trace.getSpan(context.active());
if (span) {
metadata.telemetry = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of note, this is exactly how most third-party logging libraries operate: they add a global hook where they perform this same operation of massaging the metadata, only to avoid callers having to reason about adding metadata explicitly so that all logs get the tracing metadata.

@jackyzha0 jackyzha0 merged commit b32d9f7 into main Oct 21, 2024
4 checks passed
@jackyzha0 jackyzha0 deleted the jackyzha0/implicit-async-context-telem branch October 21, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants