“Fine grain” tracing with Thundra

Emrah Şamdan
6 min readAug 26, 2020

Thundra provides rich auto-tracing out of the box to instrument AWS SDKs, infrastructure, and Lambda. Engineers can enhance the traces to provide additional business context. This enhancement is referred to as “logical” or custom spans, which help end users debug and understand performance in the context of business operations. Logical tracing isn’t difficult, but it does require custom instrumentation.

In this article, you’ll learn how logical spans are instrumented to provide additional business and operation context, when and how to perform the instrumentation, and how fine-grain trace spans are different from span tags. Finally, we’ll see a demonstration of instrumenting an application and observing it using fine-grain traces.

Logical Spans

Logical spans record notable business events, often performed by businesses for or on behalf of users. Common business events include:

  • User logged in
  • User changed settings
  • Purchase
  • Cart action

Business events are the touchpoint between a business and its customers. The act of capturing, and representing, these events using distributed tracing is called a logical span (or event). Logical tracing is the act of instrumenting code for logical spans, with the logical span representing the business event.

Figure 1. Traces without and with logical spans

It isn’t possible for tooling to automatically know which events are important to any given business. This means that logical spans require manual instrumentation (covered later). Logical spans comprise a group of concrete actions. These actions are “how” business value is accomplished, and logical spans represent the “what”: the actions the company is executing on behalf of or for the customer.

Why Logical Spans?

Logical spans help in both debugging and in understanding the client experience. They connect the business with the client and the “how” with the “what.” Infrastructure, serverless, programming languages, and database technologies are implementation details, and most clients don’t care which technologies are used as long as the product is up and the technologies don’t adversely affect pricing. Clients only care about getting value from your product, and logical spans help companies observe how the implementation details are functioning to deliver that value. Here are some ways that logical spans help businesses to understand their client’s experience and ensure that the client is having a positive experience.

Understanding Common User Actions

Logical spans help to document current user actions. Engineers translate the original product specifications, the “what,” into the “how.” During this translation, the original “what” is often lost.

Consider a company that provides authentication for its users; the feature that “logs users in” may have an implementation like this:

  1. Validate input.
  2. Check the session store for session.
  3. Authenticate username/password.
  4. Save session.
  5. Set cookie.
  6. Return to the user.

Logical spans are explicit instrumentation named after the client event, in this case, login. Logical spans group these “how” events together, name them, document them for a cohesive action on behalf of the user, and measure them.

Aiding in Debugging

Capturing user events as logical spans aids in debugging. Logical spans can add metadata in the form of trace tags and error results. Instead of “redis” GET failing, the logical trace will display the business action “login error.” By tagging logical spans with errors, an engineer can first check if the user action is working (has an error? Yes or no) and if it’s not working, begin debugging by drilling into the implementation details.

Logical spans aid in debugging by adding a level of abstraction. They’re a lot like a car dashboard, which shows a handful of lights, including the “check engine” light and the oil light.

They don’t, however, display every metric that every subsystem generates: temperature, fuel-to-air ratios, etc. Similarly, logical spans operate as top-level “lights” to indicate high-level customer-impacting issues. Once an issue has been surfaced using logical spans, engineers can dive into low-level implementation details to debug it, much like a mechanic debugging low-level car issues.

Traces that contain logical spans make it easy for engineers to see which end-user actions are impacted.

Giving Engineers Contex

Logical spans group specific implementation spans, helping to provide context and understanding of the specific actions required to support customers. When implementation details are grouped under a logical identifier, it’s easier to understand the purpose of the individual actions. Backend engineers may not be very familiar with the product or the end user, but this creates a bridge between them and the end user.

Logical traces also provide context to built-in Thundra features, such as unique traces. Logical spans make it easier to connect traces with the client action they are orchestrating. This context is especially useful for adding meaningful descriptions to EventBridge events. Thundra can ingest EventBridge events from any AWS-supported source.

How Do We Implement Logical Spans?

Logical spans are relatively simple to implement:

  • Identify — List the important user events.
  • Instrument — Modify the code to capture the logical span using a client library.
  • Verify — Ensure the instrumentation works as expected.

Identify

The purpose of the following Lambda, named “Scraper,” is to update a user’s profile with information from social media. Figure 2 lists the code for the Scraper Lambda.

Figure 2. Scraper Lambda code

The Lambda has five distinct operations:

  • Get the user’s profile.
  • Make a request to Twitter.
  • Save the results.
  • Make a request to Facebook.
  • Save the results.

The Lambda’s purpose is to “update the user profile,” and the trace for this Lambda using Thundra’s auto-instrumentation looks like this:

Figure 3. Scraper Lambda trace

The action the company is providing for the customer is to update the user profile, which we’ll call “update_user_profile.”

Instrument

The next step is to instrument the logical trace using Thundra’s instrumentation API. This requires you to create a new span, which is the logical span. To instrument a logical span, you must set a parent span that groups each of the concrete operations. This allows the operations to be nested under the logical span.

Figure 4. Instrumenting a logical span

Thundra also supports offline debugging, which can help ensure that instrumentation works and is error-free.

Verify

The final step is to deploy the newly instrumented code and to verify that the logical span shows up as expected.

Figure 5. Verify logical span

The code now surfaces the logical span, which represents the client action being executed.

An Alternative: Decorating Function Calls

Another viable option is instrumenting function calls. Sometimes custom instrumentation is not possible because of time constraints. Thundra offers serverless integrations that auto-instrument Lambda function calls. Thundra can instrument Lambdas using Amazon Lambda layers without any code changes, allowing you to toggle on Lambda function instrumentation through the Lambda UI.

Alternative: Using Tags

Instead of creating a logical span, you can tag concrete spans with important user attributes such as user action, user IDs, and other user-focused metadata. This is a viable option. One benefit to an explicit logical span is that concrete spans will be grouped underneath the logical span. This makes it obvious which concrete spans are involved with orchestrating customer actions.

Alternative: Logically Scoped Lambda

A logically scoped Lambda has a name that represents the logical operation. In the example above, the Lambda was named “Scraper,” but its purpose was to update a user’s profile. Changing the Lambda’s name to “Update User Profile” better describes the Lambda and generates the default trace from Thundra’s integration.

Whenever a Lambda performs a single user action, naming the Lambda to represent that action is an easy win and doesn’t require that you add another logical span. If a Lambda doesn’t map one-to-one with a user action or it performs multiple user actions, you should use manually instrumented logical spans.

Logical Spans: Monitoring Business Context

Logical spans are tools to help connect implementation details (the how) with the customer experience (the what). Logical spans help with debugging by indicating which client actions are impacted and the steps required for orchestrating those actions. Logical spans are easy to add and can provide engineers with important context while aiding in debugging. Thundra provides open source client libraries which make instrumenting logical spans and understanding your client experience easy!

Originally published at https://blog.thundra.io.

--

--