PricingDocs

Announcing OpenTelemetry mobile tracing (that actually works)

Today we are absolutely thrilled to announce OpenTelemetry (OTel) compatible distributed tracing for bitdrift Capture. We are quite confident that this is the first OTel mobile tracing solution to be released that actually works and will provide engineers with the traces they need to debug problems in the wild – without guessing, reproducing locally, or waiting on another release.

Announcing OpenTelemetry mobile tracing (that actually works)

Why is mobile tracing very hard?

Let’s talk about why tracing from the mobile client is very difficult, especially at large scale. In any moderately large deployment, backend traces are sampled to control costs. Less sophisticated deployments use random head sampling, deciding at the beginning of the trace whether it is sampled or not). More sophisticated deployments use tail sampling, like the OpenTelemetry Tail Sampling Processor. In these deployments, a trace data is held for a period of time while waiting for a signal to flush the entire trace or not (for example if one of the operations results in an error the entire trace can be flushed). In a best case scenario with tail sampling, the tail sampling buffer is generally tuned to hold trace data for a very short period of time (minutes at most) because the primary use cases involve back-end network calls and their associated microservices. Compare this to the length of mobile sessions which could be hours or even days! So fundamentally an engineer who wants to see a mobile session with full backend tracing has to make sure that all network calls in that session are force sampled. But how can this be done sanely? The naive approach is to simply head sample from the client on a strict % basis. This is what every other solution on the market does, which means you’re effectively rolling the dice on whether you’ll have the data you need when something breaks. However, bitdrift’s dynamic control allows us to fundamentally change how users think about this. bitdrift Capture can provide a tracing experience that is much more likely to capture the customer sessions with traces that engineers actually care about.

Mobile tracing in Capture

Today we are launching a few different things that together provide a novel, holistic mobile tracing experience:
  1. The workflow engine now supports a sampled matcher. This allows flows to only progress a certain % of the time. This feature is not specific to tracing, though it is especially useful in that context.
  2. The workflow engine now supports a “start tracing” action. This allows client side traces to be enabled based on workflow progression.
  3. The mobile SDK now supports an “Is tracing active?” API. The out of the box network integrations have been updated to respect that API and if tracing is enabled, will initiate W3C (OTel) traces on network requests with force sampled set to true against every outbound network call. The created trace ID is then stored in network response log fields. If the session is ultimately flushed and captured, the session timeline UI has gained the ability to deep link to the user’s trace provider of choice to see the full backend trace. For those that are not using our OOTB network integrations, the API can still be used to decide whether trace headers should be added or not.
Together, these capabilities let you deterministically capture full, end-to-end traces for the sessions you care about, without permanently increasing cost or overhead. Now, let’s cover some use case examples to fully understand how this is mobile tracing finally done right.

Mobile tracing use cases

Here are a few common aspirational tracing use cases that show how the new mobile tracing support provides actionable insights quickly.

On-demand tracing of a specific target population

Let’s say that you are rolling out a new feature and want to capture sessions with tracing specific to that feature. Using workflows in bitdrift Capture, you can match on feature exposure, then start tracing. After some period of time (via timeout) or until some action occurs (the feature is used) the session can be flushed. Between the time tracing was started until the session stops flushing, all network calls will be traced from the client. This guarantees that captured sessions have full backend traces to view. When enough sessions have been captured, the workflow can be dynamically stopped to avoid further tracing overhead. This gives you high-fidelity traces exactly when you need them, and zero waste when you don’t.

Getting traces for a specific failure flow across a very large population

In another scenario, let’s say you have a massive mobile app with hundreds of millions of users that sells widgets. Given the massive deployment size, widget checkout is going to fail for one reason or the other a bunch of times per day across the entire population. It would be extremely useful to get full end-to-end traces of a failed checkout experience. In order to do this, you can create a workflow that has the following steps:
  1. Match on app open with a sample rate of 0.001%.
  2. Wait until checkout fails.
  3. Flush the session.
Probabilistically, every day you are going to capture full sessions with complete end-to-end tracing to help with our understanding of how the backend is involved with checkout failures.

Tracing done right

Gone are the days of performing head sampling on the mobile client and hoping to get lucky with matching backend traces. (Reality check: you never get lucky when you actually need it.) bitdrift Capture’s mobile tracing integration gives you full real-time control over what session gets traced, allowing you to get the traces you need when you need them. You’ll move from hypothesis to evidence-based understanding of what’s really happening in a single session. This is just the beginning of our tracing roadmap and we plan on adding a bunch of other features in the future, including the ability to show server spans directly in our session trace view. We can’t wait to iterate based on your feedback!

Join us for the future of observability

Mobile tracing is available today for all customers. New and existing customers can contact us to learn more. Capture is changing the mobile observability game by adding a control plane and local storage on every mobile device, providing extremely detailed telemetry when you need it, and none when you don’t. If lack of mobile tracing was keeping you away, now is the time to give us a try! Interested in learning more? Here are some options: We can’t wait to see what you think.

Frequently asked questions

What is OpenTelemetry mobile tracing?

OpenTelemetry mobile tracing extends distributed tracing to the mobile client, allowing engineers to track requests from a user’s device through backend services. Unlike traditional backend-only tracing, it captures the full end-to-end flow of a user session, including client-side context and network interactions. If you’re new to OpenTelemetry, you can start with this overview: https://blog.bitdrift.io/post/what-is-opentelemetry

What problems does mobile tracing help solve?

Mobile tracing helps engineers debug issues that are difficult or impossible to reproduce locally, such as intermittent failures, long-tail bugs, and complex backend interactions triggered from the client. By providing full end-to-end visibility into real user sessions, it allows teams to move from guesswork to precise root cause analysis much faster.

Why is mobile tracing harder than backend tracing?

Mobile tracing is more complex because mobile sessions can last much longer (hours or days), data may arrive late or not at all, and cost constraints make continuous tracing impractical. Traditional sampling approaches used in backend systems (like head or tail sampling) don’t translate well to mobile environments and often miss the exact sessions engineers need to debug. For a deeper perspective on the limitations of OpenTelemetry in real-world systems, see: https://blog.bitdrift.dev/post/reality-check-otel

How does bitdrift’s OpenTelemetry tracing work differently?

bitdrift uses dynamic workflows to control when tracing is enabled on the client. Instead of randomly sampling sessions, engineers can deterministically start tracing based on specific conditions, such as feature usage or error states, and ensure all related network calls are fully traced. This approach provides complete, high-fidelity traces for the sessions that matter most, without requiring engineers to “get lucky” with sampling.

Does OpenTelemetry mobile tracing require constant sampling?

No. With bitdrift, tracing is activated only when needed using real-time workflows. This avoids the need for constant head-based sampling and reduces unnecessary overhead. Teams can capture full traces for targeted sessions and then stop tracing dynamically, balancing visibility with cost.

Stay in the know, sign up to the bitdrift newsletter.

Author