The truth is out there (not in your backend)

If you’ve ever tried to debug a mobile issue with nothing but vague backend logs, you know the feeling—like Fox Mulder staring at redacted reports, convinced there’s more to the story. And there usually is. Backend telemetry might give you hints, but it’s often incomplete, sampled, or missing. The truth? It’s out there—on the devices themselves.

You can ingest all the metrics, logs, and traces you want into your backend, but when your app starts acting like it’s not of this world, backend telemetry often leaves you in the dark. Maybe you stripped out all your verbose development logs in fear of billing probes from above. Maybe your observability vendor quietly abducted the relevant session and sacrificed it to the altar of sampling and cost control. Whatever the cause, you’re stuck—unable to reproduce the issue, and weeks away from shipping an update. In the meantime, your customers are left wondering if they’ve stumbled into a glitch—or a Close Encounter. That’s why we do things differently at bitdrift. We believe the truth lives on the device. So we encourage mobile devs to log everything, store it locally in the bitdrift ring buffer, and beam up the evidence only when needed using bitdrift’s powerful Workflows. No waiting for an update, no hoping the backend caught it, and no unexplained phenomena. Today, I’d like to walk you through a real-world example of how this approach helped us solve a user issue that would have otherwise remained unidentified. Here’s what happened. A customer set up a workflow in bitdrift to match any device that logged an event containing the word "Error" in the log body. Then they created an alert: trigger if more than 500 of those logs showed up within an hour. The result? The alert started firing like clockwork—because the real number of error logs wasn’t 500, it was over 5,000! Other than containing the word “Error”, the customer had no idea what the errors were or where to begin investigating. But the issue wasn’t a lack of data. It was a matter of perspective. They were used to legacy observability tools, where curated dashboards and polished charts guided every decision, refined over the years by dedicated observability teams. But bitdrift works differently. Sure, you can carefully curate your dashboards in bitdrift, but when you don’t have a dashboard to drill into, no prebuilt charts, no red strings connecting incidents to metrics, you don’t have to worry. The data’s already out there. You just have to get it. And here’s how it happened. The first step was to narrow the beam, which in bitdrift just meant tweaking the workflow. We asked bitdrift to chart the same error logs, grouping them by application identifier, OS, and app version. The workflow was redeployed, and within minutes, we had a fresh stream of live data populating our new charts (Figures 1 & 2). The pattern was instantly clear—this wasn’t a global phenomenon. The surge in errors was isolated to iOS devices and overwhelmingly concentrated in the latest release of one app. There was no digging through backend dashboards, no stale data, just a high-resolution picture of the problem—fresh from the field, unfolding in real time. Next, we wanted to go deeper—to understand what these error logs were saying. When the updated workflow was redeployed, bitdrift immediately began capturing unsampled, live sessions from affected devices. We jumped into the Session Timeline and started reviewing a handful of them. At first, the default logs didn’t tell us much—they all looked the same: vague errors, no detail, no direction. But then we hit the custom log fields. That’s where the real story started to unfold. These logs often get stripped out before production to save space, or quietly dropped or sampled away by traditional observability platforms. But in bitdrift, we encourage you to log everything and store it locally—so when it’s time to investigate, you have a complete, unfiltered record of what happened. And in this case, those custom logs showed us exactly which services were being called when the errors occurred. Now that we knew those 5,000+ errors an hour weren’t all coming from a single source, we made one more key modification to the workflow. This time, we asked bitdrift to chart the errors grouped by the custom field that captured the services being called. That’s when the real culprits finally stepped into the light. There were three standouts (Figure 3). The first was a Google Maps API used to calculate routes between locations. The second was a call responsible for registering the device for push notifications—something that typically runs quietly in the background but can cause trouble if it fails repeatedly. The third was an internal service we’ll keep redacted to protect the customer's identity (Mulder would understand). Thanks to bitdrift’s ability to capture these logs on-device, in real time, and without sampling, the customer could finally isolate not just the existence of an issue, but also its origins. No backend dashboards. No need to wait for a new build. With that clarity, the next steps were obvious. The iOS team could go straight to the codebase for the latest app version and zero in on the parts of the app responsible for calling those specific APIs. There would be no wild goose chase, no waiting on repro steps, just a direct path from symptom to root cause. With the root causes identified, we also revisited their alerting strategy. Instead of triggering an alarm whenever any log contained the word "Error", which was like setting off sirens for every strange light in the sky, we helped them take a more focused approach. Now, alerts were tied to specific API calls, with thresholds calibrated based on how critical each one was to the user experience. This shift turned a flood of noise into actionable signals. More importantly, it gave the team the confidence to prioritize the right issues at the right time. The truth was out there—it just wasn’t in the backend. It was on real devices, in real time, waiting to be uncovered. This experience was a reminder that modern mobile observability isn’t about collecting more data centrally. It’s about putting the right tools in place to retrieve exactly what you need, exactly when you need it, from the only place that knows what happened: the user’s device. With bitdrift, the team stopped chasing ghosts and started closing cases. Mulder would be proud.

Author

Iain Finlayson

June 30, 2025

Stay in the know, sign up to the bitdrift newsletter.