June 26, 2024

Contrast

Our Journey Implementing Session Replay in Android for Jetpack Compose

Keeping pace with bleeding edge Android libraries

Session Replay is a Capture feature that allows users to see a representation of an app’s screen using a highly efficient, privacy-conscious mechanism. A proprietary binary format is used to encode a lightweight representation of the UI, which our frontend then renders as wireframes.

In this post we are going to recount our difficult but rewarding journey making this feature work well on Android when using Jetpack Compose. Buckle up!

For those unfamiliar, bitdrift is a different take on observability: on-device intelligence. Instead of sending loads of expensive telemetry data hoping for insights later, we couple local storage and a real-time control plane to dynamically fetch only the data you need. One of our most popular features is Session Replay, where we rebuild wireframes of UI screens that your users actually experienced at the time of a bug or any other interesting event. Here is a quick video of what Session Replay looks like. Our users have been utilizing the Session Replay feature for over a year now, and we have since iterated heavily to adapt to the evolving limitations of the underlying Android APIs and meet our customer needs. Let’s walk through the history of our implementation and the hurdles we’ve overcome along the way!

The Challenge

Last year, we set out to create an implementation of our Session Replay feature that worked with both traditional Android Views as well as screens created with Jetpack Compose. For Android Views, the process was straightforward: traverse the view tree using recursive calls to ViewGroup.getChildAt(index) and extract layout information via Drawable.getBounds() and similar methods. However, supporting Jetpack Compose, which was still in its early days, proved to be far more complex and less straightforward.

Exploration and Initial Implementation

Jetpack Compose didn’t expose any clear APIs for traversing Compose View sub-trees so the first thing we wondered was, how does Android Studio do it? They support a Layout Inspector that renders Compose Views after all. We started by diving into the code. The two main sources of inspiration were the Compose UI Tooling APIs used by Android Studio Layout Inspector, and the Radiography library by Square, which had some experimental support for rendering Compose Views. The first clue was found in the comments of the sourceInfo property inside CompositionData:


kotlin
/**
 * [CompositionGroup] is a group of data slots tracked independently by composition. These groups
 * correspond to flow control branches (such as if statements and function calls) as well as
 * emitting of a node to the tree.
 *
 * This interface is not intended to be used directly and is provided to allow the tools API to have
 * access to data tracked during composition. The tools API should be used instead which provides a
 * more usable interpretation of the slot table.
 */
interface CompositionGroup : CompositionData {
    /**
     * Information recorded by the compiler to help tooling identify the source that generated the
     * group. The format of this string is internal and is interpreted by the tools API which
     * translates this information into source file name and offsets.
     */
    val sourceInfo: String?
}

Reading through the comments, they mention looking inside the tools API for a more friendly representation of the composition information. After more source code reading, we found the promised APIs in the form of compositionData.asTree(). This was indeed the secret sauce used not only by Android Studio’s Layout Inspector but also by the Radiography library:


kotlin
// Composer and its slot table are finally public API again.
// asTree is provided by the Compose Tooling library. It "reads" the slot table and parses it
// into a tree of Group objects. This means we're technically traversing the composable tree
// twice, so why not just read the slot table directly? As opaque as the Group API is, the actual
// slot table API is quite complicated, and the actual format of the slot table (effectively an
// array that stores a flattened version of a composition tree) is super low level. It's likely to
// change a lot between compose versions, and keeping up with that with every two-week dev release
// would be a lot of work. Additionally, a lot of the objects stored in the slot table are not
// public (eg LayoutNode), so we'd need to use even more (brittle) reflection to do that parsing.
// That said, once Compose is more stable, it might be worth it to read the slot table directly,
// since then we could drop the requirement for the Tooling library to be on the classpath.
@OptIn(UiToolingDataApi::class)
val rootGroup = composer.compositionData.asTree()

This API exposed enough information to understand how Compose View trees were laid out via the Group class which gave us the name and box of each node, as well as a collection of all the children. Getting to this CompositionData was not easy though. It required heavy uses of reflection which went like this: View.mKeyedTags → WrappedComposition.original → CompositionImpl.composer → Composer.compositionData This worked well during our preliminary tests. We had conducted rigorous tests using the Jetpack Compose example app, JetNews, which is an app 100% built using Compose Views as well as tested it with real-world apps like those from one of our earliest partners, Lyft, who use a hybrid approach for their Rider and Driver apps. Our Session Replay screen representations were being rendered correctly.

But, is it production ready?

Unfortunately, this implementation had some issues. It required heavy use of brittle reflection to access the data, and it used the compositionData.asTree() method, which was a very expensive operation (in our initial tests it would take ~800ms for it to completely compute all >65 nested views inside the JetNews app). Since this logic was intended to run in the main thread repeatedly, we marked this feature as experimental and disabled it by default while we conducted more tests and research. The absence of a performant approach for traversing Jetpack Compose views persisted until the introduction of the CompositionData.mapTree() extension method by the Compose Tooling team in version 1.3.0-alpha02:


Version 1.3.0-alpha02
July 27, 2022

API Changes
Add mapTree to SlotTree.kt. This allows tools to inspect the SlotTree without making an in memory copy first like asTree does.
For the Layout Inspector this gives a performance improvement of about a factor 10. (I5e113)

This looked promising. However, It was worth investigating whether Radiography hadn't yet incorporated it due to any known limitations, or simply because the method was relatively new at the time.

Tapping into the Android Community

We decided to reach out to the community to look for confirmation. We went and asked the creators of the Radiography library directly and they confirmed our suspicion that the optimized mapTree() method just didn't exist at the time. After reworking our Session Replay implementation using mapTree(), we conducted extensive tests using the JetNews app. This validated performance gains of an order of magnitude! Even though it wasn't perfect, we deemed it good enough to start enabling this feature for our production customers and carefully roll it out via A/B tests.

Navigating Breaking Changes

Just when we thought we were done, we hit another snag. A customer reported that their Compose views weren’t being rendered at all in release builds of their app, even when targeting v1.0.0 of the Compose libraries.

Breaking change 1.0.0: If code is missing; it’s always proguard

In Android app development, it's standard practice to prepare your app for release by shrinking, obfuscating, and optimizing its code. This is achieved through the Android Gradle Plugin, which utilizes the R8 or ProGuard compiler under the hood. These optimizations can potentially introduce runtime exceptions because code can be removed or classes renamed during the process. This is especially true when working with reflection and native code (a core part of our library is written in Rust). Since optimization typically only occurs in release builds, this became our initial area of investigation. After getting a repro in our local fork of the Jetnews app, we used gradle’s plugin proguard debugging flags to get more info:


# inside proguard-rules.pro configuration file:

# output a full report of all the rules that R8 applies when building your project
-printconfiguration proguard-release-config.txt
# output a report of all the code that R8 removed from your app
-printusage proguard-release-usage.txt
# output a report of the entry points that R8 determines from your project’s keep rules
-printseeds proguard-release-seeds.txt

Examining the output's diagnostic information, a specific section within the proguard-release-config.txt file drew our attention:


# The proguard configuration file for the following section is
# /Users/xx/.gradle/caches/transforms-4/1470916d09cfef928d28226165655495/transformed/runtime-release/proguard.txt
-assumenosideeffects public class androidx.compose.runtime.ComposerKt {
    void sourceInformation(androidx.compose.runtime.Composer,java.lang.String);
    void sourceInformationMarkerStart(androidx.compose.runtime.Composer,int,java.lang.String);
    void sourceInformationMarkerEnd(androidx.compose.runtime.Composer);
}

After reading more about the -assumenosideeffects proguard flag in their documentation, it became clear that this was very likely the culprit:


Specifies methods that don't have any side effects, other than possibly returning a value. [..]
In the optimization step, ProGuard can then remove calls to such methods, if it can determine that the return values aren't used.
ProGuard will analyze your program code to find such methods automatically.
It will not analyze library code, for which this option can therefore be useful.

Originally we noted that even when applying this rule it was not always being honored. We learned that this particular rule is only applicable when optimizing. One way to confirm this is what’s triggering this behavior is by looking for the android-optimize.txt file in your Gradle configuration:


kotlin
android {
  buildTypes {
    release {
      isMinifyEnabled = true
      proguardFiles(getDefaultProguardFile("proguard-android-optimize.txt"), "proguard-rules.pro")
    }
  }
}

The proguard-android-optimize.txt file explicitly states:


# Optimizations: If you don't want to optimize, use the proguard-android.txt configuration file
# instead of this one, which turns off the optimization flags.

After searching far and wide for the breaking change, we discovered this old code change by Google in version 1.0.0-beta07. This change intentionally removed code that would allow access to sourceInfo in release builds when bringing in the androidx.compose.runtime:runtime library (which is usually a transitive dependency of the compose-ui library).


Version 1.0.0-beta07
May 18, 2021

API Changes
Added new compose compiler APIs that allow the source information
generated by the compiler to be removed during source minification. (Ia34e6)

To address this issue for apps that are indeed optimizing their builds, we found out that starting with Android Gradle Plugin v.7.3, there was an option to override ProGuard rules brought in by third-party libraries using the ignoreExternalDependencies command (later renamed to ignoreFrom in AGP v8.4):


kotlin
buildTypes {
//...
  release {
    isMinifyEnabled = true
    optimization {
      keepRules {
        ignoreFrom("androidx.compose.runtime:runtime-android") //make sure to add the '-android' part, which is automatically added by the gradle plugin due to kotlin multiplatform support
      }
    }
    //...
  }
}

After applying this configuration we were able to validate that the assumenosideeffects rule was indeed absent via the -printconfiguration command above and that the feature was again working. We started recommending our users to add this flag to their own gradle configuration before releasing them.

Breaking Change 1.5.4: It’s not always proguard

Even though this fix worked well for our local tests, it didn’t solve the customer’s issue. Comprehensive testing revealed that our feature was breaking for other combinations of library versions + build configurations, starting with v1.5.4. We then discovered that Compose runtime compiler version 1.5.4 added support for configuring the inclusion of the sourceInfo via a compiler configuration flag:


Version 1.5.4
November 7, 2023

New Features
Add flag to enable/disable source/trace information. (4d45f09)

Any further documentation on how to use these flags was sparse to say the least! We eventually discovered the flag's name in a random bug report by Jake Wharton. By examining the layout inspector source code, we figured out that one passes this flag to the compiler with Gradle using the freeCompileArgs option. With this newfound knowledge, we were again unblocked.


kotlin
android {
  kotlinOptions {
    freeCompilerArgs += listOf("-P", "plugin:androidx.compose.compiler.plugins.kotlin:sourceInformation=true")
  }
}

To adapt to this new change, we now had to recommend that users of our library add this compiler flag as well, which wasn’t ideal.

Breaking Change 1.6.0: Time to throw in the towel?

Despite these configurations, our test matrix indicated that our feature was still broken for Compose versions 1.6.0 and above. Turns out there had been another code change in the compose runtime which further limited the collection of source information.


Move source information to a side table

Prior to this change, source information collected for tooling
was as a slot in the group associated with a call. This had two
down-sides, 1) it grew the size of the slot table unnecessarily
and 2) it introduced groups just to record the source information
for functions that otherwise would not need a group.

With this change, the collection of source information is
1) now disabled by default and only enabled when the tooling
API requests it, and 2) moved to a side table instead of being
recorded in the main table.

As source information collection was on by default and is now
off, the size of the slot table for builds that do not use R8,
ProGuard or similar tools are smaller.

Test: ./gradlew :composer:r:r:tDUT
Fixes: 254480106

Change-Id: Ic8b75b1dab148d7cb0313131bd5df5b8a209cc6e

The change meant that source information would now only be populated when explicitly calling currentComposer.collectParameterInformation(). This method had to be called from within a composition, as the Layout Editor did. Adapting to this change would require a significant overhaul of our implementation and force our users to add extra code to their integration to allow our library to hook into their Compositions. At this point, we started wondering if pursuing this approach further was worth it. One of the main ethos of our library is that it should stay out of the way and “just work”, but we were now asking users to add more and more setup boilerplate code. Google had also been trying to tell us that these APIs were not stable nor performant.

Finding Another Way

Before deciding to look for an alternative approach, we checked back with Radiography and found out they were also grappling with the constant breaking changes. They had to update their library after Compose releases 1.4.0, 1.5.1, and were still dealing with the 1.6.0 changes at the time of this writing. We went back to the community for guidance. This time, Google engineers in the Compose team reminded us of the Compose Semantics APIs, which we hadn’t looked into in a while. These are the APIs are currently used by Android’s Accessibility Tools (e.g. Talkback) as well as their Test Tools, which made them very promising. In order to figure out how to get a hold of this information in the code, we decided to check within the Android community before diving back into the source code and we found from Jake Wharton the clue that led us in the right direction:

There's the accessibility tree as well, depending on whether that meets your requirements or not [...] It's the tree through which things like Ul testing interaction and accessibility interaction is exposed. You can get the tree from a semantics owner [...] from there it's a tree of nodes with key/value pairs and a bunch of typesafe properties (id, position, etc.), do with it what you will

Thanks to this lead, we learned that all AndroidComposeViews expose two versions of their entire subtrees: unmergedRootSemanticsNode and rootSemanticsNode via their SemanticsOwner property. This means we could recursively traverse and encode this subtree in a single pass.


kotlin
// This is needed in order to be able to access internal androidx.compose.ui properties and classes
@Suppress("INVISIBLE_MEMBER", "INVISIBLE_REFERENCE")
fun parse(view: View) {
  // or access rootSemanticsNode if you want the unmerged tree
  val composeTreeRoot = (view as? AndroidComposeView)?.semanticsOwner?.unmergedRootSemanticsNode
  // traverse tree using composeTreeRoot?.children
}

By examining how Android’s Accessibility Tools parse the SemanticNode data, we were able to extract enough information to encode our representation for Session Replay.


kotlin
fun SemanticsNode.extractInfo(): ReplayType {
  val notAttachedOrPlaced = !this.layoutNode.isPlaced || !this.layoutNode.isAttached
  return if (notAttachedOrPlaced) {
    return ScannableView.IgnoreView
  } else if (this.isTransparent) {
    ReplayType.TransparentView
  } else {
    val role = this.unmergedConfig.getOrNull(SemanticsProperties.Role)
    if (this.unmergedConfig.contains(SemanticsProperties.Text)) {
        ReplayType.Label
    } else if (this.unmergedConfig.contains(SemanticsActions.SetText)) {
        ReplayType.TextInput
    } else if (role == Role.Button) {
        ReplayType.Button
    } else if (role == Role.Image) {
        ReplayType.Image
    } else if (role == Role.Checkbox) {
        if (this.unmergedConfig.getOrNull(SemanticsProperties.ToggleableState) == ToggleableState.On) {
            ReplayType.SwitchOn
        } else {
            ReplayType.SwitchOff
        }
    } else {
        // ...etc
    }
  }
}

Armed with this new approach, we went and rewrote our entire Session Replay logic for Jetpack Compose. After validating it in our tests, we shipped a release-candidate build for our customer to try, which they promptly validated solved all of their problems!

Perseverance is Key

Roughly 3 weeks after the customer’s initial report, we were able to release a new version of our SDK with the improved Session Replay support. Pivoting to using the Semantic APIs not only made our code significantly simpler and less brittle but also improved performance by an order of magnitude (from double to single-digit milliseconds!). It was a wild ride, but this journey taught us a lot about perseverance, community support, and the importance of adaptability. It reinforced two key aspects for us: the immense value of publicly available source code and the supportive nature of the Android developer community. Moral of the story: keep iterating and exploring alternatives to deliver the most optimal solution for your users! Are you excited about tackling challenging and innovative problems like these? Join our team at Bitdrift —we're hiring an Android Software Engineer! Want to learn more about Bitdrift Capture? Explore our sandbox and experience Capture firsthand, or get in touch with us for getting access to the real thing. Please join us in Slack as well to ask questions and give feedback!

Author

Miguel Juárez López

June 26, 2024

The Challenge

Exploration and Initial Implementation

kotlin

kotlin

But, is it production ready?

Tapping into the Android Community

Navigating Breaking Changes

Breaking change 1.0.0: If code is missing; it’s always proguard

kotlin

kotlin

Breaking Change 1.5.4: It’s not always proguard

kotlin

Breaking Change 1.6.0: Time to throw in the towel?

Finding Another Way

kotlin

kotlin

Perseverance is Key

Stay in the know, sign up to the bitdrift newsletter.