by Wayne Chang, Co-founder
At Crashlytics, one of our founding principals has been an extreme (some would say, absurd) attention to detail. Crash detection and reporting, particularly on iOS, is a complex and esoteric problem to solve, with arcane restrictions that throw modern programming practices out the window.
Need to allocate memory at crash-time? Revisit your approach. Thinking of calling an Objective-C method? Dream on.
This focus has not gone unnoticed: many of the world's best mobile engineering teams - that build many of the most-well-known apps - now trust our award-winning Crashlytics for iOS solution to deliver accurate, detailed, crash reports from hundreds of millions of devices around the globe.
But we're not satisfied.
Over the past 6 months, we've embarked on a ground-up rewrite of our iOS SDK to take things to a whole new level and, after 2 months of intense testing, I'm extremely pleased to publicly announce the release of Crashlytics for iOS v2.
Highlights from hundreds of improvements
When we set out to design our new iOS SDK, it was the perfect opportunity to fundamentally rethink our approach. We've made hundreds of improvements to our iOS SDK that have lead to significant performance and stability increases. Here are some of the major ones:
- Mach Exceptions. Better than signal handlers.
All widely-used crash-reporting solutions for iOS and Mac OS are currently based off signals and uncaught exceptions. By registering handlers for both of these events, it's possible to detect and inspect the majority of crashes that occur. As our usage has exploded, however, it became painfully obvious that crashes were sneaking through. For example, it's not possible to catch all stack-overflow crashes with a signal handler.
Fortunately, there's a better way. In Darwin, signals are actually implemented on top of lower-level events called Mach Exceptions. Handling these directly is the holy-grail - all crashes can be captured immediately after they happen with far more precision and accuracy. The Mach Exception API is radically more complex than signal handling, but capturing every crash more than justifies the hurdles.
- Advanced techniques to stop secondary crashes.
Processes that crash often end up sustaining considerable damage before the kernel takes action to terminate them. In many of the nastiest crashes we've seen, this can result in secondary crashes, where the crash-handling code itself is unable to operate correctly and fails, obscuring the source of the original crash. Secondary crashes have two primary causes, both due to corruption. A buffer-overrun could mangle or destroy the in-memory data-structures that Crashlytics uses to track state. Alternatively, hardware failures or disk errors could damage the temporary cache files used to record data before it was sent to our servers.
Our new SDK goes to great lengths to address these scenarios. By carefully controlling its memory usage, our new SDK is able to pre-allocate a contiguous block of RAM that it then surrounds with guard pages, protecting against buffer-overruns. In the case of cache corruption, we've invested in making our file-handling code extremely defensive, so parsing cache files can't crash unexpectedly.
- Stack unwinding. Finding the real path to the crash.
One of the most abstruse aspects of crash detection is stack unwinding, the seemingly omniscient ability to determine historic code execution that directly lead to the crash. In practice, it involves carefully walking up the stack in memory and searching for return addresses - the instruction pointers of the calling lines.
Writing an ARM stack unwinder that works in most cases is relatively straightforward - the stack layout for iOS is well-defined. However, things start to fall apart when custom assembly is thrown into the mix, as there are no hard rules on what can and can't be done. It just so happens that objc_msgSend is such a method, performing countless tricks to make dynamic method invocation in Objective-C as fast as possible. All works perfectly during normal operation, but if objc_msgSend crashes, a naïve ARM unwinder could easily miss the stack frame of the calling function. Of course, that's the critical line you need to know!
Our new SDK uses a vastly better technique to determine the calling instruction that works in the case of objc_msgSend and many other "creative" methods that still conform in-part to Apple's iOS ABI.
We could not be happier to get these improvements into the hands of thousands of app developers and the feedback so far has been fantastic. Thanks to all those who have helped us test Crashlytics for iOS v2 and trust me - we're not stopping here - the most advanced crash reporting SDK for iOS will keep getting better!
Have questions about Crashlytics for iOS v2? Send us email!