Android’s billion-dollar mistake(s) – Kt. Academy

0
7
Jean-Michel Fayard

An article on billion-dollar mistakes, the ones that are assumed and the ones that stay un-spoken and on the importance of not misleading new developers with bad documentation.

Have you heard from the billion dollar quote? Probably:

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object-oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

Tony Hoare at QCon London in 2009 https://en.wikipedia.org/wiki/Tony_Hoare

If you are like me, your reaction was, when first heard this quote, something like: “Wow. I am also doing lots of mistakes, but they usually don’t cost that much money!”.

I have thought deeper about this recently, and I now think that Tony Hoare is a great programming hero! And not only because of all the impressive work he did outside of the billion-dollar mistake

No, my claim is that he is also great for having openly assumed its “mistake”!

Do you think he is the only programmer that made a billion-dollar mistake? Think again. The IT sector is huge. Facebook, Google, Amazon, Apple, Microsoft are worth something between $500 billion and $1 trillion. Any programming mistake that makes their valuation lose 0.2% qualifies as a billion-dollar mistake.

No, the real reason Tony Hoare is known as the billion-dollar-mistake guy is that he clearly and openly described his decision as a mistake and by doing this sent a clear signal that things must change.

This, my friends, was hugely beneficial for the software industry, it’s the reason Kotlin and other programming languages have null-safety built in their type systems. They still have null, which in itself isn’t a problem, but it’s integrated in the type system to ensure that all references are absolutely safe, with checking performed automatically by the compiler.

Tony Hoare is the real Mensch, the ego-less programmer who accepted responsibility for a mistake so that we can recognize it, and we should all be grateful.

In the Android world, things are a bit different. Let’s start with a simple example before we dive into something deeper.

The sAndroid Hungarian mNotation

Until May 2019, most of the sAndroid mCodebases in the world were plagued by a meaningless mVariant of the Hungarian mNotation. It had the drawback of bringing no benefits over a simple code highlighting rule present in Android Studio, and the obvious drawback of making everything less readable.

When you brought up the problem before 2019, you typically got one of two answers:

  • this is the status-quo, therefore it’s good.
  • we the Android team were just saying that this convention must be followed if you contribute code in the Android Open Source project.

But actually

  • The first answer was wrong. We know it because since the hungarian notation is dead, there was no huge complaints to have it back.
  • The second answer is worse, it belong to the not-even-wrong category. The claim is essentially that everyone else was mistaken. The obvious question is then: Why?. Well because everyone has been learning with the Android documentation and samples, and this convention was everywhere. This is exactly the kind of hard consistent work that you should do to create a convention. It just happened to be a harmful convention.

What killed the mHungarian notation in May 2019? Not the admission of a mistake, but the introduction of Kotlin. Why did we have to wait so long?

Android is obviously a huge commercial success and I’m not claiming the contrary. Android and the iPhone have managed to impose a dual monopoly on the smartphone world, so what follows is probably not a tactical “mistake” either. We have to work with whatever tools the Android team delivers no matter what.

I also think that Android is a good operating system from a user point of view. You can like iOS better, I’m fine with it, but that doesn’t make Android bad.

In the Context of this article, mistake means specifically misleading developers in a path that will bring them pain and suffering.

I am also not claiming that is the only big mistake in the Android SDK, or even the most important mistake from the Android SDK.

If you are curious about learning about Android’s bad parts, the #androiddev Reddit community has put together a really useful list of what they find bad in Android. But here I will focus on an interesting, basic mistake.

A sad thing about Android is that the official Android samples follow what Israel Ferrer Camacho called the Android Burrito design pattern: just wrap everything into a burrito 🌯 a GodActivity and/or GodFragment that does everything.

Have a look for example at the wonderful source code of 😱 camera-samples/Camera2BasicFragment.kt 😭. I had to cut most of the file to fit it inside a gist, but it’s still readable:

See in its full glory at https://github.com/android/camera-samples/blob/f555592364979f4005db0bea40753e0de52c8d86/Camera2BasicKotlin/Application/src/main/java/com/example/android/camera2basic/Camera2BasicFragment.kt#L60-L817

God kills a kitten every time you suggest to put yet another thing inside the Activity. That’s exactly what the official Android documentation and samples do to this day.

What could go wrong if you follow the Android Burrito design pattern?

Activity is a special kind of Context, one filled with landmines ready to explode. The most obvious problem is that because of this complex lifecycle, your Activity can be killed by the system at any point in time. It is much safer to use a Context with a simpler lifecycle like Application.

Activity is an expensive object that is tied to the entire user interface. It is very easy to fall into the trap to hold on the Activity object. A memory leak follows. In fact, it’s such a common trap that you will see this error even in classes of the Android SDK itself, either in some crappy Samsung fork or in the Android Open Source Project itself. It is such a common problem that the good guys at square invested time and efforts to detect those problems automatically.

Legacy code is often used as a vague term meaning “code that is so hard to understand that you are afraid to change it“. Michael Feathers’s classic book: “Working Effectively with Legacy Code” has a more precise and operational definition: any code that is not automatically covered by unit tests qualifies as legacy code.

Any code that follows the Android Burrito design pattern qualifies instantly as legacy code.

I have always wondered why the official Android documentation puts so much emphasis on the instrumented tests.

In my experience, those are hard to write, fundamentally slow — they have to run on Android devices -, and worst of all, they usually tell you very little when they fail.

I went the exact opposite direction, wrote lots of easy, fast, focused JVM tests and had much better results. In fact, Google’s Testing team has a brilliant article explaining why end-to-end tests is an apparently good idea that fails in practice:

Just Say No to more end-to-end test

Good ideas often fail in practice, and in the world of testing, one pervasive good idea that often fails in practice is a testing strategy built around end-to-end tests.

[…please read the whole article, it’s very good…]
Google Testing Blog: Just Say No to more end-to-end tests

So instrumented Android tests are not a great idea.

But honestly, if you put your logic inside the Android components, that’s pretty much all you can do:

The only way to test a Burrito is to taste it.

The Android Burrito design pattern is in retrospect so clearly wrong that it made me curious: where does it come from and how did it survive until today?

To give you some Context, those are two of the most fundamental building blocks of the Android SDK 1.0:

  • android.content.Context provides access to all global information about an application environment. It allows access to application-specific resources and classes, as well as up-calls for application-level operations such as launching activities, broadcasting and receiving intents, etc.
  • android.app.Activity provides the equivalent of a main() function for an app, but supercharged with a lot of features you need in a mobile operating system, most importantly a complex Activity lifecycle.

How are those two concepts related?

Here is a fatal error that was made in Android 1.0

package android.app;import android.content.Context;class Activity extends Context { 
}

But first a bit of theory.

From your Object Oriented Programming 101 course, you may remember that there are two very different kinds of relationships between objects:

  • Inheritance: A House IS A Building
  • Composition: A House HAS A Room

Prefer Composition over Inheritance is a well-known design principle that has been stated in influential books like

Android is just another SDK (Software Development Kit), but is there maybe a reason that the principle does not apply here? Well no, I know that if they had the chance to rewrite Activity today, the Android team wouldn’t do it this way because…

If you look at androidx.app.Fragment, another building block from the Android SDK that is very similar to Activity but was introduced later, you realize that it does not extend Context. Instead a Fragment HAS A Context.

So why did the Android team change their mind, albeit silently?

You can and should avoid the Burrito Design Pattern. What you can’t escape is that in Android, you need a Context to do basically everything:

class SomeThirdPartyClass {
fun doStuff(contex: Context) = TODO()
}

But even this banal SomeThirdPartyClass class is a landmine waiting to explode.

Activity IS A Context, so it’s easy to pass this@Activity as a parameter to doStuff(). But that’s the wrong thing to do, you cannot be sure that SomeThirdPartyClass is doing the right thing or that you are doing the right thing. Crashes, memory leaks and un-testability follows.

I want to point out that I am not talking only about a historical short-sighted decision.

In 2014, I was a young inexperienced Android developer in a team of young, inexperienced Android developers. We were trying to learn how these things worked and used the Android documentation and samples as a blueprint. In retrospect, this was a terrible mistake. We ended up with a painful difficult-to-understand, hard-to-test, even-harder-to-modify mess. Not because we didn’t follow the “Android best practices”, but precisely because we did!

Fast forward today, and while there has been progress in several areas, a large part of the official Android documentation and samples are still poorly written. And it continues to mislead the new generation of inexperienced developers. And as Uncle Bob would tell you, most developers are new since the IT industry double its size every five years.

I am aware that for a certain school of thought, all of this is fair game: “Those mistakes are stupid, I am a Real Programmer and wouldn’t fall for it. And you can’t prevent stupid people from being stupid, no?”.

But I’m from the design-for-humans school of thought, so in my opinion when one programmer makes an error, it’s the programmer’s fault, but when over a decade, thousands of programmers do the same error again and again, then it’s the designer and the documentation writers who didn’t do a good job. Ideally, it should be easy to do the right thing, and harder to shoot yourself in the foot.

So it’s time to say clearly that Burrito Activities and Fragments are not acceptable. It’s past time to fix the documentation and samples.

I do understand that, although they are painful today, those mistakes were made in a given historical context. The Android project had to release something or become irrelevant, it was a different area when smartphones were hugely less powerful than today.

It’s the same story with JavaScript. Its design was rushed in just ten days, then it was shipped in Netscape Navigator 1.0, and the rest is history.

It’s not that there are no solutions that make this kind of historical mistakes manageable.

Smart people are usually fast at finding solutions once they are painfully aware of the problem.

And this is precisely what is so great with Tony Hoare’s ego-less honesty: it instantly raises the awareness that there is a problem to be fixed here. This is what the Android world is lacking today. The official Android documentation continues to this day to kill kittens with the wonderful Android Burrito Design Pattern.

Allow me to finish with Tony Hoare’s quote:

This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last ten years.

Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here