I was recently optimizing a small part of the Jetpack Compose runtime when I stumbled upon a seemingly harmless API, isBlank(). This API return true if the string it’s called on is empty or consists solely of whitespace characters.

But is it truly harmless? Let’s look at the JVM implementation to get a better sense of what it does:

1public actual fun CharSequence.isBlank(): Boolean =
2    length == 0 || indices.all { this[it].isWhitespace() }

Simple and straight to the point. Unfortunately the bytecode tells a very different story:

 1const-string v0, "s" // string@0080
 2invoke-static {v6, v0}, Lkotlin/text/CharsKt__CharKt;.checkNotNullParameter:(Ljava/lang/Object;Ljava/lang/String;)V // method@002a
 3invoke-interface {v6}, Ljava/lang/CharSequence;.length:()I // method@0006
 4move-result v0
 5const/4 v1, #int 1 // #1
 6if-eqz v0, 0069 // +005f
 7new-instance v0, Lkotlin/ranges/IntRange; // type@001c
 8invoke-interface {v6}, Ljava/lang/CharSequence;.length:()I // method@0006
 9move-result v2
10add-int/lit8 v2, v2, #int -1 // #ff
11const/4 v3, #int 0 // #0
12invoke-direct {v0, v3, v2}, Lkotlin/ranges/IntRange;.<init>:(II)V // method@0026
13instance-of v2, v0, Ljava/util/Collection; // type@0017
14if-eqz v2, 0026 // +000c
15move-object v2, v0
16check-cast v2, Ljava/util/Collection; // type@0017
17invoke-interface {v2}, Ljava/util/Collection;.isEmpty:()Z // method@001b
18move-result v2
19if-eqz v2, 0026 // +0003
20goto 0064 // +003f
21invoke-virtual {v0}, Lkotlin/ranges/IntProgression;.iterator:()Ljava/util/Iterator; // method@001e
22move-result-object v0
23move-object v2, v0
24check-cast v2, Lkotlin/ranges/IntProgressionIterator; // type@001b
25iget-boolean v2, v2, Lkotlin/ranges/IntProgressionIterator;.hasNext:Z // field@0007
26if-eqz v2, 0064 // +0035
27move-object v2, v0
28check-cast v2, Lkotlin/ranges/IntProgressionIterator; // type@001b
29iget v4, v2, Lkotlin/ranges/IntProgressionIterator;.next:I // field@0008
30iget v5, v2, Lkotlin/ranges/IntProgressionIterator;.finalElement:I // field@0006
31if-ne v4, v5, 0047 // +000f
32iget-boolean v5, v2, Lkotlin/ranges/IntProgressionIterator;.hasNext:Z // field@0007
33if-eqz v5, 0041 // +0005
34iput-boolean v3, v2, Lkotlin/ranges/IntProgressionIterator;.hasNext:Z // field@0007
35goto 004c // +000c
36new-instance v6, Ljava/util/NoSuchElementException; // type@0019
37invoke-direct {v6}, Ljava/util/NoSuchElementException;.<init>:()V // method@001c
38throw v6
39iget v5, v2, Lkotlin/ranges/IntProgressionIterator;.step:I // field@0009
40add-int/2addr v5, v4
41iput v5, v2, Lkotlin/ranges/IntProgressionIterator;.next:I // field@0008
42invoke-interface {v6, v4}, Ljava/lang/CharSequence;.charAt:(I)C // method@0005
43move-result v2
44invoke-static {v2}, Ljava/lang/Character;.isWhitespace:(C)Z // method@0008
45move-result v4
46if-nez v4, 005f // +000b
47invoke-static {v2}, Ljava/lang/Character;.isSpaceChar:(C)Z // method@0007
48move-result v2
49if-eqz v2, 005d // +0003
50goto 005f // +0003
51const/4 v2, #int 0 // #0
52goto 0060 // +0002
53const/4 v2, #int 1 // #1
54if-nez v2, 002a // -0036
55const/4 v6, #int 0 // #0
56goto 0065 // +0002
57const/4 v6, #int 1 // #1
58if-eqz v6, 0068 // +0003
59goto 0069 // +0002
60const/4 v1, #int 0 // #0
61return v1

The implementation suffers from several problems, including allocating new objects: an IntRange and an Iterator. If we look again at the Kotlin implementation, we notice it calls all() on CharSequence.indices, an IntRange that would normally be discarded when compiling a regular for loop. In this case, the call to all() — an API defined using generics — leads to the absurd code above that goes through many abstractions just to iterate over a list of characters.

Instead of a beautiful one-liner, we can use a traditional for loop:

1private inline fun CharSequence.fastIsBlank(): Boolean {
2    for (i in 0..<length) {
3        if (!this[i].isWhitespace()) {
4            return false
5        }
6    }
7    return true
8}

The code remains easy to understand but the generated bytecode is much more efficient:

 1invoke-interface {v6}, Ljava/lang/CharSequence;.length:()I // method@0007
 2move-result v0
 3const/4 v1, #int 1 // #1
 4sub-int/2addr v0, v1
 5const/4 v2, #int 0 // #0
 6const/4 v3, #int 0 // #0
 7if-ge v3, v0, 0024 // +001c
 8invoke-interface {v6, v3}, Ljava/lang/CharSequence;.charAt:(I)C // method@0006
 9move-result v4
10invoke-static {v4}, Ljava/lang/Character;.isWhitespace:(C)Z // method@0009
11move-result v5
12if-nez v5, 0021 // +000f
13const/16 v5, #int 160 // #a0
14if-eq v4, v5, 0021 // +000b
15const/16 v5, #int 8199 // #2007
16if-eq v4, v5, 0021 // +0007
17const/16 v5, #int 8239 // #202f
18if-eq v4, v5, 0021 // +0003
19return v2
20add-int/lit8 v3, v3, #int 1 // #01
21goto 0008 // -001b
22return v1

If we benchmark both implementations using a list of 1,000 strings, we see that our for loop version is 60% faster and does zero allocations:

37,998 ns  2001 allocs  isBlank
14,943 ns     0 allocs  isBlankForLoop

We can even further improve on it. If you look closely at the bytecode, you can see that Kotlin’s isWhitespace() calls both Character.isSpaceChar() and Character.isWhitespace(), two functions that do similar work (the implementations are almost the same).

To further optimize our function, we can invoke only Character.isWhitespace() and do the extra checks done in Character.isSpaceChar() (the opposite would be more complicated):

 1private inline fun CharSequence.fastIsBlank(): Boolean {
 2    for (i in 0..<length) {
 3        val c = this[i]
 4        if (!Character.isWhitespace(c) &&
 5            c != '\u00a0' && c != '\u2007' && c != '\u202f') {
 6            return false
 7        }
 8    }
 9    return true
10}

The benchmarks are now:

37,998 ns  2001 allocs  isBlank
14,943 ns     0 allocs  isBlankForLoop
13,519 ns     0 allocs  isBlankForLoopOneCall

The latest version is close to 65% faster than the original implementation, and we won’t trigger the garbage collector either!

I will spare you the printouts of the final ARM asembly after ahead-of-time compilation, just know that we go from 161 instructions with the original isBlank(), down to 53 instructions. And that’s without counting the instructions we save by not invoking Character.isSpaceChar().

So does this matter? As always, it depends. If you are validating a text field when the user sends a form, probably not. If you are trying to parse a ton of strings, the GC savings alone are probably worth paying attention to this. But more importantly, this shows that even simple looking one-liners can have a disproportionate impact on performance in places you don’t expect.