-[ Introduction ]
This is a very quick post about a challenge I just solved and found interesting. It’s a Java “what’s this code doing?” type of challenge, and since I haven’t touched Java bytecode in a while I thought I’d give it a go!
The original challenge can be found here, and many many more can be found here!
-[ Challenge ]
We’re presented with this snippet of, as we are informed, optimized Java bytecode:
public boolean f(char);
descriptor: (C)Z
flags: ACC_PUBLIC
Code:
stack=2, locals=2, args_size=2
0: iload_1
1: bipush 97
3: if_icmplt 14
6: iload_1
7: bipush 122
9: if_icmpgt 14
12: iconst_1
13: ireturn
14: iload_1
15: bipush 65
17: if_icmplt 28
20: iload_1
21: bipush 90
23: if_icmpgt 28
26: iconst_1
27: ireturn
28: iconst_0
29: ireturn
The author was kind enough to give us the method’s signature! So it takes a character (char, as opposed to Character) and returns a boolean and it’s stack size has a max of two elements at any time (stack=2)! The Oracle documentation and this Wikipedia article are two handy references that I frequently use when doing Java bytecode-level reversing.
Reading this from top to bottom, it reads like:
0: load local variable 1 on the stack 1: load integer 97 to the stack 3: if local variable 1 < 97 GOTO offset 14 6: load local variable 1 on the stack 7: load integer 122 to the stack 9: if local variable 1 > 122 GOTO offset 14 12: load 1 onto the stack 13: return 14: load local variable 1 on the stack 15: push integer 65 on the stack 17: if local variable 1 < 65 GOTO offset 28 20: load local variable 1 on the stack 21: push integer 90 on the stack 23: if local variable 1 > 90 GOTO offset 28 26: load constant 1 on the stack 27: return 28: load constant 8 on the stack 29: return
There are a number of.. numbers that should attract your attention on this one! Number one, the offsets for the different GOTOs. Offset 14 seems to be leading to a further number of computation, while offset 28 to a return statement! Number two, there are numbers involved in the comparisons that are in the ASCII printable range (I’m always on the lookout for those!)!
We can tackle the above code in a number of way. To make sense of this, it would not harm you to give the Oracle documentation on the JVM and its stack-based architecture a read. It will make you understand which operations consume which objects from the stack and why it’s necessary to keep loading the local variables on the stack!
-[ Approach ]
One can treat the lines 0-14 as a set of checks, with the lines 15-end as a second set of checks. The first set of checks loads a value on the stack and checks if it’s less than 97 (‘a’) or greater than 122 (‘z’). If not, it returns 1.
The JVM’s if_cmpXX operators expect two arguments on the stack, which they consume (i.e. remove from the stack after the instruction has finished executing) and then if the branch is true then execution continues from the offset specified with the instruction. If the branch does not evaluate to true, execution falls through to the next instruction. As an example, in offset 3 if the first argument on the stack is less than 97 then we will continue execution from offset 14, otherwise from offset 6.
Also, truth values are represented with the integers 0 for false and 1 for true (similar to .Net)!
Taking that into account, I believe the following code is executed:
public boolean isAlphabetic(char c){ if ( c < 'a' || c > 'z' ){ if( c < 'A' || c > 'Z'){ return false ; } } return true; }
-[ Conclusion ]
There was no verification for this challenge, as the author has decided to keep his solutions private(although you can email him to verify a solution, but this is too much hassle).
I decided to publish my solution in order to get some feedback and verify the correctness, should somebody stumble upon this! I do not believe that this would spoil the challenge for the motivated beginners or reversers, so here goes!
As a parting token, the person who published these has also written an introduction to reversing for beginners! It might, or might not, help you, I haven’t read it!