Resisting decompilation of Rust's wasm with obfuscation

It’s been a while since WebAssembly (WASM) was introduced to the world, but this time we’re going to talk about the decompile of WASM.
I’ve heard that Wasm is often used for malicious purposes such as running malware. Also, although wasm has tasks that can use machine power more efficiently than Javascript, it is said that as of 2022, most DOM operations and UI operations in the browser are still inferior to the processing speed of JS, so there is a possibility that wasm will eventually disappear in the future.

Is it safe to include security information in Wasm?

The compiled wasm file is binary data, but of course it can be decompiled back into human-understandable code. So, if you put something like an important secret key in the wasm, there is a possibility that it will be extracted. So, in conclusion, important information (db connection information, secret keys to other services, etc.) should never be included in a wasm.
On the other hand, even if the information is extracted, it doesn’t mean that there is any damage by itself, but I was wondering if I could hide some information that I don’t want to be known in this wasm if possible. For example, a connection key for a public API that can be called without authentication. Of course, if you look at the Network in the browser inspector, you can see the connection key, etc., but why not issue a one-time token with wasm, encrypt it, and send it with the HTTP header? The higher the cost, the more likely it is that hackers will give up.
So in this article, I investigated the decompilation and string obfuscation of WebAssembly (WASM) by Rust.

Wasm code in Rust

In this case, we will try to compile and use WASM with the following code.

extern crate wasm_bindgen;
use wasm_bindgen::prelude::*;

#[wasm_bindgen]
pub fn secret(
    key: &str
) -> String {
    if key != "this is pass!" {
        return String::from("error");
    }
    return String::from("success")
}

It’s a stupid code, but it is a wasm function called “secret” that succeeds if the key given from outside is the string “this is pass!” and returns an error otherwise. In wasm, it will be called from JavaScript and return a value to js, but at this time, the function name will be exposed because the function named secret will be used on the js side.

// Function names are exposed by calling WASM in JS.
import init, {secret} from "./pkg/secret_wasm.js";
init().then(() => {
  secret(input)
});

On the other hand, the string “this is pass!” is represented as binary in the wasm file, so no matter how much you use the JS source or browser functions, it will not appear.

Decompile Rust’s Wasm and strip it bare

I used the following for the decompiler.
https://github.com/WebAssembly/wabt

First, decompile it into wat format.

 # Excerpts
  (func (;40;) (type 1) (param i32 i32)
    nop)
  (table (;0;) 2 2 funcref)
  (memory (;0;) 17)
  (global (;0;) (mut i32) (i32.const 1048576))
  (export "memory" (memory 0))
  (export "secret" (func 8))
  (export "__wbindgen_add_to_stack_pointer" (func 30))
  (export "__wbindgen_malloc" (func 12))
  (export "__wbindgen_realloc" (func 14))
  (export "__wbindgen_free" (func 22))
  (elem (;0;) (i32.const 1) func 40)
  (data (;0;) (i32.const 1048576) "this is pass!errorsuccess\00\00\00\04"))

Yeah. You can clearly see “this is pass!”, “success”, and “error”. You can also decompile the wasm file into c.

/*Excerpts*/
static void w2c_secret(u32 w2c_p0, u32 w2c_p1, u32 w2c_p2) {
  u32 w2c_l3 = 0, w2c_l4 = 0;
  FUNC_PROLOGUE;
  u32 w2c_i0, w2c_i1;
  w2c_i0 = w2c_p2;
  w2c_i1 = 13u;
  w2c_i0 = w2c_i0 == w2c_i1;
  ....
}

static const u8 data_segment_data_0[] = {
  0x74, 0x68, 0x69, 0x73, 0x20, 0x69, 0x73, 0x20, 0x70, 0x61, 0x73, 0x73,
  0x21, 0x65, 0x72, 0x72, 0x6f, 0x72, 0x73, 0x75, 0x63, 0x63, 0x65, 0x73,
  0x73, 0x00, 0x00, 0x00, 0x04,
};

You can see the secret function, and if you follow it, you can find data_segment_data_0. It is easy to imagine that the hexadecimal numbers listed here represent ASCII codes.If we convert this ASCII code back to alphabetic, we get “this is pass!errorsuccess”. And if you look at the part where the data_segment_data_0 constant is used, you can see that the index of each string is specified, which means that the strings in the Rust code are known in full.However, even looking at the decompiled c language, I had the impression that it was quite difficult to understand the logic of the processing. Even a very simple code like this one is difficult to understand, so decompiling and understanding the actual processing of this and that kind of wasm seems to be quite costly.

Can obfuscation be used to hide it?

In this article, we will try to obfuscate code using Rust’s obfstr. It’s quite simple and can be used as follows.

extern crate wasm_bindgen;
use wasm_bindgen::prelude::*;

#[wasm_bindgen]
pub fn secret(
    key: &str
) -> String {
    if key != obfstr::obfstr!("this is pass!") {
        return String::from("error");
    }
    return String::from("success")
}

I applied obfuscation to the “this is pass!” part. If you compile it back, it will look like this.

 # Excerpts
  (export "secret" (func 8))
  (export "__wbindgen_add_to_stack_pointer" (func 30))
  (export "__wbindgen_malloc" (func 12))
  (export "__wbindgen_realloc" (func 14))
  (export "__wbindgen_free" (func 22))
  (elem (;0;) (i32.const 1) func 40)
  (data (;0;) (i32.const 1048576) "errorsuccess\ce6\bd\a1{\1c\16YL\c8r]n\00\00\00\04"))

In the wat format, it is an unobfuscated error, success, and the obfuscated pass is an unintelligible string.

/*Excerpts*/
static const u8 data_segment_data_0[] = {
  0x65, 0x72, 0x72, 0x6f, 0x72, 0x73, 0x75, 0x63, 0x63, 0x65, 0x73, 0x73,
  0xce, 0x36, 0xbd, 0xa1, 0x7b, 0x1c, 0x16, 0x59, 0x4c, 0xc8, 0x72, 0x5d,
  0x6e, 0x00, 0x00, 0x00, 0x04,
};

When decompiled to c, this ASCII code becomes “errorsuccessÎ6½¡{YLÈr]n”. It is indeed obfuscated.

Incidentally, it is possible to recover this obfuscated string (see $crate::bytes::deobfuscate). Since the obfuscation rules were written exactly in the crate source, it is likely that the best hackers will be able to find this obfuscation method and easily compound it.

Obfuscate more and more.

How about decomposing the string to involve the String type, or obfuscating only a part of it, as shown below?

#[wasm_bindgen]
pub fn secret(
    key: &str
) -> String {
    let this: String = String::from("this");
    if key != this + obfstr::obfstr!(" is pass!") {
        return String::from("error");
    }
    return String::from("success")
}

And the following is a part of the decompilation of WASM to C.

static const u8 data_segment_data_0[] = {
  0x65, 0x72, 0x72, 0x6f, 0x72, 0x73, 0x75, 0x63, 0x63, 0x65, 0x73, 0x73,
  0x2f, 0x23, 0x99, 0xb1, 0x6a, 0x6c, 0xa0, 0xbe, 0x43, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x63, 0x61, 0x6c, 0x6c,
  0x65, 0x64, 0x20, 0x60, 0x4f, 0x70, 0x74, 0x69, 0x6f, 0x6e, 0x3a, 0x3a,
  0x75, 0x6e, 0x77, 0x72, 0x61, 0x70, 0x28, 0x29, 0x60, 0x20, 0x6f, 0x6e,
  0x20, 0x61, 0x20, 0x60, 0x4e, 0x6f, 0x6e, 0x65, 0x60, 0x20, 0x76, 0x61,
  0x6c, 0x75, 0x65, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x6c, 0x69, 0x62, 0x72, 0x61, 0x72, 0x79, 0x2f, 0x73, 0x74, 0x64, 0x2f,
  0x73, 0x72, 0x63, 0x2f, 0x70, 0x61, 0x6e, 0x69, 0x63, 0x6b, 0x69, 0x6e,
  0x67, 0x2e, 0x72, 0x73, 0x6c, 0x00, 0x10, 0x00, 0x1c, 0x00, 0x00, 0x00,
  0xeb, 0x01, 0x00, 0x00, 0x1f, 0x00, 0x00, 0x00, 0x6c, 0x00, 0x10, 0x00,
  0x1c, 0x00, 0x00, 0x00, 0xec, 0x01, 0x00, 0x00, 0x1e, 0x00, 0x00, 0x00,
  0x06, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00,
  0x07, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
  0x08, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00,
  0x0a, 0x00, 0x00, 0x00, 0x0b, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00,
  0x04, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
  0x08, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x0d, 0x00, 0x00, 0x00,
  0x6c, 0x69, 0x62, 0x72, 0x61, 0x72, 0x79, 0x2f, 0x61, 0x6c, 0x6c, 0x6f,
  0x63, 0x2f, 0x73, 0x72, 0x63, 0x2f, 0x72, 0x61, 0x77, 0x5f, 0x76, 0x65,
  0x63, 0x2e, 0x72, 0x73, 0x63, 0x61, 0x70, 0x61, 0x63, 0x69, 0x74, 0x79,
  0x20, 0x6f, 0x76, 0x65, 0x72, 0x66, 0x6c, 0x6f, 0x77, 0x00, 0x00, 0x00,
  0xf0, 0x00, 0x10, 0x00, 0x1c, 0x00, 0x00, 0x00, 0x18, 0x02, 0x00, 0x00,
  0x05, 0x00, 0x00, 0x00, 0x0f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x01, 0x00, 0x00, 0x00, 0x10,
};

I think it is quite difficult to find out what kind of rules were used to obfuscate this. Also, if Rust uses not only String but also some byte codes, or repeats obfuscate n characters at a time, I honestly think that it is impossible to return to the original string. I’m sure some hackers will be able to figure out the obfuscation rules by looking at the processing of the decompiled language. But it would take an awful lot of time.

Wasm obfuscation is effective enough

As mentioned above, obfuscation of wasm seems to be effective enough for all but a few genius hackers. In addition, it is also possible to use webpack to compile the JS that reads this wasm into an obfuscated form, adding even more cover-up.
However, it is absolutely impossible to decompile and prevent all processes from being exposed, and to prevent the strings written inside from being compounded. It would be easy for some nasty hacker to expose the wasm if he took his time. So don’t include any information in the source of Wasm that you don’t want to leak.
On the other hand, hackers (crackers) are not busy people either, so cost effectiveness is definitely a consideration. They will obfuscate as much as they can, and if they have to spend a lot of time and money to get just a few hints, and they see almost no return, they will drop out of the target list. Just as a house that says it never locks its doors is the easiest target for burglars, a system that can be easily hacked is a target.