Suddenly Shellcode Part 1: a puzzle on Twitter led me to assembly

One of the better things about the internet is the unlimited scope for learning new things, including things you didn’t set out to discover. Even better, sometimes people who know a great deal about their chosen topic will help you along the way. This is the story of how my efforts to grok an arcane Bash one-liner led me out of my comfort zone to directly confront shellcode and assembly language for the first time.

So there I was, drinking coffee and aimlessly scrolling Twitter, which I’m pretty sure counts as a protected religious practice under Section 2a of the Canadian Charter of Rights and Freedoms. At some point, I saw a tweet from Google Zero researcher Tavis Ormandy. What if you Wiley Coyote yourself and use chmod to remove all permissions from chmod?

Your computer will thank you for never running these commands

One response by Erik Bosman caught my eye. On a good day, I consider myself pretty handy with Bash, but what. is. this?

I think I know what this does, but I don’t yet have any idea what it is

I know this must somehow restore useful permissions to chmod. After all, no-one has corrected Erik, and this the internet. I more or less understand the bits of it which look like Bash, but what’s encoded there? What’s that argument to yes? How does this one-liner work overall?

Here it is with some newlines to make it a bit easier to read:

cd /proc/$$;
exec 3>mem;
(base64 -d<<<6xNfanteweYC/8ZqWlgPBWo8WA8F6Oj///8vYmluL2NobW9kAOvZ; \
yes $'\xeb\xfc'|tr -d '\n')| \
dd bs=1 seek=$((0x$(grep vdso -m1 maps|cut -f1 -d-)))>&3Code language: JavaScript (javascript)

Heh, the syntax highlighter on that code block doesn’t get it either.

I’m going to work through this roughly in order, and I’ll try not to double back too much, but I make no promises. As will become abundantly clear, this is not a tutorial but more an exposition of me not knowing stuff. Enjoy.

How would you describe yourself?

The first command changes directory to /proc/$$ which is the process information pseudo-filesystem for the current process, i.e. this Bash shell. The Linux Documentation Project describes the files in /proc as a window into the kernel, and elaborates that files in proc don’t actually contain any data; they just acts as a pointer to where the actual process information resides.

Redirection section

As with fractious children, so with computers: redirection is often a great shortcut to the desired result.

exec 3> mem creates a new handle 3 and maps it to mem, and that’s then used at the end of the one-liner to patch mem:

(bunch of stuff)>&3

A simple demo of this technique:

$ exec 4>foo; (echo "bar")>&4
$ cat foo
barCode language: PHP (php)

The example in this one-liner is just using dd to be a little more surgical. We’ll come back to dd later.

Base in your face

Knowing what’s been encoded in base64 is always going to be worthwhile, so let’s start there. If this was a real-world problem, I’d try to find instances of that base64 string online to see if it’s popular/well-known, but I’m going to see what I can do by myself instead.

base64 -d<<<6xNfanteweYC/8ZqWlgPBWo8WA8F6Oj///8vYmluL2NobW9kAOvZ
�_j{^����jZXj<X�����/bin/chmod��Code language: PHP (php)

Can’t say I’m surprised that it’s mostly binary something-or-other. xxd will at least let us see what’s in there:

base64 -d<<<6xNfanteweYC/8ZqWlgPBWo8WA8F6Oj///8vYmluL2NobW9kAOvZ | xxd
00000000: eb13 5f6a 7b5e c1e6 02ff c66a 5a58 0f05  .._j{^.....jZX..
00000010: 6a3c 580f 05e8 e8ff ffff 2f62 696e 2f63  j<X......./bin/c
00000020: 686d 6f64 00eb d9                        hmod...Code language: PHP (php)

There might be clues here I’m missing, but apart from /bin/chmod I don’t know what to make of this. There’s no ELF header, and I couldn’t google up any significance to the opening bytes. Let’s park this, and see what we can figure out about the rest of the one-liner.

Say yes to the address

yes will spam whatever string you give it. What’s that argument to yes? It’s hex. For example:

echo $'\x41\x44\x41\x4D'
ADAMCode language: PHP (php)

So what’s the significance of '\xeb\xfc' repeated over and over?

I didn’t know, and I couldn’t figure it out. I did receive an answer later though, so keep reading. For now, let’s move back to the contents of /proc/$$ as referenced by the one-liner: mem and maps.

0x marks the spot

/proc/$$/mem points at memory held by this process. However, we can’t just look straight at this file with the usual everyday tools though:

cat mem
cat: mem: Input/output error

Let’s look at man proc:

This file can be used to access the pages of a process’s memory through open(2), read(2), and lseek(2).

proc man page

So mem can be read but I’m still not sure how without writing some C or Python or something – and the man page doesn’t seem to explain why there’s an I/O error.

I found what appears to be a high quality StackExchange answer describing mem and maps. I’ve used maps before, but only to figure out what was hogging all the hugepages on a production system during a resource crunch; I’ve not yet tried to look directly at the memory contents, still less change them.

We get an explanation for the I/O error above:

Since the first page in a process is never mapped (so that dereferencing a NULL pointer fails cleanly rather than unintendedly accessing actual memory), reading the first byte of /proc/$pid/mem always yield an I/O error.

How do I read from /proc/$pid/mem under Linux?” on StackExchange by Gilles

We also get a reminder that maps is essentially a directory listing for what lives where within mem:

The way to find out what parts of the process memory are mapped is to read /proc/$pid/maps. This file contains one line per mapped region.

How do I read from /proc/$pid/mem under Linux?” on StackExchange by Gilles

In the case of my Bash shell, the whole of maps is kinda long at ~50 lines, but the last entry describes the location in memory of vdso, which is something which was mentioned in the original Bash command.

We’re only interested in the [first] line within maps which describes the location of vdso in mem, and specifically the field before the first hyphen, which is the starting offset of vdso within mem:

% grep vdso -m1 maps|cut -f1 -d-
7ffe8fdf7000

Vee Dee Ess Whatnow?

vdso stands for virtual dynamic shared object:

a small shared library that the kernel automatically maps into the address space of all user-space applications. Applications usually do not need to concern themselves with these details as the vDSO is most commonly called by the C library. This way you can code in the normal way using standard functions and the C library will take care of using any functionality that is available via the vDSO.

vdso man page

So the kernel is exposing something into userspace. Hmm. Why does the vDSO exist at all? The man page has you covered:

There are some system calls the kernel provides that user-space code ends up using frequently, to the point that such calls can dominate overall performance. This is due both to the frequency of the call as well as the context-switch overhead that results from exiting user space and entering the kernel.

The rest of this documentation is geared toward the curious and/or C library writers rather than general developers. If you’re trying to call the vDSO in your own application rather than using the C library, you’re most likely doing it wrong.

vdso man page

Solid advice in general no doubt, but in this context this makes sense; looks like the one-liner sneakily patches something into vdso: an area of memory which handles frequently-used system calls but which is accessible from within userspace. Part of me is wondering how security for vdso is handled, but at least I know at a high level what vdso is now, and where to find it in the context of the memory of this particular Bash Shell instance.

Review

Let’s take another look at the one-liner to see what we know – and what we still don’t:

cd /proc/$$;
exec 3>mem;
(base64 -d<<<6xNfanteweYC/8ZqWlgPBWo8WA8F6Oj///8vYmluL2NobW9kAOvZ; \
yes $'\xeb\xfc'|tr -d '\n')| \
dd bs=1 seek=$((0x$(grep vdso -m1 maps|cut -f1 -d-)))>&3Code language: JavaScript (javascript)

So:

  1. Change to the proc directory for this process (the Bash shell we’re using)
  2. Open a new handle 3 to interact with mem, which is a pointer to the memory used by this process
  3. Write mysterious payload one byte at a time into mem, overwriting the vdso entirely
  4. vdso is a small shared library containing frequently-called kernel functionality mapped into userspace to avoid the performance

It’s the mysterious payload which has me stumped. Time to ask the internet and see if anyone is kind enough to help. Turns out someone is indeed kind, and not just someone, but OP Erik Bosman, who I subsequently realized has significant form in this area.

I asked, and the internet answered

Thank you Erik. A quick reminder for the rest of us:

A small piece of code, used as the payload of a virus or other malware, that launches a shell so that the attacker can control the compromised computer.

Wiktionary, “shellcode”

The term can encompass similar code which does something else useful to an attacker, but in general it makes sense: get shell, perform arbitrary operations on compromised host.

I still want to understand exactly what’s in that little string of binary though. Erik mentions that it’s [compiled for] x86-64, which makes sense because that’s going to cover most modern Linux systems.

After a few minutes of reading, I installed ndisasm, aka the Netwide Disassembler. I fed it the little bit of binary – the decoded Base64 message from the Bash one-liner – and it presented me with something which is at least human-legible:

$ ndisasm -b 64 -p intel <( base64 -d<<<6xNfanteweYC/8ZqWlgPBWo8WA8F6Oj///8vYmluL2NobW9kAOvZ )
00000000  EB13              jmp short 0x15
00000002  5F                pop rdi
00000003  6A7B              push byte +0x7b
00000005  5E                pop rsi
00000006  C1E602            shl esi,byte 0x2
00000009  FFC6              inc esi
0000000B  6A5A              push byte +0x5a
0000000D  58                pop rax
0000000E  0F05              syscall
00000010  6A3C              push byte +0x3c
00000012  58                pop rax
00000013  0F05              syscall
00000015  E8E8FFFFFF        call 0x2
0000001A  2F                db 0x2f
0000001B  62                db 0x62
0000001C  696E2F63686D6F    imul ebp,[rsi+0x2f],dword 0x6f6d6863
00000023  6400EB            fs add bl,ch
00000026  D9                db 0xd9Code language: PHP (php)

Is this correctly disassembled though? Actually, yes! Poking about on Twitter, I realized at this point that I’d missed a quote-tweet by Anisse who not only provided analysis of the shellcode from Erik’s one-liner, but a review of a significant number of other sneaky solutions to the original chmod puzzle posed by Tavis.

Am I going to try to understand how this code actually does what it does? Yes, absolutely… but that’ll be Part 2.

This Post Has 2 Comments

  1. Nick

    V. interesting article – shall forward to part 2

    1. Adam Barnett

      Thanks Nick! Another thing I don’t know enough about is firewalls, so I’ll be checking out your site as well.

Leave a Reply