Tangent is a bot I've been working on and testing for about a week now and I think its ready for public use, there are a lot of reasons why allowing anyone to execute arbitrary is a terrible idea so extra care must be given when planning.

Think you can break it? Try it out here: https://discord.gg/F2F2EdE
Github: https://github.com/PixelToast/tangent

My design constraints were the following:

  1. Arbitrary native code execution, not just a sandboxed interpreted language like the OpenComputers mod for Minecraft.
  2. Limited internet access, people should not be able to use my internet connection for nefarious purposes.
  3. Limited CPU, memory, and disk usage.
  4. Automatic recovery, nothing a user can do should put the bot into an unrecoverable state whether it be a fork bomb, filling the filesystem, killing all processes.
  5. The discord bot should limit buffers on anything the VM sends, you should not be able to spam files / data / process events to the bot and cause it to run out of memory.
  6. Discord should not be trusted, if discord server or my account were compromised it should not allow attackers to gain access to my server.

The most common method for sandboxing is restricting methods at the language level for example my old Lua sandbox, this is incredibly language-specific and some languages like C simply cannot be sandboxed like this.

Another common method is chroot, a Linux command which changes the root directory of the running program but can be easily broken out of if you aren't careful.

Docker does everything I need in terms of sandboxing with little effort including limiting resources, custom network routing, etc but it is based on chroot meaning sandboxed applications share a kernel with the host system.

Sharing a kernel with the host system makes it more vulnerable to side channel attacks and nasty privilege escalation than a hypervisor based virtual machine which is not something I would feel safe leaving running publicly for a long period time especially with the amount of hardware exploits being discovered on x86 CPUs lately.

Docker also doesn't have support for qcow2 snapshots meaning resetting a containers state is much slower than a qemu / libvirt based machine, something that is very important for handling people continuously bricking the vm as a denial of service attack.

It seems pretty clear now that a full Linux VM in a hypervisor would be much better than a container for what I'm trying to do.

My solution

Right now, Tangent uses libvirt to manage a qemu powered Debian 9 virtual machine and communicates to it through a custom json rpc like protocol.

tangent-server contains the server that runs as an unprivileged user on the VM, allowing the bot on the host machine to start processes, read stdin/stdout, and access files safely.

In its current configuration the VM is on a closed virtual network with only the host machine being routable (192.168.69.1), iptables are set up so requests from the VM to the host are blocked to prevent it from attempting to connect to SSH or other services it should not have access to.

The only way for information to go in and out of the VM is through connections initiated by the host.

System resources are also heavily limited with 1 thread, 256MB of memory, and a 16GB virtual disk.

If you manage to put the VM in an unusable state like by killing the server process continuously, Tangent will automatically use virsh to reboot the VM which only takes around 4 seconds.

As a last resort if someone obtains root and bricks the system a qcow2 snapshot can restore the system state to brand new using the qclean command, this is actually much faster than rebooting the VM.

The bot itself is a Dart application and is designed to be as fault tolerant as possible, all buffers that the VM send to are capped, any malformed packets will instantly terminate the connection, and all of the async wrappers for files and processes are destroyed properly when closed.

Dart is especially good for this job because of its powerful and safe async library, it eliminates a lot of corner cases and concurrency problems you usually get when designing asynchronous code.

Where the fun starts

So far I've installed the SDKs of over 50 languages to the VM, including:
sh, bash, ARM assembly, x86 assembly, C, C++, Lua 5.3/5.2/5.1, LuaJIT, Python 2/3, JavaScript, Perl, Java, Lisp, Brainfuck, C#, F#, Haskell, PHP, COBOL, Golang, Ruby, APL, Prolog, OCaml, SML, Crystal, Ada, D, Groovy, Dart, Erlang, FORTH, Pascal, Fortran, Hack, Julia, Kotlin, Scala, Swift, TypeScript, Verilog, WebAssembly, Scheme, AWK, Clojure, TI-BASIC, Batch, Racket, Rust. Over 12GB of packages!
All with bot commands that compile and run them for a single file.

Here are some examples:

But wait there's more

You can upload and download files to it!

In conclusion

It was a fun project to work on, I hope somebody finds a good use for it.
Back to working on my new game.

Discord: https://discord.gg/F2F2EdE