Monday, 13 June 2016

Local File Inclusion - why everything you ever read about file uploads is probably wrong

Allowing people to load content onto your server potentially allows them to deploy code which will be executed by your server. Your server can easily be pwned, leading to your work being destroyed/stolen and the website used for spamming, phishing, DOS attacks.....its a bad thing.

Where I've implemented file uploads, or have been asked how someone else should do it, I usually give this advice:
  1. Ensure that uploaded content is held outwith the document root, failing that, in a directory configured with "php_flag engine off"
  2. Only use filenames with safe characters - [a-z], [A-Z], [0-9], .-_
    (OWASP, after detailling why you should not use blacklisting as a preventative measure, offer a black list of characters :) )
  3. Only allow content with a whitelisted extension to be uploaded
  4. Check the mimetype of uploads - using PHP's mime_content_type(), not the value supplied by the user
  5. Preferably convert the content to a different format (then back to the original format if required)
This looks like Security-in-depth, i.e. applying more protection than is actually required. I've always been somewhat skeptical of this as a justification for effort and expense (often in the context where in Security in breadth is sadly lacking). Certainly ticking each of the boxes in the list above is not always practical. But relying on ones own knowledge and understanding is a dangerous conceit.

I am indebted to wireghoul for pointing out to me that there is a rather subtle attack which can be exploited for LFI on Apache systems.

Hopefully you know that the PHP interpreter will run the PHP code embedded in any file presented to it. To give a simple example:
  1. amend your webserver config to send png files to the PHP handler, e.g. by creating a .htaccess file containing

     AddHandler application/x-httpd-php .png

  2. write your PHP code, and then append it to a valid png file....

        echo '<?php mail("root@localhost", "hello", "from the inside");' >>image.png

  3. Point your web browser at the URL for image.png, and the picture renders in your browser, just as you would expect....but the PHP code executes too. No surprises so far. After all, if someone can reconfigure your webserver, they don't need to bother with the LFI - they've already pwned the system.
But it is also possible to get the webserver to execute PHP code embedded in an image file without changing the handler!

This comes about due to way mod_mime infers mimetypes from extensions

mod_mime will interpret anything in the name beginning with a . as an extension. From the linked documentation, note the paragraph

   Care should be taken when a file with multiple extensions gets associated with both a media-type and a handler. This will usually result in the request being handled by the module associated with the handler. For example, if the .imap extension is mapped to the handler imap-file (from mod_imagemap) and the .html extension is mapped to the media-type text/html, then the file world.imap.html will be associated with both the imap-file handler and text/html media-type. When it is processed, the imap-file handler will be used, and so it will be treated as a mod_imagemap imagemap file.

The mod_mime documentation authors go on to explain that this is the default behaviour for handlers, but that it is possible to ensure that only the last extension is used for choosing the handler with a FilesMatch block around the SetHandler directive.
So if we name our uploaded file image.php.png then we should be able to get our PHP code executing on a server.

Indeed, guess what method PHP recommend for configuring your server? (as implemented by Redhat, Suse, Ubuntu and others).

Let's try it out!

In the earlier example, we appended the PHP code to the end of the file. When I tried this again with the double extension, I got a lot of unusual characters in my browser. The Content-type was PHP's default test/html not image/png. So instead, the PHP code needs to prefix the image and set the content-type header:

    ( echo -n  '<?php  header("Content-Type: image/png");
      mail("root@localhost", "hello", "from the inside");
      ?>' ; cat original_image.png ) >> image.php.png
Point your browser at the URL, and once again, PHP code executes, the image renders but this time using the default config on most webservers.

The double extension is known but not by many.

It's not just me. I had a look around the web at advice being given about PHP file uploads - and a lot of it is wrong.

All the articles I've looked at claim that it only works where the extensions *other* than '.php' are unknown to the browser. As demonstrated here. this is NOT true.

Validating the mimetype won't protect against this kind of attack; although it would detect the file I created in the example, it won't work if the PHP code is embedded somewhere other than the start of the file. If it is contained in an unreachable block or as EXIF, then the code will execute, the file will be a valid image (based on its extension) and a mime check (which reads the first few bytes of the file) will say its an image. It will only render as an image when served up by the webserver if output buffering is enabled by default - but this will not often be a problem for an attacker.

FAQ - How do I build a LAMP cluster

Once again, someone just asked this question on a forum - this time on LinkedIn, but it crops out often in other places like serverfault.

There are so many variables that you are unlikely to get the right answer without supplying a huge amount of detail. Facebook run a fault-tolerant, scalable architecture using Linux, Apache, PHP and MySQL (and lots of other things). Do you think their infrastructure and applications could be answered in a post? Do you think their architecture is appropriate for a backoffice supplies management database?

At the other end of the spectrum, even with just 2 machines there are a lot of different ways to connect them up.

The first step in the battle is knowing what questions you should be asking - at least to yourself.

How do you replicate content across multiple webservers?

HTTP is stateless - so as long as webservers have access to the same data, you have redundancy. For TLS it helps performance to have a shared store for session keys (although there are ways to store session keys on the client).
PHP code shouldn't change very frequently but for complex applications, managed deployments are a requirement (or significant investment and skills in how to avoid having to coordinate the deployment). Sometimes rsync is enough. Sometimes you need a multi-site SAN. Sometimes you need a custom content management system written from scratch.

Do you host the system yourself? 

If so, then you have a lot of flexibility over how you configure the cluster and communication between nodes - make sure you have more than one IP address though. But do you have the skills to design and manage the cluster? Moving the service into the cloud solves a lot of problems (or rather makes them someone else's problems) but creates new ones for you.

How frequently is data updated?

...and must each node be exactly in sync? If not, how much lag can you tolerate?

Do you use PHP sessions?

These are (frequent) data updates. They need to be shared across the cluster.

How much can you spend?

Employment a good consultant is going to cost you money. But it should be money well spent. If you need to have your appendix removed, would you look for the lowest bid? While it seems that it is possible to outsource some development work like this, I'm not convinced it's the right way to plan an IT architecture.

How much downtime can you afford?

Both scheduled and un-scheduled (i.e. planned and accidental). 
In my experience, once you move past a single machine hosting your site, there is surprisingly little correlation between investment and availability. But planning for how the service will degrade when components start failing is always good practice. Knowing that there is scope for at least scheduled maintenance windows does expand the horizon of possibilities.

Do you need to split the cluster across data centres for redundancy?

While it would be nice to design an architecture which can scale from a couple of machines to multiple data-centres, the likely outcome of this would be a very expensive, and hard to manage solution running on a pair of servers. Even if we were all running the same publish-only application, the right architecture changes according to the scale of the operation. 

How scalable do you want the system to be?

Continuing the previous point - maybe you should be planning further ahead than just the next rung on the ladder.

Where are your users/customers/datacentre?

Geography matters when it comes to performance.

What is the current volume of writes on the database?

There are 2 well defined solution for MySQL replication - synchronous and asynchronous. The latter is simple, available out of the box but only works for 2 nodes accepting writes from the outside world. Due the replication being implemented in a single execution thread there are potential issues with capacity where there are a large proportion of writes.

What network infrastructure is between the server(s) and clients?

It may aready be capable of supporting some sort of load balancing. But don't discount Round Robin DNS - there is a HUGE amount of FUD about RRDNS on the internet. It's mostly completely wrong. But there benefits to using other approaches - while RRDNS solves a problem for clients connecting via HTTP, its not a good way to manage redundancy between your PHP and MySQL.

What is your content caching policy?

Caching is essential for web usability. But the default ETags setup for Apache is far from optimal for a loose cluster of webservers. There are different approaches to managing caching effectively and (if immutable) these may impact your architecture.

What regulatory frameworks apply?

Even where there are no explicitly obligations (such as PCI-DSS) as soon as you start accepting and process data from users you have a duty of care to them. The type of data you are collecting and what you do with it has an important bearing on the security and therefore the architecture of the service.

The wrong questions

How do I build a LAMP cluster

Obviously - the point of this post is to explain why that is the wrong question to ask.

The number of users

This has very little to do with the infrastructure required. The number of concurrent sessions is only marginally better. When you start talking about the number of HTTP requests, the split between static and dynamic content, and the rate of data updates then your getting closer to the information required to inform a choice of architecture.

Friday, 24 October 2014

Nouvea? Retch.

I often wonder what heinous crimes I committed in a previous existence to deserve the punishments I get in this one.

It all started so simply.

My Desktop was running PCLinuxos11 - a bit long in the tooth and still 32 bit, but it was all working and I kept up with the patches. But then at the last round of patches, the Chromium browser stopped working - a broken dependency. I try to fix the dependancy - but get a 404 from the respository. I trid to revert but can no longer find the previous package. Oh well, I bite the bullet and try to do a dist upgrade - which completely trashes my machine.

First I try installing PCLinuxOS14 - but it uninstalls all of KDE (but I still have openbox which I added some time ago to play around with). Then I try OpenSuse (I used to run Suse on my servers up to about version 8) the current version looks nice and it all works but OMG is it SLOW! And it also trashes the PCLInuxOS installation completely! Then I try MINT 17 - which won't even boot. Then I find an old MINT 15 DVD which boots OK and I install that, mount my /home filesystem and recreate the accounts. I roll forward the patches, and I seem to have a working (and usable system). Only I can't install any more software as it seems this version of the distro is no longer supported.

Why is this stuff so hard? I know you guys don't want to maintain lots of different versions of your software, but is it so hard to just leave the old packages online and let us upgrade through them? 

After a lot more digging it seems that my graphics card (Nvidia GeForce 6150) does not play nice with recent versions of the nouveau driver. Hence my only option is to hope that the nVidia supplied versions will work with whatever flavour Linux I try next - which I need to boot up and install with the nouveau driver blacklisted. But for a short while I think I'll stick with having a working computer.

Nothing Nowhere - Thanks Orange

I've had the same mobile SIM (and number) for least 12 years from Orange - a basic PAYG account. In recent years my work has provided me with a contract phone, but I keep the Orange SIM in an charged phone and top up the credit every so often - formerly using scratch cards, more recently via ATMs.

This weekend I'm stuck in the house and needed to add some credit - surely I must be able to do that online? So I find the site, and yes, I can top up online but first I have to create an account - ho hum - a bit of a hassle, but I'll play along. I type in my details, they send a text to the phone with an activation code. I don't know why I can't make an anonymous online top up via the internet when I can do this at any ATM. But surely just a few clicks and keystrokes and I'll get there.


First the registration process process fails to redirect to the account. Then When I do eventually get logged in, I can't do anything unless I provide the 4 digit PIN code they say they sent me when I registered the phone. WTF?

So I click on the "I don't have my PIN" link - it says to call 450. I call 450. I then spend around 10 minutes pressing buttons trying to find my way through their IVR, then just when I think I'm getting near to my destination I get disconnected. I try again. 10 minutes later disconnected again.

FFS, Orange I'm trying to give you money!

So how do I contact them via the website? Is there an email address? No. Is there a form I can submit a request to? No. The only option appears to be to call a mobile number (which I can't do without any credit on my phone).

Tuesday, 22 July 2014

Browser Fingerprinting - digging further into the client

I previously wrote about some techniques for Browser Fingerprinting (or "Device Identification" as it's known in some circles). Today I came across an interesting technique already in widespread use which detects variations between devices by looking at how content is rendered by WebGL / HTML5 Canvas.

Typically as much of the processing for these as possible is pushed onto the underlying hardware, resulting in consistent results for a given device independently of the OS / software. There is a surprising amount of variation here. However there's not sufficient variation for it to be used in isolation from other methods.

Update: 28 Jan 2015

Another interesting article found here lists Google Gears detection and MSIE security Policy as enumerable interfaces for fingerprinting. (The TCP/IP parameters is presumably done serverside, while proxy detection uses Flash). But the really interesting bit is that 2 of the products tested tried to hookup with spyware on the client!

Thursday, 12 June 2014

Whatever happened to scripting?

Don't you just love "enterprise" tools. Most of the ones I've had the pleasure of working with seem to have been around a very long time, belonged to companies which have progressively been bought over by bigger and bigger corporations, have been developed by different teams with different methodologies and coding styles. It's a miracle they work at all.

But one common theme, and one that they all tend to be very good at is making sure that they are at the top of the food chain in their field. Most provide good downstream connectivity, collecting data from all sorts of different sources. But it it is exceedingly difficult to integrate them to upstream components - for reporting, user management, logging etc.

The latest problem was to get data out of Microstrategy. It has a SOAP interface for invoking reports remotely. But try finding any documentation for it. It also has a "simple" HTTP based interface (where report definitions are specified in the URL). Again with no available documentation. I asked the Data warehouse team whether they knew anything about these interfaces. Answers ranged from "What's SOAP?" to "no". It has a scheduler for running reports - can't we just dump these on a filesystem somewhere?....apparently not.  So how can you get information out in a machine readable form? We can send an email.

Great. Email I can do. So I fire up putty and start hacking together a script to get the file out of an email and hand it over to my app. Fetchmail -> procmail -> metamail. Simples.

....only metamail is not available in RHEL. I've previously blogged about mail processing in RH. I really don't want to write my own MIME handler. While there's lots of PHP implementations on the internet, you need to look hard to find the ones which are robust and well written. But even then the parsing is done by loading the entire message into memory. Not very handy if the message is 100Mb+ and using PHP.

I could download metamail and compile it....but looking around the internet, it doesn't seem to have been actively maintained. Indeed there hadn't been any significant changes since I'd sent in some bug reports about 15 years ago! Investigating further I found ripmime which does what I need. So a quick security scan and it seems ideal.

This might be a good point to describe what I looked for in checking its provenance.

  • It seems to be bundled in several Linux distros (i.e. other people like it and are using it).
  • Older versions have some CVEs logged against it - now fixed. This is good on several counts - again it shows that people are using it and finding security problems and the security problems are getting fixed.
  • the other products flagged in the same CVEs put it in respectable company
  • I went to the origin website for the tarball - found other interesting security stuff related to email handling.
  • scanned the source for anything that might indicate an alternate function (fork(), exec*(), system(), socket stuff)
  • looked to see if the code was aware of obvious attacks (such as Content-Disposition: attachment; filename="/etc/passwd";).
All good. It would have taken me a very long time to implement all this myself.

Really RedHat, ripMIME should be part of RHEL!

I know Linux is now mainstream - but that doesn't mean I want a complex black box which I can't diagnose or re-purpose. If I wanted that I would have bought MS Windows! (are you reading this KDE PIM developers, systemd developers). 

Fortunately it's trivial to build ripmime (no dependencies other than glibc and iconv).

Project back on track. Thank you Paul.

Monday, 24 March 2014

Warning: BBWC may be bad for your health

Over the past few days I've been looking again at I/O performance and reliability. One technology keeps cropping up: Battery Backed Write Caches. Since we are approaching World backup day I thought I'd publish what I've learnt so far. The most alarming thing in my investigation is the number of people who automatically assume that a BBWC assures data integrity. Unfortunately the internet is full of opinions and received wisdom, and short on facts and measurement. However if we accept that re-ordering of the operations which make up a filesystem transaction on a journalling filesystem undermines the integrity of the data, then the advice from sources who should know better is dangerous.

Write barriers are also unnecessary whenever the system uses hardware RAID controllers with battery-backed write cache. If the system is equipped with such controllers and if its component drives have write caches disabled, the controller will advertise itself as a write-through cache; this will inform the kernel that the write cache data will survive a power loss.”

There's quite a lot here to be confused about. The point about RAID controllers is something of a red herring. There's a lot discussion elsewhere about software vs hardware RAID – and the short version is that modern computers have plenty of CPU to handle the RAID without a measurable performance impact, indeed many of the cheaper devices offload the processing work to the main CPU. On the other hand hardware RAID poses 2 problems:

  1. The RAID controller must write its description of the disk layout to the storage – it does this in a propretary manner meaning that if (when?) the controller fails, you will need to source compatible hardware (most likely the same hardware) to access the data locked away in your disks
  2. While all hardware RAID controllers will present the configured RAID sets to the computer as simple disks, your OS need visibility of what's happenning beyond the controller in order to tell you about failing drives. Only a small proportion of the cards currently available are fully supported in Linux.

It should however be possible to exploit the advantages (if any) offerred by the non-volatile write cache without using the hardware RAID functionality. It's an important point that the on-disk caches must be disabled for any assurance of data intregrity. But there an omission in the statement above which will eat your data.

If you use any I/O scheduler other than 'noop' then the writes sent to the BBWC will be re-ordered. That's the point of I/O scheduler. Barriers (and more recently FUA) provide a mechanism for write operations to be grouped into logical transactions within which re-ordering has no impact on integrity. Without such boundaries, there is no guarantee that the 'commit' operation will only occur after the data and meta data changes are applied to the non-volatile storage.

Even worse than losing data is that your computer wont be able to tell you that it's all gone horribly wrong. Journalling filesystem are a mechanism for identifying and resolving corruption events arising due to incomplete writes. If the transaction mechanism is compromised from out-of-sequence writes, then the filesystem will most likely be oblivious to the corruption and report no errors on resumption.

For such an event to lead corruption, it must occur when a write operation is taking place - since writing to the non-volatile storage should be much faster than to disk, and that in most systems reads are more comon than writes, writes will only be occurring for a small proportion of the time. But with faster/more disks and write intensive applications this differential decreases.

When Boyd Stephen Smith Jr said on the Debian list that a BBWC does not provide the same protection as barriers he got well flamed. He did provide links that show that the barrier overhead is not really that big.

A problem with BBWC is that the batteries wear out. The better devices will switch into learning mode to measure the battery health either automatically or on demand. But when they do so, the cache ceases to operate as non-volatile storage and the device changes it's behaviour from write-back to write through. This has a huge impact on both performance and reliability. Low end devices won't know what state the battery is in until the power is cut. Hence it is essential to choose a device which is fully supported under Linux.

Increasingly BBWC is being phased out in favour of write caches using flash for non-volatile storage. Unlike the battery-backed RAM there is no learning cycle. But Flash wears out with writes. The failure modes for these devices are not well understood.

There's a further consideration for the behaviour of the system when its not failing: The larger the cache on the disk controller or on disk, the more likely that writes will be re-ordered later anyway – so maintaining a buffer in the host systems memory and re-ordering the data there just means adding more latency before data is in non-volatile storage. NOOP will be no slower and should be faster most of the time.

If we accept that a BBWC with a noop scheduler should ensure the integrity of our data, then is there any benefit from enabling barriers? According to RedHat, we should disable them because the BBWC makes them redundant and...
enabling write barriers causes a significant performance penalty.”
Wait a minute. The barrier should force the flush of all data held in non-volatile memory. But we don't have any non-volatile memory after the OS/filesystem buffer. So the barrier should not be causing any significant delay. Did Redhat get it wrong? Or are the BBWC designers getting it wrong and flushing the BBWC at barriers?

Could the delay be due to moving data from the VFS to the I/O queue? We should have configured our system to minimize the size of the write buffer (by default 2.6.32 only starts actively pushing out dirty pages when the get to 10% of the RAM – that means you could have 3Gb of data in volatile storage on a 32Gb box). However, many people also report performance issues with Innodb + BBWC + barriers, and the Innodb engine should be configured to use O_DIRECT, hence we can exclude a significant contribution to performance problems from the VFS cache.

I can understand that the people developing BBWC might want to provide a mechanism for flushing the write back cache – if the battery is no longer functional or has switched to “learning mode” then the device needs to switch to write through mode. But its worth noting that in this state of operation, the cache is not operating a non-volatile store!

Looking around the internet, it's not just Redhat who think that a BBWC should be used with no barriers and any IO scheduler:

The XFS FAQ states
“it is recommended to turn off the barrier support and mount the filesystem with "nobarrier",”
Percona say you should disable barriers and use a BBWC but don't mention the I/O scheduler in this context. The presentation does later include an incorrect description of the NOOP scheduler.

Does a BBWC add any value when your system is running off a UPS? Certainly I would consider a smart UPS to be the first line of defence against power problems. In addition to providing protection against over-voltages it should be configured to implement a managed shutdown of your system, meaning that transactions at all levels of abstraction will be handled cleanly and under the control of the software which creates them.

Yes, a BBWC does improve performance and reliability (in combination with the noop scheduler, a carefully managed policy for testing and monitoring battery health and RAID implemented in software). It is certainly cheaper than moving a DBMS or fileserver to a two-node cluster, but the latter provides a lot more reliability (some rough calculations suggest about 40 times more reliable). If time and money are no object then for the best performance, equip both nodes in the cluster with BBWC. But make sure they are all using the noop scheduler.

Further, I would recommend testing the hardware you've got – if you see negligible performance impact with barriers and a realistic workload then enable the barriers.