Saturday, 5 November 2016

SELinux Sucks!

If you find yourself here then you probably have at least some vague idea about how security is enforced on Unix systems. Or maybe you just wanted to read my continuing woes with computer systems.

I did spend some time thinking about a suitable title for this post. There were so many to choose from:
  • SELinux considered harmful 
  • The emperor's new clothes
  • I want to believe
...but SELinux sucks sums it up nicely.

TL&DR

SELinux is ridiculously expensive and is unlikely to improve the Security of your system. It may make it worse.

Introduction

For those who know nothing about SELinux.....don't be hard on yourself. As a lot of this post discusses there are no SELinux experts. But in case you really know nothing about SELinux then a bit of context may help.

Unix (and therefore Linux, BSD etc) has a very elegant permissions system. There are lots of descriptions of how it works on the internet. Its read/write/execute and owner/group/other primitives can be combined to implement complex authorization models, not to mention the setuid, setgid and sticky bits.

But it doesn't end there.

There's sudo, capabilitites, filesystem ACLs, chroot and containers. Or privilege seperation using process isolation with shared memory, local/network sockets.

Apparently this still leaves some gaps. Step forward SELinux.

What is SELinux?

It's a set of rules which are compiled into a configuration loaded and implemented at runtime.

Operations on entities are mediated by this abstract set of rules based on the labels attached to those entities and the user trying to effect the change.

So apart from the compilation step, not that different from permissions?

Well, actually, yes – the configuration is a mystery black box. Most experienced Linux/Unix users can tell by looking at permissions exposed in 'ls -l' and be able to make an accurate prediction about the outcome of an operation – and how to resolve the problem when this is not as required. The permissions are presented as 10 characters, sometimes more if we're talking about the directory the file is in or facls. While 'ls -Z' displays the SELinux labels on files, it doesn't say much about the permissions these enable. For that you need to look at the policy.

The targeted SELinux policy from Fedora is currently distributed as 1271 files containing 118815 lines of configuration. The rpm contains no documentation. On the other hand, the standard installation of Apache 2.4 on the machine I'm sitting in front of, has 143 configuration files (an unusually high number due to Ubuntu distributing stub files for every available module) and 2589 lines of configuration. So, SELinux has 10 times as many files and 45 times as much config as a very complex software package. Do I need to spell out the problem here?

Indeed, the recommended practice is not to change these files, but rather add more configuration to change the behaviour of SELinux.

One consequence of the Gordian knot, is that upgrades to the configuration (which at least won't trash the extra config you have added) often need to change the labels on the filesystem; a simple upgrade can unexpectedly morph from a brief outage to hours or days of disk thrashing while every file on your disks is relabelled. And hopefully that didn't also break your security model. But...

It breaks existing security models

The targeted policy not only overrules the filesystem permissions, but also the access control mechanisms built into programs, for example 'at' is unable to read from /etc/at.allow running as a system_u user!

With the setuid bit set on an executable, you can run it as a different user, but retains the original SELinux context!

 

It is inconsistent by design

"By default, Linux users in the guest_t and xguest_t domains cannot execute applications in their home directories or the /tmp/ directory, preventing them from executing applications, which inherit users' permissions, in directories they have write access to. This helps prevent flawed or malicious applications from modifying users' files"
        - https://access.redhat.com/
       
In other words, Linux users can't run compiled C programs but can run (malicious) Java, shell script, python, PDF, Flash.... where the logic is bootstrapped by an existing executable but does not require the executable bit to be set.

What about networking?

Of course SELinux can solve every security problem; it has the capability to restrict network access. This is not available in AppArmor, and you can't apply restrictions on a per-user or per binary application using iptables.

OTOH, TCP wrappers , Network namespaces, iptables and network interfaces of type 'dummy' provide primitives which can be combined to implement complex security policies on multi (or single) tenant machines.

Debugging SELinux problems

Selinux has an option to only report, and not prevent actions. Great, that should simplify fixing things, right? However it is my experience that it does not log all exceptions that it subsequently enforces.

Under very controlled conditions, I was investigating a problem with a system_u user running 'at'. Suspecting that SELinux was the culprit, I setenforcing 0, tried running application - it worked, no log entries found. Maybe SELinux was not the problem? So I setenforcing 1, ran app - got message "cannot set euid: Operation not permitted", no log entries found.

WTF?

Again, I set enforcing 0, ran the app. Again it worked. Again, no log entries. Just to be sure I run some stuff which I knew would violate the policy – and that produced log entries. With no idea how to begin to fix the problem, I setforcing 1 again, ran the app, this time it worked!

Yeah! problem solved.

Then, 10 minutes later "cannot set euid: Operation not permitted", but now I was getting log entries.

Automated Baselining

You don't start reading through the kernel source every time something misbehaves on Linux, so maybe you should treat the default policy in the same way, as a black box. It sounds like a reasonable strategy. Just run your system in a learning mode then apply those lessons to the policy. Indeed several commentators advocate just such an approach.  
(Trying to fix permissions in enforcing mode is a thankless task - each blocked operation is usually masking 3 or 4 further rules preventing your application from working).
So the first step to getting your application working is to switch off a critical security control? Really??!!!

Anyone who has worked on developing a complex system will tell you that getting full code coverage in test environments is a myth.

Yes, as Darwin proved, evolution works really well - but it takes a very long time.

And there are "times when it [audit2allow] doesn't get it right"

sealert is painfully slow;  in recent exercise I clocked it at around 20-25 log entries per second. Not great when you have a 100Mb log file to read. Amongst some of the oddities it identified:

SELinux is preventing /usr/libexec/mysqld from write access on the file /tmp/vmware-root/vmware145. 

- you might think this means that mysqld was attempting to write to /tmp/vmware-root/vmware145, and you'd be wrong. This is part of vmware's memory management. But vmware also uses this as a general dumping ground. The odd thing is the directory is:
       
        drwxrwxrwt.   6 root root        4096 Jul 11 15:20 tmp
        drwx------.   2 root root        4096 Jul 11 15:10 /tmp/vmware-root
       
SELinux is preventing /sbin/telinit from using the setuid capability.
SELinux is preventing /sbin/telinit from read access on the file /var/run/utmp.
SELinux is preventing /sbin/telinit from write access on the file wtmp.

Clearly Redhat are not reading their audit logs, or maybe they just disable SELinux?

SELinux encourages dumb workarounds

One the problems we ran into when SELinux was enabled on a server I recently migrated was that email stopped working. The guys with root access started working on this (I made sure they had a test script to replicate the problem) while I started looking at other ways of solving the problem - it was having a significant impact on the service. Guess who came up with a solution first?

In about 2 hours I had a working drop in replacement for '/usr/sbin/sendmail -t -i' which PHP uses for sending emails.

I'm not criticizing the Unix guys. The people working on this are very capable and do have expertise in SELinux. The problem is SELinux.

But go back and re-read my previous sentence; in 2 hours I had written a MTA from scratch and bypassed the SELinux policy written by the experts at RedHat. WTF????? If I am some sort of uber-cracker then I really am not getting paid enough.

(spookily enough one of the reasons the server could not send email is shown in the screen shot at https://fedorahosted.org/setroubleshoot/ ! This might be why Redhat 7 now has a selinux bool httpd_can_sendmail)

Now, which do you think is more secure, the original Postfix installation using a standardized config which has been extensively tested in house or the MTA I knocked up in between other jobs?

Maybe its just me?

I've spent a very long time searching the internet for stories about how people have used SELinux to prevent and investigate attacks. While there are a huge number of articles proclaiming its benefits, I struggled to find many which demonstrated any real effectiveness.

Excluding the cases where a patch had been available for at least a week before the alleged incident, I was left with:

Mambo exploint blocked by SELinux – http://www.linuxjournal.com/article/9176?page=0,0

HPLIP Security flaw – https://rhn.redhat.com/errata/RHSA-2007-0960.html

OpenPegasus vulnerability blocked by SELinux – http://james-morris.livejournal.com/25421.html
       
Just 3 cases. The first one is a very good article and I recommend reading it (although there are some gaps in the story).

Should I care?

One of the problems developing secure we based applications is that everything in the site ends up running as the same user id. This doesn't mean you can't do privilege separation using sudo or daemons, but it does mean that you always have to implement security controls in your applications. A mandatory access system does not solve these problems, but it should simplify some some of them. Fortunately SELinux is not the only game in town; AppArmor, GRSecurity and Smack are all available, well tested and widely implemented on Linux systems.

Of course, if you are Google or Facebook, then you can afford to spend 100's of man years working out how to get SELinux working properly (and of course there are no security bugs in Android).





What is wrong with SELinux?

The people developing SELinux (or insisting on its use) have missed out on something I have drummed into every junior programmer I have trained:

We don't write code for computers to understand; we write it for humans to understand.

SELinux/Targeted policy is 

- really bad for productivity, 

- bad for availability, 

- bad for functionality

It is quicker to bypass SELinux/Targeted policy restrictions than change them to allow a denied action.

What is the solution?

The time you would spend aligning the off-the-shelf SELinux policies with your application will be better spent addressing the security of your system in conventional ways. Switch off SELinux and fix your security.

Monday, 13 June 2016

Local File Inclusion - why everything you ever read about file uploads is probably wrong


Allowing people to load content onto your server potentially allows them to deploy code which will be executed by your server. Your server can easily be pwned, leading to your work being destroyed/stolen and the website used for spamming, phishing, DOS attacks.....its a bad thing.

Where I've implemented file uploads, or have been asked how someone else should do it, I usually give this advice:
  1. Ensure that uploaded content is held outwith the document root, failing that, in a directory configured with "php_flag engine off"
  2. Only use filenames with safe characters - [a-z], [A-Z], [0-9], .-_
    (OWASP, after detailling why you should not use blacklisting as a preventative measure, offer a black list of characters :) )
  3. Only allow content with a whitelisted extension to be uploaded
  4. Check the mimetype of uploads - using PHP's mime_content_type(), not the value supplied by the user
  5. Preferably convert the content to a different format (then back to the original format if required)
This looks like Security-in-depth, i.e. applying more protection than is actually required. I've always been somewhat skeptical of this as a justification for effort and expense (often in the context where in Security in breadth is sadly lacking). Certainly ticking each of the boxes in the list above is not always practical. But relying on ones own knowledge and understanding is a dangerous conceit.

I am indebted to wireghoul for pointing out to me that there is a rather subtle attack which can be exploited for LFI on Apache systems.

Hopefully you know that the PHP interpreter will run the PHP code embedded in any file presented to it. To give a simple example:
  1. amend your webserver config to send png files to the PHP handler, e.g. by creating a .htaccess file containing

     AddHandler application/x-httpd-php .png
    

  2. write your PHP code, and then append it to a valid png file....

        echo '<?php mail("root@localhost", "hello", "from the inside");' >>image.png

  3. Point your web browser at the URL for image.png, and the picture renders in your browser, just as you would expect....but the PHP code executes too. No surprises so far. After all, if someone can reconfigure your webserver, they don't need to bother with the LFI - they've already pwned the system.
But it is also possible to get the webserver to execute PHP code embedded in an image file without changing the handler!

This comes about due to way mod_mime infers mimetypes from extensions

mod_mime will interpret anything in the name beginning with a . as an extension. From the linked documentation, note the paragraph

   Care should be taken when a file with multiple extensions gets associated with both a media-type and a handler. This will usually result in the request being handled by the module associated with the handler. For example, if the .imap extension is mapped to the handler imap-file (from mod_imagemap) and the .html extension is mapped to the media-type text/html, then the file world.imap.html will be associated with both the imap-file handler and text/html media-type. When it is processed, the imap-file handler will be used, and so it will be treated as a mod_imagemap imagemap file.

The mod_mime documentation authors go on to explain that this is the default behaviour for handlers, but that it is possible to ensure that only the last extension is used for choosing the handler with a FilesMatch block around the SetHandler directive.
 
So if we name our uploaded file image.php.png then we should be able to get our PHP code executing on a server.

Indeed, guess what method PHP recommend for configuring your server? (as implemented by Redhat, Suse, Ubuntu and others).

Let's try it out!

In the earlier example, we appended the PHP code to the end of the file. When I tried this again with the double extension, I got a lot of unusual characters in my browser. The Content-type was PHP's default test/html not image/png. So instead, the PHP code needs to prefix the image and set the content-type header:

    ( echo -n  '<?php  header("Content-Type: image/png");
      mail("root@localhost", "hello", "from the inside");
      ?>' ; cat original_image.png ) >> image.php.png
              
Point your browser at the URL, and once again, PHP code executes, the image renders but this time using the default config on most webservers.

The double extension is known but not by many.

It's not just me. I had a look around the web at advice being given about PHP file uploads - and a lot of it is wrong.

All the articles I've looked at claim that it only works where the extensions *other* than '.php' are unknown to the browser. As demonstrated here. this is NOT true.

Validating the mimetype won't protect against this kind of attack; although it would detect the file I created in the example, it won't work if the PHP code is embedded somewhere other than the start of the file. If it is contained in an unreachable block or as EXIF, then the code will execute, the file will be a valid image (based on its extension) and a mime check (which reads the first few bytes of the file) will say its an image. It will only render as an image when served up by the webserver if output buffering is enabled by default - but this will not often be a problem for an attacker.

FAQ - How do I build a LAMP cluster

Once again, someone just asked this question on a forum - this time on LinkedIn, but it crops out often in other places like serverfault.

There are so many variables that you are unlikely to get the right answer without supplying a huge amount of detail. Facebook run a fault-tolerant, scalable architecture using Linux, Apache, PHP and MySQL (and lots of other things). Do you think their infrastructure and applications could be answered in a post? Do you think their architecture is appropriate for a backoffice supplies management database?

At the other end of the spectrum, even with just 2 machines there are a lot of different ways to connect them up.

The first step in the battle is knowing what questions you should be asking - at least to yourself.

How do you replicate content across multiple webservers?

HTTP is stateless - so as long as webservers have access to the same data, you have redundancy. For TLS it helps performance to have a shared store for session keys (although there are ways to store session keys on the client).
PHP code shouldn't change very frequently but for complex applications, managed deployments are a requirement (or significant investment and skills in how to avoid having to coordinate the deployment). Sometimes rsync is enough. Sometimes you need a multi-site SAN. Sometimes you need a custom content management system written from scratch.

Do you host the system yourself? 

If so, then you have a lot of flexibility over how you configure the cluster and communication between nodes - make sure you have more than one IP address though. But do you have the skills to design and manage the cluster? Moving the service into the cloud solves a lot of problems (or rather makes them someone else's problems) but creates new ones for you.

How frequently is data updated?

...and must each node be exactly in sync? If not, how much lag can you tolerate?

Do you use PHP sessions?

These are (frequent) data updates. They need to be shared across the cluster.

How much can you spend?

Employment a good consultant is going to cost you money. But it should be money well spent. If you need to have your appendix removed, would you look for the lowest bid? While it seems that it is possible to outsource some development work like this, I'm not convinced it's the right way to plan an IT architecture.

How much downtime can you afford?

Both scheduled and un-scheduled (i.e. planned and accidental). 
In my experience, once you move past a single machine hosting your site, there is surprisingly little correlation between investment and availability. But planning for how the service will degrade when components start failing is always good practice. Knowing that there is scope for at least scheduled maintenance windows does expand the horizon of possibilities.

Do you need to split the cluster across data centres for redundancy?

While it would be nice to design an architecture which can scale from a couple of machines to multiple data-centres, the likely outcome of this would be a very expensive, and hard to manage solution running on a pair of servers. Even if we were all running the same publish-only application, the right architecture changes according to the scale of the operation. 

How scalable do you want the system to be?

Continuing the previous point - maybe you should be planning further ahead than just the next rung on the ladder.

Where are your users/customers/datacentre?

Geography matters when it comes to performance.

What is the current volume of writes on the database?

There are 2 well defined solution for MySQL replication - synchronous and asynchronous. The latter is simple, available out of the box but only works for 2 nodes accepting writes from the outside world. Due the replication being implemented in a single execution thread there are potential issues with capacity where there are a large proportion of writes.

What network infrastructure is between the server(s) and clients?

It may aready be capable of supporting some sort of load balancing. But don't discount Round Robin DNS - there is a HUGE amount of FUD about RRDNS on the internet. It's mostly completely wrong. But there benefits to using other approaches - while RRDNS solves a problem for clients connecting via HTTP, its not a good way to manage redundancy between your PHP and MySQL.

What is your content caching policy?

Caching is essential for web usability. But the default ETags setup for Apache is far from optimal for a loose cluster of webservers. There are different approaches to managing caching effectively and (if immutable) these may impact your architecture.

What regulatory frameworks apply?

Even where there are no explicitly obligations (such as PCI-DSS) as soon as you start accepting and process data from users you have a duty of care to them. The type of data you are collecting and what you do with it has an important bearing on the security and therefore the architecture of the service.

The wrong questions

How do I build a LAMP cluster

Obviously - the point of this post is to explain why that is the wrong question to ask.

The number of users

This has very little to do with the infrastructure required. The number of concurrent sessions is only marginally better. When you start talking about the number of HTTP requests, the split between static and dynamic content, and the rate of data updates then your getting closer to the information required to inform a choice of architecture.

Friday, 24 October 2014

Nouvea? Retch.

I often wonder what heinous crimes I committed in a previous existence to deserve the punishments I get in this one.

It all started so simply.

My Desktop was running PCLinuxos11 - a bit long in the tooth and still 32 bit, but it was all working and I kept up with the patches. But then at the last round of patches, the Chromium browser stopped working - a broken dependency. I try to fix the dependancy - but get a 404 from the respository. I trid to revert but can no longer find the previous package. Oh well, I bite the bullet and try to do a dist upgrade - which completely trashes my machine.

First I try installing PCLinuxOS14 - but it uninstalls all of KDE (but I still have openbox which I added some time ago to play around with). Then I try OpenSuse (I used to run Suse on my servers up to about version 8) the current version looks nice and it all works but OMG is it SLOW! And it also trashes the PCLInuxOS installation completely! Then I try MINT 17 - which won't even boot. Then I find an old MINT 15 DVD which boots OK and I install that, mount my /home filesystem and recreate the accounts. I roll forward the patches, and I seem to have a working (and usable system). Only I can't install any more software as it seems this version of the distro is no longer supported.

Why is this stuff so hard? I know you guys don't want to maintain lots of different versions of your software, but is it so hard to just leave the old packages online and let us upgrade through them? 

After a lot more digging it seems that my graphics card (Nvidia GeForce 6150) does not play nice with recent versions of the nouveau driver. Hence my only option is to hope that the nVidia supplied versions will work with whatever flavour Linux I try next - which I need to boot up and install with the nouveau driver blacklisted. But for a short while I think I'll stick with having a working computer.


Nothing Nowhere - Thanks Orange

I've had the same mobile SIM (and number) for ...erm...at least 12 years from Orange - a basic PAYG account. In recent years my work has provided me with a contract phone, but I keep the Orange SIM in an charged phone and top up the credit every so often - formerly using scratch cards, more recently via ATMs.

This weekend I'm stuck in the house and needed to add some credit - surely I must be able to do that online? So I find the site, and yes, I can top up online but first I have to create an account - ho hum - a bit of a hassle, but I'll play along. I type in my details, they send a text to the phone with an activation code. I don't know why I can't make an anonymous online top up via the internet when I can do this at any ATM. But surely just a few clicks and keystrokes and I'll get there.

No.

First the registration process process fails to redirect to the account. Then When I do eventually get logged in, I can't do anything unless I provide the 4 digit PIN code they say they sent me when I registered the phone. WTF?

So I click on the "I don't have my PIN" link - it says to call 450. I call 450. I then spend around 10 minutes pressing buttons trying to find my way through their IVR, then just when I think I'm getting near to my destination I get disconnected. I try again. 10 minutes later disconnected again.

FFS, Orange I'm trying to give you money!

So how do I contact them via the website? Is there an email address? No. Is there a form I can submit a request to? No. The only option appears to be to call a mobile number (which I can't do without any credit on my phone).

Tuesday, 22 July 2014

Browser Fingerprinting - digging further into the client

I previously wrote about some techniques for Browser Fingerprinting (or "Device Identification" as it's known in some circles). Today I came across an interesting technique already in widespread use which detects variations between devices by looking at how content is rendered by WebGL / HTML5 Canvas.

Typically as much of the processing for these as possible is pushed onto the underlying hardware, resulting in consistent results for a given device independently of the OS / software. There is a surprising amount of variation here. However there's not sufficient variation for it to be used in isolation from other methods.

Update: 28 Jan 2015

Another interesting article found here lists Google Gears detection and MSIE security Policy as enumerable interfaces for fingerprinting. (The TCP/IP parameters is presumably done serverside, while proxy detection uses Flash). But the really interesting bit is that 2 of the products tested tried to hookup with spyware on the client!

Thursday, 12 June 2014

Whatever happened to scripting?

Don't you just love "enterprise" tools. Most of the ones I've had the pleasure of working with seem to have been around a very long time, belonged to companies which have progressively been bought over by bigger and bigger corporations, have been developed by different teams with different methodologies and coding styles. It's a miracle they work at all.

But one common theme, and one that they all tend to be very good at is making sure that they are at the top of the food chain in their field. Most provide good downstream connectivity, collecting data from all sorts of different sources. But it it is exceedingly difficult to integrate them to upstream components - for reporting, user management, logging etc.

The latest problem was to get data out of Microstrategy. It has a SOAP interface for invoking reports remotely. But try finding any documentation for it. It also has a "simple" HTTP based interface (where report definitions are specified in the URL). Again with no available documentation. I asked the Data warehouse team whether they knew anything about these interfaces. Answers ranged from "What's SOAP?" to "no". It has a scheduler for running reports - can't we just dump these on a filesystem somewhere?....apparently not.  So how can you get information out in a machine readable form? We can send an email.

Great. Email I can do. So I fire up putty and start hacking together a script to get the file out of an email and hand it over to my app. Fetchmail -> procmail -> metamail. Simples.

....only metamail is not available in RHEL. I've previously blogged about mail processing in RH. I really don't want to write my own MIME handler. While there's lots of PHP implementations on the internet, you need to look hard to find the ones which are robust and well written. But even then the parsing is done by loading the entire message into memory. Not very handy if the message is 100Mb+ and using PHP.

I could download metamail and compile it....but looking around the internet, it doesn't seem to have been actively maintained. Indeed there hadn't been any significant changes since I'd sent in some bug reports about 15 years ago! Investigating further I found ripmime which does what I need. So a quick security scan and it seems ideal.

This might be a good point to describe what I looked for in checking its provenance.


  • It seems to be bundled in several Linux distros (i.e. other people like it and are using it).
  • Older versions have some CVEs logged against it - now fixed. This is good on several counts - again it shows that people are using it and finding security problems and the security problems are getting fixed.
  • the other products flagged in the same CVEs put it in respectable company
  • I went to the origin website for the tarball - found other interesting security stuff related to email handling.
  • scanned the source for anything that might indicate an alternate function (fork(), exec*(), system(), socket stuff)
  • looked to see if the code was aware of obvious attacks (such as Content-Disposition: attachment; filename="/etc/passwd";).
All good. It would have taken me a very long time to implement all this myself.

Really RedHat, ripMIME should be part of RHEL!

I know Linux is now mainstream - but that doesn't mean I want a complex black box which I can't diagnose or re-purpose. If I wanted that I would have bought MS Windows! (are you reading this KDE PIM developers, systemd developers). 

Fortunately it's trivial to build ripmime (no dependencies other than glibc and iconv).

Project back on track. Thank you Paul.