[sclug] Personalised web content filtering

Wed Jun 22 08:39:07 UTC 2011

On Thu, 2011-06-16 at 11:46 +0100, Dickon Hood wrote:
> On Thu, Jun 16, 2011 at 11:29:42 +0100, Neil Haughton wrote:
> 
> [ bad idea ]
> 
> : (BTW I am already familar with the 'this is an HR issue not a technical one
> : and you shouldn't employ staff you don't trust' argument, but I need to
> : treat it as a technical issue and find a technical solution, if I can.   The
> : idea is to be safe rather than sorry, to shut the stable door before the
> : horse bolts, and quietly open the door when the company is confident that
> : the horse is not the bolting type. And then keep the horse happy enough not
> : to want to bolt in future.)
> 
> By locking things down so much, your horse is going to get remarkably
> pissed off as it is.

Nada, sufficient security carries kudos (eg the guy who came up with the
first pub key encryption scheme was not allowed to write anything 'work
related' down outside of the complex he was working at he went home and
simply thought about the idea and spooled the notes from memory at
work).  If the work is cool enough people will put up with heavy
security measures (they will be disappointed if there aren't any)

> And what's to stop them walking off with the source code on a USB stick?
> Better: a USB stick masquerading as a smartphone?  If your answer is 'oh,
> we just disable the USB ports', then make sure you've also nobbled the
> Firewire ports in the BIOS, as you can read the entire system memory over
> that just by plugging something in.  It requires no interaction with the
> host OS whatsoever.

You have firewire on your $work computer, cool.

> In short: you're right: it's an HR problem, not a technical problem.  No
> technical solution you put in place will do anything other than severely
> piss your horses off, to the extent that they may well consider bolting.
> The trick here is to ensure your horses don't wish to leave *first*, not
> at some jam-tomorrow future point.

It's a security problem so a bit hr and a bit technical. There are other
leakage problems as well, a subverted browser could be a vector for an
external attacker to gain access to the network (say triggered by  a
targeted phish inviting employees to visit a certain website) and make
off with the source.  Filtering accessible websites is reducing attack
surface and increasing security.  So as with most security problems one
of the weakest elements is the humans involved with the system.  Laptops
get left on trans,  administrative staff get bought, sectaries receive
convincing phone calls and reveal the name of the ceo's wife ... which
just happens to be his password, etc.  

Dismissing the human element as not important undoes most technical
counter measures.  Have some respect for your work place and colleagues
practise safe browsing after all you cover you mouth when you sneeze
right?  Keep personal/unsolicited email, shorten link clicking, social
network sites to your home machine and use your work machine for
activity that will advance your companies business and not possibly
compromising the entity which ultimately pays your wage.  Respect and
understand your employers right to back that up with technical measures.

I believe the OP is suggesting a probationary period for new employees
to restrict them until trust is gained with just enough access to
perform their job.  This is classic Role Based Access.  It is good
security.  

Lets play a substitution game, lets take the words 'source code' and
replace them with the words 'data containing your financial records,
name address, next of kin, mother's maiden name, DoB, first car, name of
first school'.  Lets change 'Developers' to financial analysts.  Are you
happy that a junior employee with no previous record (cos he's too young
to have committed their first major fraud) has complete access to your
datas and an unfiltered internet connection?

DLP software is young it exist because there is this chicken and egg
problem of trust between new employees and employers.  Lots of
industries can not afford the risk of espionage, or accidental leakage.
There are other risks of employees stealing sales contacts, or similar
privileged information.  The same information they need to work with day
in and day out.  The company owns that information they have a right to
try and protect it.  Be it source code IP (urgh, but I stand by your
companies right to protect it), personal information, or other business
critical data.  The problem is a security problem.

> In answer to your question: no, no idea.  You could probably do something
> unwise with squid ACLs and NTLM auth, but that'll mandate the use of IE,
> and isn't foolproof anyway; we attempted something (briefly) at the BBC
> for similar (ie., clueless HR) reasons, and abandoned it some three years
> after starting.

Squid can auth via digests and other methods, and all modern browsers
will support some of these options.  In order for squid to support ssl
you will need to recompile it yourself.  openssl is GPL rather LGPL this
makes it difficult for distros to supply pre-built binaries (the
downside of choosing an incompatible license for a library component)
nobody has yet ported the code to gnutls which is not encumbered by this
'linking transfers license' clause .... well the openssl compatibility
layer is still GPL, to force migration to the LGPL gnutls apis which is
just a touch bitchy.

Other options include pam_iptables (assuming your users are using linux
workstations) and are un-privileged.  NuFW would be an option along a
similar line and has an windows agent option although that is
proprietary.  I think NuFW allows you to set user based policy on
machines outside of the local machine. If you are going to use a proxy
for enforcement make sure you block traffic at your firewall originating
from the workstations (or redirect it through the proxy in transparent
mode).

Checkouts of the code could happen on a remote machine which the devs
ssh into, disable scp and sftp, X forwarding etc the only leak then
comes via screen scrapes.  Which could be fairly tedious for a
sufficiently large code base.  Further items of 'critical business ip'
could require senior/build/privileged access to checkout.  Similar could
be setup with RDP for windows.

I'm not sure I buy the arguments I have seen in other parts of this
thread that a determined attack will always prevail.  I think this
assumes a passive defender.  I suspect determined attack vs determined
(active) defender == stale mate.  Papers/research  on the subject would
be welcome.

The game of how to make this as secure as possible interests me, I play
with the assumption that the source is of infinite value, so the costs
of securing it are minimal in comparison.  

The following is very lengthy and does not do much to further the goal
of answering the OP's question.  It's mainly a thought experiment to
identify and mitigate possible vectors for intentional data lose.  

In order to reduce possible egress of data the following measures could
be taken :- host the source code on a separate network with workstations
directly attached to this physical network (you actually need very
little infrastructure to support this, no dhcp, no dns use static
configs and host files instead). Electrically cripple external ports
such as usb, firewire, parallel/serial ports, floppy disk drives,
writable optical media, wifi, bluetooth. 

Access to the internals of the computer should be prevented most
machines have a loop for a locking device.  Attach machines to an APC
(automated power controller) device and turn off power outside of
business hours. I assume physical/peer security prevents an employee
caming the monitor for source code.

No printer is allowed on the network or devices other than workstations,
firewalls, and servers required for builds, & source control management
should be attached to the network.  Shielded network cables should be
used and connected in a way that they cannot be removed and connected to
an external device.  

All hosts should  communicate via ipsec, ss{l,h}tunnels, openvpn or
other vpn software.  A proprietary network protocol could be used
instead but benefits of peer review by crypto community are lost.  No
traffic should go plain text, hosts & users require unique signed keys
and must be validated before allowed to participate in network
communication.  

Workstations and server machines should be separated by a multi homed
firewall each workstation should be in its own CIDR/4 (this has two
usable ips gateway, host, and a broadcast) subnet which prevents cross
talk between workstations and enhances isolation.  This prevents an
attacker from attacking another workstation.  Placing each workstation
in its own vlan is also an option, but requires multiple interfaces to
be setup on the firewall side.  These measures will limit the value of
packet sniffing by placing the nic in promiscuous mode, as only traffic
from the workstation and the firewall (and onward servers) will be
visible as well as being encrypted.  There are attacks that could be
performed to try and break the encryption.

If internet access is to be permitted to the workstation this must be
routed via a proxy behind the firewall.  The proxy should sit behind a
secondary firewall which allows only port 80 and 443 outbound traffic.
Only traffic from established traffic streams should be allowed into the
firewall. The proxy should terminate SSL and re-commence the secure
session, this gives it man in the middle for the requested SSL stream
valid certs should be used and root trust certificate placed in
workstation browsers.  White-listing sites at the proxy is most viable
to implement and manage, other options would be to allow GET requests
(without query strings) more widely and only allow POST (and GET with
query strings ) to say google & other whitelisted sites. 

HTTP headers should be very tightly controlled, non from the workstation
should make it past the proxy.  HTTP methods should be limited to GET
and POST.  DNS resolution is performed at the proxy, ICMP traffic is not
permitted out the perimeter firewall. Yet another alternative is to have
live approval of each site via the proxy contacting an moderator to
authorise and waiting on their decision before allow/denying access.

Inbound content should be filtered & limited to html, text,  common
image formats (nix this if you can get away with it, GIF has been
exploited in the past some formats rely on zlib again vulnerabilities
have been found). Java script should be scrapped out of the html streams
unless whitelisted. Cookies should be removed period (no sneaky
googleing the source and storing it in your search history). Streams
which appear to contain content different to their mime type should be
aborted.  Outbound content should be filtered down to plain text, other
POSTs should be aborted.  

Each employee will use individual user accounts.  User accounts will be
removed when an employee leaves employment. Users should be non
privileged.  As users are assumed to be developers and are thus capable
of crafting tools required to effect data theft.  Password rotation
policy will not be enforced (reduces risk of post it note theft)

I will still assume third party software most go via an approval process
and must be installed by an administrator.  Security software such as
apparmour, selinux, grsecurity etc should be enabled and active (or
suitable similar software for the os).  Anti - vulnerability
exploitation technology such as non executable stack pages, address
space randomisation, stack canaries, & format string guards should be
compiled in for all setuid (or equivalent for other oses) and core
libraries, if not the entire os and application stack. Hardware access &
Network connectivity should be denied to applications not approved.
Logging should go to centralised hosts with event correlation software
analysing logs.

Relevant anti-virus software should be installed.  Win, Mac, Linux
(RST-B which is 10 years old & still does the rounds), Android, & iOS
all have viruses in the wild.  Win is most prevalent as it has largest
user base and malware these days is financially motivated.  The aim of
the anti-virus software is to prevent users from creating code to
circumvent security, or exploitation of the browser and its supporting
libs. A product with runtime detection features is required for this.
This 'doubles up' on the anti vulnerability and security software if
that layer is beaten the av will be the last line of defence for the
workstation.

At some point a build would need to be transferred.  This again would
need to be controlled with limited access ie a sftp push, ensure host
authenticity via ssh host keys to prevent mim based attacks,
signed/encrypted with gpg/pgp so authenticity of build can be asserted
this wants to be the only permitted transfer out of environment. Build
machine should collect source direct from source control.  The build
machine should have access to only the source repo and the artifact
repository. Its assumed the complied code is then safe to distribute.
Direct access to the build machine from workstations should not be
permitted.  

Decommissioning of machines should ensure harddrives are thoroughly
scrubbed.  De-guasing and physical destruction are optional unless drive
failure prevents scrubbing.

I look forward to seeing ideas for data ex-filtration :-)

> In other news, sort() in shell:
> <http://dis.4chan.org/read/prog/1295544154>.
> 
> 
> Dickon Hood