You are currently browsing the archives for the filtering category


Once Upon a Time There Was a Hosts File

When the World Wide Web was little and called the ARPAnet, resolving computers to their IP addresses wasn’t a big deal. In fact because the network consisted of only a few hundred hosts, a single file called HOSTS.TXT was sufficient. This file contained the name to address mapping of every computer on the ARPAnet. Unix computers hacked the HOSTS.TXT and built it’s own version and stored it into /etc/hosts – all was fine and dandy.

The HOSTS.TXT was maintained by a Network Information Centre and distributed by a single host. Any client would pick up a fresh copy every few days to see if any new hosts had been added to the network. Slowly there were problems as the network got bigger – here’s some of the biggies:

  • Traffic – the toll on the SRI-NIC (the computer which held the master copy of HOSTS.TXT) became unbearable. Network traffic and CPU utilization was overloading the host.
  • Name Collisions – No two hosts on a network can be the same. There was no system to enforce this uniqueness of host names – duplicates started to appear in the host list as it got bigger.
  • Consistency – making sure that everyone had the correct version of HOSTS.TXT became extremely difficult. Machines on the far edges of the network would take so long to get an update that it was

It didn’t work, name resolution started to cause havoc on the network as it grew, mailservers fell over as duplicates appeared. Hundreds of versions of the HOSTS.TXT file caused loads of issues and the reliability of the network plummeted.  A new system was needed and it was needed fast, that system was delivered by a chap called Paul Mockapetris.  He released two RFCs  - 882 and 883 which were the first definition of the Domain Name System – or as we mostly refer to it as DNS.    These RFCs have now been superceded many times as security, administration and implementation problems have been identified and rectified.

The Internet as we know it relies not on some huge text file but the Name resolution delivered by the Domain Name System.  DNS is simply a huge distributed database, local control of this data is allowed.   However this data is accessible across the whole network through a client/server set up.  Now this is where the history lesson finishes – I don’t want to start talking about Name Servers, resolvers or caching as you can find that stuff in other places.

Here on theninjaproxy.org we like our information is little more practical – so lets have a look at a little legacy of the HOSTS.TXT file that is used as a first step of resolution by Windows TCP/IP.

There’s the little fellow  - a text file called hosts which contains your computers first port of call in Name resolution before it uses methods like DNS for example.

It can be used to block or filters websites, hackers use it to infect clients with viruses and trojans by redirecting to nasty sites.  Also plenty of places still use it to make web based applications work properly or to redirect clients to specific computers.

It’s quite simple to use – here’s a brief illustration.  We are going to redirect a web site to a different place using the hosts file -

Let’s redirect our web surfer to somewhere pleasing to the eye – playboy.com.  First we find the IP address of the site by pinging it -216.18.172.158.   Next we need to make some simple modifications to our hosts file – you’ll usually need administration access to alter this file.


You can see we have added a line telling the computer that the site www.google.com can be found at the address 216.18.172.158 (oh no it can’t!).

Of course you’ve guessed what will happen when anyone tries to visit Google on this computer!

Sometimes doesn’t work as great on the bigger sites that rotate their IPs over lots of servers and you may have to clear your cache with CCleaner beforehand.  But you get the idea, another slight modification is that you can use the hosts file to block access to sites to.   Instead of redirecting a site to different IP address you can just redirect to your local computer using 127.0.0.1.

For example perhaps you are getting pissed about all the adverts that are served on websites from ad.doubleclick.net, simply add this line to your hosts file.

127.0.0.1     ad.doubleclick.net

This will have the effect of blocking access to that website (and blocking it’s adverts).  It’s a crude but reasonably effective way of blocking access to specific websites on a particular computer.  Many companies or schools use this method on public facing or ‘kiosk’ machines.

Unfortunately hackers also use this method too, viruses modify your hosts file to redirect your machine to malicious websites instead of popular sites like Facebook or similar.  So it’s always worth checking out your hosts file occasionally to see all is in order.

Internet Filtering, Censorship, Surveillance and Stuff Like That

One of the many justifications used across the world by agencies, governments, regimes etc for spying on us and filtering internet feeds is that it actually protects us.    By that they generally mean by employing these tactics they are able to catch more terrorists, paedophiles and various nasty people using the internet for their naughtiness.  In fact in many sectors of society if you argue that the internet shouldn’t be monitored or filtered then you will often find yourself grouped with these unsavory characters. Now just to clarify I’m not talking about carefully targeted surveillance and filtering on suspects (fair enough on that) but the general broad monitoring and filtering on an entire population on the off chance of picking up something interesting !

The problem is that it’s utter rubbish for one very good reason – it simply doesn’t work.    It’s all very well a Government thinking that they can routinely pick up terrorists by swooping on a Facebook page – but in reality what sort of hardened operatives are they going to pick up?   One thing for sure they won’t be very clever – in fact you’ll probably pick up the likes of these two harmless muppets who tried to organise a riot on Facebook.   Their riot attracted no rioters and they were picked up and sentenced to four years (which will probably be reduced to 2 weeks on appeal).

Jordan Blackshaw, left, and Perry Sutcliffe-Keenan

Now to be honest I don’t know about you, but I might be prepared to concede a large part of my liberty and privacy if I thought the world would become a genuinely safer and better place.   However picking up the likes of these two hardly meets that criteria.

The point I’m trying to make is that when internet filtering, censoring and surveillance techniques are utilised the only people who are affected are those with nothing to hide, plus perhaps a few thick criminals/terrorists who are probably of limited danger.   There are many ways to circumvent filters, there are lots of ways to communicate anonymously and all those who need to are doing just that.

Do Al Qaeda communicate through Facebook, My Space or Twitter – I suspect not.  Do they send out their orders by emails in clear text with PDF attachments detailing their targets – of course they don’t.    They’ll be using TOR, encrypted emails, hidden web sites and communication networks on the Dark web.   There will be codes, ciphers and carefully devised communication methods and strategies plus loads of other stuff on here The Ninja Proxy!

Of course they might be like this lot from the rather funny film Four Lions -

 

But I suspect not.

Why Can’t I Use a Proxy

We’ve all been there – you’re stuck in work or school, and frankly bored out of your brain.   Sure you have internet access but all the most interesting sites are blocked -

  • Facebook Blocked
  • Youtube Blocked
  • MySpace Blocked
  • World of Warcraft (games and forum) Blocked

So why’s it happening and what can you do about it?

Your company or school controls your access to the internet at several points and is blocking your access at several levels.

The first control is probably through their own proxy server.  If you go and look in Tools/Internet Options/Connections/LAN Settings or  something like that in different browsers you’ll probably see a proxy server set.  That address will be a server controlled by your company where they force all internet traffic.  If they’ve done a decent job you won’t be able to change this.

The settings will normally be deployed by something called GPO (group Policy Objects) which are the way most organisations control what their computer looks like.  These apply settings like specific desktops, screensavers, Internet Explorer settings each time you boot up your computer.

Therefore absolutely everything you request goes through the company proxy server.  You might think you’re being clever searching for ninja proxy sites on the internet but I’m afraid you’re not.  All you are doing is creating a log of you searching for ‘ninja proxy sites online’, and letting administrators know you want to bypass their settings. The proxy server will be set to filter out all such requests by a variety of methods.  The most common one will be a huge list of URLs containing all the dodgy one page, Glype proxy installations online.

So you need to bypass this proxy server or do you?

If the organisation has their network set up properly then even by using an alternative browser or modifying the proxy settings in IE will not work anyway.  The reason is that your company firewall, the hardware device which controls all the traffic in and out of your network should only allow web traffic out from one specific address – the proxy server.   So if you bypass this your request will come from your specific IP address and get blocked.

Then a couple of things might happen -

  • The alert will be flagged on the firewall (Web requests from an incorrect internal client)
  • The administrator will track down the PC and find out it’s been modified.

But don’t worry in reality probably nobody ever looks at  the logs and most firewalls generate so many alerts that nobody ever looks at those either.

The point is your searching for online web proxies is simply a waste of time.  To bypass most corporate proxies you need to go through that proxy and not around it.  Through it because any other originating IP address will get blocked and may possibly  wake up your IT Department.  But you need to stop the proxy blocking access based on the content (what you are requesting) and the URL (the actual site you want to visit).

There are two things you can do to allow this – first you need encryption so that nothing can see inside your web request and secondly you need some low key server outside the network to relay your request.  These two requirements if implemented correctly will allow you to tunnel through any corporate network firewall or proxy and also keep your surfing private from the administrators and logs.

Type of Filtering and Ninja Bypassing

Internet filtering used to be relatively scarce but it’s extremely common now and takes a variety of forms.  The two most basic forms are URL and content filtering .

URL Filtering

Typical examples of URL filtering is where the requested URL of a web site is intercepted by the proxy or firewall and compared to a big list of ‘bad urls’.  If the URLs match then the request is denied and blocked.  In  this case the user is normally redirected to an error page, although in some cases the request will be logged and an administrator alerted.   It’s not a great system as if you have an extensive list of URLs it can have a big performance impact – and remember this impact is for all requests even those that don’t contain a blocked site.

In recent years some performance improvements have been made to alleviate the issues.  For instance some URL filtering systems use hash values of the URLs rather than the addresses themselves.  The hash values can be ordered so that the system can locate information faster (by jumping to specific points in the list rather than searching from start to finish).   Most systems you’ll find in corporate environments will use URL filtering to some extent.

There can be lots of other problems with filtering simply based on a list especially if you use the hash value searching system.  The URLs have to complete and only that exact, specific address is restricted.   Many websites have multiple domain names and aliases so any list has to have all these URLs listed too.

Content Filtering

Just like URL filtering has a noticeable impact on performance, the same can be said of content filtering.   Content filters look inside the data being transmitted – their goal is not only to block access to inappropriate sites but also to check for security risks.  A content filtering system will often be set to filter out specific objects like Java or ActiveX.   They also check for viruses and other security problems entering the network.

These filtering systems are very sophisticated – analysing the actual packet data though is bound to have an impact on any networks performance.  Content filters will usually defeat the use of anonymous proxies as the end URL is irrelevant – the data itself is being scanned which will reveal both the proxy address and the destination URL.   An example of one of the most widely used content filters is WebSense – which uses a variety of plug ins and runs on dedicated hardware strategically placed with a tap into all network traffic.

Ninja Bypassing of Filtering Systems

To defeat the URL filtering system is normally fairly straight forward, most anonymous ninja proxy servers available on the internet will suffice.  The only difficulty is that most URL lists contain a large selection of these sites – so if the one you use is on the list you’re going to get blocked.   Not only that but the administrator will likely be informed that someone is deliberately trying to bypass corporate restrictions.  If you set up your own using a hosting account and a Glype installation then you’ll likely be able to surf under the radar.

Unfortunately the mass majority of filtering devices now use both URL and Content filtering technology. The normal web proxy sites you’ll see on the internet promising you complete anonymity and the ability to bypass filters are completely useless. The content filter will look into the packet itself – the fact you are using a proxy and a fake ip are irrelevant.

There is only one effective way to defeat a genuine content filter and that is to encrypt your surfing. In this case the URLs and sites you are visiting are unable to be read by the content filters.