Shared Responsibility 3 - Identify and Destroy the Bots




Please note: You should have a link in your login for blind or vision impaired people. These techniques will prevent them from using your application. They could be accommodated using an alternative dual factor authentication process.

In my last post I detailed how to create dynamic CSS selectors to make life difficult for reliable Bot scripts to be written. The next part of this series is to identify the Bot and take retaliatory action. The code for this post is available at Gist in case Blogger screws it up again.. The code for creating dynamic CSS selectors including some additional decoy elements looks like this:

function dynamicCSS(){
 var username, password
 x = ''
 if ((Math.random()*2) > 1)
  x += ''
 else
  x += ''
 x += '
' x += '

Please Login

' y = Math.floor((Math.random()*5)) + 2 for (var a=0; a' x += '' x += '' } for (var a=0; a
' x += '' x += '' return x }

In a real app you would set this all up using a jade template but for simplicity we will just send raw html.

Analysing Header Information

The first thing we can look at is the header information sent to our NodeJS EC2 instance. I conducted some tests using a number of different browsers and also using the very popular PhantomJS headless webkit to find any clear differences between a real browser and a headless browser. Below are the results.

Request headers from Chrome:

{
  "host": "54.197.212.141",
  "connection": "keep-alive",
  "content-length": "31",
  "accept": "*/*",
  "origin": "http://54.197.212.141",
  "x-requested-with": "XMLHttpRequest",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "referer": "http://54.197.212.141/",
  "accept-encoding": "gzip, deflate",
  "accept-language": "en-GB,en-US;q=0.8,en;q=0.6",
}

Request headers from Firefox:

{
  "host": "54.197.212.141",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0",
  "accept": "*/*",
  "accept-language": "en-US,en;q=0.5",
  "accept-encoding": "gzip, deflate",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "x-requested-with": "XMLHttpRequest",
  "referer": "http://54.197.212.141/",
  "content-length": "10",
  "connection": "keep-alive"
}

Request headers from Safari:

{
  "host": "54.197.212.141",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "origin": "http://54.197.212.141",
  "accept-encoding": "gzip, deflate",
  "content-length": "9",
  "connection": "keep-alive",
  "accept": "*/*",
  "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/601.7.8 (KHTML, like Gecko) Version/9.1.3 Safari/601.7.8",
  "referer": "http://54.197.212.141/",
  "accept-language": "en-us",
  "x-requested-with": "XMLHttpRequest"
}

Request headers from Internet Explorer:

{
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "accept": "*/*",
  "x-requested-with": "XMLHttpRequest",
  "referer": "http://54.197.212.141/",
  "accept-language": "en-US;q=0.5",
  "accept-encoding": "gzip, deflate",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko",
  "host": "54.197.212.141",
  "content-length": "9",
  "dnt": "1",
  "connection": "Keep-Alive",
  "cache-control": "no-cache"
}

Request headers from PhantomJS:

{
  "accept": "*/*",
  "referer": "http://54.197.212.141/",
  "origin": "http://54.197.212.141",
  "x-requested-with": "XMLHttpRequest",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "content-length": "9",
  "connection": "Keep-Alive",
  "accept-encoding": "gzip, deflate",
  "accept-language": "en-AU,*",
  "host": "54.197.212.141"
}

I won't put all the results here for all the browsers tested, but there is a clear structure to the PhantomJS header that is unique from conventional browsers. In particular starting with "accept" and finishing with "host".
Analysing the header structure can be used to help identify a Bot from a real person.

Plugin Support

Support for plugins with headless browsers is minimal or non-existent. We can also look at using the javascript Navigator method on the client to get information on supported plugins and other browser features. A simple test would be to check the plugins array length. In order to do this we need to create a javascript file that runs on the client.

In the public directory of your NodeJS create a new file called login.js (use your IP address of course)

function submitForm(username, password) {
  loginURL = 'http://54.197.212.141' + '/login'
  user = $('#' + username).val()
  pass = $('#' + password).val()
  $.post( loginURL, { 
    username: 'user',
    password: 'pass',
    plugins: '  '//navigator.plugins.length
  })
    .done(function( result ) {
      alert(result)
      }
    })
}

This code will post back to your NodeJS server information about the browser. Although headless browsers can post forms, from my experiments they don't process jQuery post commands running on the client. In contrast it has worked on all normal browsers without issue

If the post is submitted then the further identification can occur. In this example we will just check the length of the plugins array and also send the filled out input fields. If it is a Bot then the wrong input fields will be submitted and the plugins array length will be zero.

You will also need to install and enable bodyParser in your index.js file to receive the information:

var bodyParser = require('body-parser')
app.use(bodyParser.urlencoded({ extended: true }));

You will also need to update the dynamicCSS function to include jQuery on the client:

x += '<script src="https://code.jquery.com/jquery-3.1.0.min.js"></script>'

Also add a link to to your login.js file:

x += '<script src="login.js"></script>'

Additional security can also be achieved by creating multiple versions of the login file with different names for the parameters (username/password/plugins). The backend code that handles the post request would be expecting the fields defined in login file that was served. Anything different that is posted means the request is most probably a from a Bot. Another option is to serve the login.js file on the server side with random parameters and set up a route for it to be downloaded. For simplicity I won't add the extra code for this, but implementation is quite straightforward.

Building a Profile of the Bot


It is quite important that you use a number of techniques to identify bots to make sure you do not have a case of mistaken identity. It is a good idea to use a score system and identify a level which will trigger corrective action.

In your index.js file that is run on your server update with the following code:

app.post('/login', function(request, response) {
  var botScore = 0
  if (request.body.plugins == 0) ++botScore // Empty plugins array
  if (!request.body.username) ++botScore // Clicked on decoy inputs
  if (!request.body.password) ++botScore
  if (getObjectKeyIndex(request.headers, 'host') != 0) ++botScore // Bot type header info
  if (getObjectKeyIndex(request.headers, 'accept') == 0) ++botScore
  if (getObjectKeyIndex(request.headers, 'referer') == 1) ++botScore
  if (getObjectKeyIndex(request.headers, 'origin') == 2) ++botScore
  console.log('Bot score = ' + botScore)
  if (botScore > 4) {
    console.log('Destroy Bot')
    response.send('fail')
  }
  else {
    response.send('Logged in ' + request.body.username)
  }
})

We are now building a bot score and deciding whether to allow access based on that score.

Attacking the Bot with Javascript

The following technique needs to be used with caution.

If you want to ensure the Bot is destroyed for the good of the internet community then you can look at launching a counterattack on the bot with your own malicious script to crash the browser.

This can be achieved quite simply using an endless loop that reloads the page. This will stop execution of the browser and eventually cause it to crash. Update your client side code in login.js with:


function submitForm(username, password) {
  loginURL = 'http://54.197.212.141' + '/login'
  user = $('#' + username).val()
  pass = $('#' + password).val()
  $.post( loginURL, { 
    username: 'user',
    password: 'pass',
    plugins:   navigator.plugins.length
  })
    .done(function( result ) {
      if (result == 'fail')
        while(true) location.reload(true) // Crash the Bot
      else {
        alert(result)
      }
    })
}


In my next post I will look at the different types and benefits of captcha and how to enable them on your site.

Be sure to subscribe to the blog so that you can get the latest updates.

For more AWS training and tutorials check out backspace.academy

BackSpace Academy CEO BackSpace Technology LLC

Providing the best value AWS certification courses and exam engines.

No comments:

Post a Comment