YAML and CloudFormation. Yippee!!!

YAML: YAML Ain't Markup Language


I spend a heck of a lot of time coding and, like many devops guys, love Coffeescript, Jade, Stylus and YAML. No chasing missing semicolons, commas and curly braces. I just write clean code how it should be and, at least twice as fast.

JSON, like plain javascript, is a lot cleaner, quicker and easier to read when you remove all those curly braces, commas etc. YAML does just that!

AWS just announced support for YAML with CloudFormation templates. I would thoroughly recommend you check it out and start using YAML. It will make big difference to your productivity and, your templates will be much easier to read understand.

YAML, like Coffeescript, Jade and Stylus, makes use of indenting in code to eliminate the need for braces and commas. When you're learning YAML, you can use a JSON to YAML converter (eg http://www.json2yaml.com) to convert your existing JSON to YAML.

(Very) Basics of YAML

Collections using Indentation eliminate the need for braces and commas with Objects:

JSON 
"WebsiteConfiguration": {
"IndexDocument": "index.html",
"ErrorDocument": "error.html"
}

YAML
WebsiteConfiguration:
  IndexDocument: index.html
  ErrorDocument: error.html

Sequences with Dashes eliminate the need for square brackets and commas with Arrays:

JSON 
[
"S3Bucket",
"DomainName"
]

YAML
  - S3Bucket
  - DomainName

Full Example

Here is a full example I created for S3. I'll let you be the judge which one is better!

JSON:

{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "AWS CloudFormation Sample Template",
"Resources": {
"S3Bucket": {
"Type": "AWS::S3::Bucket",
"Properties": {
"AccessControl": "PublicRead",
"WebsiteConfiguration": {
"IndexDocument": "index.html",
"ErrorDocument": "error.html"
}
},
"DeletionPolicy": "Retain"
}
},
"Outputs": {
"WebsiteURL": {
"Value": {
"Fn::GetAtt": [
"S3Bucket",
"WebsiteURL"
]
},
"Description": "URL for website hosted on S3"
},
"S3BucketSecureURL": {
"Value": {
"Fn::Join": [
"",
[
"https://",
{
"Fn::GetAtt": [
"S3Bucket",
"DomainName"
]
}
]
]
},
"Description": "Name of S3 bucket to hold website content"
}
}
}

YAML:

---
AWSTemplateFormatVersion: '2010-09-09'
Description: AWS CloudFormation Sample Template
Resources:
  S3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: PublicRead
      WebsiteConfiguration:
        IndexDocument: index.html
        ErrorDocument: error.html
    DeletionPolicy: Retain
Outputs:
  WebsiteURL:
    Value:
      Fn::GetAtt:
      - S3Bucket
      - WebsiteURL
    Description: URL for website hosted on S3
  S3BucketSecureURL:
    Value:
      Fn::Join:
      - ''
      - - https://
        - Fn::GetAtt:
          - S3Bucket
          - DomainName
    Description: Name of S3 bucket to hold website content



Shared Responsibility 3 - Identify and Destroy the Bots



Please note: You should have a link in your login for blind or vision impaired people. These techniques will prevent them from using your application. They could be accommodated using an alternative dual factor authentication process.

In my last post I detailed how to create dynamic CSS selectors to make life difficult for reliable Bot scripts to be written. The next part of this series is to identify the Bot and take retaliatory action. The code for this post is available at Gist in case Blogger screws it up again.. The code for creating dynamic CSS selectors including some additional decoy elements looks like this:

function dynamicCSS(){
 var username, password
 x = ''
 if ((Math.random()*2) > 1)
  x += ''
 else
  x += ''
 x += '
' x += '

Please Login

' y = Math.floor((Math.random()*5)) + 2 for (var a=0; a' x += '' x += '' } for (var a=0; a
' x += '' x += '' return x }

In a real app you would set this all up using a jade template but for simplicity we will just send raw html.

Analysing Header Information

The first thing we can look at is the header information sent to our NodeJS EC2 instance. I conducted some tests using a number of different browsers and also using the very popular PhantomJS headless webkit to find any clear differences between a real browser and a headless browser. Below are the results.

Request headers from Chrome:

{
  "host": "54.197.212.141",
  "connection": "keep-alive",
  "content-length": "31",
  "accept": "*/*",
  "origin": "http://54.197.212.141",
  "x-requested-with": "XMLHttpRequest",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "referer": "http://54.197.212.141/",
  "accept-encoding": "gzip, deflate",
  "accept-language": "en-GB,en-US;q=0.8,en;q=0.6",
}

Request headers from Firefox:

{
  "host": "54.197.212.141",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0",
  "accept": "*/*",
  "accept-language": "en-US,en;q=0.5",
  "accept-encoding": "gzip, deflate",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "x-requested-with": "XMLHttpRequest",
  "referer": "http://54.197.212.141/",
  "content-length": "10",
  "connection": "keep-alive"
}

Request headers from Safari:

{
  "host": "54.197.212.141",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "origin": "http://54.197.212.141",
  "accept-encoding": "gzip, deflate",
  "content-length": "9",
  "connection": "keep-alive",
  "accept": "*/*",
  "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/601.7.8 (KHTML, like Gecko) Version/9.1.3 Safari/601.7.8",
  "referer": "http://54.197.212.141/",
  "accept-language": "en-us",
  "x-requested-with": "XMLHttpRequest"
}

Request headers from Internet Explorer:

{
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "accept": "*/*",
  "x-requested-with": "XMLHttpRequest",
  "referer": "http://54.197.212.141/",
  "accept-language": "en-US;q=0.5",
  "accept-encoding": "gzip, deflate",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko",
  "host": "54.197.212.141",
  "content-length": "9",
  "dnt": "1",
  "connection": "Keep-Alive",
  "cache-control": "no-cache"
}

Request headers from PhantomJS:

{
  "accept": "*/*",
  "referer": "http://54.197.212.141/",
  "origin": "http://54.197.212.141",
  "x-requested-with": "XMLHttpRequest",
  "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0",
  "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
  "content-length": "9",
  "connection": "Keep-Alive",
  "accept-encoding": "gzip, deflate",
  "accept-language": "en-AU,*",
  "host": "54.197.212.141"
}

I won't put all the results here for all the browsers tested, but there is a clear structure to the PhantomJS header that is unique from conventional browsers. In particular starting with "accept" and finishing with "host".
Analysing the header structure can be used to help identify a Bot from a real person.

Plugin Support

Support for plugins with headless browsers is minimal or non-existent. We can also look at using the javascript Navigator method on the client to get information on supported plugins and other browser features. A simple test would be to check the plugins array length. In order to do this we need to create a javascript file that runs on the client.

In the public directory of your NodeJS create a new file called login.js (use your IP address of course)

function submitForm(username, password) {
  loginURL = 'http://54.197.212.141' + '/login'
  user = $('#' + username).val()
  pass = $('#' + password).val()
  $.post( loginURL, { 
    username: 'user',
    password: 'pass',
    plugins: '  '//navigator.plugins.length
  })
    .done(function( result ) {
      alert(result)
      }
    })
}

This code will post back to your NodeJS server information about the browser. Although headless browsers can post forms, from my experiments they don't process jQuery post commands running on the client. In contrast it has worked on all normal browsers without issue

If the post is submitted then the further identification can occur. In this example we will just check the length of the plugins array and also send the filled out input fields. If it is a Bot then the wrong input fields will be submitted and the plugins array length will be zero.

You will also need to install and enable bodyParser in your index.js file to receive the information:

var bodyParser = require('body-parser')
app.use(bodyParser.urlencoded({ extended: true }));

You will also need to update the dynamicCSS function to include jQuery on the client:

x += '<script src="https://code.jquery.com/jquery-3.1.0.min.js"></script>'

Also add a link to to your login.js file:

x += '<script src="login.js"></script>'

Additional security can also be achieved by creating multiple versions of the login file with different names for the parameters (username/password/plugins). The backend code that handles the post request would be expecting the fields defined in login file that was served. Anything different that is posted means the request is most probably a from a Bot. Another option is to serve the login.js file on the server side with random parameters and set up a route for it to be downloaded. For simplicity I won't add the extra code for this, but implementation is quite straightforward.

Building a Profile of the Bot


It is quite important that you use a number of techniques to identify bots to make sure you do not have a case of mistaken identity. It is a good idea to use a score system and identify a level which will trigger corrective action.

In your index.js file that is run on your server update with the following code:

app.post('/login', function(request, response) {
  var botScore = 0
  if (request.body.plugins == 0) ++botScore // Empty plugins array
  if (!request.body.username) ++botScore // Clicked on decoy inputs
  if (!request.body.password) ++botScore
  if (getObjectKeyIndex(request.headers, 'host') != 0) ++botScore // Bot type header info
  if (getObjectKeyIndex(request.headers, 'accept') == 0) ++botScore
  if (getObjectKeyIndex(request.headers, 'referer') == 1) ++botScore
  if (getObjectKeyIndex(request.headers, 'origin') == 2) ++botScore
  console.log('Bot score = ' + botScore)
  if (botScore > 4) {
    console.log('Destroy Bot')
    response.send('fail')
  }
  else {
    response.send('Logged in ' + request.body.username)
  }
})

We are now building a bot score and deciding whether to allow access based on that score.

Attacking the Bot with Javascript

The following technique needs to be used with caution.

If you want to ensure the Bot is destroyed for the good of the internet community then you can look at launching a counterattack on the bot with your own malicious script to crash the browser.

This can be achieved quite simply using an endless loop that reloads the page. This will stop execution of the browser and eventually cause it to crash. Update your client side code in login.js with:


function submitForm(username, password) {
  loginURL = 'http://54.197.212.141' + '/login'
  user = $('#' + username).val()
  pass = $('#' + password).val()
  $.post( loginURL, { 
    username: 'user',
    password: 'pass',
    plugins:   navigator.plugins.length
  })
    .done(function( result ) {
      if (result == 'fail')
        while(true) location.reload(true) // Crash the Bot
      else {
        alert(result)
      }
    })
}


In my next post I will look at the different types and benefits of captcha and how to enable them on your site.

Be sure to subscribe to the blog so that you can get the latest updates.

For more AWS training and tutorials check out backspace.academy

New Classroom Website




We have just started updating our classroom website:

  • We now accept PayPal and use Paypal for all transactions including credit cards (no more Stripe).
  • Runs on all browsers.
  • One click login using Facebook (more options will be added soon)
Login is no longer with username and password so please login using your facebook account that is linked to the email you used with BackSpace. If your Facebook account uses a different email then please send me an email on info@backspace.academy and I will update the database with your Facebook email.

Shared Responsibility 2 - Using Dynamic CSS Selectors to stop the bots.


In my last post I talked about techniques to stop malicious web automation services at the source before they reach AWS infrastructure. Now we will get our hands dirty with some code to put it into action. Don't worry if you are not an experienced coder, you should still be able to follow along.

How do Bot scripts work?

A rendered web page contains a Document Object Model (DOM). The DOM defines all the elements on the page such as forms and input fields. Bots mimic a real user that enters information in fields, clicks on buttons etc. To do this the bot needs to identify the relevant elements in the DOM. DOM elements are identified using CSS selectors. Bot scripts consist of a series of steps that detail CSS selectors and what action to perform on them.

The DOM structure and elements of a page can be quickly identified using a browser. Pressing F12 in your browser will launch developer tools with this information:


To see specific details of a DOM element simply right click on the element on the page and select 'inspect':


This will open up the developer tools with the element identified. You can get the CSS selector for the element easily by again right clicking on the element in the developer tools:



Note that this will be only one representation of the element as a CSS selector (generally the shortest one). There are a number of ways an element can be defined as a CSS selector including:

  • id name
  • input name
  • class names
  • DOM traversal e.g. defining its chain of parent elements in the DOM
  • Text inside the element using Jquery ':contains'.

Dynamic CSS Selectors

To make life difficult to develop bot scripts you can use dynamic CSS selectors. Instead of creating the same CSS selectors each time your page is rendered, you can look at changing these randomly each time.

When using NodeJS and Express this is quite straightforward as your are already rendering pages on the server. Simply introduce some code to mix this up a bit.

Let's Start


First of all set up an EC2 instance with NodeJS and Express set up to render pages. If you are unsure you can view the video below:

https://vimeo.com/145017165

To save you typing, the code is available at Gist (also Blogger tends to screw up code when it is published).

Now let's change index.js to create a simple login form.

Point your browser to the public IP address of your instance to check everything is ok. e.g. xxx.xxx.xxx.xxx:8080

Now change the index.js file to include a dynamicCSS function :

app.get('/', function(request, response) {
  response.send(dynamicCSS())
})

function dynamicCSS(){
 x = ''
 x += '
'; x += '

Please Login

; x += ''; x += ''; x += '' x += '
'; return x }


Now do npm start at the command line of your ec2 instance and refresh the browser page. You will now see our very simple login form:


The problem with this form is that it is really easy to identify the dom elements required to login. The id, name, placeholder all refer to username or password.

Now let's change our code and introduce dynamically created CSS selectors.


var loginElements = {
 username: '',
 password: ''
 }

function dynamicCSS(){
 var username = randomString()
 var password = randomString()
 loginElements.username = username
 loginElements.password = password 
 x = ''
 x += '
' x += '

Please Login

' x += '' x += '' x += '' x += '
' return x } function randomString(){ chars = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghiklmnopqrstuvwxyz'.split('') chars.sort(function() { return 0.5 - Math.random() }) return chars.splice(0, 8).toString().replace(/,/g, '') }



This now generates a random string for the id and name tags of the input elements. This makes it not possible to use these in a reliable bot script. If you do npm start again and view the the view the element in developer tools you can see the random strings.

We now need to look at the other ways our elements can be identified as CSS selectors. As you can see the text "username" and "password" is still used in the placeholders and input type tag. Also the DOM structure itself doesn't change dynamically, making it possible to reference the element through traversing the DOM structure.

We will address both problems by creating random decoy input elements with the same parameters. The CSS position property will allow us to stack them on top of each other so that the decoy elements are not visible on the page:

app.get('/', function(request, response) {
  response.send(dynamicCSS())
})

var loginElements = {
  username: '',
  password: ''
}

function dynamicCSS(){
  var username, password
  x = ''
  x += '
' x += '

Please Login

' y = Math.floor((Math.random()*5)) + 2 for (var a=0; a' x += '' loginElements.username = username loginElements.password = password } x += '' x += '
' return x } function randonString(){ chars = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghiklmnopqrstuvwxyz'.split('') chars.sort(function() { return 0.5 - Math.random() }) return chars.splice(0, 8).toString().replace(/,/g, '') }


Now when you view the DOM in your browser developer tools,  you can see the decoy input elements created underneath the real input element. If you refresh your browser you will see a different number of elements created each time (between 1 and 5 created).



The bot creator can no longer use the username and password placeholders or input types to identify the elements. They can also not use the DOM structure to traverse through the DOM as this is changing also. As pointed out by a reader of this post (thanks Vadim!), you should also put some random inputs after to handle jquery ":last". A good place would be underneath your logo.
y = Math.floor((Math.random()*5)) + 2
for (var a=0; a'
  x += ''
  loginElements.username = username
  loginElements.password = password
}
for (var a=0; a'
  x += ''
  document.getElementById(username).style.visibility = "hidden";
  document.getElementById(password).style.visibility = "hidden";
}

The next thing a bot script can do is click on an x-y position on the screen. We can handle this by randomly changing the position of the elements.

var loginElements = {
 username: '',
 password: ''
 }

function dynamicCSS(){
  var username, password
  x = ''
  if ((Math.random()*2) > 1)
    x += ''
  else
    x += ''
  x += '
' x += '

Please Login

' y = Math.floor((Math.random()*5)) + 2 for (var a=0; a' x += '' loginElements.username = username loginElements.password = password } x += '' x += '
' return x } function randomString(){ chars = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghiklmnopqrstuvwxyz'.split('') chars.sort(function() { return 0.5 - Math.random() }) return chars.splice(0, 8).toString().replace(/,/g, '') }


The position of the input elements is now random. This currently only has two positions but you can elaborate on this to create many possible combinations of positions. You may also make your login form inside a modal window that changes position on the screen.

If want to go further you can look having two login forms, username followed by password. Or even better, randomly change between the two.

We have now addressed the possible techniques a bot creator can use to identify your input elements and login to your site.

Congratulations, you made it to the end!

What's next?

In my next post I will introduce techniques to identify bots and then look at launching a counter attack on the bot to crash it after it has been positively identified.

Be sure to subscribe to the blog so that you can get the latest updates.

For more AWS training and tutorials check out backspace.academy

Shared Responsibility - Stopping threats at the source


Over the past week Denial of Service (DOS) has been dinner table talk in Australia since the catastrophic failure of it's online Census implementation. Everyone from the Prime Minister down has been quick to blame IBM and, quick to accuse the Chinese government for the attack.

After the dust has settled and reality sets in, the true picture appears. No Chinese conspiracy but certainly poor planning. Expecting the entire population of Australia to log in and complete a multi page form on the same night was in reality... dumb.


One thing that surprised me in all the discussions by the experts was the focus on back-end strategy to mitigate attacks. Surely any strategy should start at the root cause, the user interface. A lot can be done at the source using client-side strategies that prevent malicious traffic ever reaching your resources. Presenting a simple login page without additional protection is surely a magnet for malicious attacks.

A magnet for attack

AWS talk a lot about "shared responsibility". This is a fundamental tenet of good application design. We need to take responsibility for what is within our control and, out of the control of AWS. Allowing threats to reach AWS infrastructure is putting all of the responsibility on AWS and taking no responsibility yourself.

When designing a critical application a number of questions can be asked and the solutions designed into the application:

  1. Are you a real browser?
  2. Are you a real person?
  3. Can I identify you?
  4. Do you have permission to be here?

Headless Browsers

In general all threats will come from an automated service, not from a person at a desktop browser. Your first line of defence should be to detect headless browsers. Headless browsers run on a server without any GUI and operate using a simulated Document Object Model (DOM). There are a number of differences between headless browsers and normal browsers that can be used for detection and mitigation:

  • Plugins: The available plugins for headless browsers will be minimal and the plugins array will probably be empty. Testing for common plugin availability and plugin array length can identify threats instantly.
  • HTML5/CSS3: HTML5/CSS3 support is not a priority for a headless browser as it does not have a GUI. Testing for these features can help identify headless browsers.
  • Iframes: Embedding forms within iframes can make it more difficult for bots to identify DOM elements.
  • Ajax: Presenting a simple login form with input type password makes life easy for bots. Consider using Ajax to reveal the login process with animation step by step. Also consider varying the time randomly between steps.
  • CSS id and classes: Dynamically creating forms allows the opportunity to vary the class names and id of the elements on the form. It also allows the position in the DOM to be varied. This will make it difficult for reliable bot scripts to be created as they are based upon CSS selectors.
  • Web Security: Headless browsers do not have the same respect for security as conventional browsers due to the limited value of information contained in browser storage. Also web security features are mostly disabled by bot owners. This creates an opportunity to run client-side scripts on the headless browser that can consume it's server resources. Extreme care must be taken with this approach due to the possibility of accidently targeting innocent visitors.

Captcha:


Captcha is the technology users hate but unfortunately we need. Image type Captcha are the most common and generally effective at preventing brute force attacks.

Image Captcha
Despite this, they can be easily overcome through the use of a Captcha solving service. The bot will save an image of the Captcha and forward it to the service for solving. These services use OCR where possible and actual human entry to solve the Captcha. The reason why this is still reasonably effective is due to the expense involved in using a Captcha service. Bots will generally move on to more cost effective targets.



The latest one-click technology by ReCaptcha is by far the best. Unlike image Captcha, this requires actual real clicks before it is accepted. This technology is extremely effective at distinguishing between real clicks and bot clicks. It is also not possible to transfer the Captcha to a solving service as it is not an image and will be disabled when the Captcha url is run on a different machine. This is certainly a powerful weapon to have in your arsenal.

Federated Identity

If your traffic has come this far it is most likely not a bot and has probably got a Facebook account. Federated users identified through Facebook, Google, Amazon etc are issued with temporary credentials and can be used in conjunction with Amazon Cognito to ensure a high degree of authentication has occurred before AWS resources are accessed. More info:

Using AWS Cognito with Node.JS

Using Cognito with PhoneGap/Cordova

AWS IAM

Not much is needed to be said about AWS Identity and Access Management. AWS supply a fantastic service and it is up to us to make full use of it through the use of  least privilege roles for your federated users and ensuring credentials are temporary and safe.


So, next time you are thinking about security, don't forget about security at the source.

In the next post I will discuss using dynamic CSS selectors to stop the bots.



Welcome aboard India!

Welcome aboard India!

AWS Announces New Asia Pacific (Mumbai) Region





At last India has its own region with two availability zones. Much overdue but sure to be a popular decision. The following services are available in the new region:

    AWS Certificate Manager (ACM)
    AWS CloudFormation
    Amazon CloudFront
    AWS CloudTrail
    Amazon CloudWatch
    AWS CodeDeploy
    AWS Config
    AWS Direct Connect
    Amazon DynamoDB
    AWS Elastic Beanstalk
    Amazon ElastiCache
    Amazon Elasticsearch Service
    Amazon EMR
    Amazon Glacier
    AWS Identity and Access Management (IAM)
    AWS Import/Export Snowball
    AWS Key Management Service (KMS)
    Amazon Kinesis
    AWS Marketplace
    AWS OpsWorks
    Amazon Redshift
    Amazon Relational Database Service (RDS) – all database engines including Amazon Aurora
    Amazon Route 53
    Amazon Simple Notification Service (SNS)
    Amazon Simple Queue Service (SQS)
    Amazon Simple Storage Service (S3)
    Amazon Simple Workflow Service (SWF)
    AWS Support
    AWS Trusted Advisor
    VM Import/Export

The available  services will no doubt be expanded so be sure to check for more details at:

New Course AWS Certified SysOps Administrator!



The much awaited AWS Certified SysOps Adminstrator Course has been released. Available with the AWS Certified Associate course. All existing members will have access!

BackSpace Academy