Ben Godfrey

Quantified Self

In January, Pew Research found that 21% of US adults track some aspect of their health using technology. This is part of a phenomenon called “Quantified Self” — the measurement of personal behaviours and activities, often using technology such as smartphones and web apps. I’m a self-quantifier and have been for a while. Over the years I’ve tracked a number of dimensions:

Driving Improvements

I use this data to drive improvements in my life.

For example, if I’m training for a race, I check that I’m building up my distance at the right rate in preparation. To understand the data, I visualise it using histograms and aggregates statistics. My all time distance on Runkeeper right now is 1,188 KM. I’ve only run a few KMs this month, but last month I ran almost 50 KM (I had a big race).

My self-quantification breaks down into two categories: “performance” and “habits.” The performance stuff (running and cycling) is a bit vain, but I feel that collecting and visualising performance data has helped me to achieve some big goals, like running a marathon.

Better Habits

Measuring some of my habits helps me to make long-term behavioural changes.

Meat-eating is a good example. I love meat, but I like to keep the total amount I consume down. I track whether I ate meat or not each day by ticking a box in chains.cc. Later I can look back and see easily whether I’m sticking to my goal or if I need to work harder. Having aggregate goals like “eat meat once per week” allows me to indulge when the meat on the table is really good (like when I went to Aux Lyonnais) but means that I get the health benefits of a mostly plant-based diet.

Self-Experiments

Some self-quantifiers conduct self-experiments using the data they collect, e.g. “does taking drug A make my condition better or worse.” I’m not quite there yet, but it’s an interesting direction. I’m not sure I have the discipline to collect sufficiently high-quality data.

For more about the Quantified Self movement I recommend checking out the Quantified Self blog.

Serve HTTPS From Elastic Beanstalk Application Instances

Elastic Beanstalk is Amazon’s platform-as-a-service built on top of EC2, S3 and other Amazon services.

An Elastic Beanstalk application consists of one or more EC2 instances running your application and a set of supporting resources, including an Elastic Load Balancer. By default, the load balancer listens on port 80 and forwards traffic to port 80 on your app servers. You can configure the load balancer to listen on 443 easily, but traffic from the LB to the app servers is not encrypted. To encrypt the traffic on this hop, you must configure your app servers to listen for HTTPS requests.

Options For Configuring App Servers To Serve HTTPS

This can be done one of 2 ways (examples specific to Tomcat, but the methods should be applicable to any app server).

  1. Use Elastic Beanstalk configuration files to install packages, create Apache configuration files, certificate and key files in the relevant locations.
  2. Create your own app server AMI with HTTPS enabled. For Tomcat, 443 forwarding to 80 (HTTPS)… How can I set up REAL HTTPS on Beanstalk outlines the required steps.

Method 1 seems preferable as it avoids the need to keep AMIs up-to-date, e.g. when security patches are released.

Using Elastic Beanstalk Configuration Files

  1. Create an .ebextensions dir at the root of your app dir or WAR file
  2. Copy in any files you want to create
  3. Create an Elastic Beanstalk configuration file to describing the required instance configuration changes

Example Configuration File

# vim: ft=yaml
packages:
  yum:
    mod_ssl: "" # empty string means latest
container_commands:
  10-ssl-key-cert-install:
    command: "cp -r .ebextensions/ssl /etc/httpd/"
  20-apache-ssl-config:
    command: "cp .ebextensions/ssl.conf /etc/httpd/conf.d/ssl.conf"

Custom commands are run before services are started. There’s no need to restart services if you’re changing their configuration (e.g. Apache in this case).

Note for Scala developers: SBT ignores hidden directories when building projects (and I can’t work out how to override that). You can use jar uf /path/to/file.war .ebextensions to insert the .ebextensions dir into the WAR file after packaging.

References

Login With SSH To Verify Changes

SSH to one of your Elastic Beanstalk instances to ensure that the changes have been applied correctly.

Initialisation Logging

The Elastic Beanstalk startup process writes any errors raised by custom container commands to /var/log/cfn-init.log. If your instance doesn’t start properly, for example services don’t start, look there.

Backend Authentication

Backend authentication is a feature of Elastic Load Balancer. It uses the public key of a certificate to verify that the backend app server is encrypting traffic with a valid certificate. This needs to be enabled for LB to app server HTTPS to work (otherwise you’ll get timeouts when making requests to the LB on port 443).

Security Group And Load Balancer Config File

A new load balancer and security group are created each time you deploy your application. Their configuration will revert back to the default, listen on port 80 only, whenever an environment is started. To avoid this, creating a second Elastic Beanstalk configuration file describing the required resource properties. E.g.

# vim: ft=yaml
# Elastic Load Balancer and Security Group configuration for the app
#
# — Allow anyone to connect to port 443 and office traffic to connect to
#   port 22
# — Ensure all traffic is encrypted by configuring load balancer to listen on
#   443 and direct traffic to port 443 on app servers
# — Enable cookie-based session stickiness
# — Use /status for health check
# — Enable backend authentication policy by providing public key for cert

Resources:
  AWSEBSecurityGroup:
    Type: "AWS::EC2::SecurityGroup"
    Properties:
      GroupDescription: "Security group to allow HTTPS for all, SSH for office"
      SecurityGroupIngress:
        — {CidrIp: "0.0.0.0/0", IpProtocol: "tcp", FromPort: "443", ToPort: "443"}
        — {CidrIp: "176.35.225.76/32", IpProtocol: "tcp", FromPort: "22", ToPort: "22"}
  AWSEBLoadBalancer:
    Type: "AWS::ElasticLoadBalancing::LoadBalancer"
    Properties:
      Listeners:
        — {LoadBalancerPort: 443, InstancePort: 443, Protocol: "HTTPS", SSLCertificateId: "arn:aws:iam::1234567890:server-certificate/server"}
      AppCookieStickinessPolicy:
        — {PolicyName: "lb-session", CookieName: "lb-session"}
      HealthCheck:
        HealthyThreshold: "3"
        Interval: "30"
        Target: "HTTPS:443/status"
        Timeout: "5"
        UnhealthyThreshold: "5"
      Policies:
        —
          PolicyName: "MyPubKey"
          PolicyType: "PublicKeyPolicyType"
          Attributes:
            —
              Name: "PublicKey"
              Value: "..."
        —
          PolicyName: "BackendAuth"
          PolicyType: "BackendServerAuthenticationPolicyType"
          Attributes:
            —
              Name: "PublicKeyPolicyName"
              Value: "MyPubKey"
          InstancePorts:
            — "443"

You’ll need to change this to point to your certificate in IAM, to restrict SSH access to the right IP range and to add your public key.

Customizing Environment Resources describes how to write configuration files for the other AWS resources in an Elastic Beanstalk environment.

How to write policies to manage backend authentication is described in the examples in ElasticLoadBalancing Policy Type.

Mock To Mobile In Minutes With Codiqa, Trigger.io And TestFlight

I’m a fan of HTML for building mobile apps. Its incredibly quick and easy to build simple apps. I already know HTML, CSS and JavaScript. Lots of other developers do too, so it’s easier to find people to work with than native platforms.

When I recently started working on a mobile project, I used Codiqa to build a mockup of the app interface and generate HTML and Trigger.io and TestFlight to turn that mock into an app running on my iPhone.

Get Creative With Your Mock

Codiqa makes it easy to create app mocks.

  1. Sign up for Codiqa basic account so you can build more than 3 pages and export your mock as HTML
  2. Build an awesome mock by dragging and dropping
  3. Optional: customise your app by creating a theme with jQuery Mobile Themeroller and uploading it to Codiqa
  4. Download your mock as HTML and unzip

From Mock To Mobile In Minutes

Next, I used Trigger.io to create an app from the mock HTML and TestFlight to release it to my phone.

  1. Install Trigger.io’s tools
  2. Create a directory for your new app
  3. Run forge create (See Trigger.io’s OS-specific instructions for more detail)
  4. Replace the contents of the src dir Download HTML from Codiqa to the src dir of your new app
  5. Rename app.html to index.html so that Trigger knows it’s your start page
  6. Change your app’s config to include your package name and point to your provisioning profile (there’s an example local_config.json here)
  7. Run forge build ios to compile the app
  8. Run forge package ios to generate an IPA package file for iPhones

Distribute Your Mock App

Once you’ve followed these steps, you can upload your build to TestFlight and push it out to your testers.

To deploy apps to iPhones, you will need an Apple developer account and a provisioning profile. TestFlight have a good resource on preparing a build for distribution through TestFlight. Also, Handshake have a good overview of Testing iOS Apps with TestFlight.

Apple’s document About iOS Development Team Administration provides a wealth of info on the iOS development workflow.

If you had an API for your email, what would you do with it?

If you could analyse all your email, or all your organisation’s email, in any way, what would you do with it? Leave a comment with your answers.

Email Analytics For You

  • Build a stats dashboard with total number of emails sent, sent per day, top recipients, etc, and start a competition to see who sends the most (or least!)
  • Calculate what time you’re most likely to get a response from a particular person
  • Extract all the links you’ve sent and make a Twitter feed
  • Extract all the links you’ve sent or received to a particular person and show them in a sidebar when composing (a bit like Rapportive)
  • Build a cool visualisation of your email social network, like InMaps

Email Analytics For Your Organisation

  • Trending topics
  • Integrate with a CRM, e.g. automatically populate contacts, history
  • Pull attachments and share them on Dropbox/GDocs/file server/etc (similar to attachments.me)
  • Extract stealth documentation (questions and answers) and add it to a wiki
  • Archive conversations applying custom rules (filter out mailing lists, large attachments)
  • Detect specific content, e.g. porn, law-breaking, bullying

Email Analytics For The World

By analysing the all the email in the world (or a sufficiently large volume thereof), you could apply the kind of social analytics that Twitter, Facebook, Bitly, PeerIndex, etc, provide.

  • Trending topics
  • Influence (“Ben knows about startups”)
  • Contact segmentation

The list goes on. What would you do?

Lean Startup Resources

My own list of resources for people applying the Lean Startup Methology.

Aphorisms

Print them on posters.

  • Life’s too short to build something nobody wants – Eric Ries
  • Remove any feature, process or effort that does not contribute directly to the learning you seek – Eric Ries
  • Innovation accounting: Baseline with an MVP; Tune; Pivot or Persevere – Eric Ries
  • A good design is one that changes customer behaviour for the better – Eric Ries
  • Real learning comes from facts and commitment, not opinions and promises – Eric Ries
  • 3 A’s of metrics: actionable, accessible, auditable – Eric Ries
  • If you launch it and see what happens, you’ll succeed – at seeing what happens – Eric Ries
  • Launch when you can deliver a quantum of value – Eric Ries
  • Given the right context, customers can clearly articulate their problems, but it’s your job to come up with the solution – Ash Maurya
  • It’s not the customer’s job to know what they want – Steve Jobs
  • Get out of the building – Steve Blank
  • Startups that succeed are those that manage to iterate enough times before running out of resources – Eric Ries
  • Practice trumps theory – Ash Maurya
  • Customers don’t care about your solution, they care about their problems – Dave McClure
  • While ideas are cheap, acting on them is quite expensive – Ash Maurya
  • Bind a solution to your problem as late as possible – Ash Maurya

Types Of MVP

In ascending order of complexity.

  • Observation
  • Re-enactment
  • Interview
  • Pitch a fake product (email, call, face-to-face)
  • Video like Dropbox
  • Mail campaign/PPC/landing page with AB/multivariate testing
  • Concierge — talk to the customer, simulate the service
  • Wizard of Oz — perform back-office tasks manually
  • Partly-implemented application, e.g. application form that declines everyone

Books

Essentials

Don’t skip these!

Very, Very Useful

Tools

  • Business Modelling: Lean Canvas, Validation Board
  • Source Control: GitHub
  • Hosting: Heroku, S3, GitHub
  • CI: Circle CI, Cloud Bees Hosted Jenkins
  • Analytics: KISSmetrics, Mixpanel, Google Analytics, StatsMix
  • Split Testing: KISSmetrics, Google Website Optimizer, Visual Website Optimizer
  • CRM: Intercom, Salesforce
  • Email Marketing: MailChimp
  • Surveys: SurveyGizmo, AskYourTargetMarket, SurveyMonkey
  • Offers: Hackers And Founders Rewardli

Resources

Lean Startup Methodology

Lean 101 from Lean Startup Machine

2010 01 27 The Lean Startup Twiistup from Eric Ries

Lean Startup Essentials — STARTup Live Hagenberg from Lukas Fittl

MVP/Experiments talk at SVA IxD program from Giff Constable

The Lean Startup Circle Wiki

Metrics

Startup Metrics for Pirates (SF, Jan 2010) from Dave McClure

Understanding Opportunities/Ideation

Customer Development

Small Data: Processing CSV In Unix

Unix provides a wealth of great tools for performing simple analysis of small datasets. awk in particular is useful for crunching CSV files.

Why use Unix tools and not Excel? The commands you create can be saved as a script and applied to updated files, multiple files at once, combined with curl to process a file from the web.

Here’s a few examples using a fictional data file, emails.csv, which contains data about a number of email campaigns sent to customers.

Get Started

First, let’s inspect the file format by printing the first few rows.

$ head -5 emails.csv
ReportRunOn,MemberID,StartDate,EndDate
10/10/2012 13:13:10,1038277,10/09/2012 00:00:00,10/10/2012 23:59:59

UserName,UserID,JobID,EmailName,EmailSubject,PickupTime,BCC,CcEmail,NormalSends,NormalSendsBCC,NormalSendsCC,TriggeredSends,TriggeredSendsBCC,TriggeredSendsCC,NormalSends_1,NormalSendsBCC_1,NormalSendsCC_1,TriggeredSends_1,TriggeredSendsBCC_1,TriggeredSendsCC_1
Ben Godfrey,ben.godfrey,1839429,Example Email,Give us your feedback!,05/10/2012 15:22:55,,,1,0,0,0,0,0,470,0,0,0,0,0

This file has a metadata section at the top. Let’s chop that off.

$ tail +4 emails.csv > emails-no-meta.csv

Introducing awk

The file has a lot of columns. We can use awk to extract only the columns we’re interested in.

As an intermediate result, let’s just print out the first column.

$ awk -F, '{ print $1 }' emails-no-meta.csv | head
UserName
Ben Godfrey
Ben Godfrey
Ben Godfrey
Ben Godfrey
Ben Godfrey
Other User
Other User
Other User
Other User

awk‘s -F argument sets the field separator. We’re using the comma character. Use -F\\t for tabs. The default separator is the space character.

We provide a program to awk on the command line. awk programs operate on each line, with BEGIN and END sections to do pre- and post-processing on the whole file.

This simple program contains a block which is applied to all lines and prints $1, which is awk‘s syntax for column 1.

We can go a step futher and add a pattern match to our program so we print the first column of lines matching “Ben Godfrey.”

$ awk -F, '/Ben Godfrey/ { print $1 }' emails-no-meta.csv
Ben Godfrey
Ben Godfrey
Ben Godfrey
Ben Godfrey
Ben Godfrey

This is functionally equivalent to using grep and piping the result to awk.

$ grep "Ben Godfrey" emails-no-meta.csv | awk -F, '{ print $1 }'
Ben Godfrey
Ben Godfrey
Ben Godfrey
Ben Godfrey
Ben Godfrey

The awk-only solution involves less overhead — only one pass through the file and only program is used — but grep is such a powerful tool and familiar to so many that it is often used for the filtering part of a pipeline. Combining simple tools to create powerful results is a key part of the Unix philosophy.

More Columns

We can also use awk to construct a new CSV file with a subset of columns.

$ awk -F, '/Ben Godfrey/ { print $1","$4","$6","$9 }' emails-no-meta.csv
Ben Godfrey,Example Email,05/10/2012 15:22:55,1
Ben Godfrey,Example Email,05/10/2012 17:06:59,1
Ben Godfrey,Example Email,08/10/2012 10:36:45,2
Ben Godfrey,Example Email,08/10/2012 11:32:56,469
Ben Godfrey,Example Email,10/10/2012 12:02:53,4232

This is a powerful technique, because we can easily rerun that command on an updated file. This beats performing the same manual steps in Excel for every new file. We can also run the awk program again many files (potentially sourced from different locations with find or curl).

As a final trick, let’s total up how many emails I’ve sent.

$ awk -F, '/Ben Godfrey/ { total = total + $9 } END { print total }' emails-no-meta.csv
4705

We add the values from column 9 to the total variable as we proceed through the file (as long as the line matches the expression /Ben Godfrey/). Finally, we use an END block to print the value of total.

Vim

Most Unix tools, including awk, can operate on either a file or stdin and stdout. Within Vim, you can filter the contents using an external program.

If we open emails.csv in Vim, we can issue the following command to reduce the list of columns in the current buffer without modifying the original file.

:%! awk -F, '{ print $1","$4","$6","$9 }'

%! is a Vim command to filter the whole file through an external command. You can also filter a specified range. We could filter every line except the header.

:2,$! awk -F, '{ print $1","$4","$6","$9 }'

Also, you can select some lines in visual mode then filter those.

:'<,'>! awk -F, '{ print $1","$4","$6","$9 }'

Personal Finance Dashboards For UK Customers

The current crop of Personal Finance Management (PFM) tools collect data from your bank accounts and cards and try to make sense of it all. Generally they do a bad job of this. Getting the data in is time-consuming and error-prone. The analyses are unhelpful. You get pretty graphs that tell you nothing.

The serious options for UK customers right now are LoveMoney and Money Dashboard.

Both tools are very similar. They pull data from your bank account using Yodlee’s API. They display transactions and analysis of your spending.

Transactions

Transactions are categorised. This is done automatically where possible. That identification process varies in it’s accuracy. Correct categorisation is vital for the analysis to be accurate. LoveMoney showed my monthly spending on health insurance as unexpectedly high, they had categorised my mortgage payment as health insurance payments.

Money Dashboard has a huge list of categories. LoveMoney has a small, more manageable list. The categories they present are too specific, e.g. gym instead of fitness. Money Dashboard’s list satisfies my desire to be accurate, but took a bit longer to find my way around.

None of the dashboards I’ve tried can understand money moving between accounts. They all show both sides of the transaction separately.

Confusing interfaces, poor performance

LoveMoney’s interface is more generally friendly. It’s easier to find your way around and understand the data. A tiny example: transactions are marked with the icon of the account they came from. It’s easier to understand than Money Dashboard’s more abstract colour coding.

Both tools are really slow. This is possibly because they’re pulling data from Yodlee’s API for every request. The slowness increases the friction of adding accounts and categorising events.

Yodlee’s system for capturing credentials doesn’t provide enough validation feedback. For example, I had to make a few guesses on what format sort code should be in for one account (it’s 2012, validation should be solved).

Analysis

LoveMoney breaks down spending into categories and shows both this month’s spending and the monthly average for each category. This is quite a helpful tool.

Money Dashboard displays categories as a pie chart, money in and out and balance trends. I like the latter because it gives me an indication of how my savings are going.

Budgeting

Money Dashboard allows you to set goals, e.g. saving for a rainy day. The data driving these need to be updated manually. They also have a budgeting tool which tries to predict monthly spending cash. My numbers were incomprehensible.

LoveMoney suffers from the same problem. Their analysis of the month so far was that I’m in a huge deficit, when all I had done is make a large transfer to my savings account. I had added the savings account so this was disappointing.

My User Stories

I’m frustrated with PFM tools. I like the idea, but the implementations don’t answer my financial questions.

How am I doing on my retirement saving?

Instead of tagging every transaction, I could nominate accounts or a subset of transactions as saving for retirement. These would be easy to spot heuristically.

Can I afford to buy this expensive item this month?

Answer: No. Better answer: Do I have enough of my genuinely disposable income left? Understanding what transactions are regular outgoings and which are genuinely disposable is key to this. Whether I spent the money on books or stuff for my bike is less important to me.

Play 2.0 And Scalate Step-By-Step

Jan Helwich has done a great job of describing how to use Scalate with Play 2.0. He’s even provided an example Play 2.0 project which uses Scalate. This post contains step-by-step instructions for a new project based on his work.

For convenience, I’ve also wrapped these changes up as a patch that can be applied to a new Play 2.0 project. Apply it with git am --signoff play20-with-scalate.patch.

  1. Create a new Play project.

    play new myproject
    # choose "1 — Create a simple Scala application"
    
  2. Add Scalate to your project dependencies in project/Build.scala

    val appDependencies = Seq(                                                   
      // Add your project dependencies here,                                     
      "org.fusesource.scalate" % "scalate-core" % "1.5.3"                        
    )
    
  3. Place a copy of Jan Helwich’s ScalateIntegration.scala in app/lib

    curl -o app/lib/ScalateIntegration.scala https://raw.github.com/janhelwich/Play-2-with-Scala-and-Scalate/master/app/controllers/ScalateIntegration.scala
    
  4. Set the default Scalate template type in conf/application.conf.

    # Default Scalate template format (mustache, scaml, jade, ssp)
    scalate.format=jade
    
  5. Create a layout and a template.

    # app/views/layouts/default.jade
    -@ var body: String
    -@ var title: String = "Page"
    
    !!!5
    html
      head
        title= title
      body
        != body
    
    # app/views/index.jade
    -@ var title: String = "Page"
    
    h1= title
    
  6. Use your templates in a controller.

    package controllers
    
    import play.api._
    import play.api.mvc._
    
    object Application extends Controller {
      def index = Action {
        Ok(Scalate("index.jade").render('title -> "Hello world!"))
      }
    }
    

Done! Try out your app with play run

Authentication workflow for gitlab, gitolite and AD

gitlab and gitolite can be integrated with Active Directory (or another LDAP server), but how it works is a bit roundabout.

  • User logs in to gitlab web interface
  • gitlab checks user’s credentials against Active Directory (via ominauth plugin) and allows log in
  • User uploads SSH key via gitlab web interface
  • gitlab writes key to gitolite keys dir?
  • User attempts to access repo via SSH (e.g. git clone git@host:repo.git)
  • SSH key is sent
  • gitolite checks keys dir and finds key
  • gitolite checks repository permissions and decides to allow the operation
  • repo is cloned

Simples!

The Specious Notion That Everybody Has To Earn A Living

We must do away with the absolutely specious notion that everybody has to earn a living. It is a fact today that one in ten thousand of us can make a technological breakthrough capable of supporting all the rest. The youth of today are absolutely right in recognizing this nonsense of earning a living. We keep inventing jobs because of this false idea that everybody has to be employed at some kind of drudgery because, according to Malthusian-Darwinian theory, he must justify his right to exist. So we have inspectors of inspectors and people making instruments for inspectors to inspect inspectors. The true business of people should be to go back to school and think about whatever it was they were thinking about before somebody came along and told them they had to earn a living.

— Buckminster Fuller