Why I love America

Wavinghi
Photo by Nulla

I recently ran across part of an essay by Henry Farlie, an English-born journalist in America. On previous Independence Days I've tried to articulate why I love America, but he says it so much better than I ever could:

I had been in the country about eight years, and was living in Houston, when a Texas friend asked me one evening: "Why do you like living in America? I don't mean why you find it interesting--why you want to write about it--but why you like living here so much." After only a moment's reflection, I replied, "It's the first time I've felt free." One spring day, shortly after my arrival in America, I was walking down the long, broad street of a suburb, with its sweeping front lawns (all that space), its tall trees (all that sky), and its clumps of azaleas (all that color). The only other person on the street was a small boy on a tricycle. As I passed him, he said, "Hi!"--just like that. No four-year-old boy had ever addressed me without an introduction before. Yet here was this one, with his cheerful "Hi!" Recovering from the  culture shock, I tried to look down stonily at his flaxen head, but instead, involuntarily, I found myself saying in return: "Well--hi!" He pedaled off, apparently satisfied. He had begun my Americanization.

"Hi!" As I often say--for Americans do not realize it--the word is a democracy. (I come from a country where one can tell someone's class by how they say "Hallo!" or "Hello!" or "Hullo," or whether they say it at all.) But anyone can say "Hi!" Anyone does. Shortly after my encounter with the boy, I called on the then Suffragan Bishop of Washington. Did he greet me as the Archbishop of Canterbury would have done? No. He said, "Hi, Henry!" I put it down to an aberration, an excess of Episcopalian latitudinarianism. But what about my first meeting with Lyndon B. Johnson, the President of the United States, the Emperor of the Free World, before whom, like a Burgher of Calais, a halter round  my neck, I would have sunk to my knees, pleading for a loan for my country? He held out the largest hand in Christendom, and said, "Hi, Henry!"

Happy 4th of July!

Speed up your PHP with XHProf

Rocketlaunch2

I'll admit it, I'm doing performance-intensive code in PHP. I started my career writing demos in hand-coded assembler, but the need for development speed has pushed me towards using a scripting language. Part of making fast progress is reducing dependencies, which has meant sticking with PHP for the whole system, even for back-end analysis where another language might be more expected.

That's left me desperate to get more information on where the time is going when things slow down. My first weapon of choice is a simple pair of timing functions that I wrap around code to tell how long a whole block is taking. That's quick and easy, (and I've included my code below) but it doesn't give you any information about which parts of it are taking all the time.

For that you need a profiler, and if you do a search for php profiling, almost every result talks about XDebug. Unfortunately, as a profiler, it's a great debugger. You need to edit php.ini and restart the server, or pass in a URL input, you can only profile an entire script rather than just portions, and once you have generated the file you need to transfer them to one of several klunky desktop applications to explore the results.

After abandoning XDebug as too unwieldy, I spent some time searching for other solutions. I finally came across XHProf, and I'm loving it. It was developed as an internal tool for Facebook and open-sourced in March, and you can tell it's been written by people who actually use it. It took a little bit of fiddling to install it, I couldn't get it going through PECL and ended up downloading the source and manually compiling. It was a dream to use after that. It let me trigger the profling programatically around the code I cared about, and then browse the results through a web interface.

There's a couple of caveats, it's missing a few advanced features I'm used to from advanced desktop profilers like detailed information on the full call stack for functions rather than the immediate parents and timing for individual lines of code rather than functions, but in practice it's got all the features I need to diagnose performance problems. I was able to speed up my IMAP email importing dramatically, largely by removing the use of a global variable in an inner loop, it turned out to be far faster to pass the object as a function argument! That's the sort of problem that would have taken me far longer to find without XHProf.

Here's the primitive timing functions I mentioned at the start:

$g_start_time = 0;
$g_end_time = 0;

function pete_start_timer()
{
    global $g_start_time;
    list($usec, $sec) = explode(' ',microtime());
    $g_start_time = ((float)$usec + (float)$sec);
}

function pete_end_timer($dolog=true)
{
    global $g_start_time;
    global $g_end_time;
    list($usec, $sec) = explode(' ',microtime());
    $g_end_time = ((float)$usec + (float)$sec);
    $duration = ($g_end_time - $g_start_time);

    if ($dolog)
    {
        $durationstring = 'pete_timer: %01.4f sec';
        error_log(sprintf($durationstring, $duration));
    }
   
    return $duration;
}

Sewers and startups

Sewer
Photo by Elsie Esq.

I had a chance to chat with Matt Mullenweg yesterday, and he focused on something I've been struggling with; building to last. It also got me thinking about plumbing.

Joseph Bazalgette is one of my engineering heros. He built London's first sewers in the 19th century, and started by estimating how large they'd need to be to cope with the current population. He then said "Well, we're only going to be doing this once and there's always the unforeseen" and doubled the diameter! Thanks to his foresight and the beautiful workmanship of the bricklayers, those same sewers are still serving Londoners today, despite a population many times larger.

The biggest enemy of early-stage startups is time. We can't afford premature scalation, because before we've finished building a system robust enough to handle millions of active users we'll have run out of money. That means we end up accumulating technical debt as we struggle to get customers and revenue with the least possible amount of code.

The danger is we end up succesful, but so deeply mired in technical debt that we spend all our time paying interest rather than making meaningful progress with the product (see the last decade of Windows). As Vernor Vinge evokes so well, there's a good chance some of our code will be in the lower layers of the stack essentially forever. It's a deep engineering sin to inflict shoddy sewers on future generations.

Matt's key insight was "When you're in the red, time is working against you. Once you're profitable, time is on your side". Getting to even Ramen profitability changes everything, and gives you the ability to build for the long term.

When I joined Apple back in 2003, the central build farm for all projects had both PowerPC and x86 Darwin boxes, and our code had to compile on both. Steve was playing a long game, years before the Intel switch he was obviously planning for it, (though I only caught the significance in retrospect).

Looking at Wordpress, you can see the same combination of long-term planning sustained by profitability. A lot of focus in the startup world is on exits, but I'll be ecstatic if I'm still helping build Mailana in 20 years time. Seeing Matt's dedication to building something to last gave me hope, especially as he gave practical steps to get there.

The SQL Trap

Venusflytrap
Photo by Beatrice Murch

Virtually every web developer starts off using a relational database like MySQL. It's so easy to use joins and sorts to implement complex operations your service needs, pretty soon you end up with big chunks of application logic in your SQL queries. You don't know it, but you've just entered The SQL Trap.

I first heard the phrase from Jud Valeski, but it's something I've seen happen to every startup that deals with massive data sets, and I've struggled with it myself. You build a working service on a small scale, and that success brings in more users and data. Suddenly your database is the bottleneck.

At this point you have two choices. You can continue writing your queries in expressive high-level SQL and pour resources into speeding up the underlying system, or you can switch to the database equivalent of assembler with a key/value store and write application code to implement the complex operations on top of that.

In an ideal world a database is a black box - you send a query and the system figures out how to execute that operation speedily. Even in conventional databases though we end up deviating from that, eg by indexing certain rows we know we'll be querying on. After wrestling with speed problems I took a few steps beyond that by denormalizing my tables for common queries to avoid joins at the cost of more fragile update logic. As my data grew, even that wasn't enough, and simple sorts on indexed rows were taking several minutes. I spent some time trying to second-guess MySQL's optimizer by tweaking various opaque limits, but it still insisted on sorting the few thousand rows by writing them out to disk and running something called FILESORT.

At this point I was in the trap. Getting further would require somebody with deeper knowledge of MySQL's internal workings, and would take a lot of tweaking of my queries and my system setup. Large companies end up throwing money at consultants and database vendors at this point, which is a great result for the providers!

Instead as a starving startup I had to bite the bullet and throw out all my lovely queries. I switched to a key/value database for storage, and designed the keys and values to get a workable sub-set of the information I needed for any query. I then sucked all that data in PHP and did the sorting and refining there.

After the initial pain it was a massive relief. I was able to use my standard computer science skills to design the right data-structures and algorithms for my operations, rather than trying to second-guess the black box of SQL optimization. Sure I've now got a whole different set of problems with more complex application code to maintain, but it's taking a lot less time and resources than the SQL route.

Don't get me wrong, premature optimization is still the root of all evil, but if you're dealing with massive datasets and your database is becoming a bottleneck, consider dropping SQL and falling back to something more primitive, before you're trapped!

How to get Tokyo Tyrant working in PHP

GodzillarockPhoto by NNE

Regular readers know that I've been both entranced and frustrated by Tokyo Tyrant. An elegantly minimal key/value database server with great performance, I've burnt days trying to get it running reliably with PHP.

I'm extremely happy to say I've now got it working, and it's everything I dreamed it could be. The major bug stopping me was truncation of values more than 16k in size, and that turns out to be a bug in the Net_TokyoTyrant PHP wrapper (and arguably a bug in PHP's libraries). The wrapper was using a single fread() call to get values, but this has size limits, and so needs to be called repeatedly in a loop to get the full result. Jeremy Hinegardner got me attacking this again after he confirmed he was using Tokyo successfully through Ruby, and after some debugging made me suspicious of fread's reliability Blagovest Buyukliev's post confirmed it was the cause, and gave me a drop-in fix.

I can't find a way to contact the original author of Net_TokyoTyrant to offer a patch, but the code is included in this updated unit test tokyotest.php

Incidentally, I'd highly recommend running through a unix file socket rather than a network socket on localhost, that's been a massive speedup for my use cases.

Help the Iranian people

I don't know how things in Iran will turn out, but show your support for their struggle by following Mousavi on one of the few channels left for them, on Twitter:
http://twitter.com/mousavi1388

Seeing the photos and videos on there from Tehran is both terrifying and inspiring. My thoughts are with the Iranian people tonight.

Why I hate client-side code (and the cloud will win)

CarcrashPhoto by Saiki

Most of my career's been spent on desktop or embedded systems code and I'm a relative newcomer to web programming. Despite the horrors of server-side development (debugger? ha!) it's so much faster to develop web services than traditional apps. The main reason is that I have control over far more of the environment when the code is running on my own box and I'm only relying on a client to display the UI. The testing matrix for Apple Motion was insane because it ran on the GPU, every piece of hardware behaved differently, and so as new graphics cards and machines came out the combinations we had to check exploded.

So, I have a lot of sympathy with Microsoft, and the Xobni folks doing client-side processing, but this novel-length KB article on troubleshooting Outlook crashes sums up why users are so happy with web apps, despite their limitations.

An implicit data bill of rights

Wethepeople
Photo by Vkx462

I've been lucky enough to spend some time with Ken Zolot this week, who's heavily involved with startups both through MIT and the Kaufmann foundation. He threw some fantastic papers in my direction, and one of the most interesting finds was a proposal by Alex Pentland on data privacy, what he calls a New Deal on Data. I've been wrestling with how to use implicit data on people's behavior in an ethical and honest way and Pentland's definition is really helpful.

He draws on English Common Law principles of possession, use and disposal, and applies them to data about ourselves, to match our intuitive feelings of ownership of information about ourselves.

1. You have a right to possess your data. Companies should adopt the role of a Swiss bank account for
your data. You open an account (anonymously, if possible), and you can remove your data whenever you’d like.

2. You, the data owner, must have full control over the use of your data. If you’re not happy with
the way a company uses your data, you can remove it. All of it. Everything must be opt-in, and not only clearly explained in plain language, but with regular reminders that you have the option to opt out.

3. You have a right to dispose or distribute your data. If you want to destroy it or remove it and redeploy it elsewhere, it is your call.

In practice these make some technical demands for the ability to export and delete information that few services provide. Try saving out your friend graph from Facebook without violating their terms-of-service!

This makes it a tough sell for corporations built around hoarding users' information as a proprietary asset. In the long-term though, the benefits of users sharing information widely will benefit services that don't lock in their users. You can already see that with Twitter's API; their lack of restrictions has led to applications that weren't even imagined before the data became available.

How can you measure influence?

Persuasion

Influence is the measure of your ability to persuade others to take an action. Micah of Lijit gave a barn-storming talk at Boulder NewTech last night, describing how they are starting to measure blogger's influence. It's not publicly released yet, but they're combining both raw audience figures and the user activity they measure through the Lijit widget, things like searches and clicks.

This is exciting to me because nobody's been able to use implicit data on people's behavior in a widespread way, because nobody's had access to a large enough set. I'm bullish on Lijit's prospects because they are in a unique position with hundreds of millions of user interactions across thousands of sites in their database (OneRiot are the only other company I can think of that's got access to more info through its browser add-on).

Lijit's measure is a big step forward, but did leave me with a couple of questions. Influence has to be defined around an action, but their measure seems to be positioned as a universal metric. Lolcats is a lot more likely to make me buy a t-shirt than the Sunlight Foundation, but lolcats has no influence on how I vote. If you pick a single influence number you can't capture that.

There's also the question of who you're influencing. A picture on lolcats will get a lot more pageviews than a post on Brad's blog, but a lot more influential people in the tech community will see the blog post. Google's PageRank tackles this by taking the influence of the people who link to a site into account to calculate its influence. That means a bunch of barely-read geocities (RIP) sites linking to you doesn't matter as much as a link from the New York Times. There's no equivalent way of compensating for the relevance of the users whose activity you're measuring. Having a single Steve Jobs viewing your pages is more influential than 1000 random teenagers.

I've some thoughts on fixing this, and actually started running PageRank on Twitter conversations to figure out who was most influential on the service, but had to put that on hold to focus on other work. I can't wait for Lijit to launch the rankings, despite all my niggling this should be a massive jump forward!

Are you taking market risk or technology risk?

Brokencasio
Photo by MHuang

If you're working in the pharmaceutical industry, your main risk is your new treatment won't work. There's a massive number of medical problems people are certain to pay money to solve, if you can create a drug that works.

In the toy business, it's completely different. Building that new Pet Rock or Cabbage Patch Doll is easy, but for very hard-to-predict reasons people may not like it. You may not be able to distribute or market it even if there are some who do.

Most startups lie on a spectrum between these two extremes of technology and market risk, but I've learnt it's crucial to understand what your mix is. People from a business background prefer market risk, because that's something they know how to measure and mitigate. Techies like me have a bias towards hard engineering problems that they know how to solve.

I started off thinking that Mailana's main risk was technology - it's really hard to integrate with Exchange, build Outlook plugins and analyze millions of emails in real time. There were all sorts of end-user problems that can be solved with the information derived from that, so once the system was built, customers would come. You can chuckle at my naivity, but I never understood that there were two separate risks. I put a lot more effort into coding than understanding the market, and then discovered there were all sorts of unexpected cultural issues around privacy that scuppered my first attempt when it was in front of customers

The beautiful thing about market risk is that you can take very simple steps to reduce it before you spend months coding. Build slideware and ask your potential customers if they'd buy a working version. Buy some relevant AdWords and point them at a dummy product page to see if anyone signs up for more information.

If you're reading this, you're not Pfizer and you do have a market risk. Take a long hard look at your business and see what you can do to reduce it.

How to make connections with people you don't know

Stalking
Matt Van Horn from Digg gave a talk I wasn't expecting last night; the practical side of networking. The whole mission of Mailana is "You guys should talk", I love it when I can connect two people who can help each other. To make that happen, you have to be able to build bridges with strangers; Matt revealed his personal toolkit for reaching the right people.

Matt started off with the LOLCAT picture because if you're not used to networking it can feel creepy and exploitative. What I've realized, and Matt emphasized, is that you need to approach it as a way of helping other people, not just be a taker. It's a long-term project, not something you desperately turn to at the last minute when you need a job.

Having said that, the story of how Matt got to be business development manager at Digg is an example of how chutzpah pays off. He targeted Digg as a company he really wanted to work for, and queued for 2 hours at a trade show to get a business card from Jay Adelson, their CEO. After that he emailed him repeatedly trying to set up a meeting, as well as sending on relevant newspaper articles to the Digg offices. Then he guessed a couple of email addresses for their recruiter and CRO, and eventually landed an interview. They asked him to write a detailed description of the position he wanted to create in the company and how it would help Digg. Finally that landed him the job! Wouldn't you hire somebody who showed that much determination and resourcefulness?

Here's a few of the tips Matt gave out for getting in touch with people you want to talk to, but can't get a 'warm' introduction for:

- Guess email addresses. Most companies have a fixed format, eg pete.warden@company.com, pwarden@company.com, pete@company.com. Figure it out from public examples or just guess and fire off a message.

- Call at odd hours. Receptionists are usually only there 9-5, but most of us work before and after, so there's a good chance somebody helpful will pick up if you ring 7:00am to 9:00am or 5:00pm to 7:00pm.

- Contact them through random social networks. Last.fm and other common sites with a social element have ways of sending their users messages. If you can find the account of the person you're looking for, send them a message and it will most likely show up in their regular inbox.

- Send an 'I've worked with you' connect request on LinkedIn. Even if you haven't been a colleague, you've got a chance to explain in the note why you want to talk to them. I have a 'pro' account on LinkedIn which lets me send a limited number of messages to people outside my connections, but a sparing use of this approach is much cheaper!

I have a few more ideas I've found effective:

- Blog about people or companies you like. I'll often spend time researching  companies or entrepreneurs I think are really cool so I can learn something, and then share it as a blog post. An awesome side-effect of that is that I often hear back from the people I've written about, that's how some of my best collaborations have come about. As I wrote in Beetlejuice, Beetlejuice, Beetlejuice, just saying someone's name on the internet is often the best way to get in touch.

- Comment on their blogs or Twitter streams. I find myself doing this naturally with interesting people I'm following, but it's also a great way to build a relationship and demonstrate a sustained commitment.

The key to all of these is thoughtfulness and sincerity. If you really don't care about what they're doing it will come across and you'll just be wasting time. Be natural, be passionate. Follow up, and show you're listening by referencing previous conversations when you do. Spend more time figuring out how you can help them than how they can help you.

Skynet runs on Windows/MFC

Skynetscreenshot

Me and Liz were re-watching season 2 of the Sarah Connor Chronicles when I spotted some familiar-looking code on John Henry's bootup screen. WM_ACTIVATETOPLEVEL sure looks like a Win32 constant, and googling led me to MSDN documentation revealing it's a private message associated with MFC. It looks like autosysconf is running some C++ code to boot up the AI. The other evidence in the series is ambiguous about what side John Henry is on, but AI code in MFC is clearly evil. Interestingly the most common use of FEP is from The Symbolics Lisp Machine as a front end processor, which would be a much more sensible language.

Interestingly though John Henry (and presumably his brother Skynet) appears to be Windows/x86-based, Terminators are known to use Apple II/6502 processors. The thought of dealing with porting between those two almost makes me feel sorry for our future robot overlords.

Scrape your call history with Selenium

Floorscrapers
Photo by WallyG

There's a lot of interesting data out on the web that's locked up in web pages, with no API access to make it machine-readable. I'm particularly interested in phone records; just like emails, IMs and tweets they form a detailed shadow of your social network. To tackle automatically grabbing my phone call history from the AT&T site I turned to Selenium, originally built as a testing tool but also well-suited to screen-scraping on sites with complex login procedures.

To get started you can install the Selenium IDE in Firefox and record the steps you'd manually take to log in and get to the screen you're interested in. Selenium turns those actions into a script you can manually edit and replay. In my case I needed to add some 'type' commands to enter the phone number and password since those weren't captured. Here's the resulting script, you should be able to run this on your own account to download your call details in a csv file once you've added your own details:

Download Attdownload

What's really handy is that you can use Selenium Remote Control to then re-run that same script from your server, using PHP or other popular languages. It's a bit of a hack because it still requires windowing capabilities so it can run within Firefox and a proxy server process to insert the needed code into external pages, but once it's running it's an incredibly flexible way to deal with constantly changing websites.

Move fast and break stuff

Breakglass
Photo by mpires

I recently talked to someone at a very innovative large web company (under Frie-NDA) who described their official engineering motto as "Move fast and break stuff". I love that philosophy because it ties in to research showing that really successful people get there by trying a lot more approaches than average folks. They fail faster, cheaper and more often than ordinary people.

The key to making that work is that the cost of the total failures must be less than the value of the cumulative successes. This is a hard problem, because the default for most organizations is "managing to avoid blame". Their implicit motto is "Reward success and inaction, punish failure", which ends up making inaction the most appealing course. "Move fast and break stuff" encourages a different mentality, "Reward success and failure, punish inaction".

So how do you get that mindset in your organization? The most important step is to de-stigmatize failure. The web company I mentioned makes it clear to their engineers that they will not be punished if they break the site, even if it costs millions of dollars in lost revenue. I didn't get to dig deeper on that topic but I'd imagine there are some serious post-mortem procedures to understand why things go wrong and build tools to prevent a recurrence, like the Five Whys.

Can you help me shape Mailana?

Sculptor

I've got some important and tricky decisions to make about the future direction of Mailana. To make those choices I need to better understand the problems that people are facing, so I've designed a short 8 question survey. If you are interested in the work I'm doing, it will help me a lot if you're able to take a few minutes to fill it out. It also gives you the chance to sign up for early previews of new features before they're publicly released. Thanks!

10 ways to kill my startup

Poisonlabel

Planning is overwhelming, it's hard to know where to begin. One solution I've picked up is 'anti-planning'; write out all the actions you'd take if you wanted to ensure failure. It's far easier to remember the background to past disasters than to understand why things succeeded. With those fresh in your mind you'll find drawing up an actual plan much simpler. It's also great to keep pinned to your notice-board, to remind yourself when you do start wandering towards one of those seductive traps.

Here's how I'd sabotage my startup, in 10 easy steps:

  1. Get distracted by every shiny new idea and forget what my big goal is
  2. Leave my product to sell itself; build it and they will come, right?
  3. Have no idea who my customers are
  4. When I describe what I'm building, focus on the technology
  5. Worry about my grand strategy, not the logistics of executing
  6. Spend more time meeting with investors than customers
  7. Build features customers don't want
  8. Focus on minor bug fixes
  9. Ignore people who want to help
  10. Rely on my intuition to tell what's working, not dull metrics

An alternative Gmail API

Opendoor
Photo by Funky64

Gabor Cselle, formerly a Gmail engineer and now a founder of the YCombinator startup Remail, has been doing some really interesting work in the email field recently. Their main Remail product takes the normal approach of asking for your Gmail username and password and then fetching all your messages through IMAP. As far as I knew this was the only way of accessing your inbox, but it is horrible for security since it requires users to hand over their Google passwords to a third-party website.

That meant I was intrigued to see that one of their experimental projects using OAuth to access user's inboxes. This is a massive improvement, since the third-party never sees the original password, but I didn't know that any of the mail APIs supported this. Trying to figure out how he did it I discovered it's possible to grab an RSS feed of your messages. Here's a few command-line examples you can try for yourself, replacing username and password with your Gmail credentials:

curl "https://username:password@mail.google.com/mail/feed/atom/unread#all"
Shows unread emails from all your folders

curl "https://
username:password@mail.google.com/mail/feed/atom/inbox"
Shows unread emails in your inbox

curl "https://username:password@mail.google.com/mail/feed/atom/spam"
Shows all your unread spam emails

These all use basic HTTP authentication but web applications can call the same URLs after authenticating with OAuth, giving users a much more secure experience.

There are some pretty serious limitations though. These only let you see unread emails, and is limited to 20 messages at most. That rules out applications that need a lot of email to analyze, but I'm sure there's some other interesting tools that could be built within the restrictions. I'd be curious to know if any other developers are using this and if there's any ways around the limitations. In the meantime I'll keep debugging my IMAP code!

How to use Yahoo's Placemaker API to extract places from documents

Oldmap

Today I was lucky enough to hear Greg Cohn walk us through all the goodies Yahoo offers developers. I'm a big fan and heavy user of their Geoplanet geocoding API, so I was stoked to hear they'd just launched a service to recognize placenames in arbitrary HTML and XML documents. Why is this so interesting? Look at what Just Landed have done by searching for the words "Just landed in" in Twitter messages and then geocoding and visualizing the placenames. Placemaker makes it a lot simpler to build tools like this with anything that can be expressed as XML or HTML. That covers web pages, REST APIs like Twitters and even RSS feeds, so you can see why I'm excited!

I've put together a simple example that shows off how to use it as a bash script, tested on OS X. You can download it as geturlplaces.zip here, or I've included the source below. To use it, pass a web page address as the first argument, eg ./geturlplaces http://news.bbc.co.uk/

For production code you'll want a real XML parser rather than the regexs used below.

#!/bin/bash

# enter your Yahoo geo app id here - to obtain one go to http://developer.yahoo.com/wsregapp/index.php and register
# (interestingly as of May 20th 2009 it works with a bogus id!)
APPID=XXXXX

if [ $# -ne 1 ]
then
  echo "Extract a list of all the recognized place names from a web page using Yahoo's Placemaker API"
  echo "Usage: `basename $0` <web page url>"
  exit 65
fi

curl --silent -d "documentURL=$1&documentType=text/html&outputType=xml&appid=$APPID" "http://wherein.yahooapis.com/v1/document" | grep '<text><\!\[CDATA\[' | sed 's/<text><\!\[CDATA\[//; \
s/\]\]><\/text>//' | sort | uniq


Jaw-dropping nature photos

Fallsrainbow

ModernHiker pointed me towards Ben H's Flickr portfolio, and I'm blown away. They really renew my sense of wonder about the amazing world we live in. Here's a few of my favorites, but there's so many more.

Alaskaaurora

Aurora in Alaska

Thewave
First light on the Wave

Iceberg A beached iceberg

Privacy's vanishing; how screwed are we?

Veil
Photo by Matanya

The whole theory behind Mailana is that people's attitudes to privacy are changing; there's a younger generation willing to open up private information as long as they get something useful in return and retain control. I've written about this before, but a recent post by Marc Hedlund brought some of my thoughts into focus.

He's a self-confessed "privacy freak" but concedes that he's on the losing side of the battle. Selfishly speaking that's a great validation of the bet I'm making on my business, but what's interesting is his motivation. He says that people are blase about privacy online because they've never been stalked or the victim of identity theft. Once you go through that hell, like he has, you realize how useful all those old-fashioned notions really are.

That makes a lot of sense to me, those are black swan events; statistically speaking pretty unlikely to happen to you but devastatingly bad when they do. What's worse is that easy-going attitudes towards privacy create an environment where criminals will thrive, actually making it more likely you'll be attacked in the future. By handing over personal information and even passwords we're all picking up pennies in front of a steam-roller right now.

I'm still a fan of people's new freedom to trade some privacy for something they want more, but I'm acutely aware that people are care-free about that bargain because they've never been stung. A lot of people are going to get hurt before we reach a new equilibrium, with widely understood ground rules for what's acceptable and safe.