PeteSearch

Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.


Try my new Big Data project!
Subscribe in a reader

Recent Posts

  • Hacks for hospital caregiving
  • How does name analysis work?
  • Fixing OpenCV's Java bindings on gcc systems
  • Five short links
  • Five short links
  • Five short links
  • Five short links
  • No more heatmaps that are just population maps!
  • Five short links
  • Five short links

Archives

  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • November 2012
  • October 2012
  • August 2012
  • July 2012

More...

About

Blog powered by Typepad

Five short links

Lincolnmoods
Photo by Earl

Want to live somewhere nice? Be prepared to work longer - How an area's living costs affect poor and rich workers differently.

Moving towards an identity and patient records locator - As Ben Adida points out, a one-way hash of a cell phone number with less than a billion possible inputs is not very useful. The flipside of de-anonymization being easy with enough dimensions is that it should be possible to perform entity disambiguation using the same data, so just store the messy redundant information you get as input, and do joins when you need to. The problem of matching entities is hard because defining an entity is hard, don't fight it.

Open Access Coalition - We're going to look back on the '00's as a golden age when data was open on the web if we're not careful.

DIY McDonald's Recipes - This is a maker project I can get behind! Fast food is my guilty pleasure, thankfully only occasional these days, but I have an engineer's appreciation for the thought that has gone into designing their recipes.

Save yourself from Reddit, Hacker News, Slashdot - A neat little productivity hack from Steve Coast!

March 19, 2013 | Permalink | Comments (0) | TrackBack (0)

Quantity has a quality all its own

Ccdarray
Photo by Kevin Collins

I used to be an image processing engineer. I'd be handed a picture, and I'd have to do something useful with it. To do that I had to take a big mental leap. Instead of seeing it as an image I had to picture it in my mind as a grid of measurements.

At first this was intensely frustrating, because they were deeply crappy measurements. A million factors introduced noise or errors, everything from lenses to sensor noise to encoding software. Gradually I began to make progress, despite all these problems. Decades of engineers before me had figured out inelegant but effective methods of getting value from an unpromising soup of pixels, and I was able to learn from their approaches.

Interesting algorithms in image processing are almost comically domain specific. Thousands of man years of work have gone into detecting and correcting the distinctive reflections that occur when peoples' eyes are caught in a camera flash. Compressing photos effectively requires an exhaustive knowledge of the human perception system, and very clear ideas of the likely subject matter for photos. The process behind facial recognition is a like a game of Mouse Trap, with a whole series of steps that have been empirically proven to work, but which could never have predicted from any theory.

The computer science I was taught at college grew out of mathematics, and assumed that you have a minimal set of clean inputs. Provability and understandability were prized values, and so messy ad-hoc algorithms were seen as dead ends, even if they worked for the problem at hand. Image processing taught me to value them instead, as long as they could be proven to work across the kind of inputs I was likely to encounter in practice.

Once I'd learned that, the world began to look very different. My image processing training gave me the mental tools to tackle problems that other people shied away from. If I have a large enough set of data, I know how to search for the signal, even if the noise is deafening. I'm happy to rely on correlations that aren't guaranteed to hold for all time, as long as I can test it holds in the cases I care about now, and have instrumentation to spot if the prediction stops working. I know that getting 80% of the way there and having a human fill in the blanks is often good enough.

I wasn't the only one to discover how effective this mindset can be, and it has come to be known as Data Science. It's an approach to solving problems that's light on elegance and heavy on pragmatism. It doesn't care about proofs but relies on experiments. Entirely new things are possible once you have massive amounts of data, so even if you're a grizzled old engineer like me and instinctively shy away from trendy new labels, give Big Data a try. Amongst all the marketing hype, there's some powerful techniques for building algorithms that have no right to work, but do.

March 18, 2013 | Permalink | Comments (0) | TrackBack (0)

Five short links

Aperture
Photo by Yersinia

The Deleted City - A spatial reinterpretation of the old Geocities sites. Having data in a single large dump instead of behind an API makes it possible to do things like this with it, things that the creators could never have foreseen.

Asteroid Discovery from 1980 to 2011 - See how our knowledge about the world around us has grown with this amazing animation. At the start new asteroids appear as discrete pinpricks days or months apart, by the end they're being discovered so fast they're a solid mass, it's like a lighthouse beam hitting fog. It's not only that we're finding out more, but that the rate of discovery is accelerating.

Open data on depression treatment in London - I love seeing mass adoption of data technologies, it's this sort of democratization of the tools that makes the real difference to the world. What's special about this approach is that it's so ordinary, what used to be elite techniques are now available to people in every walk of life.

BitDeli - I haven't used this yet, but I love the idea of being able to program custom analytics code, without the hassle of having to host it myself, and with the benefit of being able to reuse other people's approaches too.

Silicon Valley poverty - Even after twelve years here, I'm still shocked by how wide the gap is between the rich and the poor in the US. 

March 13, 2013 | Permalink | Comments (0) | TrackBack (0)

Why I'm a terrible privacy advocate

Handeyes
Photo by Michael Scott

People often think I'm a privacy researcher, thanks to the Facebook and iPhone stories. The truth is I'm just curious about undiscovered data. Because a lot of it is about people's behavior, and that's an inherently creepy area, I blog about what I'm doing to keep myself honest. It might look like I'm on a privacy crusade, but that's just a by-product of my attempts to figure out ethical ways to use these sources of information. I'm a data hacker, and I'm trying to keep my hat clean.

This has been on my mind a lot recently as I'm looking around at all the information that's publicly available about exactly where people have been. Facebook, Google+, Instagram, Flickr, Twitter are all making rich streams of location data available, especially around photos. My vision is a world where I can make those digital footprints visible to ordinary users. Who comes to this bar? Any of my friends? What sort of people take photos at this hotel?

The raw data to do this is already out there in multiple places, and you can do some of it by going to individual sites like Foursquare, but there's something different about merging together scattered information, even if it's all theoretically public already. You have to make a choice before your activities are publicly visible from these services, but the implications of that choice aren't clear until somebody aggregates the data and demonstrates why the sum is greater than its parts.

I wish I could pretend I was only worried about the privacy implications, but the truth is I'm excited about how fun and useful the applications could be!

March 12, 2013 | Permalink | Comments (0) | TrackBack (0)

Five short links

Fiveleaves
Photo by Flood G

BetaShapes - Using geotagged Flickr photos to define San Franciscos neighborhoods as a crowd-sourced 'folksonomy'. I'm entranced by how many useful things emerge from the clouds of data exhaust we're all generating.

Bacteria farming and software design - Code is an incredibly useful tool for artists, I love this behind the scenes look at how an amazing visualization was built.

Ang Lee and the uncertainty of success - The acclaimed director spent six years of his career with no visible signs of making any progress, and this post does a fantastic job highlighting how hard that must have been.

Founders and dysfunctional families - Growing up in chaos is good preparation for working in a startup.

Common Crawl URL search - Thinking of crawling the web? Check out this web interface to see if Common Crawl already has what you're looking for sitting in a handy S3 bucket!

March 11, 2013 | Permalink | Comments (0) | TrackBack (0)

Why should you care that artists are underpaid?

Starvingartists
Picture by Jamie

I've spent most of my career working closely with artists, and they were usually paid less than me. At first this was just awkward, but I began to realize it was part of a deeper problem. Most business owners didn't understand what artists were even adding to the product, and the pay was just a symptom of the lack of respect they had for their contributions. I remember at my last game industry job the owner began replacing experienced 3D artists with high-school graduates being paid a third of their salaries. 

I'm a capitalist, red in tooth and claw, so why did I have a problem with that? At that point, I'd spent six years at the coal face of a creative industry, and I knew how much those people contributed to a successful product. The trouble was it was often in subtle ways that were crucial but easy to miss. In the short term you could easily continue producing sequels, recycling assets and ideas, but it wasn't sustainable. It was a great way of cutting costs, but you ended up with a boring product that was indistingushable from the competition, which meant a lot lower profits over time. If you don't value the craft of experienced creative people, you'll never get a hit, and that's where you really make money.

I learned to find places that valued artists, not just because they were better places to work, but because I thought they had a much higher chance of success. I was never an Apple user before I was asked to join the company back in 2003, and my biggest concern was that they were about to go bust(!), but I was attracted by their reputation. I wasn't disappointed, the designers were very definitely in charge! Dedication to the intangible details of products was the core of Apple's success, and that meant valuing artists. It wasn't always easy, it was a challenging pressure-cooker environment for a lot of my friends, but the importance of what they were doing was never in doubt. I always believed Steve would happily ship off the software engineering side to the other side of the world if he could, but the designers were the company.

That's why I'm sad when I see industries throw away the talent that made them great. The Visual Effects Oscar speech was cut off when the winner started to mention the bankruptcy of Rhythm and Hues, the team that was behind a lot of the shots that won the award (and where several of my friends work). It's the latest casualty of a wave of VFX closures, and a sign that film industry bosses think they can get away with cheaper, less-experienced artists, and audiences won't notice the difference. It's like Detroit in the 70's, they have enough momentum that it will take a while for the problem to be noticeable, but by the time it's obvious, the talent will have disappeared. Fit and finish matters, and when capital-intensive industries like cars, film or games forget that, disaster looms. A free market will eventually correct the problem, but only after a lot of money has been wasted, and a lot of people have gone through hell.

Learn from Apple's success; valuing artists makes you money!

February 25, 2013 | Permalink | Comments (0) | TrackBack (0)

Which iOS versions are Jetpac users running?

Newton
Photo by Visual Media

I just hit a nasty bug in the Jetpac iPad app that only seems to affect users on iOS 6.0.1. Unfortunately it seems to be deep in the OS's Facebook integration code, so I wasn't able to put in a very satisfactory fix. Since the problem didn't occur in earlier versions of the OS, and was fixed in later ones, I needed to figure out how much more time I should spend on this bug, versus all the other issues we're wrestling with.

To get a good idea of the user impact, I needed to know how many of our users are on that exact iOS version. Happily we've got quite a robust analytics setup now, so it just took me a couple of minutes to pull that out. It was pretty interesting to see how fast iOS 6 has been adopted, so I thought I'd share it. It's based on a few thousand users from the last couple of weeks, and the sample is for iPad users who like travel apps, but I can't imagine it's too skewed from the general app-installing population:

For the major versions, 86% were on iOS6, 14% were on iOS5, and we've dropped support for iOS4 so we had practically no users there. Here's the details on the minor ones:

6.1.2 0%

6.1.0 39%

6.0.2 5%

6.0.1 38%

6.0.0 5%

5.1.1 12%

5.1.0 0%

5.0.1 1%

5.0.0 0%

4.3.5 0%

4.3.3 0%

4.3.2 0%

For my purposes, this means that 6.0.1 is still fairly strong, and I have more work to do! It's good news that in just five months, 86% of our users have moved over to a version of 6 though, that will definitely make on-going support a lot easier! It's surprising that the 6.1.2 user numbers are so low, but the cut-off time was earlier yesterday, so it may be that few of our users had upgraded by that point.

February 21, 2013 | Permalink | Comments (0) | TrackBack (0)

Five short links

Fivematches
Photo by Martin Fisch

Facial profiling for the detection of mal-intent using thermal imaging - I've been out of the loop on how far image processing has come in detecting emotions. If you think a computer that recognises your face is uncanny, how about one that can tell how you're feeling better than a human?

FakeBelieve - Juicy details on how the amazing photographic paintings of Ransom Mitchell are created. I love how they're hacking reality, with hardly a CPU in sight.

Beauty in the Breakdown - Beautiful writing about a climber's addiction to 'benzo' pills, and the interplay between anxiety, obsession, and all kinds of addiction.

Quandl - It's not a new idea, but this is a good implementation of a search engine for time-series data sets.

Bulk loading data into Redshift - I've been very impressed by what I've seen of Amazon's new database aimed at scalable analytics processing, especially since bulk loading seems to be a first class citizen, rather than something you have to hack on after the fact like the old SimpleDB.

February 19, 2013 | Permalink | Comments (0) | TrackBack (0)

A pub that's also a theater?!?

Theaterpub

I love drinking beer, and I love watching plays. I usually have to elbow my way into a crowded bar in intermission to combine the two, which is far from ideal. Imagine my delight when I ran across the concept of the San Francisco Theater Pub! The beer's right there, and so is the play. It's genius!

Tonight was my first chance to sample the format, and I had a great time. The bar had a good selection, including some delicious 101 North stout, with a nice clean Radeberger lager to clean it off. The play itself was an odd duck, just ten lines, translated from the original German of Heiner Müller. They repeated it eight times with different troupes, so I think I have it down from memory:

#1 - Can I lay my heart at your feet?

#2 - Only if you don't dirty my floor.

#1 - My heart is clean.

#2 - That remains to be seen

#1 - I can't get it out.

#2 - Would you like me to help?

#1 - Only if it's not too much trouble.

#2 - It will be my pleasure... I can't get it out either. I'll operate! What else is my pocket knife for? Persevere, don't despair. We have it. It's a brick. Your heart is a brick.

#1 - But it only throbs for you.

Everyone had a good time interpreting it with a five or ten minute performance each. Here's what I remember from each of them:

A cleaning lady and a client, with #1 silent and unwrapping the words from her body, and scrawling the final line on the window of the pub as she left. There was some strong physical comedy, it had obviously been well thought out.

A tango-dancing pair of mute ghosts, who grabbed two audience members and gave them cue card with lines to read as they snapped their fingers. The mimed direction aimed at the hapless audience members was a highlight, especially when they forced one of them to re-read his line with 'more balls'.

One pair of actors who set out to direct two volunteers from the audience. They went through a whole dress-rehearsal, notes, and preview performance with the couple, and both the volunteers threw themselves into the roles.

A country singer versus an opera singer, up against a lady with some swish moves, all doing their lines in rounds, gradually moving to a climax on the final line. The sheer power of the voices together was remarkable.

A boy band version, played by two girls in skater pants doing synchronized dance moves and lip-syncing to a cheesy track in-between their lines. It was surprisingly moving, with some great acting.

A strange accordion player up against a burlesque lady with a power drill and a latin lover. The instrument came in handy, it became part of the conversation.

Three girls who had a well-rounded dance routine, and some film-score music to run the lines against. The coordination and thoughtfulness of the numbers they performed was impressive.

A pre-recorded silent movie version with some impressive acrobatics, recorded in the bar during the day when it was empty. They did a good job with a screen that they set up to look like a thought bubble from one of the actors who was slumped on the bar.

I was never bored! Despite a mime density that would normally have me running screaming from a venue, the evening was very entertaining. Everyone in the audience was very engaged, though I forgot my first rule and ended up in the splash zone in my position at the bar. The players did a great job working in a very limited space, and did some very imaginative work with the strange shapes they had to play with. Nobody was taking themselves too seriously but the actors had obviously put some serious effort in, and it paid off in some touching performances of a very difficult work. I'm looking forward to making it there again for the Taming of the Shrew next month. It will be challenging since misogyny's the backbone of the plot, but I'm confident they can pull it off intelligently!

February 19, 2013 | Permalink | Comments (0) | TrackBack (0)

Things that happen to startup founders

Foundation
Photo by USACE Europe District

You get into a lot of debt living off an anaemic salary and go bankrupt.

Your spouse breaks up with you.

You are fired from your own company by an outside CEO.

Your company is acquired, but at a level where only the investors get any money back.

Talking with an old startup colleague, I realized how many of our circle had been through rough times. All of these have happened to our close friends. In an abstract sense we always knew the odds, but it's different when you see it playing out in real time with people you care about. Startup failure is a lot easier to deal with as an abstract possibility than a looming likelihood. Until you've been through it, you can't imagine how messy and prolonged the death of a startup can be. The final end of my Mailana was mercifully quick, but the months leading up to it were one of the toughest times of my life.

The Valley doesn't like to talk about failure and pain, but you can't understand how precious the successes are without knowing how hard it can be. I feel extremely lucky and glad to be here at Jetpac now, but some of my friends genuinely regret the time they sank into their startups. Not all entrepreneurial careers have happy endings, and there's a difference between knowing that, and living it.

February 15, 2013 | Permalink | Comments (0) | TrackBack (0)

« Previous | Next »