PeteSearch

Tips and tools for better searching

Subscribe in a reader

Help build trails

  • Help build trails by shopping at REI
Lijit Search

Categories

  • Outdoors

    Email

    Implicit Data

    Facebook

    Browser plugins

    Coding

Recent Posts

  • I'm thankful for Ed Pulaski
  • Boosting Redis performance with Unix sockets
  • Using Redis on PHP
  • Nonsensical Infographics
  • Amazing scams are earning $1000+ CPMs for priceline.com
  • Get new users for $2 each with Facebook Ads
  • Eat mistakes, not jobs
  • Andy Kessler's keynote at Defrag stunk
  • Never trust a hippy
  • Getting Tokyo Tyrant to work with files larger than 2GB

Archives

  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009

More...

About

Blog powered by TypePad

I'm thankful for Ed Pulaski

Pulaski
Photo from the Santa Monica Mountains Trails Council

For years one of my favorite trail work tools has been the Pulaski ax, similar to a small pick-ax with a blade on one side and a grubbing hoe on the other. It works great for digging out stubborn stumps and then chopping the roots so they can be pulled out, as you can see in the shot above.

I vaguely knew it was named after a fire-fighter who popularized it, but I never knew his amazing life-story until I read The Big Burn. It's one of the best books I've read in a long time, weaving the political story of the early days of the Forest Service with the personal nightmares and heroics of the biggest wildfire in American history.

Ed Pulaski was a grizzled woodsman in his 40s when he joined the new-born Forest Service as a Ranger in Idaho. In those early days Rangers were mostly young college boys from the East, so as a respected local who'd spent years outdoors as a prospector, rancher and railroad worker, Ed brought more credibility to the job than most.

In the summer of 1910, Teddy Roosevelt was out of office and without his support the Forest Service was being starved of funds. Since the year was unusually dry, there were spot fires throughout the Rockies, and not enough man-power to control them. It was so desperate that Pulaski, like many Rangers, ended up paying fire-fighter's wages out of his own savings.

On August 20th, hurricane-force winds whipped up the mass of spot fires into an immense burn, covering over 3 million acres, about the size of Connecticut. Ed Pulaski was leading a crew of 50 men trying to save the town of Wallace, and when it became clear the winds made the blaze unstoppable he tried to lead them out. The fire moved too fast, and they became cut off. Remembering a small mine nearby, he found the tunnel and led his crew inside as the forest was burning around them.

Standing at the entrance, he desperately tried to keep breathable air from being sucked out of the mine by hanging wet clothes as a barrier, and trying to extinguish the supports as they kept igniting. Finally out of cloth, and badly burned and blinded by the flames, he ordered his crew to get low to the ground to make the most of the breathable air. He had to threaten the panicked men with his pistol to get them to obey, and shortly afterward he fell unconscious to the floor.

After the fire had passed, the men stumbled to their feet. Five were dead of asphyxiation, and Ed was thought to be gone too until he was woken by the fresher air. Forty-five men survived thanks to his leadership, and they stumbled the five miles back to down on feet burned raw, in shoes with the soles melted off.

They found a third of Wallace burned, but most lives saved by evacuation. Other fire-fighting crews throughout the Rockies were not as lucky, with 28 killed in just one spot, and over 125 dead in total.

Ed never fully recovered from his injuries, but went back to work as a Ranger, with medical treatment paid for by donations from his colleagues since the Service refused to help. He fought for years to get a memorial built for the fire-fighters who lost their lives, and created the Pulaski ax as a better tool for future crews.

Ironically, the great fire was a public relations coup for the Forest Service. They used the heroic story of Ed Pulaski to push through increased funding, with the promise of a zero-tolerance approach to wild fires. This use of fire as a justification for largely unrelated policies set a terrible precedent that's come back to haunt us. Now most debates around how we should use our National Forests are fought by invoking the specter of wild fires on both sides. The lack of both regular smaller fires and logging has left us with a tinder box, and from my time with forest and fire professionals, there are no simple solutions. The only approach that makes commercial sense for loggers is clear-cutting easily accessible areas, and simply letting fires burn when there's so much fuel results in far more devastation than when they were smaller but more frequent. I'm in favor of letting the professionals figure out good management plans without too much political pressure to lean towards a pre-judged outcome. I'd imagine that would involve more selective logging, which wouldn't go down well with many environmentalists, but it's also obvious that's only going to address a small part of the problem.

Despite this political knot, I'm grateful that people back in 1905 put so much of America into National Forests. After growing up in a country where every square inch has been used and reused for thousands of years, I fell in love with the immense wildernesses over here. Even just a few miles from LA you can wander for hours in beautiful mountains without seeing another soul. I'm thankful we had dedicated Rangers like Ed Pulaski to preserve that for us.

November 26, 2009 | Permalink | Comments (0) | TrackBack (0)

Boosting Redis performance with Unix sockets

Electricsocket
Photo by SomeDriftwood

I've been searching for a way to speed up my analysis runs, and Redis's philosophy of keeping everything in RAM looks very promising for my uses. As I mentioned previously, I hit a lot of speed-bumps trying to get it working with PHP, but I finally got up and running and ran some tests. The results were good, faster than using the purely disk-based Tokyo Tyrant setup I have been relying on.

The only niggling issue was that I knew from Tokyo that Unix file (aka domain) sockets have a lot less overhead and help performance compared to TCP sockets, even using localhost. Since the interface is almost identical, I decided to dive in and spend a couple of hours patching my copy of Redis 1.02 to support file sockets. The files I've changed are available at http://web.mailana.com/labs/redis_diff.zip.

My initial results using the redis-benchmark app that ships with the code show a noticeable performance boost across the board, sometimes up to 2x. Since this is an artificial benchmark it's unlikely to be quite this dramatic in real-world situations, but with Tokyo 30%-50% increases in speed were common.

I hope these changes will get merged into newer versions of Redis, it's a comparatively small change to get a big performance boost in situations when the client and server are on the same machine.

November 24, 2009 | Permalink | Comments (0) | TrackBack (0)

Using Redis on PHP

Kitcarson
Photo by Bowbrick

There's a saying that pioneers tend to come back stuck full of arrows. After a couple of days trying to get Redis working with PHP, I know that feeling!

I'm successfully using Tokyo Tyrant/Cabinet for my key/value store, but I do find for a lot of my uses disk access is a major bottleneck on performance. I do lots of application-level RAM caching to work around this, but Redis's philosophy of keeping everything in main memory looked like a much lower-maintenance solution.

Getting started is simple, on OS X I was able to simply download the stable 1.02 source, make and then run the default redis-server executable. Its interface is through a TCP socket, so I then grabbed the native PHP module from the project's front page and started running some tests.

The first problem I hit was that the PHP interface silently failed whenever a value longer than 1024 characters was set. Looking into the source of the module, it was using fixed-length C arrays (they were even local stack variables with pointers returned from leaf functions!) and failing to check if the passed-in arguments were longer. This took me an hour or two to figure out in unfamiliar code, so I was a bit annoyed that there weren't more 'danger! untested!' signs around the module, though the README did state it was experimental.

Happily a couple of other developers had already run into this problem, and once I brought up the issue on the mailing list, Nicolas Favre-Félix and Nasreddine Bouafif made their fork of PHPRedis available with bug fixes for this and a lot of other issues.

The next day I downloaded and ran the updated version. This time I was able to get a lot further, but on my real data runs I was seeing intermittent empty string being returned for keys which should have had values. This was tough to track down, and even when I uncovered the underlying cause it didn't make any sense. It happened seemingly at random, and I wasn't able to reproduce it in a simple test case. An email to the list didn't get any response, so the following day I heavily instrumented the PHP interface module to understand what was going wrong.

Finally I spotted a pattern. The command before the one that returned an empty string was always

SET 100
<1000 character-long value>

It turned out that some digit-counting code thought the number 1000 only had three digits, and truncated it to 100. The other 900 characters in the value remained in the buffer and were misinterpreted as a second command. That meant the real next command received a -ERR result. I coded up a fix and submitted a patch, and now it seems to be working at last.

Hitting this many problems so quickly has certainly made me hesitate to move forward using Redis in PHP. It's definitely not a well-trodden path, and while the list was able bring me a solution to my first problem, I was left to debug the second one on my own, and a question about Unix domain sockets versus TCP was left unanswered as well. If you are looking at Redis yourself in PHP, make sure you're mentally prepared for something pretty experimental, and don't count on much hand-holding from the developer community.

Of course, the same goes for almost any key/value store right now, it's the wild west out there compared to the stability of the SQL world. My next stop will be MongoDB to see if having a well-supported company behind the product improves the experience.

November 22, 2009 | Permalink | Comments (0) | TrackBack (0)

Nonsensical Infographics

Nonsensicalinfographic1
Nonsensical Infographic 1 by Chad Hagan

20x200 is an awesome concept, an online gallery selling limited editions of works by new artists starting at $20 each. While I strive to follow Tufte and make all my visualizations tell a clear story, I'm aware they sometimes turn out more pretty than functional, so I'm in love with Chad Hagan's 'Nonsensical Infographic' series on there. Now I just need to convert them into flash animations to make them even more beautiful and confusing.

Nonsensicalinfographic2 
Nonsensical Infographic 2 by Chad Hagan

November 18, 2009 | Permalink | Comments (0) | TrackBack (0)

Amazing scams are earning $1000+ CPMs for priceline.com

Scam
Photo by Toasty Ken

I'm a cheerful pessimist about human nature, I generally expect the worst but don't let it get me down. This report from the US Senate is beyond the pale though, it really makes me mad. It details how companies like Affinion, Webloyalty and Vertrue pay immense amounts of money (CPM rates of up to $2650!) to well-known firms like priceline.com and 1-800-flowers to get links inserted into their checkout process. These links look like discount offers, but clicking on them passes your credit card information to the scammers, and lets them set up a recurring monthly payment on your account without asking for permission, hoping you won't notice, at least for a while.

How much money is there to be made here? The report estimates just those three firms have earned over $1.4 billion so far! And how much of a scam is it? Vertrue estimates 98% of their call volume is cancellation requests, and Webloyalty admit that 90% of their members have no idea they're enrolled.

As an entrepreneur I know how much over-regulation can hurt startups and economic growth, but scams like these drive people down that path. As an industry we need to have enough sense to avoid crazily short-sighted schemes like these if we want to have a long-term future. All three companies are owned by big-name private equity firms, and big-name websites are hosting their ads. Everyone involved should be ashamed of themselves, and nobody else should touch them with a barge pole. Sadly, $1.4 billion is very persuasive...

November 18, 2009 | Permalink | Comments (0) | TrackBack (0)

Get new users for $2 each with Facebook Ads

Facebook
Photo by Intermayer

This is one of those posts I hesitate about writing, because it's tempting to hoard an advantage like this, but sharing always seems to benefit me more in the long run. I'm able to get new users for my (in-testing, very unfinished) Facebook app for as little as $2 each using Facebook Ads. Here's my most successful ad so far:

Facebookad 

It's short and simple, and around 0.07% of people who see it clicking. Naively I started off expecting click rates of around 1%, but since then I've talked to people with more experience in the ad world, and outside of search ads mine is actually pretty respectable. It's also cheap - I set my cost-per-click bid to 50 cents, but actually ended up paying 37 cents each.

This is only the start of my funnel, the landing page is the install dialog for my Facebook app. The only thing I have control over there is the app description and logo, and currently only 30% of the visitors click accept to install it. That means my cost per installation is around $1 per user.

After they've made it through that screen, they're finally on a page I control. Here I ask them to give me their email address, accept extended permissions and authorize me to access their Twitter account. I lose between 50% and 70% of users there, bumping my final cost per true user to between $2 and $3 each.

So what are the secrets to achieving similar results?

Land in Facebook. I have a massive advantage in that I've moved my service over to run as a Facebook app. It's low friction for users when they're staying within the same site, I doubt you could achieve the same CPC for external pages. I'm now a hostage to Facebook's whims of course, but for me the gain in user trust far outweighs the risks.

Start small. I'm still paying only $15 a day spread over several campaigns, that gives me enough data to tell what's working and refine my ads and landing pages before I ramp up to collect larger numbers of users. It's also a great way of flushing out bugs and scaling issues while annoying a relatively small number of users.

Test, test, test! I'm terrible at writing ad copy, really, really bad, and my first versions had awful click-through rates around 0.01%. I was able to use Facebook's statistics panel to tell which ads were the least-worst, and spot the patterns. In my case the shorter ones worked much better, as did the ones that focused on a single feature, which is how I ended up with the one above. I'm also constantly trying new versions of the landing page and sign-up flow to measure how I can improve the rest of the funnel.

Foreigners are cheap. There must be a lot less competition for UK and commonwealth Facebook views, because I'm able to get CPCs of 37 cents if I specify the English-speaking non-US in my targeting, versus around 60 cents each in the US. If you're in the testing phase, I would expect you could get representative data for almost half the price if you use non-Americans as guinea pigs.

I still think there's a lot of room for improvement in my funnel, so I'm hopeful I can keep driving the cost down, even if the ad market overall becomes more expensive with competition. I'm also not doing much with the targeting possibilities beyond picking countries, I think localized ads could get a strong response, and I need to run a census of my users to understand what demographics the service appeals to most and target them.

November 13, 2009 | Permalink | Comments (0) | TrackBack (0)

Eat mistakes, not jobs

Pacmantable
Photo by Garretc

Andy Kessler's talk yesterday got me thinking hard about why I found his argument so unconvincing. He focused on how innovation will destroy jobs, the way container ships put all the stevedores out of work. I think he's missing a completely different outcome of innovation, and one that excites me a lot more.

Stevedores were performing a process that achieved the results we were after, they weren't dropping half the boxes into the ocean as they unloaded, so containerization just made the process more efficient.

Where Andy went off the rails is when he applied that model to worlds like education. We are really, really bad at teaching our kids, enormous numbers of them don't even make it through high school. It's as if we're losing half the cargo every time we unload a ship. Innovation in education gives us the chance to achieve better results with the existing resources, giving our teachers tools so they leave fewer kids behind. It's about effectiveness not efficiency, because we're falling so far short of our goals right now.

Would we expect a school district that increased its students' overall GPA to then fire some teachers to save money and return to the old GPA, since we lived with that before? Of course not, we'd celebrate the achievement and try to replicate it elsewhere.

What really excites me about technology innovation is that we can help people do important things that weren't possible before. MIT's opencourseware is an awesome example, the lectures and materials work as an accelerator and multiplier to traditional learning methods, helping students all over the world get better results. There's no wave of professors being fired, if anything it's taking pressure off them to do mundane and routine introductory lectures and focus on the value-added personal teaching instead.

I love increasing productivity because it lets people do tricky jobs much better, which I find a lot more satisfying than automating people out of a job. I'm much happier preventing screw-ups than eating people!

November 12, 2009 | Permalink | Comments (0) | TrackBack (0)

Andy Kessler's keynote at Defrag stunk

Andy Kessler

Andy Kessler just gave the opening keynote speech at Defrag '09, and I really hated it. The title was Be Solyent, Eat People, and since I'm fascinated by the topic of productivity and job replacement I was looking forward to a thoughtful analysis of a complex topic. Instead it felt like a rant by an undergraduate who'd just read Atlas Shrugged for the first time. He laid out a taxonomy of 'unproductive' jobs, which he generally classified as servers as opposed to creators, and then split those servers into 'sloppers', 'sponges', 'slimers' and 'thieves'.

What gobsmacked me was his seeming contention that basically anyone who wasn't a programmer was a parasite. He mentioned a lot of jobs that should be largely automated, from the uncontentious idea of stevedores being replaced by container ships, to the eyebrow-raising example of librarians and finally to the gob-smacking idea that teachers are on the way out!

He seemed to be taking an uncontroversial idea, that there are buggy-whip making jobs that will be replaced by new processes, and taking it to ridiculous and offensive extremes.He used doctors as an example of a 'sponge' profession where artificial barriers to entry kept the incumbents charging high fees and gouging their customers. I'm extremely sympathetic to Adam Smith's quote 'People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public, or in some contrivance to raise prices', but we tried unregulated doctors for most of the nineteenth century here in the US, and it didn't work so well.

All of Andy's ideas are controversial extrapolations of accepted ideas, but he gave no evidence that any of his assertions actually hold. All it did was annoy me without offering any enlightenment, I'd love to engage with his ideas but there was nothing to hang a debate on, just pure opinion.

November 11, 2009 | Permalink | Comments (0) | TrackBack (0)

Never trust a hippy

Neilyoungones

This is a tricky post to write, because some of my best friends are hippies, I've been accused of being a hippy myself and I live in Boulder, but after reading this article about a first-time entrepreneur's messy breakup with a business partner I couldn't resist.

When it comes to business, pay no attention to what a potential partner says. Judge them on what they do. This is especially important if they're charismatic and overtly spiritual because what they say will be both flattering and very appealing, you'll be tempted to bend over backwards for them. I'm speaking from painful personal experience; my two worst business outcomes were situations where I really liked a partner and stopped thinking critically about what they were offering.

I followed a charismatic hippy manager into his new startup for no equity and worked like a dog for a year. He replaced my friends (he'd needed all our resumes to get the initial contract) with cheap college interns, compressed the schedule and played a lot of other nasty tricks until I finally snapped when a colleague was reprimanded for being late on a Sunday. I'd spent many evenings with the guy and his wife and kids before 'our' startup launched, I really liked him, and he'd painted a beautiful vision of a family-friendly workplace with a great culture. My mistake was that I'd failed to push for any tangible evidence he was serious about his promises. Trust but verify.

The sad thing is, I don't think he was faking the beliefs that he kept talking about, but he was able to use them to convince himself that they justified whatever the most convenient thing for himself was. During the nightmare he often invoked providing for his family as a reason to cut salaries and hoard the benefits of success, which sounds great until you saw it meant a second home for him while employees struggled to afford healthcare for their kids.

Since then I've been much more comfortable with 'coin-operated machines', as a former partner described himself. I find someone who's up-front and honest about their motivations is a lot easier to deal with than anyone who claims they're acting in your best interests.

On Hacker News, a commenter pointed out that Steve Jobs is a hippy, which is true, but I don't think it's possible to find someone who's more blunt and straightforward in his reactions than Steve! All I want is honesty and trust, and I find that's a lot easier to achieve with someone who's unafraid to admit selfish behavior than anyone who's worried about preserving a virtuous self-image.

This is one of the hardest posts I've had to write, I'm admitted a strong prejudice based on a small sample size, and I got a lot of flak when I posted my original comment on HN. In the spirit of openness I'm trying to be honest about what my biases are and how I got to them, even if they aren't particularly flattering. I look forward to the comments!

November 10, 2009 | Permalink | Comments (0) | TrackBack (0)

Getting Tokyo Tyrant to work with files larger than 2GB

Godzillavskitten
Photo by Gen Kanai

I use Tokyo Tyrant/Cabinet as the key-value database for Mailana, and after some initial hiccups I've been very happy with its performance. Last night though it stopped working in the middle of preparing several hundred nightly emails, and I wanted to document the problem and the fix to help anyone else who hits this.

After a bit of investigation, I noticed that the Tyrant server kept dieing with "File size limit exceeded". My casket.tch hash database file had grown to 2GB, and running on a 32 bit EC2 server Tokyo couldn't cope with anything larger. There's a standard called Large File Support on Linux that allows you to access >2GB files, but it requires a few things to work:

- A modern version of Linux. I'm on 2.6, so it has support for LFS built in.

- A modern file system that supports large files. I'm on XFS, so that was also ok.

- You need to recompile your program to use the 64 bit versions of file operations. Happily Tokyo was using the correct off_t type for file offsets, rather than int, so I was able to add the -D_FILE_OFFSET_BITS=64 compile flag to the configure script in both Cabinet and Tyrant, rebuilt them both and they then ran with 64 bit file offsets on a 32 bit system.

There was one other quirk I discovered. By default Tokyo only uses a 32 bit index for the hash database, so you also need to pass in the l option at runtime to cope with the larger files, eg:

/usr/local/bin/ttserver -host /sqlvol/tokyo.sock -port 0 -le /sqlvol/casket.tch#opts=l

After doing those changes, I was able to restart my server and run the daily email updates again. The meta-data for my database seemed to have been corrupted by the issue, but all my data integrity checks passed, so I patched around the problem. Specifically in tchdb.c:tchdbopenimpl() the file size returned from fstat() didn't match the one stored in the meta-data header, so I skipped the check:

sbuf.st_size < hdb->fsiz

October 31, 2009 | Permalink | Comments (0) | TrackBack (0)

Plug and Play Tech Center spam

I don't usually post spam, but for anyone out there who gets an email like this and googles it, no, I don't think it's that dream investor you've been waiting for. The fact they can't even figure out my first name is a strong sign, and I'm not the only one getting these.

From: Nickolas Turner <nturner@plugandplaytechcenter.com>

Subject: Funding Opportunity through Plug and Play Tech Center

Dear Mailana,

Are you looking for funding? Please contact Alireza@plugandplaytechcenter.com to get in touch with our seed and early stage venture arm, as well as our partners.

Best of luck in your ventures.

Regards,

Nick Turner

Business Relationship Associate

Plug and Play Tech Center

(650) 207-7001

October 29, 2009 | Permalink | Comments (0) | TrackBack (0)

Hate your bank? Use a credit union

Bankteller
Photo by Ronn Ashore

I spent several years suffering with grotty customer service at Citibank, and then I was hit by a check fraud that spiraled into a kafka-esque nightmare. A house-mate snuck into my room, stole a check, forged my signature (poorly) and then cashed it for $1000. Einstein that he was, he'd had to write his driver's license and social security number on the back, which showed up when I got the photocopy back. Not wanting to tip him off, me and the other housemates contacted the police, who were very helpful and interested. Now all we needed was the location where the check was cashed, which didn't show up on the statement.

After 3 months of both me and the police constantly calling and visiting Citibank, they refused to provide us with any details. I was constantly fobbed off with bogus excuses, since the case was allegedly in the hands of their fraud department who must live on an island somewhere in the south Atlantic with no means of communicating with the outside world, since I was never able to get a phone number or address to contact them. I finally received a refund after blowing my top at the local branch, and then promptly closed my account, threw the house-mate's possessions out on the front lawn and sent a copy of the forged check to his parents.

I was reminded of that when I saw this article on someone being hit with a $888,888.88 bank charge, with no explanation or help from the bank staff. It sounds like exactly the same sort of organizational failure that stymied my efforts to get help. From what I can see, the big banks have spent the last decade trying to build automated systems and procedures so they can get rid of expensive staff. That mostly works for routine operations, but as soon as something unusual happens you need somebody with judgement and authority to make decisions.

So what's the answer? I moved my account to a credit union eight years ago and I've been incredibly happy with them ever since:

- The customer service has been fantastic. They have trained, motivated bank staff able and willing to sort out problems for me, both in the branch and on the phone.

- I pay zero ATM fees, even when I'm traveling, since I can use any other credit union's machine for free.

- They don't gouge me with any other fees either. The big banks make nearly 40% of their revenue from 'non-interest income', and the bigger they are, the more they rely on them. Even worse, the 20% of households who pay the majority of overdraft fees (ie the poorest) pay 80% of those, averaging around $1300 each annually.

- I also get a warm glow inside because my deposits are funding straight-forward loans to local people and businesses, not financial speculation or empire building by bank CEOs. I'd rather be helping George Bailey than Gordon Gekko.

My personal account is with Keypoint Credit Union, and my business one with Lockheed, and they've both been stellar. If you're sold on the idea, there's almost certainly one that you can join, either because of where you live or the industry you work in. If you're current with a large bank, you won't regret switching.

October 28, 2009 | Permalink | Comments (0) | TrackBack (0)

Super-simple A/B testing in PHP

Alphabeta
Photo by Roadside Pictures

To really learn about what your users want you need to see how they respond to the different alternatives. Running A/B tests is a great way to do this, but even though the concept is simple, I always felt like it would require some complex coding and database setup to implement. I was wrong: inspired by Eric Ries's tips from a recent workshop I've been getting a lot of valuable feedback using just a 32-line PHP module and plain old logging to a file.

To use it yourself, all you need to do is think up a name for your test, and surround your alternatives with an if (should_ab('yourtestname', $userid)). That's it. I've deliberately made it so there's zero configuration, you can just pick an arbitrary test name, to encourage myself to test early and often. It's best if you have a proper user id to supply to the test function, but if you omit it, the client IP address will be used instead.

Now when your users load up a page they should see one version or another based on who they are, but how do you gather the information about which one worked? I'm logging all my user events to a file on the server using my custom_log() function, so whenever a user views a page I want to store what options they viewed it with. To do that, the only other function in the module returns an array containing what A/B choices were made for the current page. With that appending as a JSON string to each log entry, I can run analytics on the user's subsequent behavior, to tell which version of a front page led to the most conversions for example. The only tricky part of this approach is that you need to make sure you're logging the event at the end of the page, after all the choices have been made.

If you want to dive deeper, there's lots of strong frameworks out there for split-testing (I particularly like kissmetrics' approach), but even using something as brain-dead as my 32 line module will be a massive leap forward if you're a non-split-tester like I was.

[Update - Doh! I got the random generator wrong, it only returned true about 30% of the time using the md5 test. I've switched it over to crc32 below and in the file]

Download abtesting.php

<?php
// A module to let you do simple A/B split testing.
// By Pete Warden ( http://petewarden.typepad.com ) - freely reusable with no restrictions

// An array to keep track of the choices that have been made, so we can log them
$g_ab_choices = array();

function should_ab($testname, $userid=null) {
    // If no user identifier is supplied, fall back to the client IP address
    if (empty($userid))
        $userid = $_SERVER['REMOTE_ADDR'];
   
    global $g_ab_choices;
    if (isset($g_ab_choices[$testname]))
        return $g_ab_choices[$testname];
       
    $key = $testname.$userid;
    $keycrc = crc32($key);
   
    $result = (($keycrc&1)==1);
   
    $g_ab_choices[$testname] = $result;
   
    return $result;
}

function get_ab_choices()
{
    global $g_ab_choices;
    return $g_ab_choices;
}
?>

October 27, 2009 | Permalink | Comments (0) | TrackBack (0)

How to log to custom files from PHP

Logcabin
Photo by Old Shoe Woman

I needed a function in PHP that worked like error_log(), but appended to a set of custom files rather than to the standard error_log. I wanted to have an easier way to organize the different types of information, so that important messages weren't buried in an avalanche of less-crucial warnings, but this sort of thing is also great fodder for analytics if you write user events to their own file.

The result is custom_log(). It takes two arguments, a category name that determines which file to write to, and the message you want to log. The message gets written to that file, prefixed with the time and client IP. You can download the code as customlog.zip or it's included below:

<?php

// A module to write out events to a set of log files. Similar to error_log(),
// but with multiple output files.
//
// You'll need to set up a directory that the process running PHP (eg Apache) has
// permission to write to. You'll also need to keep an eye on the size of the log
// files, rotate out old ones once they get too large, etc.
//
// By Pete Warden ( http://petewarden.typepad.com ) - freely reusable with no restrictions

// Edit this to set it to the folder on your server where you want the logs to live
//define('CUSTOM_LOG_ROOT_DIRECTORY', '/private/var/log/apache2/'); // OS X default Apache log directory

define('CUSTOM_LOG_ROOT_DIRECTORY', '/var/log/httpd/'); // Red Hat Linux default Apache log directory

$g_custom_log_categories = array();
$g_custom_log_shutdown_registered = false;

// This function works like error_log(), but takes an extra category argument that
// determines which file the message is appended to.
function custom_log($category, $message)
{
    global $g_custom_log_categories;
   
    // If the file hasn't been opened for appending yet, create a new file handle
    if (!isset($g_custom_log_categories[$category]))
    {
        // Make sure there's no shenanigans with special characters like ../ that
        // could be abused to write outside of the specified directory
        $sanitizedcategory = preg_replace('/[^a-zA-Z0-9]/', '_', $category);
        $filename = CUSTOM_LOG_ROOT_DIRECTORY.$sanitizedcategory;
        $filehandle = fopen($filename, 'a');
        if (empty($filehandle))
        {
            error_log("Failed to open file '$filename' for appending");
            return;
        }

        // To close any open files once the script is done, and so ensure that
        // all the messages are written to disk, register a global shutdown
        // function that fclose()'s any open handles
        global $g_custom_log_shutdown_registered;
        if (!$g_custom_log_shutdown_registered)
        {
            register_shutdown_function('custom_log_on_shutdown');
            $g_custom_log_shutdown_registered = true;
        }
       
        // Urghh, this is required to prevent a spew of warnings when more recent
        // PHP versions are set to strict errors
        if (!ini_get('date.timezone'))
            date_default_timezone_set('UTC');
       
        $g_custom_log_categories[$category] = array('filehandle' => $filehandle);
    }

    // Create the full message and append it to the file
    $categoryinfo = $g_custom_log_categories[$category];   
    $filehandle = $categoryinfo['filehandle'];
   
    $timestring = date('D M j H:i:s Y');
    $ipaddress = $_SERVER['REMOTE_ADDR'];
    $fullmessage = "[$timestring] [$category] [client $ipaddress] $message\n";
   
    fwrite($filehandle, $fullmessage);
}

// A clean-up function called to make sure all open file handles are closed
function custom_log_on_shutdown()
{
    global $g_custom_log_categories;
    foreach ($g_custom_log_categories as $category => $categoryinfo)
        fclose($categoryinfo['filehandle']);
}

?>

October 25, 2009 | Permalink | Comments (0) | TrackBack (0)

Balsamiq: So simple, even a programmer can use it

Balsamiqshot

Mock me mercilessly, I deserve it, but I've really been struggling to prototype on paper before I code. Back at Apple there was always a white-board handy and a bunch of colleagues and customer-surrogates I had to collaborate with on any feature, so I did plenty of documentation before doing any serious engineering. As a lone founder, it's seriously tempting to think I have a good enough picture in my head to just go ahead and try it out.

Wrong, wrong, wrong! For one thing I end up involving users way too late in the process, since it takes a whole bunch of coding effort before I can show them something. Even ignoring that, I've never thought things through as completely as I think I have. Just a few minutes trying to sketch out the result I'm trying to achieve will always show me something I'd missed, and that's a lot cheaper than spending hours of programming to get to the same conclusion.

One of my mental blocks to prototyping is that I couldn't find a method I felt comfortable with. I'd tried the Pencil Sketch Firefox plugin, but it just didn't work the way I wanted. OmniGraffle is fantastic for creating beautiful diagrams, but it's painful to build something that looks like a UI sketch out of it's primitives. I've fallen back to using pen and paper, but it's really hard to alter and evolve hard copy, and you have to scan it in to share it remotely. Finally I tried out Balsamiq last week, and I'm in love.

I could rhapsodize about its ease of use, but the single best feature is that it looks like a sketch. This visual metaphor is really important, it clearly marks the results out as conceptual designs, not detailed blue-prints. This stops both other people and myself from focusing on nit-picking the look-and-feel, and forces a focus on the big questions about content and placement. I don't spend hours obsessing about aligning elements, because they naturally look a bit wonky, so I'm freed to think about what the overall content should be.

You can give it a try for yourself with the online version, and the full desktop product is $79, though I got it for $40 with a Techstars discount. If you're at all involved in product development, I think you'll end up buying it too.

October 21, 2009 | Permalink | Comments (0) | TrackBack (0)

Blogs I'm reading now

Booklist
Photo by MargoLove

Paul Jozefak just posted a list of the startup-related blogs he's reading, and that reminded me that I'd been intending to highlight some of my favorites too. I'm skipping the obvious ones (Brad, Fred, Eric Ries) to focus on lesser-known gems I'd love to see more widely read.

Bill Flagg

Bill's a Boulder entrepreneur with several great companies under his belt, but what really makes him stand out is that he's a boot-strapper. During TechStars he was a great counter-point to the focus on raising money, and he posts some awesome advice on building a company that actually generates cash. How about a billing department that encourages customers to mark down their invoices if they didn't feel like they got their money's worth? It's working for RegOnline.

Rick Segal

I love Rick's blog because of his willingness to risk offending people. I actually got fairly irate at a post he did last year, but I wouldn't have him any other way. What's even more interesting is that he's recently started on a journey from VC to startup founder, so there's been lots of great "Eat your own dogfood" posts, including a mea culpa on ever uttering the words 'lifestyle business' as a VC.

Highway 12 Ventures

Mark and George were very active in TechStars, but I never realized they blogged until Mark's stellar "Don't let the bastards grind you down". Since then I've been working through their archive, and they're chock full of other great posts, even tips from a hostage negotiator!

Jay Parkhill

Talking of negotiations, Jay's latest post on telling who wants to actually do a deal and who's just there to argue is a must-read. He's a lawyer specializing in startups, so there's loads of other great advice like how to cope with the loss of co-founders without sinking the business.

October 19, 2009 | Permalink | Comments (0) | TrackBack (0)

Accidental Haiku


Haikushot


I've always been fascinated by haiku, and the launch of Drunken Haiku by a good friend gave me a brain wave. There's a massive number of updates on Twitter, some of them must be unintentional haiku!

A couple of hours later, Accidental Haiku was born. It sits on Twitter looking for messages with the right syllable patterns. It's not always perfect at counting the sounds, and being Twitter there's lots of fluff, but if you watch it update I guarantee some gems.

It's all pulled together from open source components and you can download the modified phirehose code here. Now if I could just learn to write good haiku myself...

Haikushot2

October 15, 2009 | Permalink | Comments (0) | TrackBack (0)

Information wants to be free, even at WalMart

 Walmart
Photo by El Neato

I was reading The Wal-Mart Effect when I came across a passage that summed up exactly how I want to change the world. Sara Lee had a business relationship with Wal-Mart, and as one of the negotiators recalls:

Senior officials were always coming down there [to Bentonville] for meetings, and they always had their sheets of paper bent up so the Wal-Mart person couldn't see them. The idea was, why didn't we just put the sheets of paper on the table?

So they opened up traditionally closed information, and immediately discovered ways of saving money that benefited both companies. Wal-Mart had empty trucks returning from Florida that could transport Sara Lee's stock after it was shipped from South America. Underwear cartons were too large, Wal-Mart wasted time and money splitting them to send the contents to different stores, so Sara Lee shrank the carton size. As the book puts it, all of these efforts eliminated pure waste, the equivalent of turning off a light in an empty room.

I spent years in a corporate environment where I saw hundreds of opportunities to save money and make the world better for everyone, if only people would talk and share information. I was surprised to see I had that in common with Wal-Mart, but it makes sense given their fanatical approach to efficiency. If you're really trying to be productive, it just doesn't pay to be secretive.

Are there downsides to this? One of the biggest hurdles is trust. Knowledge is power, so you're handing over power to people who's interests may not align with yours. Wal-Mart is the 800lb gorilla with a history of using its market power ruthlessly, and one of the strengths of the book is its detail on the negative side of their dominance. I'd argue that this trust argument is usually a cop-out, hiding worries about turf and control. In most cases it's clear that it's not in the other party's best interest to screw you over, and if it is, why are you dealing with them at all? The worst cases I saw were between departments within the same company, often we shared more information with competitors than the guys down the hall.

Once you're in a business relationship, there's a lot to be gained by putting all the sheets of paper on the table.

October 08, 2009 | Permalink | Comments (0) | TrackBack (0)

A lovely little online icon editor

Iconfushot

My design skills are non-existent, but I often need functional little buttons or badges. Using Photoshop for that sort of thing is like taking a sledgehammer to a nut, so I was extremely happy when I found iconfu.

It appears to be pure Javascript, which is impressive just as a technical feat, but it's also an extremely usable and surprisingly full-featured tool for building tiny icons. It's got undo, nicely anti-aliased primitives and some handy filters. Even better, it's completely free for up to 16x16 images. It's no Photoshop so don't expect to see layers or freehand, but that's part of what I love about it. It takes me back to the paint programs I'd use in the late 80's, and the hours I spent clicking individual pixels to create a massive 320x240 demo background.

The only drawback is that Internet Explorer is not supported, but if doing any web design work I'm sure you'll have a better an alternative browser installed anyway.

October 02, 2009 | Permalink | Comments (0) | TrackBack (0)

Get visual bug reports with SnapABug

Ladybug
Photo by Hamed Saber

One of the most frustrating parts of trying to fix a customer's problem is trying to understand what on earth the problem is. I've spent enormous amounts of time bouncing emails back and forth, or talking on the phone, just to get enough information to start debugging. I've long been a fan of tools like CrossLoop that let you share screens with a remote user, but I'm really excited to see what my fellow Techstars alumni Timzon have come up with.

SnapABug is a small widget you can include in any web page, and it gives users a button they can press to take a screenshot and email it off to your support team along with some notes. Dead simple but incredibly useful! Jerome, Jerome and Tony have done an awesome job of identifying a great market for their technology, I can definitely see this appearing on a lot of sites and becoming a valuable product.

September 29, 2009 | Permalink | Comments (0) | TrackBack (0)

Next »