Image by Yankee in Canada
One of my favorite parts of my own Facebook research has been discovering some of the existing work in this area I didn't know about. Here's some of the most interesting papers:Inference of Profile Elements of Individuals Using Publicly Available Social Web Data
Using Rapleaf's massive data store of publicly-available social network data, Piotr Kozikowski wrote his master's thesis on inferring attributes like gender, location and age from other known information about a person.
Contains details on the EuroSys '09 academic data set containing both connections and interactions for
A paper on how geography influences social networks,
using 30,000 users public friendship data from a German social network.
Arvind's got a few notes about the LiveJournal, Twitter and Flickr data they're using. It sounds like Mislove has been willing to share LiveJournal network data with other academics in the past.
Cameron Marlow is the head of Facebook's data mining team, and covers their internal research on his blog.
Finally, it's in a different area, but one of the scariest datasets I've run across is the Enron collection of 500,000 emails released as part of the investigation. I was a heavy user of this for developing my email services, but I'm still amazed it's out there!