4 minute read.

How to datamine facebook (without getting sued)

Paul Anthony / October 24, 2010

Posted in: Archive

Facebook, whilst hugely popular, still remains relatively difficult to data mine without legal implications.  It’s no surprise really. Firstly, there are the privacy issues – which Facebook have struggled with for some time. Secondly, they are protecting themselves from Google, who would simply prefer to crawl their data, and use it for their own core search product. Facebook are all too aware that Google’s expertise in data mining makes them a leader in the search space – a market that Facebook are clearly moving in on. Their relationship with Microsoft / Bing makes this more interesting again, with Microsoft likely seeing Facebook as a one way ticket to stealing market share.

There are however, a couple of ways of getting information out of Facebook using existing services that are out there on the web today, without going to the lengths to crawl their data yourself. I’ve listed a few techniques here to find information within Facebook that you wouldn’t ordinarily get through their existing search interface.

Advertising Platform

Facebook provide information on their demographics free of charge via their advertising platform. Yes, that’s right, you can easily mine how many people are on Facebook with particular interests, simply by following this simple tutorial.

Step 1. Make sure you are logged into Facebook. At the bottom of every Facebook page you will see a standard set of links.  Click on the ‘Advertising’ link.

Step 2. Click the ‘Create an Advert’ button

Step 3. Create your Fake advertisement, selecting a destination URL.  You aren’t actually going to go ahead with payment for this, so it doesn’t really matter what you type in here at this stage. As long as it is a valid web address, anything should be fine.

Step 4. Change the parameters of your search to reveal interesting data. For example the below shows that there are currently 112,960 people registered on Facebook listing Belfast as their primary home at the moment.

Step 5. You can start to record this information regularly in a spreadsheet to reveal changes in your chosen audience segment. For example, you may be interested in how many people from Northern Ireland are Married, Divorced or Single over a period of time. Facebook demographics allow you to keep an eye on this granular society data through their advertising statistics providing you are disciplined enough.

CheckFacebook is a great example of how this data can be utilised, collated and used to provide interesting data to your audience. With the wealth of information that Facebook currently hold on their users, this is certain to be more and more relevant as time passes.

Social Widgets

Facebook provide a number of social widgets for determining what is happening on your own, and indeed other people’s websites. You can use these widgets to determine for example, what URL’s others are sharing, and what content is working for other people. A number of third party website utilise the social widgets and parts of the Facebook API to understand the web that bit more.

ItsTrendingWeRSocial and LikeButton.Me all use these widgets in unison to provide insight into what your friends and other people are sharing on the network. Another addition to your data mining toolkit for Facebook.


URL: http://youropenbook.org

Created with the main aim of exposing just how much of your personal data is available through Facebook’s API – OpenBook still remains one of the best ways of searching across public updates. It offers seaching filtered by blokes or ladies as well, giving access to both images and text from within Facebook by keyword.


URL: http://www.bing.com

Bing will undoubtedly continue to integrate further with Facebook, and we can expect to see a much improved people search filling the void in the weeks and months ahead. Just this month they announced further plans to integrate friend relationships and photos on particular search queries. The below video provides some background on this. I’ve previously used Bing to do sitewide searches for brands as their RSS allows you a certain amount of control that Google doesn’t.

Simply try this search on Bing: site:facebook.com {query} to find the information that they are currently indexing. My guess is that it will be much more comprehensive than Google as they join forces going forward.

  • advertising
  • data
  • data mining
  • facebook
  • openbook
  • people search
  • platform
  • search

One response to “How to datamine facebook (without getting sued)

Leave a Reply

Your email address will not be published. Required fields are marked *