Our Collective 500 Terabyte-a-Day Facebook Addiction

The amount of data Facebook collects from its nearly one billion users is astounding.

The highlights from Facebook is collecting your data — 500 terabytes a day — Data | GigaOM:

  • 2.5 billion content items shared per day (status updates + wall posts + photos + videos + comments)
  • 2.7 billion Likes per day
  • 300 million photos uploaded per day
  • 100+ petabytes of disk space in one of Facebook’s largest Hadoop (HDFS) clusters
  • 105 terabytes of data scanned via Hive, Facebook’s Hadoop query language, every 30 minutes
  • 70,000 queries executed on these databases per day
  • 500+ terabytes of new data ingested into the databases every day

Facebook is a single location for 950 million users, and it’s a public company. So, these type of data can be tracked, providing perspectives that have never be achieved for the Internet at-large.

Interesting stuff and scary given that one company controls it all.

AboutJake

a.k.a.:jkuramot

3 comments

  1. The scarier part is that in the not too distant future, someone will comment, “One of Facebook’s clusters ONLY held 100 petabytes of data?” Perhaps the commenter’s phone/communication device won’t have 100 petabytes of storage capacity, but that will happen some day.

  2. Storage capacity is one of those mind-blowing topics. I remember how stoked I was to get a 1GB hard drive back in 1997. My TV probably rocks a bigger disk now, plus it’s solid state. Mind = blown.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.