Little Data

You’re probably familiar with the term big data, which tends to be difficult to define precisely, but always pertains to the analysis of huge data sets.

Over beers with my pal Rick (@turoczy) earlier this week, I used the term little data to describe the potential I see for Internet of Things. After some discussion and a few days, I like the term and plan to keep using it.

Little data is, unsurprisingly, the opposite of big data. This is a relatively small data set, but very focused and very personal. Little data is all about me and my environment. Not just what I do actively, but what happens around me and how I react. The goal of little data is to make decisions for and suggestions to me passively based on what is known about me.

From “Hackers,” a picture of data.

Little data is a bit of a personal assistant, like Siri or Google Now, but more passive. Rather than waiting for me to ask for something or check something, little data just does things, e.g. my IoT examples:

Nest could adjust the temperature in real time based on data collected by the Netatmo Urban Weather Station, applying your past history and usage thresholds to keep your environment optimally temperatured.

Sonos could play music based on the weather data from Netatmo, picking upbeat music on gray days, soothing tunes on very hots ones. FitBit could alter the calorie figures for very hot days, factoring in the extra exertion. The Withings Smart Baby Monitor, which measures temperature and humidity, could send alerts to Nest based on baby movement correlated with temperature changes, e.g. when the AC runs, the baby moves.

With all the sensors on a smartphone, little data could help outside the home too. Say you’re running to catch the bus for an appointment. Little data knows the bus schedule and when your appointment is and can calculate how fast you’re running; so, it could tell you, forget it, you’re not making it by bus and call a cab instead, it could tell you to pick up the pace to make the bus, or not to worry, the bus is running late.

Little data can help with work too, turning off distractions (e.g. logging out of IM, closing email, routing the phone to voicemail, putting the smartphone in airplane mode) when you need to concentrate, scouring the corporate intranet to find  relevant information for your meetings, suggesting people who could help with projects. Surfacing only the key notifications and key communication that really matters, saving the rest for later.

The beauty of little data is that it can work with what you have, and as you add devices or inputs, it gets better.

Of course, this is mostly pipe dream right now, but I expect pieces of little data, or at least my vision of it, to arrive over the next few years. Siri and Google Now will continue to expand the role of the digital assistant. Internet-connected devices will proliferate, and many companies will open APIs to developers who want to hack. Smartphones will continue to get better.

It’s all a matter of gluing together services that matter.

Anyway, what do you think? Aside from the obvious privacy concerns, would you even find little data useful?

Find the comments.




  1. I think you inspired me subconsciously. So, thanks for that. Siri is a gimmick, as is Google Now, but they’ll quickly evolve, especially as they open ecosystems. Google Now will be a lot more useful w Project Glass, although the heads-up display could be problematic.

  2. If we’re going anti-big data by talking about little data, why not go anti-cloud and talk about local storage and ownership of the little data? Rather than having these calculations be made in Apple’s or Google’s server farms, what would happen if we personally controlled the little data and authorized what the apps could do with the little data?

  3. Wasn’t aiming for anti-big data, just using that terminology to frame an area that I think could use some focus.

    Cloud makes little data work, although there would have to be a local piece, given the amount of information to crunch. It might be interesting to try an algorithm running locally. I think a two-tiered approach would be best to allow for offline signal processing.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.