Wired’s Jon Stokes has a really interesting post on HP’s newly unveiled data server strategy to address two different big data challenges.
As Stokes describes (and which he also mentions really comes from a presentation by Twitter’s Nathan Marz), there are “fast” big data and “slow” big data problems. For the “fast” problems, you apply a set of pre-developed algorithms and tools to the incoming datastream, looking for events that match certain patterns so that your platform can react in real-time. However, sometimes you need to ask questions of the data, and then analyze the results, which can’t be done in real-time effectively. This describes the “slow” problems, or as Stokes puts it, where you gather information and test hypotheses by running queries against a vast backlog of historical data.
It turns out that the natural evolution of analytics is to go from “slow” problems to “fast” problems, turning the inquisitive understanding of the data, requiring analysis, into faster number-crunching analytics. Knowing the right way to generate these “fast” analytics requires an solid analytics engineering discipline, especially when the problems being answered get harder and harder.
Big data solutions are currently focused on speed and data management platforms, but there is still a need for understanding the science behind developing the right analytics.
Get My Newest Articles In Your Inbox
Subscribe to get my latest articles from Decisions & Discovery by email.
Please note: I reserve the right to delete comments that are offensive or off-topic.