You've got the tools and the power of the cloud to capture big data, but figuring out what you want from it and how to extract it is the final, crucial challenge.
Advances in data networks and storage mean organizations
capture far
more data than they ever have - perhaps a stream of measurements from
manufacturing equipment, from vehicles, or from game-changers like web-enabled
refrigerators (no, I’ve never seen one either).
The enterprise CTO may have the data storage part all figured
out - their MongoDB cloud database
is in place, or they rent DBaaS from Cloudant.
But why? What does an enterprise do with all this unstructured data?
The first thing is to identify what the enterprise wants.
Analytics can be an area of blind faith – if the enterprise is not
clear about its big data needs, it may just hope that something good pops
out.
Identify the big data needs.
Big
data analytics, like all IT, is subordinate to business needs. An
organization must figure out their requirements before working on big data.
No two organizations are the same, so there is always a
variation in needs. The IT department may receive requirements like these.
- Crunch data for instant reports.
- Decode telemetry on the fly.
- Find a needle in a haystack in a vast quantity of signals.
- Find the regular operational patterns in a vast quantity of signals.
Analytics is a service-oriented area so the CTO could just finish
his work there and outsource the rest. If he decides to keep it in-house, he
needs a few more things.
Get some analytics applications.
Analytics applications help turn large data sets into business
value. The enterprise uses analytics tools to tackle the difficult job of doing
something useful with their unstructured data.
Data analytics products are one of the big
data technologies and live in a data scientist’s toolbox. Analytics products
don’t usually deliver ready-made business value.
When an organization purchases analytics applications, they
must leave plenty of cash for the training budget. Complex tools are not
intuitive.
Write a big data policy.
Managing large data sets is a difficult job. The big data
manager has plenty of moving parts to configure to meet these requirements.
- What is the retention policy? What parts of the data pool can be deleted, and when? What happens to the rest of the historical data?
- What is the data protection policy? Who gets to view data? What are the privacy implications? What are the legal restrictions?
- Where is the data stored? If a cloud provider is holding the data, how do we get it back?
- What kind of meta-data is required? How can anyone identify the purpose of a big data store?
- How many data sets are there, and how can they be blended?
Assemble an analysis team.
The first part of building a team is partnering up a business
executive and an IT sponsor. Both are required.
There may be a data warehouse and data miners in the
organization, but probably no data
scientists. There are a few ways of getting some.
- Hire experts. Pros are in demand.
- Hire people with the right capability and let them learn.
- Spot the budding statisticians in your organization and grab them.
Spotting capability means looking for clues. John Foreman is chief scientist at
Mailchimp and writes a blog on
data science. If someone is a fan of his work, that’s a clue. Perhaps one
of the data miners has an artistic streak. The person obsessively dragging
consumer behaviour out of click trails is worth talking to.
That still leaves some gaps.
A few huge organizations, like telecoms companies and global
retailers, have been battling with the problem of analytics for decades. They
have specialist teams, home-grown tools, and years of experience. Alongside
their expensive specialized capabilities, a brave new world of big data and commoditized
data analytics is appearing. There is quite a way to go.
- The enterprise is doing new things with existing data sets, rather than collecting new data.
- Plenty of big data tools exist, but few tools ready for business users.
- Organizations in many parts of the world have not started exploiting big data.
- Better machine learning is required to extract signal from noise.
It takes statistical, technical and business expertise to get value
from big data. Even where the analytics tools exist, they must be tailored for
business needs - it’s not a one-size-fits-all world.
Over to you, big data startups around the world.
Plug those gaps.
0 comments:
Post a Comment
Appreciate your concern ...