When it comes to big data and analytics, many of us get stuck in a quagmire of buzzwords and platitudes that have emerged around them. But amid uncertainty, the development of analytics has continued all the same, some of which is finally cutting through the haze. In an article for InformationWeek, Nick Elprin and Philip Levinson identify three particular developments you should be aware of:
- Big data’s diminishing returns: De-emphasizing the size of the data
- CIOs dealing with data science “Wild West”: IT teams bringing order to data and analytics
- The need to show your work: Increasing model risk management and oversight
Join the Conversation
As some have already realized, more data does not always mean more solutions. Often, more data just means more considerations—sometimes superfluous considerations—and it slows down the decision-making process. Particularly, it slows down the speed of model building and testing for the analytics team. One way to rectify this issue is to get more efficient in selecting representative data subsets. Iterate with smaller sets of data that allow you to make predictions faster. See what hits the mark and revise as you go. This is better than sluggishly juggling petabytes of whatever.
Another development regards how CIOs are realizing that more governance is needed in the way analytics is conducted. Right now, it is possible that all manner of “shadow IT infrastructure” is being used in the pursuit of results by analytics teams. The authors cite RStudio, Jupyter, and Anaconda as examples. But for the sake of greater transparency and efficiency, CIOs are working on re-centralizing infrastructure and taming the “Wild West” of data science.
A second aspect of governance is model risk management. The authors say this:
With the EU GDPR [General Data Protection Regulation] going into effect in May 2018, along with increased worldwide scrutiny on data model use in regulated industries, data governance is more important than ever. Many of the most data-oriented industries – finance, banking, insurance, healthcare, utilities – are among the most heavily regulated. In these sectors, with key decisions surrounding pricing, lending, marketing and product availability being increasingly driven by data science, policymakers are taking notice. The regulation extends beyond just what data is used but also how it is used and by whom, adding significant complexity.
These are new but necessary hoops to jump. Analytics is not getting any easier, but at least people are working out gradually what not to do.
You can view the original article here: http://www.informationweek.com/big-data/three-big-data-developments-no-one-is-talking-about/a/d-id/1329375