ITMPI FLAT 003
Main Menu
Home / Uncategorized / How to Find the Right Tool to Unlock the Power of Data

How to Find the Right Tool to Unlock the Power of Data

“Too much information” is a phrase that is usually used in response to something like your grandmother’s description of what happens to her feet after a long walk.   However, in the world of data, too much information can actually be a positive thing, if you know how to sort through it.   An article from The Economic Times highlights the ways in which one can make sense of all the models, statistical information, and technological assistance flying around out there.

While some managers show off by attempting to employ every single tool they can think of, other mangers will waste vast amounts of time trying to hand pick the most perfect model.   Ideally, although not always immediately possible, reviewing the data to select the most appropriate model in an expedient fashion would be the best choice for managers.   It would also forego the need for extensive analytical work.

In a study of this, a Wharton student and two professors gathered 64 different “real world” data sets (retailing, online media, etc.) and ran four different models with each:

A typical data set might have two years’ worth of information.   For each of the 256 (4 X 64) variations of data sets and models, the computer was fed, for example, the first year of data, and then told to use the model to predict the numbers for the second year.   The results were scored and sorted by accuracy.   In all the number crunching required 24,000 hours of computing time if a single CPU was used.   But because it was run in parallel across many different machines on Amazon’s Elastic Computing Cloud (EC2), in connection with a research grand from the company, the job took just two days to complete.

The biggest surprise of the paper was the amount of simplicity within the massive amount of information.   The data clustered itself into easily identifiable groups.   The best news for the researchers was that each pattern could be linked with one of the four models tested.   Most importantly, they proved that choosing the best model for the data instead of the most complex one is the most efficient business policy.

About Anne Grybowski

Anne is a former staff writer for CAI's Accelerating IT Success, with a degree in Media Studies from Penn State University.

Check Also

The Seven Activities of Project Closeout

People go crazy when a TV show like Firefly or Agent Carter gets canceled, because …

Leave a Reply

Your email address will not be published. Required fields are marked *