Maximal Information Coefficient Teases Out Multiple Vast Data Sets

Posted In Computers, Softwares - By GeekMan On Sunday, December 18th, 2011 With 0 Comments

The use of advanced technology to gather big and complex data sets isn’t a new stuff in the city. Many smart computer programs may be able to research these data sets with great speed, “but fall short in even-handedly detecting different kinds of patterns in large data collections, essential for more sophisticated analysis.” Team of researchers, on Friday, is reportedly have developed a tool called “Maximal Information Coefficient” (MIC) that can tease out multiple, vast data sets in a way that no other software program can.

Maximal Information Coefficient Teases Out Multiple=

What might we be missing in large datasets? If researchers printed on paper each potential relationship in a recent data set containing abundance levels of bacteria in the human gut, the stack of paper would reach to a height of 1.4 miles, 6 times the height of the Empire State Building. Credit: Sigrid Knemeyer

With support from the National Science Foundation (NSF), a press release explains, researchers from the Broad Institute and Harvard University recently developed MIC. A tool that capable of teasing out multiple, recurring events or sets of data hidden in health information from around the globe, or in the changing bacterial landscape of the gut or even in statistics amassed from a season of competitive sports—and much more.

According to the Broad Institute’s press release, Maximal Information Coefficient (MIC) is part of a suite of statistical tools called “Maximal Information-based Nonparametric Exploration”(MINE). It has the ability to sort through today’s mass of research variables–from attempts to track hurricanes, efforts to model earthquakes, endeavors to identify the Higgs Boson and efforts to glean insights from affecting the world economy and social networking interaction such as Facebook.

The researchers report their findings in the Dec. 16th issue of the journal Science. As cited on the release, Pardis Sabeti, author of the paper and an assistant professor at the Center for Systems Biology at Harvard University says:

“There are massive data sets that we want to explore, and within them, there may be many relationships that we want to understand. The human eye is the best way to find these relationships, but these data sets are so vast that we can’t do that. This toolkit gives us a way of mining the data to look for relationships.”

One of the greatest strengths of this newly discovered tool within MINE, the same release points out:

“Is its ability to detect and analyze a broad spectrum of patterns and characterize them according to a number of different parameters a researcher might be interested in. Other statistical tools work well for searching for a specific pattern in a large data set, but cannot score and compare different kinds of possible relationships. Researchers can also use MINE to generate new ideas and connections.”

(Via: “Tool detects patterns hidden in vast data sets” by the Board Institute of Harvard and MIT).

About - GeekyRoom's Chief Editor

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>