The next evolution in Predictive Analytics
Hadoop and analytics based on it are having a revolutionary impact for some companies as they are able to tackle data challenges at scale (the 3 V’s) with a responsiveness never before imagined. At conferences like Strata + Hadoop World we are learning about these success stories and our envy grows as we want to achieve similar successes for our own enterprise.
The Hadoop platform vendors (Cloudera, Hortonworks, MapR) are hardening and expanding the platform, making it enterprise ready while keeping true to its open source roots.We have seen that the pace of innovation within open source projects linked to Hadoop is great. Hadoop has evolved from a batch oriented paradigm to embrace real time use cases with the introduction of engines like Spark and Flink. The vendor community has recently come together to establish the Open Data Platform. We applaud this move, as it will only ensure the space of innovation continues, that the maturity of the platform will continue at pace, and that enterprise customers will be more willing to embrace Hadoop.
Additionally companies are building on this by providing advanced analytics capabilities (Pivotal, Teradata, …) . As we grapple with our new “Big Data” data sources, we see the emergence of Data Wrangling solutions (Trifacta, Pentaho, ...) with a goal of increasing our productivity. Finally, the importance of Visualization (Tableau, Qlik, ...) can never be overlooked as we seek to convey the insights gained to our Business Sponsors. Based on this we believe in the evolution of predictive analytics will follow the roadmap below:
As a technologist and evangelist for Big Data the pace of innovation, the broad range of tools emerging, the important role that open source is playing really excites me. However if I was a business owner, this does not interest me. I am only interested in insights that I can trust - proven insights.
This is where Singularities comes in. Our mission is to provide the most natural, mathematically sound, and efficient beliefs and behaviours modelling platform that permits creation, execution, and interaction with the models in reliable ways.
Our vision is that future systems will be a lot more capable, autonomous, and conscious if they operate on persons’ abstractions that resemble the modular and introspectable images we have of colleagues, friends, customers, and ourselves - Intuition is now a Science
When I was director of database systems for a large online Poker Site, one of the teams that reported to me was the poker data analytics team. These guys were poker experts, but more than this they could find interesting patterns in the data from how different hands of poker were played. We had 10’s of millions of hands of poker we could analyze but the sort of questions we wanted to ask could not be asked via relational technology in a performant way. One of the senior people in the company came across a company called Asterdata (acquired by Teradata in 2011) and we built out a 30+TB of active data solution on top of commodity hardware. We could now ask questions and gain insights as speeds we never thought possible. Using this platform, we were able to model player behavior and identify bots (software programs) playing on the site. We found a lot of them! We brought our results to the security team and they were very happy but responded that they needed to review each case. This wasn’t practical, given the numbers, so we introduced, a confidence metric and kept sending them sample cases at different metric levels until we calibrated it at a point that anything above that number was a proven bot. We were then able to move to automated response and in one day closed many, many accounts that were bots and seized their funds as they playing against the rules of the site.
We really believe an inflection point is happening in the evolution of predictive analytics facilitated by Hadoop and we are excited to play our part. We read with interest a recent McKinsey Report titled: An executive’s guide to machine learning. We like their take on the future:
“It’s true that change is coming (and data are generated) so quickly that human-in-the-loop involvement in all decision making is rapidly becoming impractical. Looking three to five years out, we expect to see far higher levels of artificial intelligence, as well as the development of distributed autonomous corporations (DAC). These self-motivating, self-contained agents, formed as corporations, will be able to carry out set objectives autonomously, without any direct human supervision. Some DACs will certainly become self-programming.
One current of opinion sees distributed autonomous corporations as threatening and inimical to our culture. But by the time they fully evolve, machine learning will have become culturally invisible in the same way technological inventions of the 20th century disappeared into the background. The role of humans will be to direct and guide the algorithms as they attempt to achieve the objectives that they are given. That is one lesson of the automatic-trading algorithms which wreaked such damage during the financial crisis of 2008.”
We are interested in your thoughts on the evolution happening within predictive analytics. You can write to me at [email protected].