Real-Time Big Data Stream Analytics
Big Data is a new term used in Business Analytics to identify datasets that we can not manage with current methodologies or data mining software tools due to their large size and complexity. Big Data mining is the capability of extracting useful information from these large datasets or streams of data. New mining techniques are necessary due to the volume, variability, and velocity, of such data.
In this talk, we will focus on advanced techniques in Big Data mining in real time using evolving data stream techniques: using a small amount of time and memory resources, and being able to adapt to changes. We will discuss a social network application of data stream mining to compute user influence probabilities. And finally, we will present the MOA software framework with classification, regression, and frequent pattern methods, and the SAMOA distributed streaming software that runs on top of Storm, Samza and S4.