Recently I have visited one of the probably the best data (data not Machine Learning or AI) conferences called Conf2017 by Splunk.
Why was is the best data conference?
A lot of data conferences these days are very high-level while this one is very technology oriented that bring corresponding crowd. Conversations are very interesting, attendees are very interested, curious, exploring and quite intelligent, which is always interesting to engage with an intelligent people.
Why was it data vs ML/AI?
There are a few very good Machine Learning conferences (good example is MLConf) where state of the art of ML and its application is being discussed typically within a realm of research or product. Splunk.Conf 2017 is not about the research, and has data in its roots while very general purpose, and therefore will likely remain very broad in its core and as a result will dive into ML at high-level but unlikely in depth into very specific vertical use cases.
Yes, no doubt, that Splunk has security and even IT Operations, however combination of ML at those still remains to very ridged and targets technologies, but should it? Or even can it?
State of the Art?
Can it? That’s what I would like to focus on first. I as ML practitioner for several decades have state that it is ML/AI is pretty hard work unless you have background and experience as well as have built a few of those products and models, and even then you always learn new things. There are a lot of questions that need to be answered about the data, about the features, about the model, about the performance, etc. etc. etc. that simply can not be answered by general purpose algorithms. Not mentioning, once again, that one must be skilled in art, which is not many and not many companies can afford to have a team of ML engineers.
So, while Splunk does address the data, and attempts to “democratize” ML by making some primitives available, I do not believe that it will scale or be useful beyond hobbyists and tinkers, since once it becomes a “real deal” it is quite challenging almost in every angle (including scalability which is outside of science of Machine Learning).
What is the state of the art anyway?
If your organization already has a strategy, architecture, and implementation to gather data as well as introduce new data points, you are beyond 80% of people who use data for “production” purposes. I have to state, that my sample size has bias towards IT/AIOps, as well as significantly smaller then even a consumer base, but after having a number of conversation at the conference and prior, that’s the conclusion that I have arrived to.
Well…next is of course the logical step of applying some form of analytics (which may and likely include Machine Learning), but the question is how? Why? What questions to ask? What features to use? Etc. etc. etc. As I have mentioned above, I believe that simply having a database (Splunk based would be analogues) and R (ML Toolkit would be analog of that) will not solve the problem.
I believe, that in order to make it work in IT/AIOps as well as other domains, it is important for new companies to leap beyond the toolset available, and actually create a product that out of the box answers questions for a particular industry as well as able to forecast/predict the outcomes. Basically delivering a data scientist in the box that understands data, vertical, its use cases, and challenges. That’s where all data driven industries should go, and Splunk simply can not do it (probably does not want to) all.
Do such products exist and being developed? Yes, SIOS IQ from SIOS is the product that partners with Splunk, AWS, Azure, VMware and not just identifies anomalies in the interplay between the data features (mind you, beyond traditional single feature anomaly) but also identifies root cause of the problem as well as predicts the issue up to 7 days in advance.
I am not stating that libraries that provide ability to create models, run the algorithms are not useful, but it is not for general public that has other wars to fight leaving the algorithms to products that ride on top of the data to extract the value using Machine Learning.
Who got impacted by new features announced?
There are a number of very good announcements related to improvements and new features in Splunk products across the board that I have gathered. However, over the past 4 years I have observed a number of companies that emerged that focused on some form of event analysis leveraging some form of classification approaches. I will not explicitly point any fingers or name any names, but those are the once that were impacted by the announcement of the event analysis and correlation features that were implemented inside ITSI product. Now, if you are looking at event analysis, don’t have to go any further since now, it is part of the “deal”.
Overall great conference, excellent location, intelligent audience!
Looking forward to next year!