“Big Data” have brought “Predictive Analytics” (long-time available but hidden in the academic world under the names “Machine Learning” and “Data Mining”) to the spotlight of the modern Business Analytics. These days you will find many examples when analytics enables business decisions by supporting a path from data to decisions and actions. Below I will briefly talk about nowadays positioning of the Business Analytics and more about OpenRules own experience in this area including OpenRules Rule Learner.
“The buzz around Big Data and Analytics seems to be growing exponentially. The concept of Big Data and Analytics appears to have captured the imagination of practically everyone across enterprises. It is indeed interesting to see why data and analytics are under the spotlight and the implications they have on the business” – read more here.
On the following picture Gartner describes 4 different types of analytics:
While answering different questions and using different techniques, they all have a common objective: better decision making. You may find good practical advises about these techniques in the recent posts by Jean-Francois Puget from IBM/ILOG.
OpenRules recognized the importance of machine learning for automatized business rules discovery 10 years ago. In 2006 we added a special component “Rule Learner” to our BRMS system. Being among the first vendors who actually integrated Business Rules and Machine Learning, we quickly landed a few related projects, the most prominent among which was the US government contract “Automating Business Rules Creation Using Machine Learning Models” issued by the National Headquarters Office of Research, Internal Revenue Service (IRS). In the next several years OpenRules, Inc. successfully completed two contracts with IRS related to the integrated use of machine learning and business rules technologies.
It was proven that the generated rules could yield an essential saving when they applied instead of the existing rules (of course, unknown to us). Here is a quote from the IRS Performance Report for the OpenRules contract (2010-2012):
“IRS Management was satisfied with the services being performed by the Contractor and its staff members. The overall performance of Dr. Jacob Feldman and staff on behalf of The OpenRules, Inc., was very good. This company would be highly recommended to work for the Internal Revenue Service in the future in regards to Machine Learning (ML) Models Performance in IRS Enforcement Programs.“
The most important result was that after analyzing ~50K historical records with known audit results, Rule Learner managed to discover meaningful business rules capable to diagnose suspicious tax returns. The discovered rules were automatically generated in two formats:
- Human-oriented, so subject matter experts were able to understand, interpret, and if necessary easily modify or augment the rules
- Machine-oriented, so a rule engine (OpenRules) could execute the generated rules.
The key for this success was the use of our Rule Trainer that addressed the crucial issue when people apply machine learning methods to large data sets:
“Algorithms are important, but thoughtfully constructed independent variables are more so” – see this recent article
We tried 21 different ML algorithms and selected the best one for our purposes (RIPPER). However, it would be disastrous if we directly apply even the best algorithm to the entire historical data provided by a customer without understanding the underlying relationships between key decision variables (attributes) inside the data sets. So, we created a friendly custom GUI that allowed IRS business analysts to create so called “training rules”, which may automatically select semantically meaningful positive and negative examples limiting the resulting training sets to around a thousand records instead of tens of thousands of all available records. Modifying the training rules in the business terms at the GUI, business analysts are able to immediately:
- Generate the proper training sets
- Run the selected machine learning method to generate the classification (production) rules
- Apply the classification rules to all (!) available data records and analyze the produced results using various statistics such as a hit rate.
This way after several experiments with training rules, business analysts themselves without becoming machine learning gurus were capable to produce/discover reasonably good sets of classification rules.
We usually ask our customers not to trust too much to the discovered rules only because they rely on “scientifically sound” algorithms. Business analysts who work for a particular customer are the only experts who understand the actual underlying relationships between different decision variables. If they share this knowledge via a friendly Rule Trainer, then the predictive analytics would be able to produce “reasonably good” production rules. “Reasonably good” means that the automatically discovered rules will be able to produce results that are at least better to compare with results produced by human-defined rules.
This experience also had shown that a certain level of customization is usually unavoidable especially considering the fact customers rarely have in-house expertise in predictive analytics. As a result, we did not include Rule Learner in our standard downloadable software providing it only to a few new machine learning customers together with our consulting assistance.
Apparently the market situation has changed in the last few years. As “Predictive Analytics is Becoming Mainstream“, now we receive more and more requests to download our Rule Learner. So, we decided to make Rule Learner (at least its light-weight version) publicly available. We are preparing the proper software along with the documentation and plan to include it into the upcoming OpenRules release. Meanwhile, if you want to try Rule Learner send the proper request to our Technical Support at firstname.lastname@example.org.
It is interesting to note that Predictive Analytics not necessarily assumes the use of Machine Learning methods. For example, we have another customer who initially planned to use our Rule Learner but ended up with a quite ingenious use of our classical rules-based technology to address their predictive/prescriptive problem. I’ve already described this use case in my recent comment to this Jean-Francois post:
People who actually do business analytics concern more about the methods they use than how we call them (predictive vs prescriptive). You may do predictions with or without machine learning or to find a reasonably good solution among multiple alternatives with or without optimization methods. For example, one of our customers provides field scheduling software that usually requires a lengthy configuration process to setup “who can do what and where”. So, they wanted to use predictive analytics to learn this information from the actual historical work assignments. Instead, they ended up defining intuitive business rules that analyze work assignments and automatically generating the proper configuration information. They use a rule engine (OpenRules) to define skills, service territories, and preferences for each worker, and feed this information to the scheduling system customized for every service provider. Is this an example of predictive or prescriptive analytics? They do not care as it works really fine.
More importantly such use of business analytics allowed them to add two important features to their product:
- Minimal initial configuration for every customer
- Ever learning cycle as the actual work assignments keep changing.