Archive for the ‘BI’ Category
While being able to build predictive models on mountains of data without moving it out of the database is pretty cool in itself, I feel analysis without action is pretty much pointless. Tom Davenport describes this common data mining conundrum in Competing on Analytics.
Many firms are able to segment their customers and determine which ones are most profitable or which are most likely to defect. However, they are reluctant to treat different customers differently—out of tradition or egalitarianism or whatever. With such compunctions, they will have a very difficult time becoming successful analytical competitors—yet it is surprising how often companies initiate analyses without ever acting on them. The “action” stage of any analytical effort is, of course, the only one that ultimately counts.
The OBE tutorial describes a scenario in which a business wants to identify customers who are most likely to purchase insurance. Through a set of simple steps, a (decision tree) classification model is built that can be used to predict whether a particular customer is likely to purchase based on historic data.
In a classical data mining approach, the predictions of this model would be written to some OUTPUT_TABLE where they would be available for subsequent processing. Growing staler every minute—and soon forgotten when its newer sibling OUTPUT_TABLE_NEW_FINAL_2 is inevitably created—our precious business intelligence slowly withers away in a disregarded section of the database until ultimately dropped by a careless DBA.
Output tables are where analytical insight goes to die.
If all we were interested in was building models, we’d be better off glueing choo-choos. It is the new ways in which we can utilise these database resident models that makes this technology really interesting. With a few simple additional steps, this same model can be used in real-time to provide inline predictions based on up-to-date customer data; as well as for new customers.
All we need is a view
and a join.
Update (October 3rd, 2012): as Marcos points out in the comments, I was making things far too complicated. No need for a separate join; simply select the output columns you need and pass everything directly to the view.
The join operations glues the original data and the prediction models together; The view allows us to look at the harmonised results directly. When a customer record is selected from the view the source data for this record is passed to the model to generate the predicted values in real-time. When source data changes so does the prediction. When new source records are added they are automatically processed in the same way.
-- Create a new customer. INSERT INTO INSUR_CUST_LTV_SAMPLE (CUSTOMER_ID, LAST, FIRST) VALUES ('CU123', 'VERMEER', 'LUKAS'); 1 rows inserted. Elapsed: 00:00:00.003 -- Get prediction and probability for the new customer. SELECT CUSTOMER_ID, insur_pred, insur_prob FROM insur_cust_ltv_prediction WHERE CUSTOMER_ID = 'CU123'; CUSTOMER_ID INSUR_PRED INSUR_PROB ----------- ---------- ---------- CU123 No 0.7262813 Elapsed: 00:00:00.004 -- Update customer data. UPDATE INSUR_CUST_LTV_SAMPLE SET bank_funds = 500, checking_amount = 100 WHERE CUSTOMER_ID = 'CU123'; 1 rows updated. Elapsed: 00:00:00.003 -- Get prediction and probability for the updated customer. SELECT CUSTOMER_ID, insur_pred, insur_prob FROM insur_cust_ltv_prediction WHERE CUSTOMER_ID = 'CU123'; CUSTOMER_ID INSUR_PRED INSUR_PROB ----------- ---------- ---------- CU123 Yes 0.6261398 Elapsed: 00:00:00.004
Seamless. Any system that can read data from an Oracle database can now utilise Oracle Data Mining models. No need to move your data. No need to build new applications.
Applications reading data from the view need never know the difference between the original source data and machine generated predictions. Oracle Business Intelligence Publisher can easily display this data in forecasting reports; or use it to power pro-active alerts. In Oracle Real-Time Decisions, rules can be built around the outcomes of these models; or predictions from multiple sources can be fed into combined likelihood models for increased accuracy.
This is huge. Trust me. Stop over-analysing and start taking action. After all, that’s the only step that ultimately counts.
The reason why a large percentage of business intelligence (BI) applications fail is not due to technology. To a large degree these applications fail because of organizational, cultural, and infrastructure dysfunctions.
This observation from an article by Larissa T. Moss seems to be common knowledge in the BI community. Too often have we built wonderful BI applications; only to discover that the business has changed its mind, want something else, or has no idea how to use the reports available. Our baby is thrown out with the bathwater.
We sigh, shrug, consider the money spent on the project wasted and move on. Unused report are unused, so no real harm done there. Obviously there is also some business opportunity lost, but it is difficult to estimate the real cost of a failed BI project.
Not so with Oracle Real-time Decisions (RTD).
BI reporting solutions support strategic decisions by providing reports and insight. Humans are then required to interpret the performance indicators provided and convert data and knowledge into actions to meet business goals. If the reports and insights are not aligned with business goals the human interpreters will simply not use them and come up with answers in other ways. Sure, these answers will probably not be perfect, but they will stil be directed towards meeting the actual business goals.
Conversely, RTD supports tactical decisions by combining business rules and predictive analytics to automatically provide answers that optimize the defined business goals. Humans are required to monitor results, tweak rules and set priorities. If the business goals defined do not correlate to what the business actually wants, RTD will still (attempt to) optimize whatever it is that you are measuring.
The difference is that RTD will optimize regardless of whether the defined goals match the reality of the business; and RTD is pretty damn good at optimizing the hell out of anything.
Optimizing the right thing(s).
As an example, imagine we implement RTD at a call centre to advise agents which products to sell to customers dialing in. Call centre operators care a lot about call handeling time (CHT); the average time it takes agents to handle a single call. We will (naively) use this as our only business goal; we want to minimize CHT. RTD will recommend a single product to sell and is informed of the results (total CHT for the call and whether the customer accepted the product) at the end of each call.
What do you think will happen?
The shortest calls are those where you sell nothing; the customers simply says “no” and hangs up. Thus more offers being rejected leads to lower CHT. Since RTD was told to minimize CHT it will quickly start recommending things it expects to be rejected. It will recommend things it knows have little chance of actually being sold.
What do you say? You’re saying you want to sell stuff? Well, too bad for you. That is not one of the business goals. All we said we cared about was CHT; and we’re doing a pretty good job minimizing CHT once we stop selling stuff.
In BI reporting projects, lack of business involvement and ill-defined business goals might lead to unused reports. In RTD projects it could lead to disaster. We will waste more than just money (and RTD’s built-in reporting will be able to show just how much is lost).
We really, really, really need the business involved.
A representative of a large company (which shall remain unnamed) recently called on me for some advice. They had accidentally loaded polluted data into their data warehouse and wanted to know if there was anything they could do to get rid of it. I told them that restoring their most recent backup and reloading any more recent, unpolluted data was probably the simplest solution. They concurred, but regretted to inform me that they had not made a single backup of their production system since I helped them set it up; many, many months ago.
Let me repeat that for you. Large company. Production system. No backups. Ever.
Now, you’re probably thinking that is pretty silly for a large company to not have backups of a production system. Well, wait until you hear what happened next.
A few days later, a different representative from the same company demanded to know where exactly in the documentation we had explained that they were expected to create backups of their production system. How were they supposed to know that anything could possibly go wrong? It wasn’t their fault that they had not anticipated this disaster, right?
I’m glad to report that, in the end, everything worked out and everyone lived happily ever after. But for future reference, here’s a free (as in beer, not as in speech) tip.
Backup. Your. Production. Database.
[Also, do not attempt to dry your poodle in a microwave oven]
Most people I asked, and most sources I referred to, define an organization similarly as “a group of people that share the same goals and objectives”.
[…] Working with this definition of an organization, leads you to think that stakeholders all share a set of central goals and objectives, and can be aligned in this direction. In reality, nothing could be further from the truth. In fact, many of the goals and objectives live at odds with one another. Shareholders want the highest possible shareholder value; employees look for job security and a place to build their skills and make a career; customers want a good price and a decent product or service; and suppliers want to sell as much as they can.
He subsequently proposes an alternative definition.
I have adopted what I think is a better definition of what constitutes an organization: An organization is a unique collaboration of stakeholders for the purpose of realizing goals they could not achieve by themselves. The trick to performance management is not to align everyone to the same goals and objectives, but in finding ways to bridge conflicting goals and objectives.
In my view, Frank is being modest here, this in not just the trick to performance management; it is the trick (but not the answer) to life, the universe and everything. Once you realize that not everyone wants the same things you do, the world gains quite a few interesting dimensions.
So don’t try to align everyone to your goals and objectives, but find ways to bridge the gaps and resolve conflicts. There is no need for everyone to agree to want the same thing, if we can find a solution where everyone can have what he or she wants.
Please, feel free to disagree with me in the comments; I’m sure we can work things out, and learn a thing or two on the way. 🙂
At first, the term Business Intelligence (abbr. BI) had me a bit confused. Contrary to my initial interpretation, BI does not concern the mental acuity of corporations or their employees.
Intelligence, as in ‘Feudal Japan often used ninja to gather intelligence.‘
Rather than ‘Corporate IQ‘ Business Intelligence should thus be interpreted more like ‘Decision Information‘.
We are not trying to make business people smart. We are simply working to give business people the information they need to make informed decisions.
We are Corporate Ninjas.
[Disclaimer: I have no intention of implying that BI specialists or business people lack mental skills, but simply trying to explain how I think the term BI should be interpreted; and consequently what my focus as a consultant should be.]