Forever Learning

Forever learning and helping machines do the same.

Author Archive

Art-Rock Allegory

leave a comment »

Contrary to what it may seem like, I have been busy writing; just not for this particular blog.

[ Crossposting from Quora where I tried to answer the question "Why hasn't anyone ever seen an an animal give birth to an animal of a different species?" ]

An answer in an art-rock allegory.

Imagine you are carving a statue out of a huge chunk of solid rock. Before you begin, the rock is just a rock. After you are done, the rock is gone and replaced by a marvelous (one would hope) piece of art.

As you are chiseling away you start to wonder: with which exact blow of the hammer does the rock suddenly transform from stone into sculpture? Why do we not suddenly see art appear where there was only a boulder before?

Individual animals are like chips falling to the ground as mother nature patiently carves up new marvels. Species are imaginary constructs we use to group individual pieces of the puzzle, but they can only hint at the true nature of the masterpiece they were hewn from.

When considered up close, each animal is neither rock nor art.

Art rock.

Written by Lukas Vermeer

May 13, 2012 at 19:47

Posted in Meta

Tagged with , , , ,

The Middle Way

with one comment

James Taylor is spot-on.

Too many analytic professionals think that only the data speaks and that business rules are, as someone once said to me, “for people too stupid to analyze their data”. Similarly too many IT professionals think that everything can be reduced to business rules or to code using explicit analysis. The reality for most decisions is somewhere in between.

In order to truly achieve business transcendence one must follow the Middle Way.

Written by Lukas Vermeer

May 2, 2012 at 14:59

Waterfall Predictions in Oracle Real-Time Decisions

leave a comment »

[ Crossposting from the Oracle Real-Time Decisions Blog. ]

Facet Based Predictions are a powerful method to increase predictive accuracy and facilitate rapid learning and knowledge transfer, but the simple approach described in an earlier post comes at a price. By using a single facet rather than individual choices for prediction, we decrease the granularity of our predictions. Choices that share the same facet value will be treated as equals by our predictive model; and even when important distinctions could be made after sufficient feedback is collected our simple facet based model will never learn to exploit these differences.

In most cases, the advantages of facet based models will outweigh the drawback of a reduction in granularity. This is especially true in implementations where shelf-life is short and no individual choice is ever expected to gather enough responses to build a predictive model. However, sometimes we will want to combine the power of facet based prediction with the accuracy of models defined at the lowest level of granularity; for instance when some choices are expected to collect sufficient feedback while others are not.

The complete and open decision management framework architecture of Oracle Real-Time Decisions allows us to blend predictive models in several ways. In this post, we will describe how we can mix-and-match two models at different levels detail using an approach we call waterfall prediction.

Waterfall Prediction

In essence, the waterfall method described here will try to predict a likelihood at the lowest possible level of granularity. If the choice based model at this grade has not received enough feedback to be considered mature, we will resort to a facet based model.

This implementation will build on the example described in our previous post about facet based prediction.

Product Model Setup

In addition to the facet based category events model configured earlier we will need a choice based event model Product Events. This model will predict likelihoods of events (Accepted and Ordered) based on feedback for individual choices.

Product Model Setup

Recording Events

Previously, we would record feedback events only for our facet based model. As we now have two models at different levels of granularity, we will alter our code slightly to ensure we record any event against both models.

// create a new choice to represent the product
ProductsChoice p = Products.getChoice(request.getChoice());
// create a new choice to represent the category attribute
CategoriesChoice c = new CategoriesChoice(Categories.getPrototype());

// set properties of the category choice
c.setSDOId("Categories$" + p.getCategory());

// record choice in models (catching an exception just in case)
try { p.recordEvent(request.getEvent()); } catch (Exception e) { logError(e); }
try { c.recordEvent(request.getEvent()); } catch (Exception e) { logError(e); }

Note that we are recording the same event twice, but against two separate choices p and c representing the two different levels of granularity. The Oracle Real-Time Decisions framework will automatically ensure the relevant models are updated accordingly.

Predicting Likelihoods

In this implementation, a new function will be used to predict likelihoods for our products. Rather than just returning the likelihood at the category level like before, this function will first check whether the more granular model has received enough feedback (in this case 100 positive events) to be considered mature. If the product model is deemed sufficiently trained, the function will use this model instead of the more general facet base one.

// get instance of the model used for predicting Product Events
ProductEvents productmodel = ProductEvents.getInstance();
// get instance of the model used for predicting Category Events
CategoryEvents categorymodel = CategoryEvents.getInstance();

// check if model for Product Events is sufficiently trained
if (productmodel.getChoiceEventModelCount(ModelCount.POSITIVE_COUNT, product.getSDOId(), event) >= 100)
{
    // return the likelihood based on the Product Event model
    return productmodel.getChoiceEventLikelihood("Products$"+product.getSDOId(), event);
}
// else if the model is not sufficiently trained
else {
    // return the likelihood based on the Category Event model
    return categorymodel.getChoiceEventLikelihood("Categories$"+product.getCategory(), event);
}

Waterfall Function

Proper decision design and continuous in-live testing are crucial here. Precisely how much feedback should be considered “enough feedback” can wildly differ between implementations and use-cases. Moreover, some implementation might also permit the use of the quality of the models as reported by RTD, rather than the number of positive events, to determine the cascade threshold. Decision Center is an indispensable tool in this process.

Choice Group Scores Setup

Similar to before, on the scores tab for the Products choice group we configure the Likelihood performance goal to be populated by the new WaterfallLikelihood function instead of the PredictLikelihood function.

Waterfall Score

These simple changes to our previous example empower our new implementation to benefit from two models at varying levels of granularity; leveraging both the accuracy of choice based models and the advantages of facet based prediction.

The Power of Waterfall Prediction

Waterfall prediction is a compelling example of how Oracle Real-Time Decisions enables us to blend multiple real-time models for use in rapid decisioning. This advanced approach to modeling can easily be expanded to cover more than just two levels and more than one hierarchy to further improve the predictive prowess of an RTD implementation.

Cascading models allows businesses to express various forms of decision logic that are far more subtle than simple one-step prediction models. In implementations where convergence time is incompatible with business requirements – or there is a low tolerance for random behavior in the initial stages of deployment – this flexibility in designing models and decisions is crucial.

In a future post, we will discuss another approach to amalgamate models, taking us to a whole new level of predictive modeling and analytical insights; combination models predicting likelihoods using multiple child models.

Written by Lukas Vermeer

April 16, 2012 at 17:02

Extrapolation

leave a comment »

There are two types of people in this world:

  1. Those who can extrapolate from incomplete data.

[ Via @professorkitteh. Original source unknown. ]

Written by Lukas Vermeer

April 12, 2012 at 11:15

Handshake Authentication

leave a comment »

Bruce And Me

London on a Thursday afternoon. The man sitting a few feet away from me reminds me of that infamous security guru whose work I greatly admire; a man who I’d been reading about for years.

Actually, he looks a lot like Bruce Schneier.

But I hesitate; he seems so normal. No cohort of press, no masses of screaming fans and no burly bodyguards; if I can muster the courage I can simply walk up to him and ask.

So I did.

I don’t want to be rude, but is it you?

In retrospect, that was probably one of the most moronic questions I have ever put forward to anyone ever. But Bruce seemed unfazed. I probably wasn’t the first star-struck geek dazzled by his appearance.

Yes, I am him.

Completely flabbergasted I mumbled something about that I read his blog. He asked if I had also read his latest book. I was ashamed to admit that I had not. Someone took our picture and we shook hands.

He had a pretty firm grip and a mighty powerful handshake; that’s when I was certain that this was the man who already has a backup plan for when a 2nd person discovers that P = NP.

Real-life handshake authentication.

Written by Lukas Vermeer

March 16, 2012 at 17:37

Posted in Meta

Tagged with ,

Selling Ice to Eskimos

with 6 comments

Facebook has recently discovered that beyond the uncanny valley of personalized marketing lies the bottomless pit of invasive identity misappropriation.

But there is a deeper problem here. I’ve said it before and I will say it again. Facebook has the data, but they do not have users with shopping intent. Nobody goes to Facebook to buy stuff. Facebook is for meeting friends, like a bar or a club.

Even with the best products in the world and the most detailed private information it is not easy to sell stuff to strangers in bars; unless you’re selling beer.

Written by Lukas Vermeer

February 29, 2012 at 11:00

Posted in Marketing

Tagged with , ,

Big Data is Big

leave a comment »

We happen to have one sat in the next building over. Would you guys like to see it?

Oh, boy! Would we!

Myself and about twenty other Oracle employees are attending a Cloudera training on Hadoop in the Oracle Reading office. Five days packed with information covering a whole new ecosystem filled with some pretty crazy beasts.

Our heads are spinning like a room full of network-attached storage and our pens are humming like a data center cooling system as we attempt to map and reduce every little piece of data they throw at us.

During one of the breaks, we get the opportunity to go see the Oracle Big Data Appliance. Standing in front of this enormous machine, it finally dawns on me what a massive bulk of raw power this really is. A seemingly countless number of disks are mounted in a box higher and wider than myself. Each disk can hold three terabytes of data.

Big Data is Big!

Written by Lukas Vermeer

February 23, 2012 at 19:24

Facet Based Predictions in Oracle Real-Time Decisions

with one comment

[ Crossposting from the Oracle Real-Time Decisions Blog. ]

The analytical models method detailed in a previous post are not only extremely valuable for reporting, but can also be used to predict likelihoods for things other than regular choices. We can for instance generate predictions based on statistics for an attribute of a choice, rather than the choice itself. We use the term facet based prediction to describe this advanced form of generating predictions.

This novel approach to modeling can be applied to significantly improve predictive accuracy and model quality. It can also facilitate the rapid transfer of existing learnings to newly created choices based on their facet values. These capabilities can be of use to practically all implementations, but they are of utmost importance in cases where the number of choices is very high or individual choices have short shelf life. In these instances, there might simply not be enough time or data to be able to predict likelihoods for individual choices. We could predict likelihoods for certain facets of our choices; as long as their cardinality remains relatively low.

Consider the following example in which we recommend products based on the acceptance of other products in the same category. In our ILS, Oracle Real-Time Decisions will be used to recommend a single product based on a single performance goal: Likelihood.

Choice Groups Setup

Products that may be recommended are stored in a choice group Products (we will use static choices, but this approach could be implemented for dynamic choices also). Product choices have an attribute Category which will contain a category name. We will use a second and separate dynamic choice group Categories to record acceptance of the different product categories.

Choice Group Setup

Note that we never intend to return any choices from the Categories choice group to a client. It is configured using a dummy source and will not contain any actual choices. This group is only used within the ILS for predicting likelihoods. Statistics for this group may however be viewed in decision center reports.

Recording Events

Similar to the example for analytical models, we will record events against a dynamically generated choice representing a facet value rather than against the actual choice. In this example, both the actual choice and the event to record will be passed through a request represented as Strings.

// create a new choice to represent the category facet
CategoriesChoice c = new CategoriesChoice(Categories.getPrototype());
// set properties of the choice (SDOId should be of the form "{ChoiceGroupId}${ChoiceLabel}")
c.setSDOId("Category" + "$" + Products.getChoice(request.getChoice()).getCategory());
// record event in model (catching an exception just in case)
try { c.recordEvent(request.getEvent()); } catch (Exception e) { logError("Exception: " + e); }

Model Setup
Our model setup is practically identical to before, but this time we’ll enable “Use for prediction“.

Model Setup

Predicting Likelihoods

A function PredictLikelihood will be used to predict likelihoods for our products. The function takes a Products choice and an Event (String) as parameters and returns a Double value representing the predicted likelihood.

// get instance of the model used for predicting Category Events
CategoryEvents m = CategoryEvents.getInstance();
// return the likelihood based on the generated SDOId and the "Accepted" event
return m.getChoiceEventLikelihood("Categories$"+product.getCategory(), event );

Prediction Function

Choice Group Scores Setup

On the scores tab for the Products choice group we configure the Likelihood performance goal to be populated by thePredictLikelihood function using parameters this and “Accepted”. The keyword this refers to the particular choice being scored and will ensure each choice is scored according to its category facet.

Scoring Setup

That is all that is required to score choices against a facet. We can now create decisions and advisors that use these predictions to recommend products based on their categories.

In this example, we have predicted likelihoods based on a single product facet. As a result, products in the same category will be scored the same. In practical implementations this will rarely be an issue, because there will presumably be multiple performance goals. Also, likelihoods may be mixed with product specific attributes like price or cost; resulting in score differentiation between products regardless of equality in likelihoods.

In a later post, we will discuss how we can expand on this to include multiple product facets in our likelihood prediction.

Written by Lukas Vermeer

February 17, 2012 at 12:15

Marketing Personalization and the Uncanny Valley

with 2 comments

Dear [prospect.first_name],

Following our last discussion on [prospect.last_contact_date] concerning [prospect.subject_area] I think the following article would be of particular interest to you.

Seth Godin writes.

Sure, it’s easy to grab a first name from a database or glean some info from a profile.

But when you pretend to know me, you’ve already started our relationship with a lie. You’ve cheapened the tools we use to recognize each other and you’ve tricked me, at least a little.

Increased familiarity begets heightened expectations. Personalization has its own uncanny valley.

The uncanny valley is a hypothesis in the field of robotics and 3D computer animation, which holds that when human replicas look and act almost, but not perfectly, like actual human beings, it causes a response of revulsion among human observers.

When you treat your customers as though you know them personally they will be personally offended if you do not. Beware of the eerie hollow of broken promise.

Written by Lukas Vermeer

February 2, 2012 at 15:33

Analytical Models in Oracle Real-Time Decisions

with one comment

[ Crossposting from the Oracle Real-Time Decisions Blog. ]

As explained in a previous post, we can record events against unsourced dynamic choices created on-the-fly using the getPrototype method. Choices instantiated in this fashion, and the events recorded against them, will be visible in decision center reports.

This enables us to create extensive reporting based on arbitrary input from different sources without the need to specify all the possible choice values upfront. Creating so-called analytical models can be very useful for analysis.

Recording Client Input

Consider the following example which shows how this approach can be used to create an analytical model based on informant input. In our ILS, Oracle Real-Time Decisions will be used to find and report on correlations between a regular session attribute and arbitrary codes passed through an informant.

Choice Group Setup

A choice group Reason is used to store codes passed through the informant. During initialization, the choice group will attempt to grab choices from the ReasonEntityArray, but the array is a dummy entity that will always return nothing, because we’ve not defined a value for it.

Reason choice group configuration, dynamic choices tab.

Reason choice group configuration, group attributes tab.

Informant Setup

When invoked, a RecordReason informant will record an event for the ReasonCode input parameter. The logic for this informant is pretty straightforward.

// create a new choice based on the request attribute (a string that describes the reason)
ReasonChoice c = new ReasonChoice(Reason.getPrototype());
// set properties of the choice (SDOId should be of the form "{ChoiceGroupId}${ChoiceLabel}")
c.setSDOId("Reason" + "$" + request.getReasonCode());
// record choice in model (catching an exception just in case)
try { c.recordChoice(); } catch (Exception e) { logTrace("Exception: " + e); }

Model Setup

In order to actually find and report on correlations, we will need to define at least one event model on our choice group. For this example, we’ll keep things as simple as possible.

Reasons choice event model configuration.

Reports

The reports in decision center will show the reason codes sent to the informant as if they were dynamic choices and calculate statistics and correlations against session attributes.

(In this example, an Oracle Real-Time Decisions Load Balancer script was used to send four different codes to the ILS with a severe bias towards certain age groups.)

Decision center report for Reason choice, analysis tab.

This approach enables us to generate detailed reporting and analysis of more than just regular choices in the familiar decision center environment. In this example we were using informant input, but this technique can also be applied using the attributes of other choices to gain additional insight into the correlations between session attributes and choice attributes like product group or category (rather than individual choices).

This method can also be used in conjunction with predictive models. We will explore this possibility and its applications in future posts.

Written by Lukas Vermeer

January 24, 2012 at 11:48

Follow

Get every new post delivered to your Inbox.