DTRules: Balanced vs Unbalanced Decision Tables

A Balanced Decision Table has little if any relationship to the idea of balancing a Decision Tree. With Balanced Decision Tables, the user defines *every* path through the logic; with Unbalanced Decision Tables, the user defines only a subset of paths through the logic, and an additional rule ("First" rule or "All" rule) is used by the system to define the missing paths.

The process of balancing a decision table often makes the resulting logic very difficult to understand.

This question has some very real world applications. One of the most difficult issues in creating Decision Tables using the Rules Engine from the Texas TIERS project is the fact that all decision tables have to be balanced. This same Rules Engine is also used the Colorado CBMS project, the California CalWIN project, and on the Michigan BRIDGES project. All of these projects require the Decision Tables used by the Rules Engine to be balanced, and all of them have in the order of 3000 decision tables!

In this post, we will examine one such table taken from the Texas TIERS project:

We are going to concentrate on four lower quadrants of the Actions and Conditions, shown below. What we are ignoring is the actual name of the table (the first line) and various comments that aid developers in developing and maintaining the tables.

So considering only the hardcore "Decision Table" part of this table, we have:

Someone trying to maintain this decision table might ask any one of a number of obvious questions:

Under what conditions will an individual be Excluded?
Under what conditions will they be Included?
Under what conditions will a notice Reason be set?

These questions *can* be answered. But since the individual is excluded from the eligibility group in 7 discontinuous columns, and included in the eligibility group over 5 discontinuous columns, answering the questions like why a individual is include or excluded by just looking at this table can be very difficult.

And if you find the table difficult to understand, clearly modifying it (such as adding a new policy that might further exclude individuals from the eligibility group) is going to be very difficult. Such a modification requires one to understand what qualifies the client to be included, or excluded, as well as the conditions under which the notice reasons are set. One then makes the modification and then must re-balance the table. (A future post will take this example, demonstrate the difficulty in modifying a balanced table, and compare that to modifying an unbalanced table.)

Unbalanced tables sacrifice the goal of documenting every path through the logic in favor of documenting the policy more clearly. The term "sacrifice" will ultimately prove to be too strong a word, as one can rely on one's development tools to provide the balanced table.

In this blog we will modify the given Balanced Decision Table to create an Unbalanced Decision Table, and compare the two. In that future blog where we modify balanced tables promised above, we will pretend to develop this table, then extend the development to include some changes.

Two types of Unbalanced Decision Tables currently supported by DTRules; each type is balanced automatically by DTRules using one of the following two rules, respectively:

The "First Rule" is used to balance First Tables.

A "First" Decision Table executes the First column which has all of its conditions satisfied. Logically then, the First column is evaluated, then the Second, then the Third, etc. Once a column has been found for which all its stated conditions are met, it is executed, and the execution of the Decision Table is complete. This is the one we are going to use in this example.

The "All Rule" is used to balance All Tables

An "All" Decision Table execute actions in all the columns for which all their conditions are satisfied. We will discuss this type of Unbalanced Decision Table in a future post.

A Note on Efficiency

Unbalanced Decision Tables are just as efficient as Balanced Decision Tables. In fact, all Unbalanced Decision tables are converted into Balanced Decision Tables for execution by the system.

Converting the Balanced Decision Table into a Unbalanced, "First" Decision Table

In this case, a set of situations and circumstances are described by which the individual will be marked as ineligible. Under each of these conditions, different notice reasons are set to indicate why the individual was excluded from the eligibility group. A careful examination shows that if the individual is determined to be included in the certified group, then the set of actions is always the same. If all the actions are the same when included, then we can combine all of these columns!

The first step to simplifying this table is to remove all the rows that mark the individual as being included in the certified group. These are individuals whose eligibilityGroupIndicator gets set to InCG, and marked in yellow below:

We will delete them delete them, and add a default row at the end which will always be executed. The idea here is that we will have a list of columns up front that test for all the conditions under which we exclude the individual. If the individual is excluded, then we set the NoticeReason to the code that tells why. This is a First Table. Once a column matches, the following columns will be ignored. This fits the logic of this table nicely!
Note that if we cannot find a reason to exclude the individual, we should mark the individual as certified in the group (i.e. set the eligibilityGroupIndicator = InCG):

This is about all we can do to eliminate columns. Now the question is if the order of the columns is best for describing the policy. If you notice, one notice reason ('EL0009') is set in columns 3 and 5, while another ('EL0005') is set in columns 4, 6, 7. The two notice reasons are tangled together. Let's look at what happens when we simply move the columns around to line up the columns driving the policy for 'EL0009' together, to be followed by the columns that drive the 'EL0005' notice. All we have to do is swap columns 4 and 5:

Now we cannot do anything more with the columns, but perhaps we can reduce the noise in the tables themselves.

Remember this is a "First" Table. This means if we match all the conditions for the first column (of which there is only one which is looking for a matched i.e. 'y' result), then there is no way any of the following columns can match a yes for condition 1. If they did, you would never reach such a column because you would have matched on column 1, and only the First column that matches is going to execute. So yes, you can flood the rest of the columns with 'n' entries, but they are simply not required. And if you had a 'y' in one of the following columns, all the 'n' would do is hide the fact that such a column is dead. Good tools would warn you about this, but the point is that noise makes catching errors harder.

With a First Decision Table, we can get rid of all the 'n' states that follow the 'y' on condition 1 in our example. Basically the rest of the table can pretend the first row just doesn't exist. Then for the same reasons, we can get rid of all the 'n' states following the 'y' in column 2, condition 2.

Note we cannot do this for any of the rest of the conditions since column 3 might fail on any one of the remaining 3 conditions for which it specifies a state.

Now our example looks like this:

We also can suddenly see clearly that the first action is exactly the same as the 7th action! This mistake would be difficult to miss with an Unbalanced Decision Table, because the noise is reduced, and the reviewer can look for patterns (like two sets of actions that exactly match each other!).

We can clean up the actions further by getting rid of the Exit statement. Exit statements never did anything in Texas, or any of the projects that have inherited this rules engine. Exit statements are a way of documenting which tables call which tables. They are a legacy artifact, and can be removed to reduce the "Noise" in the table.

Other Simplifications

I just used a number of these in my examples without comment, but I'll comment here. In the interest of cleaning up the editing, DTRules no longer requires the '-' dash for a "don't care". This alone eliminates all sorts of useless typing. Blank columns and blank rows can be used to separate logic within a decision table (if that makes things clearer). The '*' has come to mean a condition that always matches. It can be used to catch all the defaults at the end of a First Table, as in this example.

Deleting the redundant first action and the unneeded Exit statement in the last action leaves us with:

Is this version of the Decision Table a big improvement? YES! Let's go back to the questions that we might ask about a decision table listed before we started this!

Under what conditions will an individual be Excluded?
Under what conditions will they be Included?
Under what conditions will a notice Reason be set?

Answer to Question 1: Columns 1-7, as indicated by tests circled in orange below

Answer to Question 2: Column 8, i.e. if not excluded, the client will be Included. Circled in red below

Answer to Question 3: More complex, but each set circled in blue below

'EL0007' See Column 1
'EL0008' See Column 2
'EL0009' See Columns 3&4
'EL0005' See Column 5,6, and 7

Could we have done the same thing with the original table? After all, it was just four more columns, right? Well yes, but here is what THAT looks like:

But is this First Table the same as the Original Balanced Table?

Do we have a way to prove the simple logic does the same thing that the complex logic does? The short answer is "Yes." DTRules produces the balanced version for all decision tables at compile time. If we look in the balance.txt file, we will find the following balanced conditions generated from our simplified, Unbalanced Decision Table:

Which exactly matches the same conditions and actions we started with (less the redundant first action, and the non-functional exit statement). That is because DTRules uses the same algorithm to print the balanced form of the decision table that a developer would use to format the decision table to validate the table as balanced.

Future Posts

Of course, this blog is only scratching the surface. Unbalanced Decision Tables are not only easier to understand, but easier to modify. We will have a post for that. We also mentioned but didn't explain "ALL" Decision Tables. We will have a post for that. The actual language used in DTRules will be factored out to allow a project to choose the language it will use without any modification to DTRules. we will have a post for that.

DTRules

Sunday, March 14, 2010

Balanced vs Unbalanced Decision Tables

1 comment:

Followers

Blog Archive