Don’t Fear the Statistics – Using OBI for Statistical Analysis Part 1

Recently, Ranzal has been working with a client in the healthcare space implementing Oracle Business Intelligence (OBI), and a requirement surfaced to translate a scorecard report into an OBI dashboard. One of the data elements was simply captioned “Trend” and colored red, yellow, and green.  It was discovered that this Trend was the slope of a linear regression plot (more on what that means in a moment) and the color was based on an arbitrarily chosen number.  This immediately raised some concerns from the Ranzal team who then made some suggestions for more pertinent statistical analysis.

To set the stage, this healthcare client’s summarized (and greatly simplified) income statement divides Revenue into Inpatient and Outpatient and Expenses into Total Labor and Non Labor. Revenue and expenses are the primary focus of much of the analytics at an aggregate level.  A single (seemingly arbitrarily chosen) number was used to determine the colored flags for each of these measures.  This was despite Inpatient Revenue and Non Labor Expenses comprising the majority of the revenue and expense amounts (respectively).  If we were to plot out these categories for the first five months of a fiscal year, we see the following (all data have been altered to preserve client confidentiality without overly affecting the overall analytic output):

figure-1

Figure 1 Revenue and Expense Trend Plot

The trouble with plotting a trend of numbers is that it is sometimes difficult to understand, at a glance, how the organization is performing. In the plots above, clear downward and upward trends can be seen for Inpatient Revenue and Total Labor Expense (respectively).  However, upon closer examination of Outpatient Revenue and Non Labor Expense, there are two upward trending months and two downward trending months.  The overall trend is difficult to discern.

With the introduction of Oracle Business Intelligence Enterprise Edition (OBIEE)12c, a Trendline function was introduced that allows the creation of a linear regression trendline. Once this is applied, the above trend plots can be augmented to get a clearer picture of performance:

figure-2

Figure 2 Revenue and Expense Linear Regression

This trendline uses a simple linear regression formula that is comprised as the slope (commonly represented by the letter m) and Intercept (commonly represented by the letter b) in the following formula:

y = mx + b

In our trend plots, the letter y represents the revenue and expense categories and x represents the fiscal periods.

The intercept is where the trendline crosses the y-axis when x is equal to zero. For most statistical analyses, the intercept is unimportant.  The slope can be thought of the average change over the two parameters.  Using OBI, the slope of each revenue and expense category can be calculated and the dashboard updated:

figure-3

Figure 3 Linear Regression Slope

In the example above, the slope of the Inpatient Revenue can be thought as decreasing an average of $291,000 a month.

One issue with using the slope is that it is subjective. As was mentioned, our healthcare client had chosen a single arbitrary slope for each of the revenue and expense categories.  The slopes in the example above range from 29 thousand to -291 thousand.  Complicating matters, the client wanted the ability to run these Analysis for individual hospitals which can dramatically affect the slope.  For instance, a hospital operating in Kansas City will probably not have the same revenue growth (or shrinkage) as a hospital operating in New York City.  To use the slope as a quantifiable objective properly, a target slope would have to be determined for the enterprise and at each granular level expected to be benchmarked (hospital, department, etc.).  This creates some obvious maintenance issues.

A more objective approach is to use the correlation coefficient, a number on a range from negative one to positive one. A correlation ranking of one indicates a positive correlation while a ranking of negative one indicates a negative correlation.  For instance, for most companies, the number of units sold is often has a high degree of positive correlation to revenue.  This would correspond to a correlation coefficient of close to one.  For many companies working in the commodities market, the more competitor’s revenue increases, the lower the possible market share.  This would be a negative correlation and result in a correlation coefficient calculation of negative one.  A correlation coefficient of zero indicates a lack of any correlation.  For instance, the number of broken arms set in a New York hospital is probably uncorrelated to the number of bowls of soup served by Panera Bread in Kansas City.

It is worth noting that correlation does not mean causation. For example, consider the number of pirate attacks and users of Microsoft Internet Explorer (IE) users:

figure-4

Figure 4 IE Usage and Pirate Attacks

The number of pirate attacks and IE users have both been in decline since 2009. As can be seen by the scatter graph on the right, the more pirate attacks, the greater the use of IE.  Regardless, naval security experts are probably not asking for adoption rate reports from Microsoft.

Returning to the client’s use case, adding the correlation coefficient to the dashboard provides a greater understanding of how the company is objectively performing:

figure-5

Figure 5 Month and Revenue / Expense Category Figure Correlation

Inpatient Revenue has a correlation of -0.69, which is moderately significant for a metric most businesses want to increase. Conversely, the Outpatient Revenue has a slightly negative correlation of -0.36.  While this should be a cause for concern, a “wait and see” approach (or deeper dive into Outpatient Revenue Categories) might be more prudent.  Because the range of the correlation coefficient is negative one to one, filtering this analysis down to a more granular level, such as a hospital or department, will return an objective number that can be subjected to independent interpretation.

There are cases in which the subjectivity of the slope is particularly useful. In the case of our client, a full year budget was prepared at the beginning of the fiscal year and periodically updated as the year progressed. The slope of this budget could be used to generate the average dollar change desired per month.  The advantage of this is that it reduces the possible volatility of a particular month into a single number that can be compared to the benchmark.  As a final addition to the dashboard, a full year budget slope was added:

figure-6

Figure 6 Full Year Budget Slope

With the exception of Non Labor Expenses, this organization is missing the mark on all of their budgetary goals, and the trend indicated by the actual slope and correlation coefficient means this situation is likely to get worse.

A word of warning about statistics in general and the use of slope and correlation coefficient in particular: micro and macro trends can should be considered and extreme outliers can mask actual trends.

For an example of micro and macro trends, consider JCPenney, a retailor that has been struggling since 2010. The following visualization (created using Oracle Data Visualization Desktop) charts the quarterly revenue from 2004 Q3 to 2016 Q4 along with the trendline for the entire period.  The bars represent the correlation coefficient to that particular quarter (i.e. the first bar is the correlation between 2004 Q3 and 2004 Q4 while the second bar is the correlation between 2004 Q3, 2004 Q4, and 2005 Q1, etc.):

figure-7

Figure 7 JCPenney Revenue Trend and Correlation

Notice that the first correlation bar is equal to one. When there are only two data points, the correlation coefficient will be equal to one, negative one, or zero.  The next data point and correlation for 2005 Q1 (JCPenney recognizes holiday revenue in Q1 of each year) continues the high correlation streak, however, the following quarter drops the correlation down to 0.35.  The correlation fluctuates quarterly until about 2012 Q2 when the definite downward trend is established.

A savvy analyst will break JCPenney’s performance during this time range into three distinct trends. Upward trending from 2004 to 2008 Q1, diminished upward trend from 2008 Q2 to 2012 Q1, and then a flat, but greatly reduced revenue from there:

figure-8

Figure 8 JCPenney Distinct Trends

As an example of how an extreme outlier can affect statistical analysis, consider GTx Incorporated, a pharmaceutical drug developer. In December 2010, GTx recognized $49.9 million dollars in revenue from a partnership with Merck& Co., Inc., which spiked GTx’s revenue (previously averaging $2 million a quarter) to $56.7 million dollars:

figure-93

Figure 9 GTx Incorporated Revenue Trend

In the visualization above, the orange projected trendline was calculated using revenue from 2004 Q1 through 2009 Q4. The purple trendline is the projected calculated using 2010 Q1, which includes the huge revenue spike.  Obviously, the orange trendline is the more accurate due exclusion of the extreme data point.

Statistical analytics is part science/technology and part art. As with any data and visualizations, a certain degree of intelligent interpretation is needed to determine what it all really means.  Functional users should be focused on what the various statistical interpretations mean and not be distracted on the complexity of the underlying mathematical functions.  Trend visualizations can aid users in understanding how to interpret these statistical calculations.  Many organizations miss opportunities because of individuals unwilling to embrace statistical methods due to the lack of solid education and guidance about what these numbers really mean.  Training, change management, and the creation of rich visualizations can help enterprises harness the capabilities of statistical analysis and extend the role of their business intelligence systems.

Oracle Business Intelligence Essbase Cube Builder: Continued Integrations with Essbase and OBIEE

Oracle has been busy focusing on improved consolidation of their BI applications this past year. This next offering within Oracle’s arsenal of BI apps will allow for enhanced integrations between the multi-dimensional and relational worlds of Essbase and OBIEE. The new application is called Oracle Business Intelligence Essbase Cube Builder (OECB) previously referenced as Oracle Essbase Integrator (OEI) during beta testing and its purpose will help refine and make for more consistent binary integrations. OECB will permit OBI apps to push metadata and data into Essbase cubes, a significant advancement for OBIEE users where leveraging in-depth ad hoc and what-if scenarios can be available. You can download OECB on the Oracle e-delivery site:

OECB Download

Oracle Business Intelligence Essbase Cube Builder

OECB will bestow an integrated BI platform that supports OLAP capabilities and dashboards among others. OECB will also allow Essbase to leverage metadata and data from OBI and OBI apps. Other supports will include what-if scenarios, allocations, spreading, changing attributes, and multiple hierarchy comparisons.

From a user’s perspective, OECB will provide seamless authentication from OBI to Essbase which reduces maintenance and improves the user experience. Users can also feel confident that metadata and data will have the same context across platforms which mean users will see the same dimensions, members and measures between their OBI and Essbase apps. The environments will be synchronized as updated changes will be retained and applied with data values matching between the two apps. OBI dashboard metadata and data contents will be synchronized to Smartview. OBI physical/business model and mapping/presentation layers will be synchronized to Essbase. Additional support features indicated above affords users with the ability for more robust development. All of this becomes available starting with version 11.1.2. Prior versions will also become available after the official release of OECB.

The implication here is if you’re a traditional OBIEE shop wanting to add flexibility to improve analysis with a powerful application like Essbase, OECB will be your opportunity. As Oracle continues to provide expanded integrations between Essbase and OBIEE, see how this new offering called Oracle Business Intelligence Essbase Cube Builder can be leveraged within your organization.

Creating an OBIEE Repository for Relational Data Source

OBIEE (Oracle Business Intelligence Enterprise Edition)

In this part of my OBIEE blog, I’ll guide you through basic steps to create an Oracle server repository and use it to bring over a relational data source for use in OBI Answers. As of OBIEE version 10.1.3.3.2, Essbase is supported as an OBI data source allowing the user to integrate Essbase data with OBI Answers, OBI Interactive Dashboards, and OBI Publisher. The latest version of OBIEE is 10.1.3.4x. Refer to my part 2 blog (OBIEE and Essbase – Defining OLAP Integration) for importing Essbase content into OBIEE.

For the below steps, I will be using SQL Server 2005 relational database as my source but you can use Oracle 10g or older versions of either tool.

To create an Oracle repository, open the Oracle BI Administration Tool and select File | New and name your new repository.

113

2

Set security by selecting Manage | Security. Assuming you are the admin, select Users | Administrator. Right click Administrator, select Properties to enter an admin password and confirm the password. Close out Security Manager.

32

42

52

62

To import a relational data source, select File | Import | from Database.

71

Select a connection type.

81

Select the relational data source, enter login credentials and click OK.

94

Enter the relational table to import. Deselect all but Tables, Keys, and Foreign Keys and click Import. If you are using Views instead of Tables, change accordingly.

102

112

The relational table will import into the Physical layer of the Administration Tool.

122

132

Test the import by right clicking a column (Branch_Name is selected in this example) and select View Data. Values for the selected column should populate in a view data table.

142

152

Drag the imported relational table folder from the Physical layer to the Business Model and Mapping (BMM) layer. In the BMM layer, here you can create logical tables and joins to develop the type of model needed for presentation.

162

Next, drag the relational table from the BMM layer to the Presentation layer to finalize its presentation for the user. Manipulation by presentation can be performed in this layer for Subject Area use in OBI Answers.

172

For your reference and as a recap from my part 2, here is the significance of each layer:

1. Physical layer – imported tables and views come from the relational data source; physical joins can be performed here
2. BMM layer – this layer organizes imports from the physical layer into logical categories
3. Presentation layer – BMM entities are organized for user presentation

Once your Presentation layer is complete, it can be made available for OBI Answers to create dashboard content bringing both relational and multidimensional data sources into one view. A sample of an Answers view is displayed below combining both relational and multidimensional data sources into a combo box allowing for choice among regions. Any selection of region updates both relational and multidimensional tables for view.

182

192

202

212

This is just one simple example of the product’s capabilities. OBIEE is redefining how we approach BI with the evolution of this product improving on how we develop it. There is a great deal of flexibility within OBIEE for relational and multidimensional reporting and those who understand how to leverage this tool will see its impact upon their organization for the better.