Don’t Fear the Statistics – Using OBI for Statistical Analysis Part 2

Nearly every client Edgewater Ranzal partners with uses statistical averages in their analytic and reporting solutions. As far as statistical functions go, it is probably the easiest to understand, however; the limitation of using the average is that it can be difficult to determine how to rate the individual performance of contributors to that average.  Consider the following examples:

  • The average cost of a gallon of milk is $3.20 and the corner convenience store is selling it for $3.45, is that a significant deviation from the average?
  • If the average NFL player’s base salary is $1.86 million and Tennessee Titan’s Marcus Mariota made $5.5 million, is this an exceptional payout? Is the salary significant when his role as the team’s starting Quarterback is considered?
  • Suppose the average gross margin percent for a company’s business units is 58% and one particular business unit’s actual gross margin is 46%. Is that business unit truly underperforming?

It turns out that the average of a particular measurement is very subjective. In this post, we explore how the standard deviation of the average can be used to mitigate subjectivity and how it can be incorporated into data visualizations to identify true outliers.

The NASDAQ-100 is comprised of the largest domestic and international non-financial companies (based on market capitalization) listed on the Nasdaq Stock Exchange. It includes technology giants such as Apple and Alphabet (parent company of Google) along with consumer services such as Bed, Bath, & Beyond.  The quarterly gross margin percent from 2007 to Q3 2016 was downloaded and loaded into a data mart leveraged by Oracle Business Intelligence Enterprise Edition (OBIEE) 12c.  (Q4 2016 data was not available for all companies).  With the exception of Figure 1, the following visualizations were created in OBIEE 12c.

The standard deviation can be thought of as ranges that can be used to classify individual contributors to the average. For instance, the average gross margin percent for the NASDAQ-100 in Q4 2014 was calculated to be 59.9% with a standard deviation of 22.7%.  This can be visualized on a number line as such:

Figure 1 NASDAQ-100 Q4 2014 Gross Margin % Performance Ranges

dont-fear-statistics-part-2-figure-1

Many real world events that have variability follow a predictable distribution pattern. For instance, it is expected that approximately 34.1% of the contributors will fall between the average and one standard deviation up.  From the figure above, it is estimated that approximately 34 of the NASDAQ-100 will have a gross margin percent between 37.2% and 59.9%.  The actual distribution can be visualized as such:

Figure 2 Distribution of NASDAQ-100 Gross Margin %

dont-fear-statistics-part-2-figure-2

The NASDAQ-100 companies do not perfectly follow the distribution; there is a fatter spread into the Negative and Positive buckets (Two Standard Deviations down and up). Other, more advanced statistical methods can be used to redefine ranges, but are beyond the scope of this post.

Of course, this visualization simply confirms statistical theories that were proven over a hundred years ago. The true value of analytics is to take statistical theories and turn them into informative visuals.  One method of visualizing the ranking of companies using the standard distribution in OBIEE 12c is through a Treemap:

Figure 3 NASDAQ-100 Distribution Treemap Visualization

dont-fear-statistics-part-2-figure-3

The size of the box represents the Gross Margin % while the color aligns with the distribution ranking from Figures 1 and 2. This visualization allows the viewer to understand both the rankings and relative performance at a glance.  It is easy to discern the delineation between above and below average (border between yellow and light green) as well as which companies are herding together.

One of the most powerful and essential aspects of business analytics is the ability to dimensionalize data so it can be sliced and diced. One (of many) reasons this is done is to be sure that there is an “apples to apples” comparison.  For instance, comparing the gross margin percent comparison between Qualcomm (QCOM), a semiconductor and telecommunications company, and Ross Stores (ROST), a discount department store, can create misconstrued distributions.  Filtering the visualization in Figure 3 by the NASDAQ industry classifications for Technology companies results in the following Treemap:

Figure 4 NASDAQ-100 Technologies Companies Treemap

dont-fear-statistics-part-2-figure-4

Notice that Qualcomm has slipped from “Moderately Positive” to “Moderately Negative.” Averages and standard deviations can change dramatically when looking at the components of the whole.  To demonstrate this, consider the following visualization comparing the average and deviation spread of the three largest categories (by number of companies) of the NASDAQ-100:

Figure 5 Average and Standard Deviation by Categories

dont-fear-statistics-part-2-figure-5

The border between yellow and light green represents the average while each band represents one standard deviation. Notice that the average gross margin % as well as the standard deviation is higher for Healthcare than for Technology.  Healthcare companies are going to skew the performance perspective of Technology companies.  This skew worsens when comparing against companies classified as Consumer Service.

As a general rule, a single point is not the best indicator of long term performance. Although the average and standard deviation for a single quarter was calculated through the agglomeration of one hundred companies, it should be considered a single data point.  Consider the following visualizations that show a comparative trend for four different companies for the entire date range downloaded:

Figure 6 Gross Margin % Trend for Adobe, Amazon, Electronic Arts, and Priceline

dont-fear-statistics-part-2-figure-6

At a glance, viewers can see that Adobe (upper left) consistently beats the average performance while consumer goods and technology giant Amazon (upper right) has been performing below average until recently. Electronic Arts (lower left), a video game developer, seems to have erratic gross margin % returns; however, looking past the noise, the company is nearly always between moderately positive and moderately negative when compared against other NASDAQ-100 companies.  Finally, Priceline (lower right) has been increasing gross margin % consistently and steadily pulling ahead of other NASDAQ-100 companies.  If Priceline’s gross margin % trend continues and the performance of the other companies remains constant, Priceline will move into the “Extremely Positive” gross margin % ranking in Q4 2016 or Q1 2017.

Returning to the questions posed at the beginning of this post:

  • The average cost of a gallon of milk is $3.20 with a standard deviation of $0.08. The corner grocery store selling milk for $3.44 is three standard deviations above the average!
  • The average NFL base salary is $1.86 million with a standard deviation of $2.80 million. Comparatively, Marcus Mariota’s $5.50 million salary is one standard deviation above average. However, with the average quarterback base salary being $5.69 million with a standard deviation of $7.17 million, he is actually minimally undercompensated.

For the final question, we ask the reader to evaluate his enterprise:

  • Calculate the average gross margin percent for your company’s business units for the quarter and find the business unit that is approximately 10% less than that average. Are they truly underperforming? Are you able to properly classify these business units to gain the greatest insight into relative performance?

Average and standard deviation can be applied to any metric by which a company wishes to evaluate itself. It can be used in combination with external data to create industry benchmarks.  For instance, if you were to plot your company’s gross margin % performance against the trends above, how would it look?

We want to close this post with the same idea that we closed Part 1 of the “Don’t Fear the Statistics” post: statistical analytics is part science/technology and part art.  Reducing statistical calculations to consumable visualizations is the key.  In the visualizations above, references to “standard deviation” were diligently omitted in favor of familiar terms such as “Moderately Negative.”  Approaches such as this help with change management, adoption, and the acceleration from simple reporting to true analytical insight into business process improvement based on data.

Don’t Fear the Statistics – Using OBI for Statistical Analysis Part 1

Recently, Ranzal has been working with a client in the healthcare space implementing Oracle Business Intelligence (OBI), and a requirement surfaced to translate a scorecard report into an OBI dashboard. One of the data elements was simply captioned “Trend” and colored red, yellow, and green.  It was discovered that this Trend was the slope of a linear regression plot (more on what that means in a moment) and the color was based on an arbitrarily chosen number.  This immediately raised some concerns from the Ranzal team who then made some suggestions for more pertinent statistical analysis.

To set the stage, this healthcare client’s summarized (and greatly simplified) income statement divides Revenue into Inpatient and Outpatient and Expenses into Total Labor and Non Labor. Revenue and expenses are the primary focus of much of the analytics at an aggregate level.  A single (seemingly arbitrarily chosen) number was used to determine the colored flags for each of these measures.  This was despite Inpatient Revenue and Non Labor Expenses comprising the majority of the revenue and expense amounts (respectively).  If we were to plot out these categories for the first five months of a fiscal year, we see the following (all data have been altered to preserve client confidentiality without overly affecting the overall analytic output):

figure-1

Figure 1 Revenue and Expense Trend Plot

The trouble with plotting a trend of numbers is that it is sometimes difficult to understand, at a glance, how the organization is performing. In the plots above, clear downward and upward trends can be seen for Inpatient Revenue and Total Labor Expense (respectively).  However, upon closer examination of Outpatient Revenue and Non Labor Expense, there are two upward trending months and two downward trending months.  The overall trend is difficult to discern.

With the introduction of Oracle Business Intelligence Enterprise Edition (OBIEE)12c, a Trendline function was introduced that allows the creation of a linear regression trendline. Once this is applied, the above trend plots can be augmented to get a clearer picture of performance:

figure-2

Figure 2 Revenue and Expense Linear Regression

This trendline uses a simple linear regression formula that is comprised as the slope (commonly represented by the letter m) and Intercept (commonly represented by the letter b) in the following formula:

y = mx + b

In our trend plots, the letter y represents the revenue and expense categories and x represents the fiscal periods.

The intercept is where the trendline crosses the y-axis when x is equal to zero. For most statistical analyses, the intercept is unimportant.  The slope can be thought of the average change over the two parameters.  Using OBI, the slope of each revenue and expense category can be calculated and the dashboard updated:

figure-3

Figure 3 Linear Regression Slope

In the example above, the slope of the Inpatient Revenue can be thought as decreasing an average of $291,000 a month.

One issue with using the slope is that it is subjective. As was mentioned, our healthcare client had chosen a single arbitrary slope for each of the revenue and expense categories.  The slopes in the example above range from 29 thousand to -291 thousand.  Complicating matters, the client wanted the ability to run these Analysis for individual hospitals which can dramatically affect the slope.  For instance, a hospital operating in Kansas City will probably not have the same revenue growth (or shrinkage) as a hospital operating in New York City.  To use the slope as a quantifiable objective properly, a target slope would have to be determined for the enterprise and at each granular level expected to be benchmarked (hospital, department, etc.).  This creates some obvious maintenance issues.

A more objective approach is to use the correlation coefficient, a number on a range from negative one to positive one. A correlation ranking of one indicates a positive correlation while a ranking of negative one indicates a negative correlation.  For instance, for most companies, the number of units sold is often has a high degree of positive correlation to revenue.  This would correspond to a correlation coefficient of close to one.  For many companies working in the commodities market, the more competitor’s revenue increases, the lower the possible market share.  This would be a negative correlation and result in a correlation coefficient calculation of negative one.  A correlation coefficient of zero indicates a lack of any correlation.  For instance, the number of broken arms set in a New York hospital is probably uncorrelated to the number of bowls of soup served by Panera Bread in Kansas City.

It is worth noting that correlation does not mean causation. For example, consider the number of pirate attacks and users of Microsoft Internet Explorer (IE) users:

figure-4

Figure 4 IE Usage and Pirate Attacks

The number of pirate attacks and IE users have both been in decline since 2009. As can be seen by the scatter graph on the right, the more pirate attacks, the greater the use of IE.  Regardless, naval security experts are probably not asking for adoption rate reports from Microsoft.

Returning to the client’s use case, adding the correlation coefficient to the dashboard provides a greater understanding of how the company is objectively performing:

figure-5

Figure 5 Month and Revenue / Expense Category Figure Correlation

Inpatient Revenue has a correlation of -0.69, which is moderately significant for a metric most businesses want to increase. Conversely, the Outpatient Revenue has a slightly negative correlation of -0.36.  While this should be a cause for concern, a “wait and see” approach (or deeper dive into Outpatient Revenue Categories) might be more prudent.  Because the range of the correlation coefficient is negative one to one, filtering this analysis down to a more granular level, such as a hospital or department, will return an objective number that can be subjected to independent interpretation.

There are cases in which the subjectivity of the slope is particularly useful. In the case of our client, a full year budget was prepared at the beginning of the fiscal year and periodically updated as the year progressed. The slope of this budget could be used to generate the average dollar change desired per month.  The advantage of this is that it reduces the possible volatility of a particular month into a single number that can be compared to the benchmark.  As a final addition to the dashboard, a full year budget slope was added:

figure-6

Figure 6 Full Year Budget Slope

With the exception of Non Labor Expenses, this organization is missing the mark on all of their budgetary goals, and the trend indicated by the actual slope and correlation coefficient means this situation is likely to get worse.

A word of warning about statistics in general and the use of slope and correlation coefficient in particular: micro and macro trends can should be considered and extreme outliers can mask actual trends.

For an example of micro and macro trends, consider JCPenney, a retailor that has been struggling since 2010. The following visualization (created using Oracle Data Visualization Desktop) charts the quarterly revenue from 2004 Q3 to 2016 Q4 along with the trendline for the entire period.  The bars represent the correlation coefficient to that particular quarter (i.e. the first bar is the correlation between 2004 Q3 and 2004 Q4 while the second bar is the correlation between 2004 Q3, 2004 Q4, and 2005 Q1, etc.):

figure-7

Figure 7 JCPenney Revenue Trend and Correlation

Notice that the first correlation bar is equal to one. When there are only two data points, the correlation coefficient will be equal to one, negative one, or zero.  The next data point and correlation for 2005 Q1 (JCPenney recognizes holiday revenue in Q1 of each year) continues the high correlation streak, however, the following quarter drops the correlation down to 0.35.  The correlation fluctuates quarterly until about 2012 Q2 when the definite downward trend is established.

A savvy analyst will break JCPenney’s performance during this time range into three distinct trends. Upward trending from 2004 to 2008 Q1, diminished upward trend from 2008 Q2 to 2012 Q1, and then a flat, but greatly reduced revenue from there:

figure-8

Figure 8 JCPenney Distinct Trends

As an example of how an extreme outlier can affect statistical analysis, consider GTx Incorporated, a pharmaceutical drug developer. In December 2010, GTx recognized $49.9 million dollars in revenue from a partnership with Merck& Co., Inc., which spiked GTx’s revenue (previously averaging $2 million a quarter) to $56.7 million dollars:

figure-93

Figure 9 GTx Incorporated Revenue Trend

In the visualization above, the orange projected trendline was calculated using revenue from 2004 Q1 through 2009 Q4. The purple trendline is the projected calculated using 2010 Q1, which includes the huge revenue spike.  Obviously, the orange trendline is the more accurate due exclusion of the extreme data point.

Statistical analytics is part science/technology and part art. As with any data and visualizations, a certain degree of intelligent interpretation is needed to determine what it all really means.  Functional users should be focused on what the various statistical interpretations mean and not be distracted on the complexity of the underlying mathematical functions.  Trend visualizations can aid users in understanding how to interpret these statistical calculations.  Many organizations miss opportunities because of individuals unwilling to embrace statistical methods due to the lack of solid education and guidance about what these numbers really mean.  Training, change management, and the creation of rich visualizations can help enterprises harness the capabilities of statistical analysis and extend the role of their business intelligence systems.

A Comparison of Oracle Business Intelligence, Data Visualization, and Visual Analyzer

We recently authored The Role of Oracle Data Visualizer in the Modern Enterprise in which we had referred to both Data Visualization (DV) and Visual Analyzer (VA) as Data Visualizer.  This post addresses readers’ inquiries about the differences between DV and VA as well as a comparison to that of Oracle Business Intelligence (OBI).  The following sections provide details of the solutions for the OBI and DV/VA products as well as a matrix to compare each solution’s capabilities.  Finally, some use cases for DV/VA projects versus OBI will be outlined.

For the purposes of this post, OBI will be considered the parent solution for both on premise Oracle Business Intelligence solutions (including Enterprise Edition (OBIEE), Foundation Services (BIFS), and Standard Edition (OBSE)) as well as Business Intelligence Cloud Service (BICS). OBI is the platform thousands of Oracle customers have become familiar with to provide robust visualizations and dashboard solutions from nearly any data source.  While the on premise solutions are currently the most mature products, at some point in the future, BICS is expected to become the flagship product for Oracle at which time all features are expected to be available.

Likewise, DV/VA will be used to refer collectively to Visual Analyzer packaged with BICS (VA BICS), Visual Analyzer packaged with OBI 12c (VA 12c), Data Visualization Desktop (DVD), and Data Visualization Cloud Service (DVCS). VA was initially introduced as part of the BICS package, but has since become available as part of OBIEE 12c (the latest on premise version).  DVD was released early in 2016 as a stand-alone product that can be downloaded and installed on a local machine.  Recently, DVCS has been released as the cloud-based version of DVD.  All of these products offer similar data visualization capabilities as OBI but feature significant enhancements to the manner in which users interact with their data.  Compared to OBI, the interface is even more simplified and intuitive to use which is an accomplishment for Oracle considering how easy OBI is to use.  Reusable and business process-centric dashboards are available in DV/VA but are referred to as DV or VA Projects.  Perhaps the most powerful feature is the ability for users to mash up data from different sources (including Excel) to quickly gain insight they might have spent days or weeks manually assembling in Excel or Access.  These mashups can be used to create reusable DV/VA Projects that can be refreshed through new data loads in the source system and by uploading updated Excel spreadsheets into DV/VA.

While the six products mentioned can be grouped nicely into two categories, the following matrix outlines the differences between each product. The following sections will provide some commentary to some of the features.

Table 1

Table 1:  Product Capability Matrix

Advanced Analytics provides integrated statistical capabilities based on the R programming language and includes the following functions:

  • Trendline – This function provides a linear or exponential plot through noisy data to indicate a general pattern or direction for time series data. For instance, while there is a noisy fluctuation of revenue over these three years, a slowly increasing general trend can be detected by the Trendline plot:
Figure 1

Figure 1:  Trendline Analysis

 

  • Clusters – This function attempts to classify scattered data into related groups. Users are able to determine the number of clusters and other grouping attributes. For instance, these clusters were generated using Revenue versus Billed Quantity by Month:
Figure 2

Figure 2:  Cluster Analysis

 

  • Outliers – This function detects exceptions in the sample data. For instance, given the previous scatter plot, four outliers can be detected:
Figure 3

Figure 3:  Outlier Analysis

 

  • Regression – This function is similar to the Trendline function but correlates relationships between two measures and does not require a time series. This is often used to help create or determine forecasts. Using the previous Revenue versus Billed Quantity, the following Regression series can be detected:
Figure 4

Figure 4:  Regression Analysis

 

Insights provide users the ability to embed commentary within DV/VA projects (except for VA 12c). Users take a “snapshot” of their data at a certain intersection and make an Insight comment.  These Insights can then be associated with each other to tell a story about the data and then shared with others or assembled into a presentation.  For those readers familiar with the Hyperion Planning capabilities, Insights are analogous to Cell Comments.  OBI 12c (as well as 11g) offers the ability to write comments back to a relational table; however, this capability is not as flexible or robust as Insights and requires intervention by the BI support team to implement.

Figure 5

Figure 5:  Insights Assembled into a Story

 

Direct connections to a Relational Database Management System (RDBMS) such as an enterprise data warehouse are now possible using some of the DV/VA products. (For the purpose of this post, inserting a semantic or logical layer between the database and user is not considered a direct connection).  For the cloud-based versions (VA BICS and DVCS), only connections to other cloud databases are available while DVD allows users to connect to an on premise or cloud database.  This capability will typically be created and configured either by the IT support team or analysts familiar with the data model of the target data source as well as SQL concepts such as creating joins between relational tables.  (Direct connections using OBI are technically possible; however, they require the users to manually write the SQL to extract the data for their analysis).  Once these connections are created and the correct joins are configured between tables, users can further augment their data with data mashups.  VA 12c currently requires a Subject Area connected to a RDBMS to create projects.

Leveraging OLAP data sources such as Essbase is currently only available in OBI 12c (as well as 11g) and VA 12c. These data sources require that the OLAP cube be exposed as a Subject Area in the Presentation layer (in other words, no direct connection to OLAP data sources).  OBI is considered very mature and offers robust mechanisms for interacting with the cube, including the ability to use drillable hierarchical columns in Analysis.  VA 12c currently exposes a flattened list of hierarchical columns without a drillable hierarchical column.  As with direct connections, users are able to mashup their data with the cubes to create custom data models.

While the capabilities of the DV/VA product set are impressive, the solution currently lacks some key capabilities of OBI Analysis and Dashboards. A few of the most noticeable gaps between the capabilities of DV/VA and OBI Dashboards are the inability to:

  • Create the functional equivalent of Action Links which allows users to drill down or across from an Analysis
  • Schedule and/or deliver reports
  • Customize graphs, charts, and other data visualizations to the extent offered by OBI
  • Create Alerts which can perform conditionally-based actions such as pushing information to users
  • Use drillable hierarchical columns

At this time, OBI should continue to be used as the centerpiece for enterprise-wide analytical solutions that require complex dashboards and other capabilities. DV/VA will be more suited for analysts who need to unify discrete data sources in a repeatable and presentation-friendly format using DV/VA Projects.  As mentioned, DV/VA is even easier to use than OBI which makes it ideal for users who wish to have an analytics tool that rapidly allows them to pull together ad hoc analysis.  As was discussed in The Role of Oracle Data Visualizer in the Modern Enterprise, enterprises that are reaching for new game-changing analytic capabilities should give the DV/VA product set a thorough evaluation.  Oracle releases regular upgrades to the entire DV/VA product set, and we anticipate many of the noted gaps will be closed at some point in the future.

Upgrading Oracle Business Intelligence to 12c

Ranzal was recently invited to participate in a number of chalk talks for the Healthcare Industry User Group (HIUG) in San Antonio, TX. One of these chalk talks covered how an organization should prepare for and execute an upgrade to Oracle Business Intelligence (OBI) 12c.  Since the technical steps are already covered in numerous blog posts as well as Oracle documentation, our conversation focused on a strategic approach to the upgrade.  Our conversation essentially came down to four topics:

  1. An overview of the simplicity of the upgrade from 11g to 12c
  2. Organizational mindset while preparing for the upgrade
  3. The new technical infrastructure and the implications for the organization
  4. How to introduce users to the new features and how this might impact governance

We wanted to formally document this lively and fast-paced discussion to help other organizations as well as the HIUG chalk talk participants who were furiously scribbling notes and may not have had the opportunity to take it all in.

We started our discussion with how relatively easy Oracle has made this upgrade. For those of you who experienced the difficult and buggy process of upgrading from OBI 10g to 11g and may be dreading the upgrade to 12c, we have some advice:  relax.  First, the upgrade to 12c is an “in place upgrade” which means your 11g environment remains intact while the metadata and configuration gets “lifted and shifted” into 12c.  Speaking of “lift and shift,” 12c comes with a tool that extracts the metadata and much of the configuration from 11g into one tidy package that is then pushed into 12c.  There is a small amount of manual configuration that has to occur; however, this will only slow down customers with highly customized environments.  Once this lift and shift has occurred, an Oracle validation tool checks that your Dashboards and Analysis are working and alerts you to potential issues.  While there are bugs with 12c (what new or old software does not have bugs?), we have not found any major issues that will cause a full stop for an organization.

So how does an organization prepare for this upgrade? First, we encourage clients to view this upgrade as an opportunity to “clean house,” especially for customers that have been building OBI assets for years.  Used properly, analytic tools lend themselves to experimentation and evolution.  Experimentation can result in partially formed or broken logical objects such as presentation columns and facts, Analysis, or entire Dashboards.  The developers of these objects have the best intention of coming back to either fix or complete development on these objects or, at the very least, delete them.  Developers are busy people though, and eventually the existence of these objects is forgotten.  Evolution of both the tool and the focus on analytics results in objects becoming stale and/or obsolete.  Take this opportunity to clean house on your OBI environment.  Run the consistency check in the BI Administration tool and resolve those lingering issues.  Evaluate your usage statistics and determine if unused Analysis and Dashboards are still needed.  Fix or discard broken Analysis and Dashboards.  As with any technology tool, have a process in place to document, communicate, archive, backup, and restore, if necessary.

The most substantial change to come with OBI 12c is the underlying technical architecture. Fusion Middleware and Enterprise Management has a new look and feel.  Additionally, some actions are no longer performed within these tools.  For instance, deploying the RPD is no longer done through Enterprise Manager (in fact, the RPD is now the BAR file).  The directory has significantly changed which is a bit unfortunate for those of us who like to go directly to the log files in the directory rather than relying on Enterprise Manager and Session Manager.  Finally, many of the RPD . . . that is . . . BAR functions, such as deploying and copying, are done through a new command line interface.  So, while most of your users will be able to log into 12c and quickly adapt to the new look and feel, your OBI support team will have some learning to do.  Again, the goal of this post is to provide an overarching upgrade strategy, so we will not delve any deeper into these changes.  There is plenty of quality content online regarding these changes and you should always review Oracle documentation before performing implementation or upgrades.

End users will initially feel that, with the exception of a new look and feel, nothing much has changed. However, with new graphing options, statistical analytics capabilities based on R, as well as Visual Analyzer, users have an opportunity to expand their analytical capacity.  The organizational challenge is to go back to the change management playbook that was used when OBI was initially introduced, re-evaluate, and update so that end users can get the most out of this upgrade.  Evaluate how to train users on where to properly use the new graphs and charts.  Determine (or re-determine) who your power users are who need or want the new statistical capabilities.  Review existing Dashboards and Analysis and make appropriate upgrades.

Potentially the biggest challenge will be evaluating and understanding the capabilities of the new Visual Analyzer tool which, among other features, allows you to perform data mashups. This new tool will require that your organization determines some use cases and user groups as well as some additional training.  While users uploading data into the OBI system and combining it with existing data models opens up entirely new possibilities for insight, it also creates a governance challenge.  How do you separate and maintain the organizational “one version of the truth” while encouraging and properly promoting new analytic insight?  How will security be handled and users trained to adhere to this model?  How will you handle the archiving and deletion of potentially huge numbers of Excel spreadsheets uploaded onto the OBI server?  While this all sounds intimidating, keep in mind that your organization has already been through these exercises once during the original OBI implementation.  Adapt your existing knowledge.

Thus far, Ranzal has had a positive experience with the 12c upgrade. The underlying technical architecture has resulted in some real gains in performance, especially when leveraging EPM as a data source.  The upgrade is well thought out and simple, especially if you go through a system checkup and resolve issues.  While your technical BI support team will have some homework and learning to do to continue to fill that role, your users will be able to jump right into using 12c.  Despite this ease of user adoption, be sure to have a change management plan in place and take advantage of the new features and capabilities of 12c.

If you are thinking of doing an upgrade and have questions, feel free to reach out to us. Also, keep an eye out for an upcoming webcast on upgrading to 12c with an interactive question and answer session.

Oracle Business Intelligence EPM and Relational Federation – A Strategic Approach

The federation of EPM and relational data sources through Oracle Business Intelligence (OBI) seems straightforward: import the cube, federate and rename, expose it all, and create dashboards and analysis. Due to the technical simplicity of EPM and relational federation, many organizations underestimate the amount of effort needed to implement an OBI solution that properly leverages and extends the capabilities of the EPM and relational data sources.  The OBI implementation process should not be an afterthought, especially if OBI is to be the primary method by which users consume organizational data.  We have assembled ten “Dos and Don’ts” that cover the full lifecycle implementation to help organizations get the most out of their OBI solution.

Do – Design and develop the data sources with input from the OBI implementation team

Especially in implementations where OBI is to be the primary method of consuming data, the OBI implementation team should have been heavily involved in Dashboard and Analysis requirements and design. As such, this team will have the knowledge of what data structure is needed to support an efficient and easy-to-use analytic solution.  Asking the OBI implementation team to come in after the data model has been set and create Dashboards and Analysis will often result in workarounds that are error prone, difficult to maintain, and challenging or impossible to scale.

Don’t – View OBI as a one-size-fits-all analytics and reporting tool for the organization

OBI is a powerful and versatile tool capable of addressing a slew of needs; however, it is not a magic bullet. Depending on the application and needs of the organization, Smart View, Financial Reporting, and even BI Publisher have their places in the organization.  Attempting to replicate the capabilities of other analytic and reporting tools through OBI may provide the illusion of capability, but will fall short of user expectations and possibly harm adoption by the rest of the organization.

Do – Have a metadata management process in place before federating data sources

We discussed the rationale of this best practice thoroughly in the post Oracle Business Intelligence – Synchronizing Hierarchical Structures to Enable Federation. To summarize, unsynchronized hierarchical structures between data sources can result in analysis with outcomes that are irreconcilable, seemingly reorganize while drill down or up, display erroneously shared members, or simply result in errors in OBI.  A centralized process for managing this metadata as well as ensuring that all relevant data sources are updated simultaneously is imperative when federating data sources.

Don’t – Treat OBI as a metadata or master data management tool

This is typically a symptom of not having the OBI implementation team involved during the design of the data models. As a result of this misalignment, clients attempt to shoehorn analysis into the data model by using the BI Administration tool (RPD) to excessively manipulate the data model.  Properly leveraged, the BI Administration tool can create an agile analytics solution; however, relying on this tool to fill large gaps between the data model and analytics will result in performance and maintenance issues.

Do – Define a use case, user community, and requirements for all implementations

From proof of concept to full implementations, having the right people involved is imperative. Within your organization:  Who understands the reporting and analytic needs and gaps?  Who understands where the data is coming from?  Who understands what capabilities are needed?  Who is positioned to help user adoption?  Who is asking questions that the organization is struggling to answer?  Any technology implementation that is done with the intent to “throw it against a wall and hope something sticks” is destined to fail; OBI is no different.

Don’t – Expect that users will flock to OBI if EPM is the only data available

We find that when there are both EPM and relational data sources, EPM is often the first to be implemented and exposed through OBI. During these implementations, users are extensively exposed to Smart View and finance users become especially enamored with the tool and struggle to immediately see value in OBI.  A Pavlovian response is to simply federate the EPM cube’s relational data source which typically provides a lower level of detail (or granularity).  While this is sometimes useful to users, it is still not providing the additional insight users cannot readily get elsewhere.  Federating additional data sources with EPM cubes should provide additional attributes or measures or provide a simple path to jump from one organizational view of the data to another.  For instance, a financial consolidation EPM cube federated with an operational relational data source provides an easy-to-use analytical solution for managers with responsibilities that straddle both worlds.  These users will quickly adopt OBI and help with future user adoption.

Do – Empower the users

Guided analysis through Dashboards, Analysis, Alerts, and Scorecards is a powerful tool; however, an organization will never address every scenario through this method. Guided analysis should be an introduction to OBI for users which should quickly be developed into self-service.  Within a few months of rolling out the OBI solution, power users should be assembling ad hoc analysis and putting together their own dashboards.  Within a year, most users should be answering basic questions on their own.  Organizations that empower users are not only improving the ROI on OBI, but they are also more agile in addressing changing business landscapes, accelerating user adoption, and reducing the load on (often) overburdened IT organizations.

Don’t – Neglect the performance of any data sources

The demand for data is the epitome of just-in-time logistics. Especially when users are empowered, many organizations find that their data sources and caching strategies are not sufficient for how users are actually leveraging the data.  EPM and relational data sources both have performance monitoring capabilities that should be frequently evaluated during the months after initial rollout and periodically evaluated thereafter and any deficiencies addressed.  Failing to address performance issues will result in users abandoning and circumventing the analytic tool, resulting in loss of productivity and data quality issues.

Do – Pivot to using OBI as an analytics tool instead of simply another reporting tool

Tabular reporting is typically (and should be) the first use for OBI that clients turn to, but this should be viewed as an insertion point and not the final rally point. With capabilities such as graphs, heat matrices, treemaps, gauges, alerts, and trellis, pivoting from reporting to analytics should be the goal.  Answering business-critical questions, quickly understanding the business landscape, and gaining insight is where the true value of OBI lies.  Simply leveraging OBI as another reporting solution is severely handicapping the tool’s return on investment.

Don’t – Let OBI data sources become static

Analytics is one of the few tools that simultaneously changes a business in a deliberate and serendipitous manner. A well-led and strategically executed analytics program can have a lasting contribution to an organization’s goals.  At the same time, users will develop new skills and capabilities as they become familiar with both the tool and the data and begin to ask new questions.  As both the competitive landscape changes and organizational capabilities expand, data models should be evolving to address these new needs.  OBI has the ability to easily expose, slice and dice, and visualize data to answer these questions; the challenge is to not become complacent in providing new data resources to users.

If OBI is to play a role in your organization’s analytic strategy, it should not be an afterthought. Involving implementation team members with the knowledge of OBI’s capabilities from the start can help ease implementation during the later phases, accelerate user adoption, and increase the long term ROI.  Edgewater Ranzal has both the technical and functional implementation experience with OBI to help you evaluate, adjust, and execute your analytic strategy according to these ten “Dos and Don’ts.”

Oracle Business Intelligence – Synchronizing Hierarchical Structures to Enable Federation

More and more Oracle customers are finding value in federating their EPM cubes with existing relational data stores such as data marts and data warehouses (for brevity, data warehouse will refer to all relational data stores). This post explains the concept of federation, explores the consequences of allowing hierarchical structures to get out of synchronization, and shares options to enable this synchronization.

In OBI, federation is the integration of distinct data sources to allow end users to perform analytical tasks without having to consider where the data is coming from. There are two types of federation to consider when using EPM and data warehouse sources:  vertical and horizontal.  Vertical federation allows users to drill down a hierarchy and switch data sources when moving from an aggregate data source to a more detailed one.  Most often, this occurs in the Time dimension whereby the EPM cube stores data for year, quarter, and month, and the relational data sources have details on daily transactions.  Horizontal federation allows users to combine different measures from the distinct data sources naturally in an OBI analysis, rather than extracting the data and building a unified report in another tool.

Federation makes it imperative that the common hierarchical structures are kept in sync. To demonstrate issues that can occur during vertical federation when the data sources are not synchronized, take the following hierarchies in an EMP application and a data warehouse:

Figure 1: Unsynchronized Hierarchies

Jason Hodson Blog Figure 1.jpg

Notice that Colorado falls under the Western region in the EPM application, but under the Southwestern region in the data warehouse. Also notice that the data warehouse contains an additional level (or granularity) in the form of cities for each region.  Assume that both data sources contain revenue data.  An OBI analysis such as this would route the query to the EPM cube and return these results:

Figure 2: EPM Analysis – Vertical Federation

Jason Hodson Blog Figure 2

However, if the user were to expand the state of Washington to see the results for each city, OBI would route the query to the data warehouse. When the results return, the user would be confronted with different revenue figures for the Southwest and West regions:

Figure 3: Data Warehouse – Vertical Federation

Jason Hodson Blog Figure 3

When the hierarchical structures are not aligned between the two data sources, irreconcilable differences can occur when switching between the sources. Many times, end users are not aware that they are switching between EPM and a data warehouse, and will simply experience a confusing reorganization in their analysis.

To demonstrate issues that occur in horizontal federation, assume the same hierarchies as in Figure 1 above, but the EPM application contains data on budget revenue while the data warehouse contains details on actual revenue. An analysis such as this could be created to query each source simultaneously and combine the budget and actual data along the common dimension:

Figure 4: Horizontal Federation

Jason Hodson Blog Figure 4

However, drilling into the West and Southwest regions will result in Colorado becoming an erroneously “shared” member:

Figure 5: Colorado as a “Shared” Member

Jason Hodson Blog Figure 5

In actuality, the mocked up analysis above would more than likely result in an error since OBI would not be able to match the hierarchical structures during query generation.

There are a number of options to enable the synchronization of hierarchical structures across EPM applications and data warehouses. Many organizations are manually maintaining their hierarchical structures in spreadsheets and text files, often located on an individual’s desktop.  It is possible to continue this manual maintenance; however, these dispersed files should be centralized, a governance processes defined, and the EPM metadata management and data warehouse ETL process redesigned to pick up these centralized files.  This method is still subject to errors and is inherently difficult to properly govern and audit.  For organizations that are already using Enterprise Performance Management Architect (EPMA), a scripting process can be implemented that extracts the hierarchical structures in flat files.  A follow on ETL process to move these hierarchies into the data warehouse will also have to be implemented.

The best practices solution is to use Hyperion Data Relationship Management (DRM) to manage these hierarchical structures. DRM boasts robust metadata management capabilities coupled with a system-agnostic approach to exporting this metadata.  DRM’s most valuable export method allows pushing directly to a relational database.  If a data warehouse is built in tandem with an EPM application, DRM can push directly to a dimensional table that can then be accessed by OBI.  If there is a data warehouse already in place, existing ETL processes may have to be modified or a dimensional table devoted to the dimension hierarchy created.  Ranzal has a DRM accelerator package to enable the synchronization of hierarchical structures between EPM and data warehouses that is designed to work with our existing EPM application DRM implementation accelerators.  Using these accelerators, Ranzal can perform an implementation in as little as six weeks that provides metadata management for the EPM application, establishes a process for maintaining hierarchical structure synchronization between EPM and the data warehouse, and federation of the data source.

While the federation of EPM and data warehouse sources has been the primary focus, it is worth noting that two EPM cubes or two data warehouses could be federated in OBI. For many of the reasons discussed previously, data synchronization processes will have to be in place to enable this federation.  The previous solutions for maintaining metadata synchronization may be able to be adapted to enable this federation.

The federation of EPM and data warehouse sources allows an enterprise to create a more tightly integrated analytical solution. This tight integration allows users to transverse the organization’s data, gain insight, and answer business essential questions at the speed of thought.  As demonstrated, mismanaging hierarchical structures can result in an analytical solution that produces unexpected results that can harm user confidence.  Enterprise solutions often need enterprise approaches to governance; therefore, it is often imperative to understand and address shortcomings in hierarchical structure management.  Ranzal has a deep knowledge of EPM, DRM, and OBIEE, and how these systems can be implemented to tightly work together to address an organization’s analytical and reporting needs.

Leveraging Your Organization’s OBI Investment for Data Discovery

Coupling disparate data sets into meaningful “mashups” is a powerful way to test new hypotheses and ask new questions of your organization’s data.  However, more often than not, the most valuable data in your organization has already been transformed and warehoused by IT in order to support the analytics needed to run the business.  Tools that neglect these IT-managed silos don’t allow your organization to tell the most accurate story possible when pursuing their discovery initiatives.  Data discovery should not focus only on the new varieties of data that exist outside your data warehouse.  The value from social media data and machine generated data cannot be fully realized until it can be paired with the transactional data your organization already stockpiles.

Judging by the heavy investment in a new “self-service” theme in the recently released version 3.1 of Endeca Information Discovery, this truth has not been lost on Oracle.

Companies that are eager to get into the data discovery game, yet are afraid to walk away from the time and effort they’ve poured into their OBI solution, can breathe a little easier.  Oracle has made the proper strides in the Endeca product to incorporate OBI into the discovery experience.

And unlike other discovery products on the market today, the access to these IT-managed repositories (like OBI) is centrally managed.  By controlling access to the data and keeping all data “on the platform”, this centralized management allows IT to avoid the common “spreadmart” problem that plagues other discovery products.

Rather than explain how OBI has been introduced into the discovery experience, I figured I would show you.  Check out this short 4 minute demonstration which illustrates how your organization can build their own data “mashups” leveraging the valuable data tied up in OBI.

 

 

Chances are that a handful of these tested hypotheses will unlock new ways to measure your business.  These new data mashups will warrant permanent applications that are made available to larger audiences within your organization.  The need for more permanent applications will require IT to “operationalize” your discovery application — introducing data updates, security, and properly sized hardware to support the application.

For these IT-provisioned applications, Oracle has also provided some tooling in Endeca to make the job more straightforward.  Specifically, when it comes to OBI, the product now boasts a wizard that will produce a Integrator project with all of the plumbing necessary to pull data tied up in OBI into a discovery application in minutes.  Check out this video to see how:

 

 

It is product investments like these that will allow organizations to realize the transformative effects data discovery can have on their business without having to ignore the substantial BI investments already in place.

As always, please direct any questions or comments to [at] ranzal.com.