Oracle Business Intelligence Cloud Service (BICS) September Update

The latest upgrade for BICS happened last week and, while there are no new end user features, it is now easier to integrate data. New to this version is the ability to connect to JDBC data sources through the Data Sync tool.  This allows customers to set up automated data pulls from Salesforce, Redshift, and Hive among others.  In addition to these connections, Oracle RightNow CRM customers have the ability to pull directly from RightNow reports using Oracle Data Sync.  Finally, connections to on premise databases and BICS can be secured using Secure Socket Layer (SSL) certifications.

After developing a customer script using API calls to pull data from Salesforce, I am excited about the ability to connect directly to Salesforce with Data Sync. Direct connections to the Salesforce database allows you to search and browse for relevant tables and import the definitions with ease:


Once the definitions have been imported, standard querying clauses can create the ability to include only relevant data, perform incremental ETLs, and further manipulate the data.

While there are no new features for end users, this is a powerful update when it comes to data integration. Using APIs to extract data from Salesforce meant that each extraction query had to be written by hand which was time consuming and prone to error.  With these new data extraction processes, BICS implementations and integrating data becomes much faster, furthering the promise of Oracle Cloud technologies.

A Comparison of Oracle Business Intelligence, Data Visualization, and Visual Analyzer

We recently authored The Role of Oracle Data Visualizer in the Modern Enterprise in which we had referred to both Data Visualization (DV) and Visual Analyzer (VA) as Data Visualizer.  This post addresses readers’ inquiries about the differences between DV and VA as well as a comparison to that of Oracle Business Intelligence (OBI).  The following sections provide details of the solutions for the OBI and DV/VA products as well as a matrix to compare each solution’s capabilities.  Finally, some use cases for DV/VA projects versus OBI will be outlined.

For the purposes of this post, OBI will be considered the parent solution for both on premise Oracle Business Intelligence solutions (including Enterprise Edition (OBIEE), Foundation Services (BIFS), and Standard Edition (OBSE)) as well as Business Intelligence Cloud Service (BICS). OBI is the platform thousands of Oracle customers have become familiar with to provide robust visualizations and dashboard solutions from nearly any data source.  While the on premise solutions are currently the most mature products, at some point in the future, BICS is expected to become the flagship product for Oracle at which time all features are expected to be available.

Likewise, DV/VA will be used to refer collectively to Visual Analyzer packaged with BICS (VA BICS), Visual Analyzer packaged with OBI 12c (VA 12c), Data Visualization Desktop (DVD), and Data Visualization Cloud Service (DVCS). VA was initially introduced as part of the BICS package, but has since become available as part of OBIEE 12c (the latest on premise version).  DVD was released early in 2016 as a stand-alone product that can be downloaded and installed on a local machine.  Recently, DVCS has been released as the cloud-based version of DVD.  All of these products offer similar data visualization capabilities as OBI but feature significant enhancements to the manner in which users interact with their data.  Compared to OBI, the interface is even more simplified and intuitive to use which is an accomplishment for Oracle considering how easy OBI is to use.  Reusable and business process-centric dashboards are available in DV/VA but are referred to as DV or VA Projects.  Perhaps the most powerful feature is the ability for users to mash up data from different sources (including Excel) to quickly gain insight they might have spent days or weeks manually assembling in Excel or Access.  These mashups can be used to create reusable DV/VA Projects that can be refreshed through new data loads in the source system and by uploading updated Excel spreadsheets into DV/VA.

While the six products mentioned can be grouped nicely into two categories, the following matrix outlines the differences between each product. The following sections will provide some commentary to some of the features.

Table 1

Table 1:  Product Capability Matrix

Advanced Analytics provides integrated statistical capabilities based on the R programming language and includes the following functions:

  • Trendline – This function provides a linear or exponential plot through noisy data to indicate a general pattern or direction for time series data. For instance, while there is a noisy fluctuation of revenue over these three years, a slowly increasing general trend can be detected by the Trendline plot:
Figure 1

Figure 1:  Trendline Analysis


  • Clusters – This function attempts to classify scattered data into related groups. Users are able to determine the number of clusters and other grouping attributes. For instance, these clusters were generated using Revenue versus Billed Quantity by Month:
Figure 2

Figure 2:  Cluster Analysis


  • Outliers – This function detects exceptions in the sample data. For instance, given the previous scatter plot, four outliers can be detected:
Figure 3

Figure 3:  Outlier Analysis


  • Regression – This function is similar to the Trendline function but correlates relationships between two measures and does not require a time series. This is often used to help create or determine forecasts. Using the previous Revenue versus Billed Quantity, the following Regression series can be detected:
Figure 4

Figure 4:  Regression Analysis


Insights provide users the ability to embed commentary within DV/VA projects (except for VA 12c). Users take a “snapshot” of their data at a certain intersection and make an Insight comment.  These Insights can then be associated with each other to tell a story about the data and then shared with others or assembled into a presentation.  For those readers familiar with the Hyperion Planning capabilities, Insights are analogous to Cell Comments.  OBI 12c (as well as 11g) offers the ability to write comments back to a relational table; however, this capability is not as flexible or robust as Insights and requires intervention by the BI support team to implement.

Figure 5

Figure 5:  Insights Assembled into a Story


Direct connections to a Relational Database Management System (RDBMS) such as an enterprise data warehouse are now possible using some of the DV/VA products. (For the purpose of this post, inserting a semantic or logical layer between the database and user is not considered a direct connection).  For the cloud-based versions (VA BICS and DVCS), only connections to other cloud databases are available while DVD allows users to connect to an on premise or cloud database.  This capability will typically be created and configured either by the IT support team or analysts familiar with the data model of the target data source as well as SQL concepts such as creating joins between relational tables.  (Direct connections using OBI are technically possible; however, they require the users to manually write the SQL to extract the data for their analysis).  Once these connections are created and the correct joins are configured between tables, users can further augment their data with data mashups.  VA 12c currently requires a Subject Area connected to a RDBMS to create projects.

Leveraging OLAP data sources such as Essbase is currently only available in OBI 12c (as well as 11g) and VA 12c. These data sources require that the OLAP cube be exposed as a Subject Area in the Presentation layer (in other words, no direct connection to OLAP data sources).  OBI is considered very mature and offers robust mechanisms for interacting with the cube, including the ability to use drillable hierarchical columns in Analysis.  VA 12c currently exposes a flattened list of hierarchical columns without a drillable hierarchical column.  As with direct connections, users are able to mashup their data with the cubes to create custom data models.

While the capabilities of the DV/VA product set are impressive, the solution currently lacks some key capabilities of OBI Analysis and Dashboards. A few of the most noticeable gaps between the capabilities of DV/VA and OBI Dashboards are the inability to:

  • Create the functional equivalent of Action Links which allows users to drill down or across from an Analysis
  • Schedule and/or deliver reports
  • Customize graphs, charts, and other data visualizations to the extent offered by OBI
  • Create Alerts which can perform conditionally-based actions such as pushing information to users
  • Use drillable hierarchical columns

At this time, OBI should continue to be used as the centerpiece for enterprise-wide analytical solutions that require complex dashboards and other capabilities. DV/VA will be more suited for analysts who need to unify discrete data sources in a repeatable and presentation-friendly format using DV/VA Projects.  As mentioned, DV/VA is even easier to use than OBI which makes it ideal for users who wish to have an analytics tool that rapidly allows them to pull together ad hoc analysis.  As was discussed in The Role of Oracle Data Visualizer in the Modern Enterprise, enterprises that are reaching for new game-changing analytic capabilities should give the DV/VA product set a thorough evaluation.  Oracle releases regular upgrades to the entire DV/VA product set, and we anticipate many of the noted gaps will be closed at some point in the future.

The Role of Oracle Data Visualizer in the Modern Enterprise

Chess as a metaphor for strategic competition is not a novel concept, and it remains one of the most respected due to the intellectual and strategic demand it places on competitors. The sheer combination of moves in a chess game (estimated to be more than the number of atoms in the universe) means that it is entirely possible that no two people have unintentionally played the same game.  Of course, many of these combinations result in a draw and many more set a player down the path of an inevitable loss after only a few moves.  It is no surprise that chess has pushed the limits of computational analytics which in turn has pushed the limits of players.  Claude Shannon, the father of information theory, was the first to state the advantages of the human and computer competitor attempting to wrest control of opposing kings from each other:

The computer is:

  1. Very fast at making calculations;
  2. Unable to make mistakes (unless the mistakes are part of the programmatic DNA);
  3. Diligent in fully analyzing a position or all possible moves;
  4. Unemotional in assessing current conditions and unencumbered by prior wins or losses.

The human, on the other hand, is:

  1. Flexible and able to deviate from a given pattern (or code);
  2. Imaginative;
  3. Able to reason;
  4. Able to learn [1].

The application of business analytics is the perfect convergence of this chess metaphor, powerful computations, and the people involved. Of course, the chess metaphor breaks down a bit since we have human and machine working together against competing partnerships of humans and machines (rather than human against machine).

Oracle Business Intelligence (along with implementation partners such as Edgewater Ranzal) has long provided enterprises with the ability to balance this convergence. Regardless of the robustness of the tool, the excellence of the implementation, the expertise of the users, and the responsiveness of the technical support team, there has been one weakness:  No organization can resolve data integration logic mistakes or incorporate new data as quickly as users request changes.  As a result, the second and third computer advantages above are hindered.  Computers making mistakes due to their programmatic DNA will continue to make these mistakes until corrective action can be implemented (which can take days, weeks, or months).  Likewise, all possible positions or moves cannot be analyzed due to missing data elements.  Exacerbating the problem, all of the human advantages stated previously can be handicapped; increasingly so depending on the variability, robustness, and depth of the missing or wrongly calculated data set.

With the introduction of Visual Analyzer (VA) and Data Visualization (DV), Oracle has made enormous strides in overcoming this weakness. Users now have the ability to perform data mashups between local data and centralized repositories of data such as data warehouses/marts and cubes.  No longer does the computer have to make data analysis without the availability of all possible data.  No longer does the user have to make educated guesses about how centralized and localized data sets correlate and how it will affect overall trends or predictions.  Used properly, users and enterprises can leverage VA/DV to iteratively refine and redefine the analytical component that contributes to their strategic goals.  Of course, all new technologies and capabilities come with their own challenges.

The first challenge is how an organization can present these new views of data and compare and contrast them with the organizational “one version of the truth”. Enterprise data repositories are a popular and useful asset because they enable organizations to slice, dice, pivot, and drill down into this centralized data while minimizing subjectivity.  Allowing users to introduce their own data creates a situation where they can increase data subjectivity.  If VA/DV is to be part of your organization’s analytics strategy, processes must be in place to validate the result of these new data models.  The level of effort that should be applied to this validation should increase according to the following factors:

  • The amount of manual manipulation the user performed on the data before performing the mashup with existing data models;
  • The reputability of the data source. Combining data from an internal ERP or CRM system is different from downloading and aligning outside data (e.g. US Census Bureau or Google results);
  • The depth and width of data. In layman’s terms, this corresponds to how many rows and columns (respectively) the data set has;
  • The expertise and experience of the individual performing the data mashup.

If you have an existing centralized data repository, you have probably already gone through data validation exercises. Reexamine and apply the data and a metadata governance processes you went through when the data repository was created (and hopefully maintained and updated).

The next challenge is integrating the data into the data repository. Fortunately, users may have already defined the process of extracting and transforming data when they assembled the VA/DV project.  Evaluating and leveraging the process the user has already defined can shorten the development cycle for enhancing existing data models and the Extract, Transform, and Load (ETL) process.  The data validation factors above can also provide a rough order of magnitude of the level of effort needed to incorporate this data.  The more difficult task may be determining how to prioritize data integration projects within an (often) overburdened IT department.  Time, scope, and cost are familiar benchmarks when determining prioritization, but it is important to take revenue into account.  Organizations that have become analytics savvy and have users demanding VA/DV data mashup capabilities have often moved beyond simple reporting and onto leveraging data to create opportunities.  Are salespeople asking to incorporate external data to gain customer insight?  Are product managers pulling in data from a system the organization never got around to integrating?  Are functional managers manipulating and re-integrating data to cut costs and boost margins?

To round out this chess metaphor, a game that seems to be nearly a draw or a loss can breathe new life by promoting a pawn to a lost queen. Many of your competitors already have a business intelligence solution; your organization can only find data differentiation through the type of data you have and how quickly it can be incorporated at an enterprise level.  Providing VA/DV to the individuals within your organization with a deep knowledge of the data they need, how to get it, and how to deploy it can be the queen that checkmates the king.

[1] Shannon, C. E. (1950). XXII. Programming a computer for playing chess. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 41(314), 256-275. doi:10.1080/14786445008521796

Upgrading Oracle Business Intelligence to 12c

Ranzal was recently invited to participate in a number of chalk talks for the Healthcare Industry User Group (HIUG) in San Antonio, TX. One of these chalk talks covered how an organization should prepare for and execute an upgrade to Oracle Business Intelligence (OBI) 12c.  Since the technical steps are already covered in numerous blog posts as well as Oracle documentation, our conversation focused on a strategic approach to the upgrade.  Our conversation essentially came down to four topics:

  1. An overview of the simplicity of the upgrade from 11g to 12c
  2. Organizational mindset while preparing for the upgrade
  3. The new technical infrastructure and the implications for the organization
  4. How to introduce users to the new features and how this might impact governance

We wanted to formally document this lively and fast-paced discussion to help other organizations as well as the HIUG chalk talk participants who were furiously scribbling notes and may not have had the opportunity to take it all in.

We started our discussion with how relatively easy Oracle has made this upgrade. For those of you who experienced the difficult and buggy process of upgrading from OBI 10g to 11g and may be dreading the upgrade to 12c, we have some advice:  relax.  First, the upgrade to 12c is an “in place upgrade” which means your 11g environment remains intact while the metadata and configuration gets “lifted and shifted” into 12c.  Speaking of “lift and shift,” 12c comes with a tool that extracts the metadata and much of the configuration from 11g into one tidy package that is then pushed into 12c.  There is a small amount of manual configuration that has to occur; however, this will only slow down customers with highly customized environments.  Once this lift and shift has occurred, an Oracle validation tool checks that your Dashboards and Analysis are working and alerts you to potential issues.  While there are bugs with 12c (what new or old software does not have bugs?), we have not found any major issues that will cause a full stop for an organization.

So how does an organization prepare for this upgrade? First, we encourage clients to view this upgrade as an opportunity to “clean house,” especially for customers that have been building OBI assets for years.  Used properly, analytic tools lend themselves to experimentation and evolution.  Experimentation can result in partially formed or broken logical objects such as presentation columns and facts, Analysis, or entire Dashboards.  The developers of these objects have the best intention of coming back to either fix or complete development on these objects or, at the very least, delete them.  Developers are busy people though, and eventually the existence of these objects is forgotten.  Evolution of both the tool and the focus on analytics results in objects becoming stale and/or obsolete.  Take this opportunity to clean house on your OBI environment.  Run the consistency check in the BI Administration tool and resolve those lingering issues.  Evaluate your usage statistics and determine if unused Analysis and Dashboards are still needed.  Fix or discard broken Analysis and Dashboards.  As with any technology tool, have a process in place to document, communicate, archive, backup, and restore, if necessary.

The most substantial change to come with OBI 12c is the underlying technical architecture. Fusion Middleware and Enterprise Management has a new look and feel.  Additionally, some actions are no longer performed within these tools.  For instance, deploying the RPD is no longer done through Enterprise Manager (in fact, the RPD is now the BAR file).  The directory has significantly changed which is a bit unfortunate for those of us who like to go directly to the log files in the directory rather than relying on Enterprise Manager and Session Manager.  Finally, many of the RPD . . . that is . . . BAR functions, such as deploying and copying, are done through a new command line interface.  So, while most of your users will be able to log into 12c and quickly adapt to the new look and feel, your OBI support team will have some learning to do.  Again, the goal of this post is to provide an overarching upgrade strategy, so we will not delve any deeper into these changes.  There is plenty of quality content online regarding these changes and you should always review Oracle documentation before performing implementation or upgrades.

End users will initially feel that, with the exception of a new look and feel, nothing much has changed. However, with new graphing options, statistical analytics capabilities based on R, as well as Visual Analyzer, users have an opportunity to expand their analytical capacity.  The organizational challenge is to go back to the change management playbook that was used when OBI was initially introduced, re-evaluate, and update so that end users can get the most out of this upgrade.  Evaluate how to train users on where to properly use the new graphs and charts.  Determine (or re-determine) who your power users are who need or want the new statistical capabilities.  Review existing Dashboards and Analysis and make appropriate upgrades.

Potentially the biggest challenge will be evaluating and understanding the capabilities of the new Visual Analyzer tool which, among other features, allows you to perform data mashups. This new tool will require that your organization determines some use cases and user groups as well as some additional training.  While users uploading data into the OBI system and combining it with existing data models opens up entirely new possibilities for insight, it also creates a governance challenge.  How do you separate and maintain the organizational “one version of the truth” while encouraging and properly promoting new analytic insight?  How will security be handled and users trained to adhere to this model?  How will you handle the archiving and deletion of potentially huge numbers of Excel spreadsheets uploaded onto the OBI server?  While this all sounds intimidating, keep in mind that your organization has already been through these exercises once during the original OBI implementation.  Adapt your existing knowledge.

Thus far, Ranzal has had a positive experience with the 12c upgrade. The underlying technical architecture has resulted in some real gains in performance, especially when leveraging EPM as a data source.  The upgrade is well thought out and simple, especially if you go through a system checkup and resolve issues.  While your technical BI support team will have some homework and learning to do to continue to fill that role, your users will be able to jump right into using 12c.  Despite this ease of user adoption, be sure to have a change management plan in place and take advantage of the new features and capabilities of 12c.

If you are thinking of doing an upgrade and have questions, feel free to reach out to us. Also, keep an eye out for an upcoming webcast on upgrading to 12c with an interactive question and answer session.

Oracle Business Intelligence EPM and Relational Federation – A Strategic Approach

The federation of EPM and relational data sources through Oracle Business Intelligence (OBI) seems straightforward: import the cube, federate and rename, expose it all, and create dashboards and analysis. Due to the technical simplicity of EPM and relational federation, many organizations underestimate the amount of effort needed to implement an OBI solution that properly leverages and extends the capabilities of the EPM and relational data sources.  The OBI implementation process should not be an afterthought, especially if OBI is to be the primary method by which users consume organizational data.  We have assembled ten “Dos and Don’ts” that cover the full lifecycle implementation to help organizations get the most out of their OBI solution.

Do – Design and develop the data sources with input from the OBI implementation team

Especially in implementations where OBI is to be the primary method of consuming data, the OBI implementation team should have been heavily involved in Dashboard and Analysis requirements and design. As such, this team will have the knowledge of what data structure is needed to support an efficient and easy-to-use analytic solution.  Asking the OBI implementation team to come in after the data model has been set and create Dashboards and Analysis will often result in workarounds that are error prone, difficult to maintain, and challenging or impossible to scale.

Don’t – View OBI as a one-size-fits-all analytics and reporting tool for the organization

OBI is a powerful and versatile tool capable of addressing a slew of needs; however, it is not a magic bullet. Depending on the application and needs of the organization, Smart View, Financial Reporting, and even BI Publisher have their places in the organization.  Attempting to replicate the capabilities of other analytic and reporting tools through OBI may provide the illusion of capability, but will fall short of user expectations and possibly harm adoption by the rest of the organization.

Do – Have a metadata management process in place before federating data sources

We discussed the rationale of this best practice thoroughly in the post Oracle Business Intelligence – Synchronizing Hierarchical Structures to Enable Federation. To summarize, unsynchronized hierarchical structures between data sources can result in analysis with outcomes that are irreconcilable, seemingly reorganize while drill down or up, display erroneously shared members, or simply result in errors in OBI.  A centralized process for managing this metadata as well as ensuring that all relevant data sources are updated simultaneously is imperative when federating data sources.

Don’t – Treat OBI as a metadata or master data management tool

This is typically a symptom of not having the OBI implementation team involved during the design of the data models. As a result of this misalignment, clients attempt to shoehorn analysis into the data model by using the BI Administration tool (RPD) to excessively manipulate the data model.  Properly leveraged, the BI Administration tool can create an agile analytics solution; however, relying on this tool to fill large gaps between the data model and analytics will result in performance and maintenance issues.

Do – Define a use case, user community, and requirements for all implementations

From proof of concept to full implementations, having the right people involved is imperative. Within your organization:  Who understands the reporting and analytic needs and gaps?  Who understands where the data is coming from?  Who understands what capabilities are needed?  Who is positioned to help user adoption?  Who is asking questions that the organization is struggling to answer?  Any technology implementation that is done with the intent to “throw it against a wall and hope something sticks” is destined to fail; OBI is no different.

Don’t – Expect that users will flock to OBI if EPM is the only data available

We find that when there are both EPM and relational data sources, EPM is often the first to be implemented and exposed through OBI. During these implementations, users are extensively exposed to Smart View and finance users become especially enamored with the tool and struggle to immediately see value in OBI.  A Pavlovian response is to simply federate the EPM cube’s relational data source which typically provides a lower level of detail (or granularity).  While this is sometimes useful to users, it is still not providing the additional insight users cannot readily get elsewhere.  Federating additional data sources with EPM cubes should provide additional attributes or measures or provide a simple path to jump from one organizational view of the data to another.  For instance, a financial consolidation EPM cube federated with an operational relational data source provides an easy-to-use analytical solution for managers with responsibilities that straddle both worlds.  These users will quickly adopt OBI and help with future user adoption.

Do – Empower the users

Guided analysis through Dashboards, Analysis, Alerts, and Scorecards is a powerful tool; however, an organization will never address every scenario through this method. Guided analysis should be an introduction to OBI for users which should quickly be developed into self-service.  Within a few months of rolling out the OBI solution, power users should be assembling ad hoc analysis and putting together their own dashboards.  Within a year, most users should be answering basic questions on their own.  Organizations that empower users are not only improving the ROI on OBI, but they are also more agile in addressing changing business landscapes, accelerating user adoption, and reducing the load on (often) overburdened IT organizations.

Don’t – Neglect the performance of any data sources

The demand for data is the epitome of just-in-time logistics. Especially when users are empowered, many organizations find that their data sources and caching strategies are not sufficient for how users are actually leveraging the data.  EPM and relational data sources both have performance monitoring capabilities that should be frequently evaluated during the months after initial rollout and periodically evaluated thereafter and any deficiencies addressed.  Failing to address performance issues will result in users abandoning and circumventing the analytic tool, resulting in loss of productivity and data quality issues.

Do – Pivot to using OBI as an analytics tool instead of simply another reporting tool

Tabular reporting is typically (and should be) the first use for OBI that clients turn to, but this should be viewed as an insertion point and not the final rally point. With capabilities such as graphs, heat matrices, treemaps, gauges, alerts, and trellis, pivoting from reporting to analytics should be the goal.  Answering business-critical questions, quickly understanding the business landscape, and gaining insight is where the true value of OBI lies.  Simply leveraging OBI as another reporting solution is severely handicapping the tool’s return on investment.

Don’t – Let OBI data sources become static

Analytics is one of the few tools that simultaneously changes a business in a deliberate and serendipitous manner. A well-led and strategically executed analytics program can have a lasting contribution to an organization’s goals.  At the same time, users will develop new skills and capabilities as they become familiar with both the tool and the data and begin to ask new questions.  As both the competitive landscape changes and organizational capabilities expand, data models should be evolving to address these new needs.  OBI has the ability to easily expose, slice and dice, and visualize data to answer these questions; the challenge is to not become complacent in providing new data resources to users.

If OBI is to play a role in your organization’s analytic strategy, it should not be an afterthought. Involving implementation team members with the knowledge of OBI’s capabilities from the start can help ease implementation during the later phases, accelerate user adoption, and increase the long term ROI.  Edgewater Ranzal has both the technical and functional implementation experience with OBI to help you evaluate, adjust, and execute your analytic strategy according to these ten “Dos and Don’ts.”

Oracle Business Intelligence – Synchronizing Hierarchical Structures to Enable Federation

More and more Oracle customers are finding value in federating their EPM cubes with existing relational data stores such as data marts and data warehouses (for brevity, data warehouse will refer to all relational data stores). This post explains the concept of federation, explores the consequences of allowing hierarchical structures to get out of synchronization, and shares options to enable this synchronization.

In OBI, federation is the integration of distinct data sources to allow end users to perform analytical tasks without having to consider where the data is coming from. There are two types of federation to consider when using EPM and data warehouse sources:  vertical and horizontal.  Vertical federation allows users to drill down a hierarchy and switch data sources when moving from an aggregate data source to a more detailed one.  Most often, this occurs in the Time dimension whereby the EPM cube stores data for year, quarter, and month, and the relational data sources have details on daily transactions.  Horizontal federation allows users to combine different measures from the distinct data sources naturally in an OBI analysis, rather than extracting the data and building a unified report in another tool.

Federation makes it imperative that the common hierarchical structures are kept in sync. To demonstrate issues that can occur during vertical federation when the data sources are not synchronized, take the following hierarchies in an EMP application and a data warehouse:

Figure 1: Unsynchronized Hierarchies

Jason Hodson Blog Figure 1.jpg

Notice that Colorado falls under the Western region in the EPM application, but under the Southwestern region in the data warehouse. Also notice that the data warehouse contains an additional level (or granularity) in the form of cities for each region.  Assume that both data sources contain revenue data.  An OBI analysis such as this would route the query to the EPM cube and return these results:

Figure 2: EPM Analysis – Vertical Federation

Jason Hodson Blog Figure 2

However, if the user were to expand the state of Washington to see the results for each city, OBI would route the query to the data warehouse. When the results return, the user would be confronted with different revenue figures for the Southwest and West regions:

Figure 3: Data Warehouse – Vertical Federation

Jason Hodson Blog Figure 3

When the hierarchical structures are not aligned between the two data sources, irreconcilable differences can occur when switching between the sources. Many times, end users are not aware that they are switching between EPM and a data warehouse, and will simply experience a confusing reorganization in their analysis.

To demonstrate issues that occur in horizontal federation, assume the same hierarchies as in Figure 1 above, but the EPM application contains data on budget revenue while the data warehouse contains details on actual revenue. An analysis such as this could be created to query each source simultaneously and combine the budget and actual data along the common dimension:

Figure 4: Horizontal Federation

Jason Hodson Blog Figure 4

However, drilling into the West and Southwest regions will result in Colorado becoming an erroneously “shared” member:

Figure 5: Colorado as a “Shared” Member

Jason Hodson Blog Figure 5

In actuality, the mocked up analysis above would more than likely result in an error since OBI would not be able to match the hierarchical structures during query generation.

There are a number of options to enable the synchronization of hierarchical structures across EPM applications and data warehouses. Many organizations are manually maintaining their hierarchical structures in spreadsheets and text files, often located on an individual’s desktop.  It is possible to continue this manual maintenance; however, these dispersed files should be centralized, a governance processes defined, and the EPM metadata management and data warehouse ETL process redesigned to pick up these centralized files.  This method is still subject to errors and is inherently difficult to properly govern and audit.  For organizations that are already using Enterprise Performance Management Architect (EPMA), a scripting process can be implemented that extracts the hierarchical structures in flat files.  A follow on ETL process to move these hierarchies into the data warehouse will also have to be implemented.

The best practices solution is to use Hyperion Data Relationship Management (DRM) to manage these hierarchical structures. DRM boasts robust metadata management capabilities coupled with a system-agnostic approach to exporting this metadata.  DRM’s most valuable export method allows pushing directly to a relational database.  If a data warehouse is built in tandem with an EPM application, DRM can push directly to a dimensional table that can then be accessed by OBI.  If there is a data warehouse already in place, existing ETL processes may have to be modified or a dimensional table devoted to the dimension hierarchy created.  Ranzal has a DRM accelerator package to enable the synchronization of hierarchical structures between EPM and data warehouses that is designed to work with our existing EPM application DRM implementation accelerators.  Using these accelerators, Ranzal can perform an implementation in as little as six weeks that provides metadata management for the EPM application, establishes a process for maintaining hierarchical structure synchronization between EPM and the data warehouse, and federation of the data source.

While the federation of EPM and data warehouse sources has been the primary focus, it is worth noting that two EPM cubes or two data warehouses could be federated in OBI. For many of the reasons discussed previously, data synchronization processes will have to be in place to enable this federation.  The previous solutions for maintaining metadata synchronization may be able to be adapted to enable this federation.

The federation of EPM and data warehouse sources allows an enterprise to create a more tightly integrated analytical solution. This tight integration allows users to transverse the organization’s data, gain insight, and answer business essential questions at the speed of thought.  As demonstrated, mismanaging hierarchical structures can result in an analytical solution that produces unexpected results that can harm user confidence.  Enterprise solutions often need enterprise approaches to governance; therefore, it is often imperative to understand and address shortcomings in hierarchical structure management.  Ranzal has a deep knowledge of EPM, DRM, and OBIEE, and how these systems can be implemented to tightly work together to address an organization’s analytical and reporting needs.

Visualizing Big Data

This post explores using Tableau Server to visualize data in a Hadoop cluster.

More and more businesses are finding value or savings in replumbing their databases to be deployed on commodity hardware running open-source software like Cloudera Distribution of Hadoop. With tools like Impala, realtime queries of massive datasets become possible. To get the most insight, compelling interactive data exploration and visualization is necessary. We wanted to explore how Tableau works in this regard, and found Tableau Server visualizing data from CDH using Impala proved facile. This combination provides data exploration at the speed of thought with beautiful intuitive visualizations, resulting in a quick front-end for big data.

To demonstrate this, I took the classic AdventureWorks Bike Store data that Microsoft uses to demo a data warehouse in SQL Server (also used in our Endeca Information Discovery demo) and loaded it in our CDH cluster using Impala. I downloaded Tableau Desktop, as well as the Cloudera ODBC Driver for Impala, and spun up a VM running Windows Server to host Tableau Server. After configuring the Impala driver to point at our cluster, I launched Tableau and added a new data source. Tableau makes it pretty simple to choose from a menu of data sources, from a simple CSV to a massive CDH cluster. After selecting the Cloudera Hadoop option, I input our cluster DNS and the port and credentials for Impala. I selected my new Bike Store database and table, and was ready to whip up some visualizations.

Tableau provides tools for ETL, including a pretty nice GUI for simple joins, but since I was trying to denormalize a star schema I did the transformations using impala-shell where I have more control and the operations are more visible. Cluster-side ETL would also help these visualizations run at the speed of thought, even if working with big data at scale.

Using Tableau provides really easy creation of standard charts and even more complicated visualizations like maps can be rendered automagically with geographic attribute detection. I found the date detection to be less dependable. You can add filters to slice and dice on different attributes, and create dashboards to combine several worksheets, or create “Stories” to walk a viewer through a series of visualizations.

Overall, I found some aspects of Tableau a little bit like using Apple products: you get great design and intuitive functionality at the cost of robust configurability. It’s a fantastic tool to get pretty visualizations up and running quickly and intuitively, to add data sources on the fly with little complexity, and to quickly share the results. But in spite of those strengths, Tableau isn’t a replacement for enterprise-scale BI offerings.

Interact with the visualization we built and embedded here: Store Dashboard