What You Can Do…

Last week, we announced general availability of our Advanced Visualization Framework (AVF) for Oracle Endeca Information Discovery.  We’ve received a lot of great feedback and we’re excited to see what our customers and partners can create and discover in a matter of days. Because the AVF is a framework, we’ve already gotten some questions and wanted to address some uncertainty around “what’s in the box”.  For example: Is it really that easy? What capabilities does it have? What are the out of the box visualizations I get with the framework?

Ease of Use

If you haven’t already registered and downloaded some of the documentation and cookbook, I’d encourage you to do so.  When we demoed the first version of the AVF at the Rittman Mead BI Forum in Atlanta this spring, we wrapped up the presentation with a simple “file diff” of a Ranzal AVF visualization.  It compared our AVF JavaScript and the corresponding “gallery entry” from the D3 site that we based it on.  In addition to allowing us to plug one of our favorite utilities (Beyond Compare 3), it illustrated just how little code you need to change to inject powerful JavaScript into the AVF and into OEID.

Capabilities

Talking about the framework is great, but the clearest way to show the capabilities of the AVG is by example.  So, let’s take a deep dive into two of the visualizations we’ve been working on this week.  First up, and it’s a mouthful, is our “micro-choropleth”. We started with a location-specific Choropleth (follow the link for a textbook definition) centered around the City of Chicago.  Using the multitude of publicly available shape files for Chicago, the gist of this visualization is to display some publicly available data at a micro-level, in this case crime statistics at a “Neighborhood” level: It’s completely interactive, reacts to guided navigation, gives contextual information when you mouse over and even gives you the details about individual events (i.e. crimes) when you click in. Great stuff but what if I don’t want to know about crime in Chicago?  What if I want to track average length of stay in my hospital by where my patients reside?   Similar data, same concept, how can I transition this concept easily?  Well, our micro-choropleth has two key capabilities, both enabled by the framework, to account for this.  Not only does it allow my visualization to contain a number of different shape layers by default (JavaScript objects for USA state-by-state, USA states and counties, etc.), it also gives you the ability to add additional ones via Studio (no XML, no code). Once I’ve added the new JavaScript file containing the data shape, I can simply set some configuration to load this totally different geographic data frame rather than Chicago.  I can then switch my geographic configuration (all enabled in my visualization’s definition) to indicate that I’ll be using zip codes rather than Chicago neighborhoods for my shapes. Note that our health care data and medical notes are real but we de-identify the data, leaving our “public data” at the zip code level of granularity.  From there, I simply change my query to hit population health data and calculate a different metric (length of stay in Days) and I’m done! That’s a pretty “wholesale” change that just got knocked out in a matter of minutes.  It’s even easier to make small tweaks.  For example, notice there are areas of “white” in my map that can look a little washed out.  These are areas (such as the U.S. Naval Observatory) that have zip codes but lack any permanent residents.  To increase the sharpness of my map, maybe I want to flip the line colors to black.  I can go into the Preferences area and edit CSS to my heart’s content.  In this case, I’ll flip the border class to “black” right through Studio (again, no cracking open the code)… …and see the changes occur right away. The same form factor is valid for other visualizations that we’ve been working on.  The following visualization leverages a D3 force layout to show a Node-Link analysis between NFL skill position players (it’s Fantasy Football season!) and the things they share in common (College attended, Draft Year, Draft Round, etc.).  Below, I’ve narrowed down my data (approximately 10 years worth) by selecting some of the traditional powers in the SEC East and limiting to active players. This is an example of one of our “template visualizations”.  It shows you relationships, interesting information but really is intended to show what you can do with your data.  I don’t think the visualization below will help you win your fantasy league though it may help you answer a trivia question or two.

However, the true value is in realizing how this can be used in real data scenarios.  For example, picture a network of data related to intelligence gathering.  I can visualize people, say known terrorists, and organizations they are affiliated with.  From there, I can see others who may be affiliated with those organizations in a variety of ways (family relations, telephone calls, emails).  The visualization is interactive, it lends itself to exploration through panning, scanning and re-centering.  It can show all available detail about a given entity or relationship and provide focused detail when things get to be a bit of a jumble: And again, the key is configuration and flexibility over coding.  The icons for each college are present on my web server but are driven entirely by the data, and retrieved and rendered using the framework.  The color and behavior of my circles is configurable via CSS.

What’s In The Box?

So, you’re seeing some of the great stuff we’ve been building inside our AVF.  Some of the visualizations are still in progress, some of them are “proof of concept” but a lot of it is already packaged up and included. We ship with visualizations for Box Plots, Donut Charts, Animated Timeline (aka Health and Wealth of Nations), and our Tree Map.  In addition, we ship with almost a dozen code samples for other use cases that can give you a jump start on what you’re trying to create. This includes a US Choropleth (States and Counties), a number of hierarchical and parent-child discovery visualizations as well as a Sunburst chart. In addition, we’ll be “refreshing the library” on a monthly basis with new visualizations and updates to existing ones.  These updates might be as simple as demonstrations of best practices and design patterns to fully fledged supported visualizations built by the Engineering team here in Chicago.  Our customers and partners who are using the framework can expect an update on that front around the first of the month.

As always, feedback and questions welcome at product [at] ranzal.com.

Introducing Visualizations 2.0

Ranzal is proud to announce general availability of our Advanced Visualization Framework (AVF) 2.0 for Oracle Endeca Information Discovery (OEID) today. A few months back, we released our first cut at a “framework within the framework” for building powerful data visualizations rapidly in an enterprise-class discovery environment. It allowed both our internal team and our customers to build visualizations that deliver powerful insights in a matter of days and hours (or sometimes minutes) rather than weeks and months.  You find something cool, you plug the JavaScript into our framework, fill out some XML configuration and you’re on your way.

So What’s New?

This new release builds on top of our previous work and makes vast improvements both to how visualizations get built (using JavaScript) and configured (inside of OEID Studio).  The most common piece of feedback we received the first time around was that, once an advanced visualization was ready to go, configuring that visualization could be exceedingly difficult even for a seasoned user. In this release, we’ve made great strides in improving what we call the configuration experience.   Within this area, we’ve invested primarily in what we call ease of use and user guidance.

Ease of Use

We set out in this release to make every part of the configuration screens easier to understand.  For each tab in our set of preferences, we’ve either streamlined the set of required fields and/or given the user a set of tools and information to use as a reference.

For example, take the query writing process.  In the previous version, users needed to enter their EQL essentially “without a net”.  There was no config-side validation, no visual help with simple constructs such as the field names available in a view.  It was hard, if not impossible, to get things right without some back and forth or three different browser tabs open. In the AVF 2.0, the EQL authoring process looks like this:

avf-eqlThe user no longer has to remember field names and an at-a-glance reference showing display names, attribute keys and types is front and center.  In addition, there is now on the spot validation of the query (upon hitting Save) to help diagnose any syntactical errors that may be present.

Throughout the configuration experience, we’ve made things easier to use.  But how does the AVG 2.0 help the user through the full process of configuring a visualization (read on).

Guided Configuration

To that end, we set out to make it easy for developers to guide their users in configuring the data elements (queries, CSS, etc.) that provide the backing for a visualization.  It was apparent very early on that, in many cases, building an advanced visualization requires some advanced capabilities.  This can be illustrated in the famous “Wealth and Health of Nations” visualization that we call the Animated Timeline:

It’s a really cool visualization with a nice combination of interactivity and dynamic playback.  However, the first time we encountered it, it took us a moment to wrap our heads around questions such as “how many metrics?, how many group bys?”.  It takes a fair amount of understanding to pull off generating the data for such a complex visualization*.

For a complex visualization, the developer who wrote the JavaScript has this in-depth understanding.  The trick is to provide a “no-code” capability for the developer to help the user along the configuration path.  In this release, every visualization and nearly every configurable field for a visualization can be enabled for contextual help.

This includes the visualization itself….

…its queries….

query-level

and even Custom Preferences of your own design.

Simply adding description attributes to the XML configuration for a given visualization type allows the developer to provide the power user with all the help they need.

*For the record, the Animated Timeline uses 3 metrics (X, Y, size of the bubble) and 3 group by attributes (Detail, Color and Time).

Pruning the Hedges

Frankly, the first time we released this offering, we tried to make it too configurable and too open.  Call it the software framework corollary of “Four Drafts”.

To take one egregious example, the AVF 1.0 had introduced the idea of configurable tokens.  Configurable tokens are still available because they’re extremely flexible and valuable.  However, we also had something called “conditional dynamic tokens”.  These tokens came with their own grammar, best described as hieroglyphic, that governed their values at all times (sort of a symbolic case statement).  It’s an extremely powerful construct for the developer (each token potentially saves you 5-10 lines of JavaScript) but completely confusing to the power user trying to configure a chart.  Things like that are gone.  The developer is left to do a little more work but the 99% of our users who are not making use of this functionality find the experience a lot easier to navigate.  Less is more.

What’s Next

We’re also making a concerted effort to bring more developers into the fold.  To that end, while we don’t make our framework available for download without some agreements in place, our Installation Guide and AVF Developer’s Guide can be found on our Downloads page and are available to registered users.  To register at Bird’s Eye View, simply click here or use the registration link on the right. The most exciting part of the whole documentation set (talk about an oxymoron) is a new section of the Developer’s Guide called the Cookbook.  It’s a set of small, simple examples that allow developers to quickly come up to speed on the framework and start writing visualizations that much faster (15 minutes). If you’re interested in learning more, don’t hesitate to comment below or drop us a line at product [at] ranzal.com.

Return to Action

Late last year, we introduced some incredibly compelling capabilities that allowed users to collaborate with each other inside of Oracle Endeca Information Discovery (OEID) 3.0. Our collaborative discovery extension allowed users with certain permissions to delete records, edit attributes of records and add attribute values to existing attributes, all from within OEID Studio.  It’s an incredibly powerful way to assist in data cleanup, data flagging or grouping certain records together with applicability to almost every data discovery scenario. We’re pleased to announce that we’ve re-launched this functionality and it is now available for licensed users of OEID 3.1.  The same great capabilities that have always been there remain, but we’ve given it a bit of a facelift as part of the upgrade, as you can see below.

Deleting Misleading or Invalid Records

delete-pre-ambleHey, this tweet has nothing to do with the Olympics!

Occasionally incorrect or misleading data will find its way into a given application.  If it has no business being there or has an unwanted, adverse affect, let’s get rid of it! Hey, this tweet has nothing to do with the Olympics! Bye, bye spammer!

Augmenting Existing Records With More Attribution

In addition, users may find something interesting on a record (or set of records) and want to take action to augment the attribution of the record.  Below, two terrorist incidents (from one of our internal applications) have been identified as possibly having links to ISIS based on location: The data can be augmented by selecting the field to hold this additional information (for simplicity sake, we added it to “Themes”)… add-themes-1 …adding the additional value (or values):

Replacing Existing Attributes on Records

In the same vein as the first use case, users may find a record or set of records where they want to set a brand new value for an attribute.  It could be changing a Status from Open to Closed or from Valid to Invalid or maybe correcting an error during ingest such as a poorly calculated sentiment score. After each change, we update the index upon selection of “Apply Changes”.  If you look below, you can see the result in the application reflected immediately: Now, there’s one final piece that hasn’t been mentioned that completely closes the loop.  Since an Oracle Endeca Discovery Application is typically not the “system of record”, there’s the possibility that an update sourced from an upstream system could override changes made by users. We’ve accounted for that as well by persisting all changes to a dedicated database store that can be integrated into the update process.  For example, if I’ve deleted a record from my application, I can use the “log of deletes” in the database as a data source in any ETL processes that may happen subsequently.  Simply filter the incoming data stream using the data stored in the database and you’re good to go.  If there are attribute replacements and additions, they work the same way and are tracked and logged appropriately.

If you’re interested in pricing or capabilities or just want to give feedback, drop us a line at product [at] ranzal.com.  It’s already been delivered to a customer in Spain last month and we’re looking forward to seeing more and more people in the community get their hands on it.

Why Jython for FDMEE

Originally published here May 19, 2014 on ODTUG.comkscope14-logo

Contributed by:
Tony Scalese, Integration Practice Director
Hyperion Certified
Oracle ACE
Edgewater Ranzal
ascalese@ranzal.com

In the 11.1.2.3 release of the EPM stack, Oracle introduced Financial Data Quality Management, Enterprise Edition, or FDMEE. FDMEE was not entirely a new product but rather a rebranding of ERP Integrator (ERPi), which was released in 11.1.1.3. FDMEE actually represents the convergence of two products – FDM (now known as FDM Classic) and ERPi – and represents the best of both products. Besides the obvious changes – a new user interface (UI) that is integrated into Workspace, leveraging Oracle Data Integrator (ODI) as its data engine and direct integration to many Oracle and SAP ERP source systems – FDMEE introduced one rather significant change: Jython as its scripting language.

For organizations that have a significant investment in FDM Classic, a new scripting language likely represents one of the most daunting changes in the new data management tool known as FDMEE. Before we continue, let’s briefly talk about scripting in FDMEE. Customers face a choice with FDMEE scripting – VBScript or Jython. I have spoken with a number of customers that have asked, “Can’t I just stick with VBScript because it’s very similar to the FDM scripting, which was basically VBScript.” The technical answer is, in most cases, yes, you likely could. The more thought-out answer is, “Have you considered what you are giving up by sticking with VBScript?” Well, that really isn’t an answer, is it?

Let’s take a moment to understand why I ask that question. Let’s consider at a high level the differences in these two languages. For Wikipedia: VBScript (Visual Basic Scripting Edition) is an Active Scripting language developed by Microsoft that is modeled on Visual Basic. It is designed as a “lightweight” language with a fast interpreter for use in a wide variety of Microsoft environments.Jython, successor of JPython, is an implementation of the Python programming language written in Java.

Take a moment to consider the Enterprise Performance Management (EPM) stack at Oracle. Have you noticed any trends over the past two to three years? In 11.1.2.2, Oracle rewrote the UI for HFM using the Oracle ADF framework. In 11.1.2.3, Oracle removed all but the most basic functionality from the HFM Win32 client. In 11.1.2.4, HFM is planned to be platform agnostic meaning it can run on virtually any operating system including Linux and UNIX. Have you heard about this nifty new appliance called Exalytics? My point in this trip down memory lane is that Oracle is clearly moving away from any reliance on Microsoft technology in its product stack. Any time I have said this, the question inevitably is asked: “Do you think we’ll still be able to run EPM on Windows servers?” My answer is a resounding YES. Oracle may not be the biggest fan of integrating Microsoft technology into its software solutions, but they are smart enough to understand that failing to support Windows as a server platform would lock them out of too large of a share of the market. So breathe easily; I don’t see Oracle producing EPM software that won’t be able to be deployed on Windows servers.

The EPM stack is moving toward becoming fully platform agnostic. Exalytics is, for those of you who are not familiar, an in-memory Linux or Solaris machine that delivers extreme performance for Business Intelligence and Enterprise Performance Management applications. At a very high level, these machines have an extraordinary amount of memory (RAM) that allows the entire database to be brought into memory. The result is incredible performance gains particularly for large applications.

There are at least two constants with technology. First, data continues to grow. The demand for more data to support business decisions is no exception. The other constant is that hardware continually improves while the cost always drops. I can’t envision Exalytics being an exception this. Today’s Exalytics machines often cost six figures, and that may not be an expense your organization can justify. However, in two to five years, your organization may require an Exalytics machine, and it may well be an expense you can justify.

Given this bit of background, let’s talk about why I firmly believe Jython is the better choice for your FDMEE application. As the EPM stack moves toward being platform agnostic, I believe that support for supporting Microsoft technologies such as VBScript will slowly diminish. The application programming interface (API) will continue to be enriched for the Jython language set, while the API for VBScript will be less robust. Please keep in mind that this is just my prediction at this point. But Oracle is no different than any other organization that undertakes technology projects. They employ the same three levers that every project does – scope, time, and budget. As a customer of EPM, you have noticed the speed at which new releases have been deployed. To continue to support two sets of APIs within these accelerated development timelines will require one of the remaining levers to be “pulled.”

That leaves us the scope and budget levers. To maintain complete parity (scope) with a fixed timeline, the remaining lever is budget. Budget for any technology project is heavily correlated to people/projects. Add more people to the project, and the cost goes up. As I said before, Oracle is no different than any other organization. Project costs must be justified. So the development head of FDMEE would need to justify to his senior management the need to add additional resources to support having two sets of APIs – one of which is specifically for Microsoft technologies. One can imagine how that conversation might go.

So we’re left with the scope lever. There are two APIs – one to support Jython (JAVA on Python) and a second to support VBScript (a Microsoft technology). Let’s not forget that Oracle owns JAVA. Which do you think wins? I hope that I have built a case to support my previous conjecture about the expected richness of the Jython API vs. the VBScript API.

Let’s say you believe my above predication is wrong. That’s OK. Let’s focus on one key difference in these technologies – error handling. Throughout my years of developing scripts for FDM Classic, the reoccurring theme I heard from customers was when a process fails, can the system alert me? The technical answer is, most likely. The practical answer is no. While in a VBScript routine I can leverage the On Error Resume Next in conjunction with If Err.Number = 0, I would need to do this after every line of code and that is simply not realistic. The best solution I have found is writing scripting operations to a log file that can then be reviewed to identify the point at which a script fails. While this approach has helped, it’s not nearly as elegant as true error handling like what is available in Jython.

Jython provides error handling through the use of the Except keyword. If you have ever written (not recorded) Excel VBA macros, you may be familiar with this functionality. In VBA, you would code On Error Goto ErrHandler and then have code within an ErrHandler section of the script that performs some operation in the event of an error. Within Jython, there is a similar, albeit more robust, concept with the Try – except keywords. For example:

def divide_numbers(x, y):
   try:
      return x/y
   except ZeroDivisionError:
      return ‘You cannot divide by zero, try again’

In the above example, the Except clause is used to handle division by zero. With Jython, you can have multiple Except clauses to handle different anticipated failures in a process. Likewise you can have a catch-all (finally) clause to handle any unexpected failures. A key functionality with the Except clause is the ability to capture the line in the script that caused the failure. This is a key improvement over VBScript.

We could continue to delve further into the technical details of why Jython is a more robust language, but when I think about this conversation in context of, “Why do I want to use Jython instead of VBScript for my application?” I think the above arguments are compelling on their own. If you are interested in learning more about Jython scripting for FDMEE, please attend my session at Kscope14:

Jython Scripting in FDMEE: It’s Not that Scary on Tuesday, June 24, at 11:15 AM.
http://kscope14.com/component/seminar/seminarslist#Jython Scripting in FDMEE: It’s Not That Scary

Tag 100 Times Faster — Introducing Branchbird’s Fast Text Tagger

BBFTTClip
Text Tagging is the process of using a list of keywords to search and annotate unstructured data. This capability is frequently required by Ranzal customers, most notably in the healthcare industry.

Oracle’s Endeca Data Integrator provides three different ways to text tag your data “out of the box” .

  • The first is the “Text Tagger – Whitelist” component which is fed a list of keywords and searches your text for exact matches.
  • The second is the “Text Tagger – Regex” component which works similarly but allows for the use of regular expressions to expand the fuzzy matching capabilities when searching the text.
  • The third is using “Endeca’s Text Enrichment” component (OEM’ed from Lexalytics) and supplying a model (keyword list) that takes advantage of the component’s model-based entity extraction.

Ranzal began working on a custom text tagging component due to challenges with the aforementioned components at scale. All of the above text taggers are built to handle tagging with relatively small inputs — both the size of the supplied dictionary and the number (and size) of documents.

1,000 EMRs 10,000 EMRs 100,000 EMRs 1,000,000 EMRs
Fast Text Tagger (FTT) 250 docs/sec 1,428 docs/sec 4,347 docs/sec 6172 docs/sec
Text Enrichment (TE) 6.5 docs/second 5 docs/second N/A N/A
TE 4 threads 17.5 docs/second 15 docs/second 15 docs/second N/A

In one of our most common use cases, customers analyzing electronic medical records with Endeca need to enrich large amounts of free text (typically physician notes) using a medical ontology such as SNOMED-CT or MeSH. Each of these ontologies has a large number of medical “concepts” and their associated synonyms. For example, the US version of SNOMED-CT contains nearly 150,000 concepts. Unfortunately, the “out of the box” text tagger components do not perform well beyond a couple hundred keywords. To realize slightly better throughput during tagging, Endeca developers have traditionally leveraged the third component listed above — the  Lexalytics-based “Text Enrichment” component — which offers better performance than the other options listed above.

However, after extensive use of the “Text Enrichment” component, it became clear that not only was the performance still not acceptable at high scale, the recall of the component was inadequate especially with Electronic Medical Records (EMRs). The Text Enrichment component is NLP-based and relies on accurately parsing sentence structure and word boundaries to tokenize the document before entity extraction begins. EMRs typically have very challenging sentence structure due both to the ad hoc writing style of clinicians at point of entry and the observational metrics embedded in the record. Because of this, Text Enrichment of even small documents at high scale can be prohibitive for simple text tagging. A recent customer of ours, using very high end enterprise equipment, was experiencing 24 hour processing times using Text Enrichment text tagging with SNOMED-CT concepts to process approximately six million EMRs.

To improve both the performance and recall issues, Ranzal set out to build a simple text tagger component for Integrator that would be easy to setup and use. The Ranzal “Fast Text Tagger” was built using a high performance dictionary matching algorithm that ingests the list of terms (and phrases) into a finite state pattern matching machine which can then be used to process the documents. One of the largest benefits of these search algorithms is that the document text only needs to be parsed once to find all possible matches within the supplied dictionary.

The Ranzal Fast Text Tagger is intended to replace the stock “Text Tagger – Whitelist” component and the use of the “Text Enrichment” component for whitelisting. Our text tagger is intended for straight text matching with optional restrictions to allow for matching on word boundaries. If your use cases require more fuzzy-style text matching, then you should continue to use the “Text Tagger – Regexp” at low scale and “Text Enrichment” at higher scales.

Performance Observations

To go further on the metrics shown above, and duplicated here, you can see the remarkable performance of the Ranzal Fast Text Tagger as compared to “Text Enrichment” even when Text Enrichment is configured to consume 4 threads. Furthermore, the rate of the BB FTT tends to increase with the number of documents, before starting to level off near 1 million documents, whereas Text Enrichment stays relatively constant.

1,000 EMRs 10,000 EMRs 100,000 EMRs 1,000,000 EMRs
BB FTT 1 thread 250 docs/sec 1,428 docs/sec 4,347 docs/sec 6172 docs/sec
TE 1 thread 6.5 docs/second 5 docs/second N/A N/A
TE 4 threads 17.5 docs/second 15 docs/second 15 docs/second N/A

As a final performance note, the previously mentioned customer with the 24 hour graph run just for text tagging, the same process was done on this same test harness with the same data in just shy of 20 minutes. It took longer to read the data from disk than it took to stream it all through the Fast Text Tagger.  This implies that, in typical use cases, the Fast Text Tagger will not be a limiting component in your graph. For those of you curious about the benchmarking methods used, please continue below.

Test Runs

We built a graph that could execute the different test configurations sequentially and then compile the results. Shown below are four separate test runs and a screen capture of Integrator at test completion. Below each screen cap is a list of metrics:

  • Match AVG: The average number of concepts extracted over the corpus
  • Total Match: The total number of concepts extracted over the corpus
  • Misses: The number of non-empty EMRs where no concept was found
  • Exec Time: The total execution time of the test configuration

Note that Text Enrichment’s poor recall negatively impacts its precision (Match AVG). If you remove the (significant number of) misses, TE has precision nearly as high as our Fast Text Tagger.

Test 1: 1,000 EMRs

1000EMRs

Test ID Match AVG Total Match Misses Exec Time
BB FTT (1 thread) 9 9,126 14 4 secs
TE (1 thread) 4 4,876 260 153 secs
TE (4 threads) 4 4,876 260 57 secs

Test 2: 10,000 EMRs

10000EMRs

Test ID Match AVG Total Match Misses Exec Time
BB FTT (1 thread) 12 127,617 14 7 secs
TE (1 thread) 5 55,567 3,739 2,010 secs
TE (4 threads) 5 55,567 3,739 675 secs

Test 3: 100,000 EMRs

100000EMRs

Test ID Match AVG Total Match Misses Exec Time
BB FTT (1 thread) 13 1,380,258 17 23 secs
TE (4 threads) 5 546,598 38,466 6,555 secs

Test 4: 1,000,000 EMRs

1000000EMRs

Test ID Match AVG Total Match Misses Exec Time
BB FTT (1 thread) 14 14,834,247 17 162 secs

Benchmarking Notes

Tests conducted on OEID 3.1 using the US SNOMED-CT concept dictionary (148,000 concepts) against authentic Electronic Medical Records. Physical hardware used: PC, 4 core i7 with hyperthreading, 32 GB RAM on SSD drives.

The “Text Tagger – Whitelist” was discarded as unusable for this test setup. “Text Enrichment” with 1 thread was discarded after the 10,000 document run and TE with 4 threads was discarded after the 100,000 document run.

Advanced Visualizations on Oracle Endeca 3.1

Ranzal is pleased to announce that our Advanced Visualization Framework is now generally available.  Spend more time discovering, less time coding.

The calendar has turned, the air is frozen and, with a new year, comes the annual deluge of “predictions and trends” articles for 2014.  Spoiler Alert: Hadoop isn’t going away and data, and what you do with it, is everything right now.

Maybe you’ve seen some of these articles but one in particular, Forbes’ “Top Four Big Data Trends”, and more specifically, one section caught our eye.  It’s simple, it’s casual (possibly too much so), but it really resonates:

“Visualization allows people who are not analytics gurus to get insights out of complex stuff in a way they can absorb.”

At Ranzal, we believe the goal of Business Intelligence and Data Discovery is to democratize the discovery process and allow anyone who can read a chart to understand their data.  It’s not about the fancy query or the massive map/reduce, it’s what you do with it.

Continue reading