Consuming “Data-as-a-Service” with OEID

A growing trend in the BigData arena is for data to be offered as a service (DaaS).  Gone are the days where data is bought and shipped on storage devices.  Online data services are springing up all across the web offering subscription-based access to real-time data.  New companies are sprouting up to monetize these developer-friendly, consumable services.  Datasift, Infochimps, and Oracle’s Collective Intellect are recent, concrete examples of this DaaS proliferation.  The emerging best practice for these data services is for the data to be offered over HTTP with JSON responses.  JSON, like XML, can offer schema-less, self-describing records which are suited for the new world of BigData’s sparsely attributed information.

Traditional BI systems which are built to work with tabular, rectangular data are ill-equipped to consume these new, semi-structured data sources.  NoSQL databases like MongoDB or HBase or data discovery products like Oracle’s Endeca Information Discovery (OEID) are purpose-built to handle these sparsely-attributed, self-describing records.

A product like OEID that handles semi-structured data so naturally should offer a simple way to consume it; Enter Ranzal’s Cloud Reader.

CloudReader2

 

The Ranzal Cloud Reader’s approach is to simply take each incoming JSON “record” from any of the web’s HTTP-based data service and normalize it into its key-value-pair (KVP) parts.  The reader is built to consume either JSON text files or persistent HTTP connections which allows data to “trickle” into your OEID application.

For example, given this JSON response/record:

{"interaction":{"source":"twitterfeed", "author":{"username":"dan.brock","name":"Dan Brock","id":"123456789"}, "type":"twitter", "id":"1e1334c7fa6ea280e074bdb552477448"} }

 

The Ranzal Cloud Reader will consume the JSON record and output all of the key-value-pair attributes.  Where there is hierarchy in the JSON record, the reader merely prefixes the attribute name with that of the names of the attribute’s ancestors.  Per the JSON example above, the following flattened KVP output would be produced (including header row for clarity):

ID|Key|Value
1e1334c7fa6ea280e074bdb552477448|interaction_id|1e1334c7fa6ea280e074bdb552477448
1e1334c7fa6ea280e074bdb552477448|interaction_source|twitterfeed
1e1334c7fa6ea280e074bdb552477448|interaction_author_username|dan.brock
1e1334c7fa6ea280e074bdb552477448|interaction_author_name|Dan Brock
1e1334c7fa6ea280e074bdb552477448|interaction_author_id|123456789
1e1334c7fa6ea280e074bdb552477448|interaction_type|twitter

 

Keeping the data in this flattened KVP-form and tying it to Integrator’s “KVP Writer” allows any and all attribution offered by the data service to become immediately navigable in your application. This completely removes the need for continuously revisiting your data model and metadata edge definitions in Integrator.

Rumor has it that Oracle will offer its own “JSON reader” with the next version of its OEID product (v2.4) which will offer many of these same capabilities.

If you’re interested to know more about this custom reader or how Ranzal can help your organization unlock value in the thousands of data services that the web offers, please visit us at ranzal.com.

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s