For the last decade, since its first appearance on Flickr, the tag cloud has been ubiquitous as the default construct for visualizing concepts and topics found within text. There’s a beauty in its simplicity, the larger the word on your screen, the more frequently it appears in the data you are analyzing.
Now, nothing against the tag cloud but we’ve been thinking it’s gone a little stale lately. The visuals can be a bit muddled and, while it’s great for identifying single concepts, it does nothing to inform the user of relationships between the identified terms or any sentiment that may be attached to a given concept. The sum total of the visual is different sized fonts and good luck to you if you’re trying to draw any conclusions from your data.
Enter the Concept Cloud
Straight out of the lab at Ranzal’s HQ, we feel like this is a huge stride forward in terms of visualizing unstructured concepts and relationships. You’ll notice in the above screen capture, we’ve highlighted 6 terms from a series of physician’s notes related to a patient’s heart issue.
The first thing you’ll notice above is that you still have the visual cue of “size equals frequency”. The most frequently referenced anatomy in the data are represented by the largest circles. We then take this concept and go deeper. Wave the mouse over one of the circles and you’ll see that we surface the exact number of references for the associated term.
In addition, when you plug this visualization into a Data Discovery environment like Oracle Endeca Information Discovery, you get the ability to drill down and further investigate your data and have the cloud react accordingly. Below, we have another data set featuring key persons and concepts from everyone’s favorite topic, American Politics. You have the ability to click a circle and narrow your data to records that contain the term you’ve selected.
The Concept Cloud is the Tag Cloud “all grown up”. The traditional visual cues of size and frequency remain and, through shaping and shading, they are enhanced with sentiment analysis and a nicer visual experience. The final advance that the Concept Cloud provides, and the problem that drove us to create this solution, is in the connections between the circles. It’s great that key concepts present in the data are identified. However, what about the relationships?
This is where we think this visualization separates itself from the pack. Using some “relational/set magic”, the number of times these terms are found in proximity to each other is calculated and used to inform the user visually. When you wave over a line, linked terms are brought to the fore and unrelated terms are faded. And, just as larger circles can be highlighted to show the exact frequency of terms, the edges or connections provide the same level of precision and can show the exact number of links in common.
If you look back at the original graphic, it should be apparent that the terms are actually laid out according to how closely they are linked. Terms that are found in common with one another are arranged more closely than those with a more loose association. Terms that are totally unrelated will shown on the page, but totally disconnected from their cohort.
Also, because this visualization is based on platform-neutral technology (including D3 and SVG) and not Flash, it looks great on mobile, supports zoom in, zoom out and scales beautifully.
One other thing to note on the last diagram is the color coding of different concepts. When pairing this technology with sentiment analysis engines, such as Lexalytics, we can appropriately shade the representative circles for terms and concepts to indicate whether they are being referred to in a positive or negative fashion.
We’re starting to integrate this capability on our current engagements so please contact us if you’d like to learn more or even if you’d just like to see what this would look like on top your own data. We always welcome any feedback or suggestions, either comment below, tweet at us or send us a message at info [at] ranzal.com.