Data for dummies: 6 data-analysis tools anyone can use

I’m a big fan of measurement, metrics and visualisations. I’m not a big fan of the amount of work it can take to set up those metrics. So I was excited back in 2013 to find an article listing six tools to help visualise data, my original post is below, and the original article I referenced is still online.

Five years on and the tools are mostly still around.

  1. BigML
    The company still exists and provides comprehensive data analysis, visualisation and predictions. There’s strong integrations with other tools and programming languages. They’re aiming at the corporate market with enterprise pricing options. There’s still a free option and a subscription option for individuals but their focus seems to be more at the company scale.
  2. Google Fusion Tables
    Disappeared? I suspect some of the data features are still available, but the tool itself has been retired.
  3. Infogram
    Makes infographics in a really simple way, but need to upgrade for any downloads. I so want this tool at work though, I made the infographic scorecard on the right in a matter of minutes.
  4. Many Eyes
    Doesn’t seem to be a stand alone product any more, but has been absorbed into IBM’s data tools, that’s fine, but IBM focuses on the corporate market – and is often at the pricier end of the market in my experience.
  5. Statwing
    Statwing was bought by Qualtrics in 2016, it remains a stand alone tool with the same focus on statistics, you’ll get insights into data, be able to apply statistical tests easily but the output isn’t as designed as some of the other tools.
  6. Tableau
    As luck would have it I now have access to the subscription version of this at work but have yet to play with it. It’s a powerful tool for visualising data and you can wander through the public gallery to see what the possibilities are – including a visualisation of the shape of national happiness. The great thing is that you can always drill down into the data.We will be doing some stakeholder analysis early next year and I think this tool will be a great way to visualise the results.

Five years since my original post and I’m still geekily enthusiastic about data visualisation tools.

 


I’ve spent about an hour playing with these tools, I’m loving Statwing, and will use it to analyse some of the data we’ve got on adoption of new technology. The Infogram tool also has potential to help present data in a more appealing way.

 

Image: data

Big Data

Big Data is often touted as a solution to all our problems, a panacea for all ills often by people who struggle to define it. So what is big data and what kind of problems has it solved?

Big data refers to sets of data so big and complex that they cannot be analysed by traditional methods and tools, but which release new value when analysis is achieved.

Google translate is an example of a problem solved by the use of big data. Although the translations are imperfect they are often good enough to have an understanding of what the writer intended whatever language it was written in. Google does this by statistically analysing millions of documents online that exist in multiple languages and figuring out what is most likely to be a correct translation. The more documents available that have been accurately translated by humans the more accurate the Google translation will be.

Big data analysis has been used in predicting maintenance needs for UPS, New York city council and various car manufacturers. It’s been used in healthcare to predict the onset of infections in newborns, and outbreaks of flu.

So it sounds like it could solve some tough business problems, and it can. But it has limits.

  • messiness of data means tricky to anaylse and interpret – google translate occasionally gets the translation between Dutch and English completely wrong, and this is a language pair that must have millions of documents, you need good analytical expertise and data governance to get the valuable insights out of the data.
  • hidden biases in data collection, for example if you’re relying on smart phone data  you are probably selecting against the lowest income earners.
  • identifies correlation, but that explain causality and doesn’t necessarily tell you what to do.
  • privacy concerns; relating to the collection, use and reuse of data. People may not realise that if enough anonymised data is combined it is possible to identify an individual.

And sometimes all that extra data may induce a sort of paralysis by analysis, a belief that you could make the perfect decision with just a little more data.

Right now we’re only beginning to unlock the value of big sets of data, and it’s still very much in the hands of the experts. It’s going to take some re-learning for managers/business leaders to ask questions that big data can answer, and to understand that correlation does not imply causation.

image: geralt via pixabay

Infographics Rant

I am sick of infographics.

There I said it.

So what is an infographic? Wikipedia gives this definition;

Information graphics or infographics are graphic visual representations of information, data or knowledge. These graphics present complex information quickly and clearly, such as in signs, maps, journalism, technical writing, and education. With an information graphic, computer scientists, mathematicians, and statisticians develop and communicate concepts using a single symbol to process information.

There are infographics that are useful, the stylised subway maps are much easier to use than a true and accurate map would be. They are also fantastic for visualising huge amounts of data, it would take volumes to convey the information that Hans Rosling gets across in his data visualisations. Here’s his explanation of improving health in history, but all his videos are fascinating.

Recently there has been a fashion for infographics, and there is now a plethora of infographics on every conceivable subject;

Social media seems to be a particularly fertile ground for infographics, with 29 million results for the search query “infographics social media” which is about 10 million more than for “infographics” alone. Here’s a selection from pinterest.

The use of infographics is spreading and some are now thinly disguised advertising material including the most pointless graphic I’ve found (so far) is the “what your luggage says about you” one. Which offers the startling conclusion that a woman with a stroller is a multi-tasking mum, someone with carry-on is on business, and someone with a backpack is not.

There are too many pointless infographics out there, ones that;

  • use very long images that require you to scroll to the bottom of the page,
  • that present data in rather suspect ways such as 3D bar graphs
  • make rather dodgy connections between data sets
  • present information that could as easily been presented in a single paragraph or a short list
  • one last complaint – what’s with the use of retro styling?

Just before I got completely fed up with infographics I found a fabulous selection of infographics that specifically mock infographics. Very meta, very 2012.

 

image infographics

“Let the Data set change your Mind Set”

Another fantastic TED talk on data visualisation, and how it might help us understand the enormous amounts of complex information we’re facing.

I love working on visualisation of information, and have had mild success in simplifying problem statements or project goals into single images. I admit I get a kick out of the moment when the data/information “clicks” into place and the diagram becomes clear and simple. I get another click when someone else’s response is along the lines of “ah, now I get it”.

Hacking for Good

Governments collect huge quantities of data, but rarely display it in ways that are both useful and engaging. The Australian government is addressing this by creating “Gov Hack” events, where designers and developers go to work for a day and a half to find ways to present the data online. It’s part of the Government 2.0 Taskforce which aims to increase open access to data.

Picture 5
LobbyClue; Visualising the relationship of various lobby and supply groups.

The winners of the last event found a way to present the inter-relationship between lobbyists and a government department using something similar to an aquabrowser, called LobbyClue. Their stated goal was to “correlate data about Government contracts, business details and politician responsibilities to show the relationships between these items’

The the function makes it fun to explore, but I’m not sure what possibility of analysis are possible, it seems to require mousing over the entity’s name to get the details of the deal or relationship.

My favourite was “know where you live“, it’s perhaps less ambitious in terms of what is done with the data, but the clean presentation and the ease of use appeal. I can imagine this being a useful tool for home buyers. I wish there was something similar for my current city, perhaps I should send them an email.

Picture 2
my old neighbourhood

Data Visualisation

Data visualisation techniques can give new insights into large amounts of data, the results can be quite artistic. Because so much of what we do online is now tagged and categorised there are some tools out there to help us analyse patterns on the web in close to real time, and some data visualisations become new ways of navigating information online – occasionally the reveal more information at a meta level in the process.

Just for fun the people at Pitch Interactive created a visualisation of Oscar winning actors and directors (positioned on the inner ring) and their connections to non-Oscar winning actors (positioned on the outer ring).

The density of the connecting lines indicates that there are a few non-Oscar winners who repeatedly work with Oscar winners. It’s a pattern that would have been very hard to see in the original data.


We feel fine searches for “feel” and “feelings” on the internet, and presents them in several ways. Madness is the format shown at right, with a cluster of emotions and the text of the selected emotion above.

One of the dataviews is a bar graph of the terms used, it’s sobering, apparently we specify our feelings online when we’re feeling low. Frighteningly the word most often used was “whatever”

 

Amaztype have a freakishly mesmerising way of presenting search results on Amazon, you can search by title or author and the results are displayed as book covers that appear in the form of your search term.
The Newsmap offers a great way of viewing news across a range of categories and several countries based on google news. The size and shade of colour give information on the ranking and age of the article. Comparing countries gives an insight into what’s important locally.
My favourite is the Allosphere, it’s a collaboration between musicians, visual artists and scientists. The results are presented on the inside of a 10m diameter sphere in 3D. It is being used by scientists to understand biology at the molecular level and chemistry at the atomic level. The image at right is a visualisation of a lattice of atoms of hydrogen, oxygen and zinc which forms a new material for transperant solar cells.

It almost makes me wish I’d stayed in Science.

image binary code via pixabay