What is so interesting about Big Data?

Dr Thomas Montague, State Manager (Victoria and Tasmania), Digital Careers writes about big data.

In the second decade of the 21st century Big Data is big. It’s big because people are talking about it, it’s big because people are worried about it and it’s big because it’s useful. The world is now literally awash with Big Data. But before diving straight into they whys and wherefores it is perhaps useful to take a step back and think about data and why it is useful.

My old boss and data guru Prof. Mike Cullen used to say “if you don’t measure it you don’t know”. As a behavioural psychologist his interest was in understanding and recording what animals do and how they behave. More specifically we were interested in the Little Penguins at Phillip Island, how many penguins there were and the health of the colony, because in Victoria penguins were, and still are big business.

At the time we were looking to better understand why the population was declining and what could be done to turn this problem around. Fortunately for us we had help from a dedicated group of people the Penguin Study Group that started weighing and banding Little Penguins as far back as 1968 and they had data and lots of it. In fact by the time Mike and I came along they had tagged over 60,000 penguins and had caught some individual penguins several times per year and this generated a lot of data and the opportunity to help analyse it.

It was then that my love affair with data began. I didn’t know it at the time but the reason anyone collects data is in order to make predictions. In our case it was to predict whether the number of penguins at Phillip Island would increase or decrease over time.

Not surprisingly, the need and desire to make predictions has been around for years because it can give you an advantage, especially if you gamble. Indeed two famous French mathematicians Blaise Pascal and Pierre de Fermat developed the theory of probability in the mid-1600s precisely because of the their interest in gambling problems. Predicting what the weather will do can bring life changing benefits and the Australian Bureau of Meteorology (the BOM) delivers these benefits every day.

Now that I hope I’ve convinced you that predictions based on data can be useful let’s look at what’s different about big data?

Big data has been around for as long as computers, and even before, but before computers data took a long time to process and was expensive to store. What’s changed is our ability to generate, capture and store data has vastly improved.

Up until the 1980s and the birth of the PC, data was largely stored and analysed on main-frame computers owned by the defense industry, banks, universities and governments. Even entering the data was a challenge but these days it can be automatically recorded using an array of sensors and ID tags and transmitted and stored almost in real time. Data is everywhere and those that can make sense of it and covert data to information and knowledge will own the future. Sometimes the analysis will be done almost in real-time at the point of collection by a chip on the sensor but other times analysis will require more processing grunt and consideration.

So who collects big data and why do they collect it? Meteorologists (weather guessers) have satellites and automatic weather stations that collect data for monitoring atmospheric conditions. Similarly oceanographers use satellites to monitor sea surface temperatures. Medical people collect huge amounts of data on their patients to assess and predict the benefits of a range of treatments. Government systems such as Australia’s Medicare help track the national spend on healthcare, and supermarkets, such as Coles, track the spend of their customers using their Flybuys card system.

Innovations, like smart phones, generate and transmit huge amounts of data that might include things such as where you are, the websites you have looked at and the messages you send and whom you called. All this data helps create a picture of our behaviour, the people we know, where we go, what we do and don’t know and what concerns and interests us. Whether you should be worried about all this big data is the subject of another blog but sadly much, if not most, of our big data goes offshore where we don’t have a say in how our data is used or who it is sold to or actually what is done with it.

But it’s not all doom gloom and worry. To me big data is a two edged sword. It delivers a great number of benefits, it has huge potential to make predictions about things we never thought possible and even help us better understand how to look after the Little Penguins at Phillip Island.

As it turned out people are still capturing penguins and their data at Phillip Island and we now know a lot more about where they go, how much they eat, their survival rates and fortunately that their numbers are now increasing.

