An examination of big data: its characteristics; cons; potential impact; and how Caribbean businesses might benefit from it.
Over the last year or so, “big data” has become a major buzzword in the ICT/tech space, with projections that it will revolutionise business and how data is used and processed. Although we may all inherently understand what is meant by the term “big data”, the implications may not be as apparent. This post provides an overview of big data, along with key trends and challenges, and discusses some of the implications of big data to the Caribbean.
What is big data?
Typically, “big data” speaks to very large data sets that cannot be readily processed or analysed by traditional software tools, such as databases, within a reasonable period of time. There is no finite definition for big data, as it can be generated from a variety of sources, including but not limited to:
- web server logs
- traffic flow sensors
- satellite imagery
- broadcast audio streams
- banking transactions
- social media chatter
- web pages and logs
- scans of government documents
- financial market data, etc.
However, having said this, the following three characteristics (the 3 V’s) tend to be evident when speaking about big data:
- Volume. Volume speaks to the size of the data sets that are being created and must be processed. The reference size is continually changing – from Gigabytes, to Terrabytes and Pentabytes, and beyond. These exceedingly large data sets make collation and manipulation time consuming and expensive when conventional data processing tools are used.
- Velocity. Velocity refers to the rate at which the data is being generated, and so must be captured, and/or the rate at which it must be processed to produce the desired outputs. For example, it might be essential for a business to be able to process data without significant delay (as real time as possible), in order to identify potential fraud, or to update user recommendations.
- Variety. As mentioned above, big data can be generated from a variety of sources. However, there is also a growing appreciation that more meaningful insights can be generated when different streams of data are analysed together.
Big trends and concerns associated big data
Outlined below are a few of the trends anticipated through big data, along with some of the concerns or disadvantages likely to eventuate.
- As processing speeds, storage, analytic tools and technologies improve, reliance on big data is will become increasingly commonplace, and may not be limited to large corporations as currently obtains.
- Data will become even more commoditised than it is today. The value of information is likely to become more quantifiable, but due to its value to organisations, it is also likely to be seen as highly proprietary by those who own the data.
- Big data is expected to be a major contributor to efforts to improve user/customer experience. As indicated in our reports from the e-G8 Forum held in May 2011, there is a growing shift to provide ultra-personalised, and even individualised, products and services. Developing such capabilities requires considerable volumes of individual user data to be processed and analysed in an iterative manner in order to produce (hopefully) accurate and timely results.
- Many experts are predicting that although considerable growth in big data is anticipated, there will be a shortage of suitably skilled persons to serve that area. For example, global management consulting firm, McKinsey, has predicted that
…[b]y 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions…
- On the other hand, consumers may need to be prepared to make even greater trade-offs with regard to privacy and personal security in order to facilitate the data mining necessary to produce better predictive models and more accurate outputs.
- Finally, although big data may offer new and different opportunities for our societies, its capture, curation, storage and analysis might remain relatively expensive into the foreseeable future. As a result, it is unlikely to contribute, in any significant degree, to the narrowing of the digital divide.