Posted on: November 28, 2016

Issue 7: Data Science. The Biggest Buzz Word of the Decade.

Issue 7: Data Science. The Biggest Buzz Word of the Decade.

Data science is all about analyzing data using mathematical models and deriving algorithms useful for making decisions in the future. The field of data science offers many employment opportunities due to rapid growth and unusual explosion of data in the recent years. An interested party should have the ability to invest time and hard work in mastering the concepts of data analysis and design. 

Data is exploding at an astronomical pace. These days, companies are generating data in gigabytes and terabytes, and this is making the job of an IT administrator pretty tough in terms or storing and reusing it. The data explosion, however, is also giving rise to new and innovative ways of storing and processing data. Companies are moving away from the old ways of decision making to data driven decision making, for better results.
Traditionally, companies are used to making decisions based on previous results and perceptions based on those results. Now, the availability of huge data is helping them take meaningful steps and make informed decisions based on the results obtained from processing their data. This methodology is giving a clear overview of present and future outcomes of the decisions with absolute real and accurate predictions. In this context, it is necessary to find ways of handling the data. The science that deals with the process of handling and processing data for decision making is called data science. The people who perform these actions are called data scientists.


In today’s scenario, more than 90% of data that is generated all over the world is unstructured. This data needs to be structured in order to use it for decision making purposes. Data science involves collecting the data, cleaning it up, and building a technique that extracts new information about that specific data. Companies today are facing a challenging situation where the data that is generated is either obsolete and insufficient, or there is more than required in order to make it useful. In case the data is insufficient, one has to find the means to search the essential data and convert it into a useful form for decision making. In the case that there is too much data, the scientist has to select the necessary data within the cluster and use that to make the final decision. 


Over the last several years, data has become cheaper to collect and store. There are many free computing tools available to do something about the data deluge that is currently prominent among different areas of science and business. But big data is a new, cutting-edge technology in a sense that we have data in areas that we didn’t used to have it. We didn’t have access to GPS information from cars, or the gene structure of various living organisms in the world, so nowadays we have access to various kinds of data, which is creating new avenues for us to answer questions that were left unanswered before. This data is always unstructured and mostly filled with uncertainty. This kind of uncertainty is dealt with easily using various concepts of statistics and machine learning concepts available in data science.


Data science is a field that has existed for a long time, but has just recently attained significance in the last couple of years due to an unusual explosion of data. This has generated many opportunities in the information technology sector. It is estimated that by 2020, there will be a threefold increase in the number of job opportunities for data scientists. According to the McKinsey Global Institute’s research, by 2018 the United States will experience a shortage of 190,000 skilled data scientists and 1.5 million managers and analysts capable of handling such large data and develop meaningful insights from them. With an estimated 40,000 exabytes of data being collected by 2020 — up from 2,700 exabytes in 2012 — the implications of this shortage become apparent. Further driving this explosion in data collection and the demand for skilled practitioners is the wide range of sectors that will leverage big data analytics in the next decade. These sectors include retail, manufacturing, health care, and government services apart from other existing users. It’s better late than never to delve into this high growth opportunity.


Venu Ammanabrolu is pursuing his Master of Science in Information Systems at Fairfax University of America. He has nearly a decade of experience working as a senior business analyst in India. He is presently pursuing his passion to become a data scientist after getting involved in the decision making process.