Data Science: Overview, History and FAQs

What Is Data Science?

Data science is a field of applied mathematics and statistics that provides useful information based on large amounts of complex data or big data.

Data science, or data-driven science, combines aspects of different fields with the aid of computation to interpret reams of data for decision-making purposes.

Key Takeaways

  • Data science uses techniques such as machine learning and artificial intelligence to extract meaningful information and to predict future patterns and behaviors.
  • Advances in technology, the internet, social media, and the use of technology have all increased access to big data.
  • The field of data science is growing as technology advances and big data collection and analysis techniques become more sophisticated.

Understanding Data Science

Data is drawn from different sectors, channels, and platforms, including cell phones, social media, e-commerce sites, healthcare surveys, and internet searches. The increase in the amount of data available opened the door to a new field of study based on big data—the massive data sets that contribute to the creation of better operational tools in all sectors

The continually increasing access to data is possible due to advancements in technology and collection techniques. Individuals buying patterns and behavior can be monitored and predictions made based on the information gathered.

However, the ever-increasing data is unstructured and requires parsing for effective decision-making. This process is complex and time-consuming for companies—hence, the emergence of data science.

The Purpose of Data Science

Data science, or data-driven science, uses big data and machine learning to interpret data for decision-making purposes.

A Brief History of Data Science

The term "data science" has been in use since the early 1960s, when it was used synonymously with "computer science". Later, the term was made distinct to define the survey of data processing methods used in a range of different applications.

In 2001 William S. Cleveland used for the first time the term "data science" to refer to an independent discipline. The Harvard Business Review published an article in 2012 describing the role of the data scientist as the “sexiest job of the 21st century.”

How Data Science Is Applied

Data science incorporates tools from multiple disciplines to gather a data set, process, and derive insights from the data set, extract meaningful data from the set, and interpret it for decision-making purposes. The disciplinary areas that make up the data science field include mining, statistics, machine learning, analytics, and programming.

Data mining applies algorithms to the complex data set to reveal patterns that are then used to extract useful and relevant data from the set. Statistical measures or predictive analytics use this extracted data to gauge events that are likely to happen in the future based on what the data shows happened in the past.

Machine learning is an artificial intelligence tool that processes mass quantities of data that a human would be unable to process in a lifetime. Machine learning perfects the decision model presented under predictive analytics by matching the likelihood of an event happening to what actually happened at a predicted time.

Using analytics, the data analyst collects and processes the structured data from the machine learning stage using algorithms. The analyst interprets, converts, and summarizes the data into a cohesive language that the decision-making team can understand. Data science is applied to practically all contexts and, as the data scientist's role evolves, the field will expand to encompass data architecture, data engineering, and data administration.

Fast Fact

Demand for computer and information research scientists is expected to grow 15% from 2019 to 2029, much faster than other occupations, according to the U.S. Bureau of Labor Statistics.

Data Scientists

A data scientist collects, analyzes, and interprets large volumes of data, in many cases, to improve a company's operations. Data scientist professionals develop statistical models that analyze data and detect patterns, trends, and relationships in data sets. This information can be used to predict consumer behavior or to identify business and operational risks.

The data scientist role is often that of a storyteller presenting data insights to decision-makers in a way that is understandable and applicable to problem-solving. 

Data Science Today

Companies are applying big data and data science to everyday activities to bring value to consumers. Banking institutions are capitalizing on big data to enhance their fraud detection successes. Asset management firms are using big data to predict the likelihood of a security’s price moving up or down at a stated time.

Companies such as Netflix mine big data to determine what products to deliver to their users. Netflix also uses algorithms to create personalized recommendations for users based on their viewing history. Data science is evolving at a rapid rate, and its applications will continue to change lives into the future.

Don't All Sciences Use Data?

Yes, all empirical sciences collect and analyze data. What separates data science is that it specializes in using sophisticated computational methods and machine learning techniques in order to process and analyze big data sets. Often, these data sets are so large or complex that they can't be properly analyzed using traditional methods.

What Is Data Science Useful for?

Data science can identify patterns, permitting the making of inferences and predictions, from seemingly unstructured or unrelated data. Tech companies that collect user data can use techniques to turn what's collected into sources of useful or profitable information.

What Are Some Downsides of Data Science?

Data mining and efforts to commoditize personal data by social media companies have come under criticism in light of several scandals, such as Cambridge Analytica, where personal data was used by data scientists to influence political outcomes or undermine elections.

Article Sources
Investopedia requires writers to use primary sources to support their work. These include white papers, government data, original reporting, and interviews with industry experts. We also reference original research from other reputable publishers where appropriate. You can learn more about the standards we follow in producing accurate, unbiased content in our editorial policy.
  1. MIT Computer Science and Artificial Intelligence Laboratory: Courses Server. "50 years of Data Science," Pages 1, 14, and 17. Accessed Sept. 12, 2021.

  2. MIT Computer Science and Artificial Intelligence Laboratory: Courses Server. "50 years of Data Science," Page 1. Accessed Sept. 12, 2021.

  3. William S. Cleveland. "Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics." International Statistical Review, 2001, Vol. 69, No. 1, Pages 21-26.
  4. Harvard Business Review. "Data Scientist: The Sexiest Job of the 21st Century." Accessed Sept. 12, 2021.

  5. U.S. Bureau of Labor Statistics. "Occupational Outlook Handbook: Computer and Information Research Scientists." Accessed Sept. 12, 2021.

  6. Federal Trade Commission. "Opinion of the Commission: In the Matter of Cambridge Analytica, LLC," Page 1 and 2. Accessed Sept. 12, 2021.