Definition of

Data science

Qualification

Recommender systems are based on data science.

Data science is a discipline that combines techniques and resources from statistics , mathematics , computer science and artificial intelligence (AI) to obtain information of interest. It is a scientific and academic field focused on the analysis of massive data (big data) for the extraction of valuable or useful content.

The mission of data science is to make visible knowledge that is hidden or implicit in databases . This requires the development of a series of procedures that enable the processing, cleaning and classification of said data to provide access to the information in question.

History of data science

The history of data science began in the 1960s . The American John W. Tukey (1915-2000) is mentioned as a pioneer when reflecting on data analysis and the evolution or progress of mathematical statistics. Another promoter at that same time was the Danish Peter Naur (1928-2016), who contributed to the dissemination of the concept.

In 2001 , meanwhile, the North American William S. Cleveland postulated data science as an autonomous discipline that transcended statistics and incorporated notions of computer science. The following year, “Data Science Journal” was born, the first specialized publication on the subject.

Market segmentation

Data science can contribute to market segmentation.

Your tasks

As we already indicated, data science is applied through different procedures, which can be associated with stages or phases.

The first task carried out by the discipline is the collection of data , both unstructured and structured. This work can be carried out in different ways: transmitting data from various devices and systems, extracting it from the Web or entering it manually, for example.

Data storage takes place almost simultaneously. The following tasks consist of data cleaning (eliminating duplications, to mention one case) and exploratory data analysis (EDA) for the recognition of patterns, trends, ranges, etc. AI algorithms and neural networks are key in this data processing.

Finally comes the turn of data visualization (already processed and analyzed) and its communication so that it is possible to act based on them.

Medicine

Data science allows for predictive analysis in health.

Importance of data science

The constant and accelerated growth of the amount of existing data makes this discipline increasingly important. For companies, extracting valuable information from this immense set of data is crucial for planning and decision making .

A data scientist works together with experts in data engineering, computer science, and other branches of knowledge. They typically use Python , R , and other programming languages ​​to support data visualization and statistical inference.

Using predictive modeling , machine learning , deep learning , natural language processing (NLP) , and other tools, the data scientist achieves information extraction. Using his knowledge of this discipline, he is also in a position to transmit this knowledge to managers and any interested party, explaining how problems can be solved thanks to the results obtained.

Types of analysis

It is common for data analysis to be classified as follows:

  • Predictive analysis : Taking historical data and working with patterns, predictions are made about what may happen. For this, algorithms are used that allow the systems to be trained.
  • Prescriptive analysis : It involves the development of an analysis but also includes a recommendation on how to act upon the forecast. Within this framework, the possible effects of the different options are examined.
  • Diagnostic analysis : Focuses on the causes of phenomena through data mining and analysis of correlations.
  • Descriptive analysis : Based on an examination of the data to understand what is happening or has already happened.

Data science examples

We can find examples of data science in multiple areas. The application of this discipline can be seen in banking, when a FinTech uses automated risk analysis using AI to grant credits requested from an application. Through a prescriptive analysis, the system studies the client's background and establishes whether or not the loan is granted and under what conditions.

In the field of digital marketing, data science makes user behavior analysis possible, for example. From the data they generate with their different actions, useful information can be extracted to develop personalized campaigns and send them specific proposals. In this framework, a social network analysis can also be used.

Another example of data science occurs when AI is used to perform competitive analysis . This type of process is very valuable to define the commercial strategy taking into account the situation and position of competitors in the market. The development of forecasting based on data is another possibility offered by this discipline.