Sunday, February 9, 2020

What is Data Science and What Do Data Scientists Do?

Data science uses artificial intelligence and machine learning to extract meaningful information, answer specific questions or provide management recommendations and advice, improve work performance or avoid big data problems and to predict future patterns and behaviors.
In this article, we will discuss what data science is and what data scientists do?
data science
Data science: what are the roles and responsibilities of a data scientist?

What is Data Science and What Do Data Scientists Do?

What is Data Science?

Data science is an interrelated area that uses scientific methods, algorithms, processes, and systems, in order to understand and analyze actual phenomena with data and extract knowledge and insights from structured and unstructured data.

Data science is a mixture of multidisciplinary fields, primarily focusing on knowledge and understanding of data owned by a particular company or organization and used to solve a problem, answer specific questions or provide management recommendations and advice to improve work or avoid problems.

Data scientists have many skills - including computer science, mathematics, especially algebra, calculus, and statistics, and business knowledge - to analyze data collected from smartphones, sensors, websites, customers, and other sources. 

Data science produces insights and reveals trends that companies or organizations can use to make better decisions and create more innovative products and advanced technology services. 

Data is the basis of innovation, but its value comes from informational data that scientists have the ability to extract and work on.

A Brief History of Data Science

As a field of expertise, data science is a new science. It originated from the areas of data mining and statistical analysis. Data science emerged in 1996 at a conference in Japan.  

Science was first published in 2002 by the International Council for Science (ICSU): Committee on Data for Science and Technology (CODATA). In 2007, the Research Center for Dataology and Data Science was founded in China.
Two researchers at the center had published a paper that defines data science as a new and different class of natural and social sciences.

By 2008, the term data scientists had emerged and the field had begun to take off. At the moment, many colleges and universities have started to offer degrees in data science.

In 2009, Zhu and others defined data science as a new science whose subject is research. There is an agreement that data science differs from existing technologies and sciences today and will be a promising research path in the future.

The Difference between Data Scientists and Data Engineers
Data scientists use computer-based programming languages and techniques.
The data engineer is the one who creates solutions for technical shortcomings in the processing of high-capacity data and speed.
Data engineers prepare the data so that the data scientist can extract useful information from that.

What Do Data Scientists Do?

The demand for qualified data scientists has exceeded the number in recent years. The data scientist ranked the top 50 jobs in America based on metrics such as job satisfaction, number of jobs, and average base salary.

Key responsibilities and duties of the data scientist can include developing strategies for data collection and analysis, preparing data for analysis, exploration, image analysis, and data visualization creating models with data using programming languages ​​such as Python and R, publishing models in applications, using different types of reporting tools to detect patterns, trends, and relationships in data sets.

Data scientists cannot work alone, they usually work in teams to remove big data to obtain information that can be used to predict customer behavior and identify business risks and opportunities. 

These teams may include a business analyst identifying the problem, a data engineer preparing and accessing data, and IT engineer overseeing basic processes and infrastructure, and an application developer that publishes models or analytical outputs in applications and products.

These teams are mandated to develop statistical learning models to analyze data, so they must have experience in using statistical tools, as well as the ability to create and evaluate complex predictive models.

What are the Key Steps of a Data Science Project?
Data analysis and action is more iterative than linear, but this is how work typically flows to a data modeling project:

Plan: Identifying the project and its possible outputs.

Setup: Creating a working environment, ensuring that data scientists have the right tools and access to the correct data and other resources such as the ability of computing systems.

Ingestion: Loading data in the work environment.

Exploration: Analyzing, exploring, and visualizing the data.

Modeling: Creating, training, and validating models so that they work as required.

Deployment: Dissemination of models within a production
Who Oversees Data Science Operations?
Data science operations are usually supervised by three types of managers:

IT Managers:
Senior IT Managers are responsible for architecture and infrastructure planning that will support data science operations. They continuously monitor processes and use resources to ensure data science teams work efficiently and safely. 
IT Managers may also be responsible for creating and updating data science team environments.

Data Science Managers:
Data science managers supervise the data science team and their daily work. Data science managers are team builders whose duties are to produce effective team players who can balance team development with project planning and monitoring.

Business managers:
Business managers work with the data science team to identify the problem and develop an analysis strategy. They may represent the front line of the business such as finance, marketing and sales and they have a very experienced data science team who periodically report existing conditions. 
Business managers work closely with the IT manager and data scientist to ensure the delivery of projects.

No comments:

Post a Comment