Saturday, June 23, 2018

COMPONENTS OF DATA SCIENCE


COMPONENTS OF DATA SCIENCE 

1.STATISTICS: 

  1. Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation and organization of data.
  2. Statistics began in the ancient civilization, going back at least to the 5th century BC, but it was not until the 18th century that it started to draw more heavily from calculus and probability theory.
2.VISUALIZATION: 
    Visualization is when we display the results of Data Science analysis in a simpler way using diagrams, charts and graphs.
It improves decision making, sense of work, customer relationship and financial performance.

3. MACHINE LEARNING: 
     Machine Learning explores the study and construction of algorithms that can learn from and make predictions on data.
  1. Closely related to computational statistics.
  2. Used to devise complex models and algorithms that lend themselves to a prediction which in commercial use is known as predictive analytics.
4. DEEP LEARNING:
        Deep learning is one of the only methods by which we can circumvent the challenges of feature extraction in machine learning. This is because deep learning models are capable of learning to focus on the right features by themselves, requiring little guidance from the programmer.
Therefore, we can say that Deep Learning is:
 1. A collection of statistical machine learning techniques
 2. Used to learn feature hierarchies
 3. Often based on artificial neural networks


Below is the required skills set for becoming a data scientist :-

learn each skill online:- 

1. Python
Learn Python Programming From Scratch by Udemy
Learn to program in Python by CodeCademy
LearnPython.org interactive Python tutorial

2. Machine Learning
Machine learning online
Operational Intelligence and Machine Data with Splunk

3. R Language
R Basics – R Programming Language Introduction by Udemy
Introduction to R at DataCamp
Learn R at Code school

4. Big Data
Big Data University
Big Data and Hadoop Essentials by Udemy
Basic overview of Big Data Hadoopby- Udemy

5. Statistics
Statistics One by Coursera
Statistics and Probability
Probability & Statistics

6. Data Mining
Data Mining and Web Scraping: How to Convert Sites into Data by Udemy
Data Mining by Coursera

7. SQL
Interactive Online SQL Training for Beginners
Sachin Quickly Learns (SQL) – Structured Query Language by Udemy
SQL Tutorial by w3schools

8. Java
Learn Java: The Java Programming Tutorial For Beginners by Udemy
Learn Java – Free Interactive Java Tutorial
Learn Java Programming From Scratch – Udemy




Range of roles to make a data science project: -


Data Scientist
     This role involves a class of mathematicians/statisticians who do predictive modelling, story-telling, visualizing. They also cleans and organises big data.

Data Analyst
    These people are data-junkie. They understand data like no one else. They perform statistical data analysis and are database experts.

Data Architect
    They are the contemporary 'data modeller'. They provide data warehousing solutions and possess in depth knowledge of database architecture.

Data Engineer:
     what the world sees them as 'Software Engineer'. They are 'jack of all trades'. They develop, construct, test and maintain architecture.



      It's so much also about scientific foundations than just about the practical methods. Data science, data analytics, big data(however you called it) can be applied to almost every field out there, be it agriculture, sports, economy, finance, medicine to name a few.



No comments:

Post a Comment

Python | Remove Duplicates from a List