Wednesday, May 23, 2018

DATA SCIENCE RESEARCH WITH SK

                           
DATA SCIENCE AND RESEARCH
                                        -SK
    
 




What is Data ?

    Data is a raw, unorganized set o things that need to be processed to have a meaning and Raw data is like Raw intelligence, Useless.


What is Information ?
  Information is when data is processed, organized, structured or presented in a given context so as to make it useful

What is information Science ?
 Information Science” seems to be more appropriate term, but it’s too far to go back.


What is data Science:
    “Data Science is when you are dealing with Big Data, large ammounts of data”. but That is not true, Data Science can be applied to a data set with one thousand lines, there is no problem with this
 
   The creation of Data Science in simple words: two sides that were not totally connected, but with the new fast paced and technological world would have to merge together

  1. Statistics/mathematics: formulate proper models to generate insights;
  2. Computer science: make the bridge between the models and the data in a feasible time to come with the result;
  3. Only two sides because Machine Learning is all based on math and stats;
  4. Theoretical computer science could be considered a branch of mathematics;
What kind of knowledge and Skills you want will become a Data Scientist :-
     guys you know well this topics  you are a Data Scientist.
    1. Linear algebra
    2. Non-linear systems, dynamic systems
    3. Analytical geometry
    4. Optimization
    5. Calculus
    6. Statistics and probability
    7. Programming language (R, Python, SAS, Javascript)
    8. Softwares: Excel, IBM SPSS, SAS Enterprise Miner
    9. General DS & MLasS platforms:
      1. IBM Watson Studio & Analytics
      2. Azure Machine Learning,
      3. Google Cloud Machine Learning,
      4. H2O
      5. Big ML
      6. Rapidminer and KMINE
      7. Amazon SageMaker
    10. Data visualizations: Power BI, Tableau, R/Python using plotly/ggplot/highcharts
    11. Machine Learning (supervised, unsupervised and reinforcement learning)
    12. Big Data (MapR, RedShift, Snowflake, Big Query, Cassandra, Hadoop, Spark)
    13. Hardware (CPU, GPU, TPU, FPGA, ASIC)


  Hey you will  a data scientist?



Some helpful resources

As I worked on projects, I found these resources helpful.  Remember, resources on their own aren't useful -- find a context for them:
  • Khan Academy -- good basic statistics and linear algebra content.


Python | Remove Duplicates from a List