Hello, and welcome back folks! In
the previous article, we looked into the career prospective of Data Science and
what you should choose between Data Scientist and Data Analyst as your career
option. Now, both of these professions are, based on Data Science, and so,
first, we need to understand what is Data Science. Today I am going to discuss
what exactly is Data Science and why it has become so popular these days, and
how should you get started, what should be the Learning Path towards Data
Science.
What is Data Science?
What exactly is Data Science? Is
it about creating cool visualizations? Is it about statistics? Or about coding?
Or about writing complex machine learning models? What is it?
Data science in simple words is
solving problems and creating impact using data and past experiences. For
example, a Data Scientist working at a product-based company, would be working
on some new product recommendations, or they would be working on improving the
existing products or they would be working on some Data Analytics platforms,
which would be used by the internal teams or the external teams.
How did it start?
It all started back in 2001 when
William S. Cleveland published his paper, “Data Science: An Action Plan for
Expanding Technical Areas of the Field of Statistics”, where he combined Data
Mining with Computer Science and made the practical usage of statistics, a lot
more technical. We could now use computing power along with statistics, and
this amalgamation was called Data Science.
Hierarchy of needs
So, now let's try to understand what the Data Science space looks like:
So, this a Data Science pyramid, basically, this explains to us what's the hierarchy of needs or the skills that comprise the Data Science domain. Now the solution to any Data Science problem starts with the collection of data, which is placed at the bottom. Next comes, how easy is it to access the data, and how efficient is the data infrastructure, which is ensured by data engineers in the second layer.
Now, once you have the data
infrastructure setup, the analysts explore and transform this data to uncover
hidden patterns, store analytics, and create visualizations, that make the data
easy to read.
Then comes Scientists and Senior
Analysts, who have expertise in AI, Deep Learning, Designing experiments for
A/B testing, etc... So, you can see that the Data Science Domain, in itself
gives rise to several job descriptions, that you can aim for, Data Engineers,
Data Analysts, Machine Learning Engineer's, Data Scientists, Research
scientists, Core Scientists, they're a bunch of opportunities that you can aim
for.
Why has it become a Buzzword?
So, why has data science become a
buzzword these days? We see a lot of news around Data-Science based companies
and startups, raising a lot of money. We saw Neural Magic gets 15M $, Alteryx
acquires machine learning startup Feature Labs.
Companies raising seed funding of
50 million dollars, and just to launch some machine learning framework, the
list just doesn't stop. There is an immense number of possibilities in Data
Science, with a large number of investors, ready to fund these companies. This
is going to be the future, be it Banking, Finance, Healthcare, Agriculture,
Gaming, Entertainment, Space Exploration, Self-Driven vehicles, you just name
it.
What should we Focus on?
So, when it comes to learning
Data Science, I believe that there are four major subjects or four major
branches of Data Science Curriculum, that an individual should work on.
So, the first branch is
programming tools, where you need to cover what is Python? How to program in
Python? How to program in R? based on the language that you choose, then you
learn about how to use notebooks, various libraries like NumPy, Pandas,
TensorFlow, Keras, etc...
The second subject is Data
Engineering, where you learn about how to engineer the data? how to extract it?
learn about writing SQL queries, exploratory analysis, data wrangling,
databases, and API.
The third subject is mathematics
and statistics, where you learn about linear algebra, stats, probability,
hypothesis testing, A/B testing, how to design your experiments.
The fourth and final subject is
called Algorithms and Systems, where you learn about machine learning and deep
learning algorithms, how to build recommender systems, and other concepts. You
have to practice all these concepts in the form of projects, solving real-world
problems.
So, now we have understood, that
there is a lot of interest in pursuing Data Science, and for good reasons, that
is... high job satisfaction and high demand, high salaries and high impact.
So how should you get
started? A quick Google search for Data
Science will give you a lot of resources to learn from. There are podcasts,
forums, blogs, articles, online courses, self-directed curriculum, boot camp,
etc... so there are a lot of resources to learn from.
Definitely, you won’t get all of
them in a single resource. So, to cover the entire data science space, and the
four branches of the data science curriculum, and all the concepts mentioned in
the data science pyramid, you need to search in google for them individually,
and learn from data scientists working at Google, Microsoft, Amazon, and such
big companies. Search for which companies are looking forward to Data Science,
and what they look for in a Data Scientists.
Conclusion
Start by learning Python
programming, and then continue to learn from various resources and have command
over them. With this, I would like to wrap-up this article. I hope, I have
covered all the details that you require to begin your journey in the Data
Science Field. Thank you for reading this article and I would request you to
also check my article on the Difference between Data Scientist and DataAnalyst.
It also helps in tracking traveling details in the past and provides customers with customized travel packages. Big data also help the rail industry by using sensor-generated data to understand breaking mechanisms and mileage. data science course syllabus
ReplyDelete