I guess you’re already familiar with R and have a sexy Data Scientist job but you’ve heard the cool kids in the industry are also using Python. I said also, because as you can see from people’s job titles, Reddit ,LinkedIn and so on, there are plenty Data Scientists that use both.
You have probably figured that I’m an R enthusiast myself. I realized a long time ago that I also have to learn Python (I wanna be a cool kid as well). I struggled at first. Why? My current and previous roles were very much R based. Nobody used Python.
I knew it’s easier to move into this direction together with the team, so I proposed that all 5 of us (Data Scientists) on the team go ahead and take a Python course. One colleague was on board. In the summer of 2018 we went and sat in on an in class Python training . The training wasn’t too bad, but it was general Python programming, not Data Science specific. In order to get the diploma we had to do an assignment, so we did a Data Science one and started discovering a bit of Python for Data Science. I can tell you that it was a bit painful, using pandas was just weird.
I like how this guy describes the experience in the Reddit post below:
This is pretty much the whole 2018 Python experience that I had. And I stopped.
In 2019 I went on maternity leave to take care of Baby B (my 1st born). I felt like it was the perfect time to really learn Python for Data Science and put it on my CV. I bought a bunch of books.
I also enlisted to a number of really good Pluralsight courses. Pluralsight is golden for a Data Scientist / ML Engineer etc:
(Note: I participate in affiliate programs. This post may contain affiliate links from Amazon or other publishers I trust (at no extra cost to you). I may receive a small commission when you buy using my links, this helps to keep the blog alive! See disclosure for details.)
- Doing Data Science with Python
- Understanding Machine Learning with Python
- Pandas Fundamentals
- Exploring Web Scraping with Python
I would say that in 2019 I got a good grasp of how Python works (compared to R):
- Python is a general programming language, that also knows how to do what R does best 😜: data wrangling, engineering, feature selection, etc and that pandas is trying to copy dplyr;
- There are many IDEs where you can write and run Python code and that the majority use Jupiter notebook (yuck 🤢);
- It’s a mess if you don’t know how to use a specific Python version and virtual environments;
- Python is pretty strong on deep learning, deployment and production of models;
In 2021, I went back to work and I jumped on the first Python project in the team. In the meantime, another Data Scientist colleague switched to Python and built a robust neural network project for fraud. It was the perfect opportunity for me to put my new skills to test and see if my efforts in learning Python are paying off. They have: I was able to Validate the Fraud Project and give some recommendations. For every big project we have a Development and a Validation part – so that at least 2 Data Scientists are involved in one project.
Was I able to develop my own project in Python? No, because I went on maternity leave (again 😅).
We’re now in 2022 and I think I found something that is super helpful in switching to Python: RStudio is now home for both R and Python (hurray!! ). How cool is this?
When I found out, I jumped straight into updating my RStudio in order to be able to run Python. For the initial setup, just follow the official RStudio guidelines.
Do I think this helped? Oh yeah – it helped me a lot. So much so that I even decided to even start blogging about my mummy & Data science experience.
Python does not look so scary anymore:
One major difference is that now you have to “pip install” your libraries in the Terminal , not in the Console / Script, the way you would do it in R – not a big deal.
So, RStudio as the IDE + a personal project = the way to go.
You know what they don’t say – 4th time is a charm :). I can thank RStudio for making me fall in love with Python.
Now I’m trying to learn OOP for Data Science. I’ll update you soon!
This is a personal blog. My opinion on what I share with you is that “All models are wrong, but some are useful”. Improve the accuracy of any model I present and make it useful!