data science mentoring ml mlops

Data science mentoring is my passion. I love helping data professionals step out of their comfort zones and achieve career growth. Recently, I had the opportunity to host a Gathers meetup called “Data Science Mentorship: A Win-Win Meetup.” At the meetup, I shared my thoughts on the benefits of data science mentoring and answered questions from the audience.   This blog post is a summary of the questions and answers from the meetup. I hope this information is helpful to you, whether you are a mentor or a mentee.   Benefits of data science mentoring Mentors: Mentoring can help you to develop your leadership skills, give back to the community, and learn new things from your mentees. Mentees: Mentoring can help you to learn new skills, advance your career, and build relationships with experienced professionals.   Tips for mentors: Be supportive and encouraging. Your mentee needs to know that you believe in them and that you are there to help them succeed. Provide guidance and feedback. Help your mentee to set goals, develop a plan, and identify resources. Be a role model. Share your experiences and insights with your mentee.   Tips for mentees:   Be proactive. Don’t be afraid to ask for help and advice. Be open to feedback. Be willing to learn from your mistakes and grow. Be respectful of your mentor’s time and expertise. Ready to jump right in and uncover answers to some of the burning questions in the world of data science mentorship?   1. What is Data Science? Data science is a versatile field that equips data professionals with the tools to tackle complex problems and make informed decisions by applying mathematical and statistical concepts in a systematic and reproducible manner.   Another way of explaining this is how I explain it to my kids:   Data science is like playing a special game of hide and seek with your teddy bear. Imagine you really, really love your teddy bear, but you can’t remember where you left it in your room. You want to find it so you can hug it and feel happy again.   So, you ask someone to help you, like a magic friend. This magic friend uses their superpowers to figure out where your teddy bear might be hiding. They look around your room, and when they get closer to the teddy bear, they say, ‘You’re getting warmer!’ But if they go in the wrong direction, they say, ‘You’re getting colder!’   Data scientists are like those magic friends. They help grown-ups with important stuff, like making sure cars don’t break down unexpectedly, deciding who can borrow money from a bank, and figuring out who might stop using a favorite game. They use their special skills to solve big problems and make the world a better place, just like how you want to find your teddy bear to make yourself happy again.   For a more formal and concise definition of Data Science that you can use during an interview, consider the following: Data Science is the systematic application of scientific methods, algorithms, and data processing systems to extract knowledge and insights from diverse forms of data, encompassing both structured and unstructured sources.   Find here a short article and a mini quiz.   2 .Where to start? Where to start in your Data Science journey depends on your current background. If you have experience in data-related fields like data analysis, software development, or software engineering, you already have a solid foundation. However, for beginners, the first steps often involve gaining a grasp of fundamental concepts in statistics and algebra.   Here are some resources to help you get started:   MIT OpenCourseWare: Statistics for Applications MIT OpenCourseWare: A 2020 Vision of Linear Algebra Data Science Roadmap   3. Which field should I master in? Data scientists who are versatile and adaptable are the most successful. This means being able to quickly understand any business and learn new technologies.   Here are some tips for becoming a versatile data scientist:   Learn how to learn. Data science is a constantly evolving field, so it is important to be able to learn new things quickly. This includes learning new programming languages, new machine learning algorithms, and new data science tools and technologies. Start with Python. Python is a popular programming language for data science because it is easy to learn and has a wide range of libraries and tools available. However, be open to learning other programming languages as well, such as Java, R, and Scala. Learn programming languages for general purposes, not just for data science. This will make you more versatile and adaptable. For example, learning Java will make it easier for you to work with big data technologies, and learning R will make it easier for you to work with statistical analysis tools. Learn clean coding practices. Clean coding is important for all software development, but it is especially important for data science because data science code is often complex and needs to be easily understood and maintained by others. This is a good article to read on Clean Coding. Learn modularity and design patterns. Modularity and design patterns are important for writing maintainable and reusable code. Stay up-to-date with the latest trends and technologies. The field of data science is constantly evolving, so it is important to stay up-to-date with the latest trends and technologies. Read industry publications and blogs, attend conferences and workshops, and take online courses. Initially discover the business you’re trying to help with ML / AI. Take the time to understand the business and the problem you are trying to solve. This will help you to develop effective machine learning solutions. Spend time in the business understanding phase and interview your stakeholder to unlock insights about the problem you need to solve. This will help you to develop a better understanding of the business and the needs of the stakeholders.   By following these tips, you can become a versatile and adaptable

two babies under 3

When is scientifically the best time to have your second? byu/EFNich inScienceBasedParenting Inspired by the above reddit post and the different views on the topic of picking the best interpregnancy period,  I created the below timeline based on scientific proof of the best time to have your second baby. It adds value to the reddit post by visually summarising all the info from the reference links and also spotting some contradictory outcomes.   Insights The best period to have your second baby is between 18 and 24 months after delivering the first baby. STUDY 1 Conceiving less than 6 months after delivery was associated with an increased risk of adverse outcomes for mom and baby but that waiting 24 months may not be necessary for high-income countries. STUDY 2 Children conceived less than 18 months after their mother’s previous birth or children conceived 60 or more months after their mother’s previous birth were more likely to have ASD when compared to children conceived between 18 to 59 months after their mother’s previous birth. STUDY 3 To reduce the risk of pregnancy complications and other health problems, research suggests waiting 18 to 24 months but less than five years after a live birth before attempting your next pregnancy. STUDY 4 For children conceived less than 12 months or more than 72 months after the birth of an older sibling, the risk of autism was two to three fold higher than for those conceived 36 months to 47 months later. STUDY 5 Biggest risk recorded for children conceived less than 12 month after the birth of an older sibling. STUDY 6 The risk for preterm birth was high if the interpregnancy interval was <6 months. The risk for preterm birth declined as the interval increased and reached the lowest level when the interpregnancy interval was between 12 and 23 months. For interpregnancy intervals of ≥24 months, the risk for preterm birth gradually increased. The risk for preterm birth was high if the interpregnancy interval was ≥120 months. STUDY 7 An increased risk of preterm birth for children born after IPIs of less than 13 months and >60 months relative to the reference category of 19–24 months. STUDY 8 “We compared approximately 3 million births from 1.2 million women with at least three children and discovered the risk of adverse birth outcomes after an interpregnancy interval of less than six months was no greater than for those born after an 18-23 month interval,” Dr Tessema said. “Given that the current recommendations on birth spacing is for a waiting time of at least 18 months to two years after live births, our findings are reassuring for families who conceive sooner than this. “However, we found siblings born after a greater than 60-month interval had an increased risk of adverse birth outcomes.” STUDY 9 To reduce the risk of pregnancy complications and other health problems, research suggests waiting 18 to 24 months but less than five years after a live birth before attempting your next pregnancy. Balancing concerns about infertility, people older than 35 might consider waiting 12 months before becoming pregnant again. STUDY 10 Intervals shorter than 36 months and longer than 60 months are associated with an elevated risk of infant death and other adverse outcomes. STUDY 11 Compared to individuals whose first two children were born at most 18 months apart, individuals whose children were more widely spaced had a lower divorce risk. References STUDY 1: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0255000 STUDY 2: https://www.cdc.gov/ncbddd/autism/features/time-between-births.html STUDY 3: https://www.mayoclinic.org/healthy-lifestyle/getting-pregnant/in-depth/family-planning/art-20044072 STUDY 4: https://time.com/4033506/autism-risk-siblings/ STUDY 5: https://researchonline.lshtm.ac.uk/id/eprint/4663143/7/Schummers_etal_2021_Short-interpregnancy-interval-and-pregnancy.pdf https://www.dovepress.com/association-of-short-and-long-interpregnancy-intervals-with-adverse-bi-peer-reviewed-fulltext-article-IJGM STUDY 6: https://www.michigan.gov/-/media/Project/Websites/mdhhs/Folder4/Folder15/Folder3/Folder115/Folder2/Folder215/Folder1/Folder315/200804IPI_PTB_LBW_SGA_2008-2018.pdf?rev=e978a7ae96db445ebb0a4cf6d31ea8f9 STUDY 7: https://www.tandfonline.com/doi/full/10.1080/00324728.2020.1714701 STUDY 8: https://www.sciencedaily.com/releases/2021/07/210719143421.htm STUDY 9: https://www.mayoclinic.org/healthy-lifestyle/getting-pregnant/in-depth/family-planning/art-20044072 STUDY 10: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6667399/ STUDY 11: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6993964/   (Note: I participate in the affiliate amazon program. This post may contain affiliate links from Amazon or other publishers I trust (at no extra cost to you). I may receive a small commission when you buy using my links, this helps to keep the blog alive! See disclosure for details.) BONUS –  a free audible book from Amazon: This is a personal blog. My opinion on what I share with you is that “All models are wrong, but some are useful”. Improve the accuracy of any model I present and make it useful!

Google BigQuery Python

Get data from a Google BigQuery table using Python 3 – a specific task that took me a while to complete due to little or confusing online resources   You might ask why a Data Scientist was stuck to solve such a trivial Data Engineering task? Well… because most of the time… there is no proper Data Engineering support in an organization. Steps to follow if you want to connect to Google BigQuery and pull data using Python: Ask your GCP admin to generate a Google Cloud secret key and save it in a json file: install libraries: pip install google-cloud-bigquery pip install google-cloud pip install tqdm pip install pandas_gbq import libraries: import os import io from google.cloud.bigquery.client import Client from google.cloud import bigquery import pandas_gbq set Google Credentials (your json file created at Step 0): os.environ[‘GOOGLE_APPLICATION_CREDENTIALS’] = ‘path to your json file/filename.json’ define a BQ client: storage_client = storage.Client( project = ‘yourprojectname’) define de query and save it in a variable query = f””” SELECT * FROM `projectname.tablename`; “”” use pandas_gbq to read the results and save them in a dataframe: queryResultsTbl= pandas_gbq.read_gbq( query, project_id=project_id, dialect=”standard” Something like this: import os import io from google.cloud.bigquery.client import Client from google.cloud import bigquery import pandas_gbq os.environ[‘GOOGLE_APPLICATION_CREDENTIALS’] = ‘google-key_Cristina.json’ project_id = “project-name” client = bigquery.Client(project = project_id) query = f””” SELECT * FROM `project-name.table-name`; “”” queryResultsTbl= pandas_gbq.read_gbq( query, project_id=project_id, dialect=”standard” ) queryResultsTbl.head(10)   This is a personal blog. My opinion on what I share with you is that “All models are wrong, but some are useful”. Improve the accuracy of any model I present and make it useful!