March '22 Heartbeat

This month you will find:

🇺🇦 A special note on the war in Ukraine,

🧘🏻‍♀️ MLOps is a mess, but that's ok,

🥰 Tutorials and workflows from the Community,

🗣 Upcoming Events,

🔺 MLOps Maturity Models,

💻 Online Course(s) updates,

📖 New doc,

🚀 Info on our growing team, and more!

  • Jeny De Figueiredo
  • March 17, 20227 min read
Hero Picture

On the war in Ukraine 🇺🇦

While the war in Ukraine has impacted the world, it has also greatly impacted our company as we have team members living in Ukraine and Russia, and many with family ties to both. Our hearts are with our Iterative family in Ukraine and we are committed to doing everything we can to support the safety of our Ukrainian, as well as the transition of our Russian colleagues during this crisis.

We as a company are against this war. We have donated to the humanitarian efforts to help the people of Ukraine and are matching our team members' donations as well. We are proud of the perseverance, care, and support coming from our team at this time.

If you are able, we ask that you consider these resources as ways to help. Our hope is that the world will find a quick and peaceful end to this war and Ukraine will be restored, even stronger than before. 🇺🇦

🪙 Donations

❤️‍🩹 Other ways to help

AI/ML News

Mihail Eric: MLOps is a Mess But That's to be Expected

Mihail Eric writes a long, but really worth it piece entitled MLOps is a Mess But That’s to be Expected. In it he discusses the allure of seeking a machine learning career, only run smack into the giant wall of learning that encompasses the space, not the least of which is the multitude of tools to pick through once you get there. The state of machine learning is reviewed and some history of DevOps for perspective on MLOps is added.
You will find advice for newcomers and some final, thorough, thoughts and predictions especially as they relate to “ML at a reasonable scale” companies.
Definitely worth your review!

Gartner Hype cycle for MLOps Gartner Hype cycle for MLOps (Source link)

Community News

Kevin Lu: Learn how to use Data Version Control to remove the third wheel from your relationship

Learn how to use Data Version Control to remove the third wheel from your relationship In this hilarious post, Kevin Lu teaches us how to use DVC to enable us to disconnect from our unhealthy addictive relationships with our computers and make room for more human relationships! You don't want to miss the humor, productivity and wisdom here, all while helping you understand how each of DVC's commands help your machine learning engineering exploits.

Thanakorn Panyapiang: Putting A Machine Learning model into production with Google Cloud Platform and DVC

Are you a data scientist new to putting models into production?
In this piece Thanakorn Panyapiang describes various model deployment strategies to put projects into production including model-as-service, batch prediction and model-on-edge. In his example he uses a batch prediction approach with an image segmentation model to identify clouds. He uses DVC as a model registry with Google Cloud storage and GitHub actions to automate the Cloud Functions deployment. See all the steps he outlines in his piece to get real value out of your machine learning projects.

Data Pipeline Data Pipeline (Source link: Author)

Matthew Upson: MLOps for Conversational AI with Rasa, DVC, and CML (PartII)

In the December Heartbeat, I told you about Matt Upson's first post in his series on using DVC, CML and Rasa together. In this second post he goes through some Rasa basics and gets the DVC pipeline setup, with its train and test stages, params, dependencies, outs and metrics. He also covers syncing with DVC, making changes, the dvc repro command, the .dvc-lock file, and pushing to remote storage. We're looking forward to the next installment when we will see how CML can be used to automatically train the model.

Rasa DVC metrics diff DVC metrics diff in Rasa project (Source link)

Sibanjan Das: MLOps for Enterprise AI

Sibanjan Das notes the trending of the MLOps keyword in his piece in DZone. Sibanjan gives an overview of MLOps and how it supports the AI/ML ecosystem to deliver return on investment for ML projects. He reviews the components of MLOps, including automated ML model building pipelines, model serving, model version control, model/data monitoring, and security and governance. He also discusses the MLOps maturity models of Google and Microsoft (see below). I found this part especially interesting as it mirrors what we see in our Community and how they develop using our tools as well. Finally, he outlines some tools that help in the process, including DVC.

Comparing Google's and Microsoft's maturity models Comparing Google's and Microsoft's maturity models (Source link)

Jagreet Kaur: Implementing DevOps for Machine Learning - A Quick Guide

Tensorflow, PyTorch, DVC, Docker, CI/CD Jagreet Kaur of Xenonstack authors a guide on applying DevOps to machine learning and generally what the continuous development life cycle is as it relates to machine learning projects. Jagreet goes over all the fun continuous topics including, continuous integration, continuous testing, continuous retraining, and continuous deployment. She gives an overview of the use of Tensor Flow, PyTorch, and Docker, as well as DVC for version control, experiment management deployment, and collaboration. Additional resources from Xenonstack are provided for further review.

Yuqi Li: Why MLOps should be Open Source

Why MLOps Tools should be Open Source Yuqi Li in this opinion piece, in Towards Data Science. overviews the meaning and components of MLOps and identifies a number of good open-source tools in the space which of course includes DVC. He also outlines a number of reasons why MLOps should be open source. Among the reasons making the cut:

  1. Cost-Effectiveness
  2. Ownership
  3. No privacy concern
  4. Build Community around the tool Examine these reasons to determine if open source makes sense for your MLOps work. We think you will.

And speaking of Community…

Mert Bozkir: Community-Driven Learning

If you’ve been in our Discord server, been to one of our Meetups, or interacted with us on Twitter, you’ve surely come across DVC Community All-Star Mert Bozkir. Mert has written a great piece Entitled Community Driven Learning and describes how it is the best way to learn. He outlines his reasoning for this including the support, encouragement, and motivation you can get from the Community to be persistent in your learning efforts. He also includes eight communities that are great for learning, with invites included. Be sure to check it out!

Community Driven Learning Community Driven Learning (Source link: Unsplash by john_cameron)

And speaking of learning…

Company News

Online Course(s) Updates

  • We now have over 250 students taking the course and 10 students that have completed the course! 🎉 Thank you to all who have given us feedback. We are actively working on making adjustments to the course and improving the next one.

  • We have a new look! The website for our online course, Iterative Tools for Data Scientists and Analysts has been updated to be more streamlined to more clearly identify what our students need in the course!

  • We have already begun working on the second course which will be more advanced (remember those maturity models outlined in the article from DZone above?) and will cover scenarios with CML. We are also working on creating an ebook for each video that will provide relevant information, diagrams, and links with the video content instead of being batched at the end of the module. The ebook format will also let you take your own notes as you study!

New Hires

Mike Moynihan joins us from Brooklyn, NY as an Account Executive. He previously worked at Code Climate as the Manager of Business Development and an Account Executive. Mike's really into biking and will be participating in the 5-Boro Bike Tour in NYC this year. He's also a baker and has been baking bread and other baked goods consistently for about 3 years now. Finally, when not working or biking or baking, you may find him playing one of the video or board games in his 500-strong collection.

Rob De Wit joins our team from Utrecht, the Netherlands as a Developer Advocate. Rob's first focus will be on developing those new ebooks for our new online courses mentioned above. He has a background in Information Sciences and previously worked at and Devoteam. When not working, Rob likes photo and video editing, board games, organizing meetups, and hiking (the Peaks of the Balkans are on his bucket list).
He also stays busy by learning Spanish and dabbling in local politics.

Upcoming Events

March Office Hours!

Be sure to join us at the March Office Hours Meetup, where Fabian Zills, PhD student at University of Stuttgart, will present his ZnTrack ("zinc track") project which creates, runs and benchmarks DVC pipelines in Python and Jupyter Notebooks.
Find the repo here!

March Office Hours - ZnTrack

RSVP for DVC Office Hours - ZnTrack - Create, Visualize, Run and Benchmark DVC Pipelines in Python & Jupyter Notebooks
March Office Hours - ZnTrack


📖 New Docs


CML has a new command line reference that lets you prepare the Git repository for CML operations. For more info on cml ci, check out the docs

Open Positions

Even with our amazing new additions to the team, we're still hiring! Use this link to find details of all the positions and share with anyone you think may be interested! 🚀 is Hiring Iterative is Hiring (Source link)

Tweet Love ❤️

We were really excited to the the Sicara team all decked out in their DVC swag this month in this Tweet. If you haven't seen the video of Antoine Toubhans integration with Streamlit, you can see it on our YouTube channel or catch the presentation at this year's PyCon Berlin.

How do you get some DVC swag you ask? Write us some great content, contribute to our tools, give a presentation at one of our Meetups! We'd love to have you!

Have something great to say about our tools? We'd love to hear it! Head to this page to record or write a Testimonial! Join our Wall of Love ❤️

Do you have any use case questions or need support? Join us in Discord!

Head to the DVC Forum to discuss your ideas and best practices.

Back to blog