Join the DVC Ambassador Program!
We're launching our ambassador program for people all around the world to get involved in the DVC community. Our first ambassador, Marcel Ribeiro-Dantas, shares a guest blog about how ambassadors support open source projects through blog writing, public outreach, and code.
- Marcel Ribeiro-Dantas
- May 08, 2020 • 5 min read
DVC's software can be everywhere, but its developers can’t - that’s why ambassadors, folks who do outreach and community building around projects they love, are a key part of the open source community. DVC is starting an ambassador program to help people who are passionate about our mission get involved.
As the first DVC ambassador, and a Fedora ambassador before that, I can tell you a bit about the role. As a representative of open source projects, I've participated in lots of events, made friends, and traveled. Every single time I’ve contributed, I got this nice feeling that it was all worth it. I believe that if you agree with the core values of the project, a great relationship lies ahead :).
So what are the core values of DVC, exactly? DVC is founded on the principle of engineering solutions for making data science and machine learning rigorous and reproducible. If this matters to you, too, you might be a good fit for our ambassador program!
As an ambassador, you’ll act as a bridge between DVC in your community. There are lots of ways to do this, big and small. For example:
- Write a blog post talking about how you use DVC in your projects
- What about creating a network of DVC users and data scientists in your town? Even though we’re self-isolating now, you can still organize online meetups. We already did one! We help cover costs to organize meetups.
- Do you want to talk about DVC at your office, or at a conference? We help speakers develop talks, and we have some discretionary funds for travel on a case-by-case basis.
- Want to develop a feature for DVC? We welcome contributions to the code base, even if it’s your first pull request ever.
Being an ambassador means getting closer to the team in charge of DVC, but at the same time, it means going farther to reach people outside the organization- including people who don’t know about DVC yet, people who need some help getting started, and people who are already excited about our mission and want to find meaningful ways to pitch in.
DVC got started in 2017 as a personal project by Dmitry Petrov ( we just celebrated our 3rd birthday). Previously, Dmitry worked at Microsoft as a data scientist and did a PhD in Computer Science. In 2018, Dmitry teamed up with his co-founder Ivan Shcheklein (co-founder of The Tweeted Times and Sedna contributor) to incorporate Iterative.ai and grow the project. Iterative.ai is building enterprise tools for collaboration on ML projects. Currently, Iterative.ai's open source flagship project is Data Version Control (DVC), an open source version control system for managing complex workflows, datasets, and models.
Development is ongoing in the core DVC project as well as new ventures into MLOps and Continuous Integration & Delivery (CI/CD) for data science. The team is small-and-mighty, with developers, engineers, and data scientists on four continents. The open source community is a huge part of all Iterative.ai projects; currently, DVC has more than 5,000 stars on GitHub and more than 100 individual contributors!
One of DVC’s main principles is adapting existing software engineering practices to machine learning. For example, DVC is built around Git version control: in an ML project using DVC, each experiment corresponds to a Git commit. When you check out any commit, you’ll see the source code as it was when you made the commit- as expected. But, you’ll also see your datasets as they were and the exact pipeline of commands you ran in that experiment!
Like any volunteer position, the main benefit is getting to be involved in a project you believe in. But there are some perks:
- Establishing a formal relationship with DVC that can go on your CV/resume. We'll boost your content on our social channels, too.
- Access to support from the DVC team, such as financial resources to organize your own meetup for local data scientists and ML enthusiasts
- Mentorship about crafting blogs and talks, if desired. DVC team members regularly help people in the community develop their presentations and blogs for accuracy and clarity
- Closer relationships with the DVC team means more chances to participate in conversations that guide our product decisions.
For students and early career professionals, you can learn a lot by interacting with us! While you can certainly write a blog post or organize a meetup without being an ambassador, the program is a way to fast-track your learning- you'll have the creators of DVC helping you understand it well, and helping you discover features and best practices you might not have known about.
If you're already active in the open source or MLOps community, then becoming an ambassador is a solid way to cement your relationship with DVC. We'd love to recognize you for the amazing stuff you already do.
If you’re interested in becoming an ambassador, send us an email at [email protected] with the subject line “I want to be an ambassador!” Please tell us:
- A little about yourself and your professional background
- Any outreach work you’ve done before
- What kind of ambassador activities you’d be most interested in participating in
The program is structured to provide a lot of flexibility, so each ambassador can do outreach in ways that are personally motivating and enjoyable. There are a few guidelines:
- We ask for at least one-year commitment
- We ask ambassadors to contribute at least four activities per year, about once every three months. There's no upper limit to how much you can do!
- For your first contribution, we ask for a blog post- this way, we can collaborate with you to help get all the technical details right. After that, it’s up to you!
Our official ambassador program is just starting, but our community already has a lot of folks making noise. Here are just a few contributions we admire- we think they’re pretty cool inspirations for future projects.
Shareable blogs are one of our most effective outreach strategies. They give visibility to the author and new ways to use DVC, so it's a win-win.
- Remote training with GitLab-CI and DVC, by Mercel Mikl and Bert Besser (Bert has also organized a DVC meetup in Berlin)
- Creating a solid Data Science development environment, by Gabriel dos Santos Goncalves
- Continuous Delivery for Machine Learning, by Danilo Sato, Arif Wider, and Christoph Windheuser
- Manage your Data Science Project in R was my first blog post about using DVC in an R project!
Community members have presented at events like PyCon, PyData, and local meetups.
- Version control for data science, by Alessia Marcolini @ PyCon DE & PyData Berlin
- How to easily set up and version control your machine learning pipelines, by Sarah Diot-Girard & Stephanie Bracaloni @ PyData Amsterdam
- ML models and dataset versioning, by Kurian Benoy @ PyCon India
Our GitHub repository has lots of open discussions about potential features- its a goldmine for ways to pitch in. For example:
Helge Munk Jacobsen took on an open issue in our code base about supporting hyperparameter tracking with DVC and made a pull request to add this feature.
Vera Sativa added directory support to the
dvc import-urlfunction- and she was our 100th contributor, so she won her own DeeVee the owl.
Vera (center, flashing a peace sign) thanked us with this lovely picture of DeeVee and her team, Odd Industries.