This month you will find:
As usual we have a ton of goodness from the Community! Let's jump in!
Antoine Toubhans of Sicara wrote a fantastic and detailed tutorial entitled How to Build Customizable Web UI for Machine Learning with Streamlit and DVC bringing together the best of DVC and integrating it with Streamlit to provide a customizable UI. The tutorial goes through the steps of setting up a pipeline, spltting a dataset, training and evaluating a model, tracking changes to data and model, dvc metrics and plots and then bridging the gap in visualizations using Streamlit. You won't want to miss this one!
DVC + Streamlit = ♥️! Source link
For our friends that speak Japanese, these slides created by Yusuke Shibui walk you through a machine learning to production project using DVC and CML. We love seeing our tools being used all around the world! 🌏
DVC and CML in Japanese! Source link
Miguel Méndez and his team at Gradiant struggled with reproducibility before using DVC for versioning their image dataset and annotations. The dataset and annotaions are held in a shared storage space and used by the whole team. DVC enables the team to track changes and know what versions of the dataset produce the best results. His tutorial walks you through the steps to set it up!
We have been seeing an uptick in the number of jobs requiring knowledge of DVC. It's exciting to see that our tools are helping these companies in their MLOps workflows! 🎉
With all those DVC job opportunities out there, you better get on it! 😉
This week we kicked off our new DVC Learn Meetup series with Milecia McGregor. This set of three, short, half-hour classes are designed to get you up and running in DVC. If you are just getting started with DVC or kicking the tires, this Meetup series is for you! Our next class on August 4th will get you started with experiments.
If you are interested in weighing in on what kinds of educational content you would like to see from us, we'd be grateful if you'd fill out this survey to help us plan! 🙏🏼
New research presented in the Data Science Journal aims to provide best practices for providing reproducibility in research datasets. This is necessary to pinpoint the version of the dataset that grounds any research. In this work the authors reviewed 39 use cases from 33 organizations to arrive at six principles for versioning datasets. These include Revision, Release, Granularity, Manifestation, Provenance and Citation. See the full work below. 👇🏼
The June Office Hours Meetup was 🔥! Amazing discussion on experiments ignited
by Sami Jawhar of
Kernel around experiment use cases and workflows.
You can find the repo for his presentation here and watch all the great DVC discussion below.
Summer and vaccinations mean travel! ☀️💉 And that travel has enabled some of our team members to get together! Pictured below are Dmitry Petrov, Alexander Guschin, Max Shmakov, Mikhail Rozhkov, Sergey Kryukov, Mikhail Sveshnikov, and Guro Bokum… But not necessarily in that order.
The first person to guess the correct order of our teammates starting from the upper right of the picture moving clockwise, and post in the corresponding Twitter Heartbeat post, will win some DVC SWAG! Hint: If you've been wondering why there are random purple letters in this blog post, they're a clue to this cipher. 🧐
David de la Iglesia Castro is the third teammate joining us from Spain! 🇪🇸 And also the third David! He hails from Galicia and has been an active member of our Community for over two years. We are so excited to have him join the team as a software enginer where he will work to improve DVC Live. When he's not contributing to DVC, David likes to go climbing, surfing or just hiking whenever he can! Welcome David!
And yes indeed, we are still hiring! Use this link to find details of all the positions including:
Please pass this info on to anyone you know that may fit the bill. We look forward to new team members! 🎉
Don't miss our Meetup July 28th at 2:00 pm UTC (7:00 am PDT), where João Santiago of Billie will present "DVThis" a set of utility functions for DVC pipelines using R scripts. Additionally the project aims to document the usual workflows of a DVC pipeline using these scripts and create templates for the use of DVC and R together.
Following Santiago, team member Tapa Dipti Sitaula will give a demo of DVC Studio! Bring your questions; we look forward to seeing you!
Do you have any use case questions or need support? Join us in Discord!
Head to the DVC Forum to discuss your ideas and best practices.