October '22 Heartbeat

This month you will find:

🎙 Andrew Ng Intel Keynote talk,

🇺🇸 White House Blueprint for AI Bill of Rights,

🧐 CML in research,

🎥 Nadia Nahar video: Collaboration Challenges in ML-Enabled Systems,

🐉 DVC-Hydra integration,

🗣 CI/CD for Machine Learning upcoming webinar,

🚀 New hire, and more!

  • Jeny De Figueiredo
    +1
    Rob de Wit
  • October 20, 20226 min read
Hero Picture

Welcome to October! As the days grow shorter or longer depending on your hemisphere, we bring you the latest and greatest from the Iterative Community.

In AI News

Andrew Ng at Intel's Innovation Conference - Democratizing AI through Data-Centric AI

By clicking play, you agree to YouTube's Privacy Policy and Terms of Service

At Intel’s Innovation conference, Andrew Ng gave a keynote on democratizing AI. He posits that while large companies have embraced AI, most smaller companies outside of the consumer-based domains still struggle. He provides two main reasons for this: small datasets and customization.

According to Ng, data-centric AI will be the key to unlocking that potential, forcing a paradigm shift away from code-centric AI. In this scenario, people could take mostly ready-built ML tech and focus on the data to ensure it captures all necessary domain knowledge.

For example, two companies that produce cornflakes and medication could take the same ML model and train it on their respective datasets. As long as they have the right tools and practices and provide a domain representative dataset, the same model can reproduce effective results. If you want to see some of the tools Ng uses, make sure to check out his keynote.

What do you think? Does the average data scientist need a different set of skills in the near future? Are you in one of these smaller industries that are starting to embrace AI? We'd love to read your thoughts! Join us in our discussion of this topic on Discord!

Blueprint for an AI Bill of Rights

Blueprint for an AI Bill of Rights If you will recall from last month's Heartbeat we called to your attention the EU AI Act. This act proposes new rules that would require that open source developers adhere to guidelines across a spectrum of categories including risk management, data governance, technical documentation and transparency, standards and accuracy, and cyber security. Not to be outdone, the US White House declared a Blue Print for an AI Bill of Rights. The White House Office of Science and Technology Policy (OSTP) has defined 5 categories for these rights:

  1. Safe and Effective Systems
  2. Algorithmic Discrimination Protection
  3. Data Privacy
  4. Notice and Explanation
  5. Human Alternatives, Consideration, and Fallback

There's definitely some overlap here with the EU AI Act and some catching up with Data Privacy in the mix. There's lots to unpack, compare, and contrast on scope and philosophy between the two. It's nice to see that major attention is given to these issues.

We could think of the relationship between AI rights and Andrew Ng's talk in the sense of the AI space maturing. To Andrew Ng's points, as we move from the frenzied all-important model development to an understanding of the need for a data-centric approach and this democratization, we are changing the focus to enable us to adequately address these hard and important issues. Improving the efficiency of tooling will help with this too. That's why we are here.

What do you think? Do the efficiencies we are gaining open up room for improved time/attention to bake protections into the process or am I too hopeful? Head to Discord and share your thoughts!

Company News

DVC-Hydra integration AI generated image of rainbow feathered dragon (DeeVee + Hydra)

DVC-Hydra Integration

Did you hear? DVC has a new integration with Hydra. Now you can use Hydra composition to configure your DVC experiments. You can also apend and remove parameters on the fly as well as do a grid search of parameters. Random search functionlity is coming, weigh in on the issue here. Find out more in David de la Iglesia's blog post.

October Meetup

If you missed the October Meetup with Nadia Nahar presenting her team's research on Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process don't worry, there's a video! Catch it below!

By clicking play, you agree to YouTube's Privacy Policy and Terms of Service

November Meetup

Join us for our next meetup on November 16th. We will have Dmytro Filatov of DeepX presenting Continous Computer Vision with DVC and CML and Jelle Bouwman demoing Iterative Studio Model Registry. Be sure to register here!

Continuous Computer Vision with DVC and CML plus Iterative Studio Model Registry Demo

Join us on November 16th. Come see the possibilities with DVC, CML, and Iterative Studio Model Registry!
Continuous Computer Vision with DVC and CML plus Iterative Studio Model Registry Demo

Alex Kim - CI/CD for Machine Learning Webinar with ODSC

Join Alex Kim on November 30th with ODSC to learn about CI/CD for Machine Learning. This webinar shares how CML is a project to help ML and data science practitioners automate their ML model training and model evaluation, using best practices and tools from software engineering, such as GitLab CI/CD (as well as GitHub Actions and BitBucket Pipelines). The idea is to automatically train your model and test it in a production-like environment every time your data or code changes. In this talk, you'll learn how to:

  • Automatically allocate cloud instances (AWS, Azure, GCP) to train ML models. And automatically shut the instance down when training is over
  • Automatically generate reports with graphs and tables in pull/merge requests to summarize your model's performance, using any visualization library
  • Transfer data between cloud storage and computing instances with DVC
  • Customize your automation workflow with GitLab CI/CD

Sign up for the talk here.

Alex Kim ODSC webinar Alex Kim webinar CI/CD for Machine Learning for ODSC (Source link)

It's Hacktoberfest!

Iterative Hacktoberfest It's Hacktoberfest month and we are participating! Find out all the information in Mert Bozkir's blog post. But if you just want to jump in, find all the open HackToBerFest issues here. Follow along in the #hacktoberfest channel in Discord to keep up to date for the rest of the month and be sure to read next month's Heartbeat to learn of the contributions!

New Hires

Ivan Longin joins us as a Senior Software Engineer on the Iterative Studio team from Zadar, Croatia. When Ivan's not working he likes to spend time doing outdoor activities, swimming in good weather, and or just walking or often running after his one-year-old! Been there three times over! ❤️ Welcome Ivan!

From the Community

This month was full of great content. We wanted to give a shout-out to all of it, so we are trying out a more abbreviated list.
Thanks to all these amazing Community members that are sharing their knowledge! 🚀

DVC

Data management

Data Pipelines

Experimentation

Other mentions

CML

MLEM

❤️ Tweet Love

I had a really hard time choosing this month, but I was excited to see this Tweet from Nick Sorros announcing the post from his colleague Matt Upson.


Have something great to say about our tools? We'd love to hear it! Head to this page to record or write a Testimonial! Join our Wall of Love ❤️

Do you have any use case questions or need support? Join us in Discord!

Head to the DVC Forum to discuss your ideas and best practices.

Back to blog