By clicking on "Accept", you're agreeing to our privacy and cookie policy.

DVC-Cloud: Query Management for Unstructured Data

Built for Computer Vision and Natural Language Processing

DVC-Cloud extends current DVC data set versioning so you can query, share, and version unstructured data like images and audio stored in clouds like AWS S3, Azure Blob Storage, and Google Cloud Storage.

Finding and creating the right data set for an ML model can take anywhere from days to weeks of manual, tedious work. Reduce data preparation time by 90% with DVC-Cloud, which automatically indexes all your unstructured data and associated meta-information like annotations.

  • Better search and collaboration

    Find and share the right data quickly, based on custom classifiers from automated, programmatic labeling for each data object

  • Lightning-fast data set creation

    Manage data regardless of location or cloud, using Iterative's custom DQL for unstructured data

  • Centralized visibility

    See all usage, lineage, versioning, and meta information around data sets in a single place

DQL: SQL for unstructured data

Query, organize, cleanse, and instantiate data objects with Iterative's custom data query language (DQL), built for unstructured data and machine learning use cases.

The Studio dashboard opened to the a file named untitled.dql, highlighting an example of DQL.

Use a first-of-its-kind query language to search and manage annotations and meta information. Manipulate your ML data quickly and efficiently to get the correct data for improving your models.

Flexible data set creation

Create data sets based on specific annotations instantly. Programmatically label and create signals for your data sets. Version these data sets as you develop your model, using data sources from any cloud location.

The Studio dashboard opened to the a file named clip_index.dql, highlighting the sources UI.

Use auxiliary features (i.e., files attributes, JSONs, helper ML models, and more), indexed by DVC-Cloud, to create the right data set for your models. Do this across any cloud or data source.

Search across all data sets with unstructured data catalog

Find the data set you're looking for across any cloud, with full details around lineage and use. Then use meta information to create custom data sets that can be shared and used across your team.

The Studio dashboard opened to the a file named open_image_select.dql.

Quickly search across any cloud and see context around who's used the data set last, where it's stored, how it's used, and more. Eliminate the need for custom scripts and long waits asking team members how a data set was changed. All in a single place.

Start managing your unstructured data now

Reach out to our experts!