IBM launches annotation tool that leverages AI to label images

We’re excited to bring back Transform 2022 in person on July 19 and virtually from July 20-28. Join leaders in AI and data for in-depth discussions and exciting networking opportunities. Register today!

Data labeling is an arduous – if necessary – part of the AI ​​model training process. Currently, it takes about 200 to 500 annotated image samples for a model to learn to detect a single object. Fortunately, there are tools available for free to automate the most monotonous subtasks, and IBM recently released a new one on GitHub. It’s part of the company’s Cloud Annotations project, which aims to develop simple and collaborative open-source image annotation tools for teams and individuals.

The new tool uses AI to help developers annotate data without having to manually draw labels on a full set of image data. Simply select the “Auto-tag” button in the dashboard to automatically tag uploaded sample images. And it’s backed by IBM Cloud Object Storage, which is optimized for data-intensive machine learning and cloud-native workloads.

Here’s how to access and use the new Cloud Annotations tool:

  • Upload and tag a subset of photos through the Cloud Annotations GUI.
  • Train a model by following these instructions. The tool will use this template to label more photos.
  • Select “Auto Label” in the GUI.
  • Review the new labels.

A number of companies offer tools that automatically label images for training machine learning models. In March 2019, Intel created the open-source Computer Vision Annotation Tool (CVAT), a data labeling toolkit that is deployed through Docker and accessed through a browser-based interface (or optionally built into platforms). forms like Onepanel). About a year ago, Google released Fluid Annotation, which leverages AI to annotate class labels and describe every object and background region in an image.

It is estimated that the data annotation tools market could be worth $1.6 billion by 2025, and some companies are already taking advantage of it.

San Francisco-based Scale uses a combination of human data labelers and machine learning algorithms to sort raw and unlabeled feeds for clients including Lyft, General Motors, Zoox, Voyage, nuTonomy and Embark. Supervisely works on the same model: a combination of deep learning and mass collaboration models. Sweden-based Mapillary creates a database of street-level images and uses computer vision technology to analyze the data in those images. And Austin, Texas-based Alegion, which raised $12 million in venture capital in August 2019, provides a range of labeling and annotation services for enterprise data science teams.

Companies like DefinedCrown take a different approach. The three-year-old Seattle-based startup, which describes itself as a ‘smart’ data curation platform, offers a bespoke training service for customers in customer service, automotive, retail retail, healthcare and business.

VentureBeat’s mission is to be a digital public square for technical decision makers to learn about transformative enterprise technology and conduct transactions. Learn more about membership.