label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

Github stars Tracking Chart

Label Studio · GitHub Build Status codecov GitHub release · :sunny:

WebsiteDocsTwitterJoin Slack Community

Label Studio is a swiss army knife of data labeling and annotation tools :v:

Try it now in a running app and check out the introductory post.

Its purpose is to help you label different types of data using a simple interface with a standardized output format. You're dealing with the custom dataset and thinking about creating your tool? Don't - using Label Studio, you can save time and create a custom tool and interface in minutes.

Label Studio

Summary

Quick Start

# Requires >=Python3.5
pip install label-studio

# Initialize the project in labeling_project path
label-studio init labeling_project

# Start the server at http://localhost:8200
label-studio start labeling_project

Install on Windows

It's not necessary to install Visual Studio Compiler,
just download "regex" (or other packages you need to compile) from gholke builds corresponding to your python version:
https://www.lfd.uci.edu/~gohlke/pythonlibs/#regex

and then

# Upgrade pip 
pip install -U pip

# Install regex
pip install <path-to-downloaded-package>.whl

# Install label studio
pip install label-studio

Local development

Running the latest Label Studio version locally without installing package from pip could be done by:

# Install all package dependencies
pip install -e .
# Start the server at http://localhost:8200
python label-studio/server.py start labeling_project --init

Run docker

You can also start serving at http://localhost:8200 by using docker:

docker start --rm -p 8200:8200 heartexlabs/label-studio:latest

If you want to build a local image, run:

docker build -t heartexlabs/label-studio:latest .

One Click Deploy



Features :star2:

  • Simple: Crafted with minimal UI design. A simple design is the best design.
  • Configurable: Using high-level jsx tags config, you can fully customize the visual interface for your data. It feels like building a custom labeling tool for your specific needs. And it's fast to do.
  • Collaborative Annotations: Label the same task by two or more people and compare the results.
  • Multiple Data Types: Label Images, Audios, Texts, HTMLs, Pairwise types with different labeling scenarios that you define yourself.
  • Import Formats: JSON, CSV, TSV, RAR and ZIP archives
  • Mobile-Friendly: Works on devices of different sizes.
  • Embeddable: It's an NPM package too. You can include it in your projects.
  • Machine Learning: Integration support for machine learning. Visualize and compare predictions from different models. Use the best ones for pre-labeling.
  • Stylable: Configure the visual appearance to match your company brand, distribute the labeling tasks as a part of your product.

Use Cases

The list of supported use cases for data annotation. Please contribute your own configs and feel free to extend the base types to support more scenarios. Note that it's not an extensive list and has only major scenarios., Task, Description, -, -, Image, Classification, Put images into categories, Object Detection, Detect objects in an image using a bounding box or polygons, Semantic Segmentation, Detect for each pixel the object category it belongs to, Pose Estimation, Mark positions of a person’s joints, Text, Classification, Put texts into categories, Summarization, Create a summary that represents the most relevant information within the original content, HTML Tagging, Annotate things like resumes, research, legal papers and excel sheet converted to HTML, Audio, Classification, Put audios into categories, Speaker Diarisation, partitioning an input audio stream into homogeneous segments according to the speaker identity, Emotion Recognition, Tag and identifying emotion from the audio, Transcription, Write down verbal communication in text, Comparison, Pairwise, Comparing entities in pairs to judge which of each entity is preferred, Ranking, Sort items in the list according to some property, ## Machine Learning Integration

You can easily connect your favorite machine learning framework with Label Studio by using Heartex SDK.

That gives you the opportunities to use:

  • Pre-labeling: Use model predictions for pre-labeling
  • Online Learning: Simultaneously update (retrain) your model while new annotations are coming
  • Active Learning: Perform labeling in active learning mode
  • Prediction Service: Instantly create running production-ready prediction service

There is a quick example tutorial on how to do that with simple image classification:

  1. Create a new project
    label-studio init --template=image_classification imgcls
    
  2. Clone pyheartex, and start serving:
    git clone https://github.com/heartexlabs/pyheartex.git
    cd pyheartex/examples/docker
    docker-compose up -d
    
  3. Specify running server url in imgcls/config.json:
    "ml_backend": {
      "url": "http://localhost:9090",
      "model_name": "my_image_classifier"
    }
    
  4. Launch Label Studio server:
    label-studio start imgcls
    

Once you're satisfied with pre-labeling results, you can immediately send prediction requests via REST API:

curl -X POST -H 'Content-Type: application/json' -d '{"image_url": "https://go.heartex.net/static/samples/sample.jpg"}' http://localhost:8200/predict

Feel free to play around any other models & frameworks apart from image classifiers! (see instructions here)

Label Studio for Teams, Startups, and Enterprises :office:

Label Studio for Teams is our enterprise edition (cloud & on-prem), that includes a data manager, high-quality baseline models, active learning, collaborators support, and more. Please visit the website to learn more.

Ecosystem, Project, Description, -, -, label-studio, Server part, distributed as a pip package, label-studio-frontend, Frontend part, written in JavaScript and React, can be embedded into your application, label-studio-converter, Encode labels into the format of your favorite machine learning library, label-studio-transformers, Transformers library connected and configured for use with label studio, ## License

This software is licensed under the Apache 2.0 LICENSE © Heartex. 2020

Overview

Name With OwnerHumanSignal/label-studio
Primary LanguageJavaScript
Program languagePython (Language Count: 10)
Platform
License:Apache License 2.0
Release Count72
Last Release Name1.12.0.post0 (Posted on )
First Release Namev.0.1.0 (Posted on )
Created At2019-06-19 02:00:44
Pushed At2024-05-18 04:34:03
Last Commit At2023-01-11 02:01:08
Stargazers Count16.8k
Watchers Count170
Fork Count2.1k
Commits Count3.4k
Has Issues Enabled
Issues Count2058
Issue Open Count719
Pull Requests Count2841
Pull Requests Open Count132
Pull Requests Close Count719
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top