Website • Docs • Twitter • Join Slack Community
Label Studio is a swiss army knife of data labeling and annotation tools
Try it now in a running app and check out the introductory post.
Its purpose is to help you label different types of data using a simple interface with a standardized output format. You're dealing with the custom dataset and thinking about creating your tool? Don't - using Label Studio, you can save time and create a custom tool and interface in minutes.
# Requires >=Python3.5
pip install label-studio
# Initialize the project in labeling_project path
label-studio init labeling_project
# Start the server at http://localhost:8080
label-studio start labeling_project
For running on Windows, the following wheel packages are needed to be manually downloaded from Gohlke builds, by ensuring the right python version:
Install Label Studio:
# Upgrade pip
pip install -U pip
# Assuming you are running Win64 with Python 3.8, install packages downloaded form Gohlke:
pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl
# Install label studio
pip install label-studio
conda create --name label-studio python=3.8
conda activate label-studio
pip install label-studio
If you see any errors during installation, try to rerun installation
pip install --ignore-installed label-studio
Running the latest Label Studio version locally without installing package from pip could be done by:
# Install all package dependencies
pip install -e .
# Start the server at http://localhost:8080
python label_studio/server.py start labeling_project --init
You can also start serving at http://localhost:8080
by using docker:
docker run --rm -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init
By default, it starts blank project in ./my_project
directory.
Note: if
./my_project
folder exists, an exception will be thrown. Please delete this folder or use--force
option.
You can override the default startup command by appending:
docker run -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init --force --template text_classification
If you want to build a local image, run:
docker build -t heartexlabs/label-studio:latest .
You can also start serving at http://localhost:8080
using docker-compose.
INIT_COMMAND='--init' docker-compose up -d
docker-compose up -d
INIT_COMMAND='--init --force' docker-compose up -d
Or you can just use .env file instead of INIT_COMMAND='...' adding this line:
INIT_COMMAND=--init --force
The list of supported use cases for data annotation. Please contribute your own configs and feel free to extend the base types to support more scenarios. Note that it's not an extensive list and has only major scenarios.
Task | Description |
---|---|
Image | |
Classification | Put images into categories |
Object Detection | Detect objects in an image using a bounding box or polygons |
Semantic Segmentation | Detect for each pixel the object category it belongs to |
Pose Estimation | Mark positions of a person’s joints |
Text | |
Classification | Put texts into categories |
Summarization | Create a summary that represents the most relevant information within the original content |
HTML Tagging | Annotate things like resumes, research, legal papers and excel sheet converted to HTML |
Audio | |
Classification | Put audios into categories |
Speaker Diarisation | partitioning an input audio stream into homogeneous segments according to the speaker identity |
Emotion Recognition | Tag and identifying emotion from the audio |
Transcription | Write down verbal communication in text |
Video | |
Classification | Put videos into categories |
Comparison | |
Pairwise | Comparing entities in pairs to judge which of each entity is preferred |
Ranking | Sort items in the list according to some property |
You can easily connect your favorite machine learning framework with Label Studio Machine Learning SDK. It's done in the simple 2 steps:
That gives you the opportunities to use:
Label Studio for Teams is our enterprise edition (cloud & on-prem), that includes a data manager, high-quality baseline models, active learning, collaborators support, and more. Please visit the website to learn more.
Project | Description |
---|---|
label-studio | Server part, distributed as a pip package |
label-studio-frontend | Frontend part, written in JavaScript and React, can be embedded into your application |
label-studio-converter | Encode labels into the format of your favorite machine learning library |
label-studio-transformers | Transformers library connected and configured for use with label studio |
@misc{Label Studio,
title={{Label Studio}: Data labeling software},
url={https://github.com/heartexlabs/label-studio},
note={Open source software available from https://github.com/heartexlabs/label-studio},
author={
Maxim Tkachenko and
Mikhail Malyuk and
Nikita Shevchenko and
Andrey Holmanyuk and
Nikolai Liubimov},
year={2020},
}
This software is licensed under the Apache 2.0 LICENSE © Heartex. 2020
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。