Interactive OCR with Tesseract

Label Studio can be used to interactively work with OCR (Optical Character Recognition) models like Tesseract.

Create Interactive Model

You can use Label Studio ML Backend to start an interactive OCR model:

  1. Download git clone https://github.com/HumanSignal/label-studio-ml-backend.git
  2. Go to label_studio_ml/examples/tesseract
  3. Run docker-compose up

It will start the server listening on http://localhost:9090.

Connect to Label Studio

Let’s connect to the running Label Studio instance. You need API_KEY that can be found in Account & Settings -> API Key section.

1from label_studio_sdk.client import LabelStudio
2
3ls = LabelStudio(
4 base_url='http://localhost:8080',
5 api_key='YOUR-API-KEY'
6)

Create a project

To create a project, you need to specify the label_config that defines the labeling interface and the labels ontology.

1project = ls.projects.create(
2 title='Live OCR',
3 description='A project to demonstrate live OCR with connected Tesseract model',
4 label_config='''
5 <View>
6 <Image name="image" value="$ocr"/>
7
8 <Labels name="label" toName="image">
9 <Label value="Text" background="green"/>
10 <Label value="Handwriting" background="blue"/>
11 </Labels>
12
13 <Rectangle name="bbox" toName="image" strokeWidth="3"/>
14
15 <TextArea name="transcription" toName="image"
16 editable="true"
17 perRegion="true"
18 required="true"
19 maxSubmissions="1"
20 rows="5"
21 placeholder="Recognized Text"
22 displayMode="region-list"
23 />
24 </View>'''
25)

Connect OCR Model to Project

To connect your running OCR model to the project, you need to specify the model URL and the project ID:

1ls.ml.create(
2 title='Tesseract OCR',
3 description='A model to perform OCR using Tesseract',
4 url='http://localhost:9090',
5 project=project.id,
6 is_interactive=True
7)