NLP & DOCUMENT AI

Text & Document
Annotation Services

Named entity recognition, relation extraction, document parsing, text classification, and OCR correction — with domain expertise across medical, legal, financial, and technical documents in 30+ languages. Every annotation meets measurable F1 and IAA benchmarks.

NER Relations Classification Sentiment OCR Correction Doc Parsing Multilingual

Discuss Your NLP Needs Text Annotation Tool

2M+

Documents Processed

F1 ≥ 0.93

Entity-Level Accuracy

30+

Languages Supported

κ ≥ 0.85

Inter-Annotator Agreement

15+

Document Types

L1→L2→L3

QA Pipeline

Capabilities

Six Core NLP Annotation Services

Each capability includes specific accuracy benchmarks, throughput rates, and output format options. All configurable to your schema and training pipeline.

Capability 1

Named Entity Recognition (NER)

Custom entity schemas with nested, overlapping, and cross-sentence spans

Precision entity extraction across unlimited custom types — persons, organizations, medical terms, financial instruments, legal clauses, product attributes, and domain-specific terminology. Support for nested entities, overlapping spans, discontinuous entities, and cross-sentence coreference chains.

TECHNICAL DETAILS

Custom entity schemas: unlimited types with hierarchical parent-child relationships
Overlapping spans: same text span tagged with multiple entity types
Cross-sentence coreference: “Dr. Smith… she… the physician” linked across paragraphs

Nested entities: “Bank of America” contains both ORG and LOC entities
Discontinuous entities: “left and right ventricle” → two linked annotations
Domain taxonomies: ICD-10, SNOMED CT, MeSH, NAICS, legal citation standards

PERFORMANCE

Entity-level F1 ≥ 0.93 (exact span match)

Agreement

Cohen’s κ ≥ 0.85

Throughput

500–2K entities/annotator/day

Capability 2

Relation Extraction & Knowledge Graphs

Entity-to-entity relationships for knowledge graph construction

Annotate directed relationships between entities — drug-disease interactions, cause-effect chains, organizational hierarchies, contractual obligations, and scientific claims. Build the structured knowledge graphs that power question answering, recommendation engines, and decision support systems.

TECHNICAL DETAILS

Directed relation annotation: subject → predicate → object triples
Evidence span marking: text supporting each relation linked to annotation
Cross-document relations: entities and relations linked across document sets

Relation types: causal, temporal, hierarchical, associative, negated
Confidence scoring: annotator confidence (high/medium/low) per relation
Knowledge graph-ready output: RDF triples, Neo4j import format, custom schema

PERFORMANCE

Relation F1 ≥ 0.85

Agreement

κ ≥ 0.80

Throughput

200–500 relations/annotator/day

Capability 3

OCR Correction & Document Parsing

Post-OCR validation with layout-aware field extraction

Post-OCR quality assurance, field extraction validation, and layout-aware document parsing for scanned forms, invoices, receipts, contracts, handwritten documents, and historical archives. We correct OCR errors, validate field extractions, and annotate document structure for training document AI models.

TECHNICAL DETAILS

OCR error correction: character-level correction with error type classification
Table annotation: row/column structure, cell content, header detection, spanning cells
Handwriting recognition validation: word-level and character-level correction

Field extraction validation: key-value pairs verified against source document
Document structure: heading hierarchy, paragraph boundaries, list detection
Historical documents: degraded scan handling, archaic typography, multi-language

PERFORMANCE

Accuracy

≥ 99% character accuracy post-correction

Throughput

100–300 pages/annotator/day

Field Accuracy

≥ 97% field extraction accuracy

Capability 4

Text Classification & Sentiment

Multi-label hierarchical classification with calibrated confidence scores

Document-level, paragraph-level, and sentence-level classification across sentiment polarity, topic categorization, intent detection, urgency scoring, and custom taxonomies. Support for hierarchical labels with configurable confidence thresholds and inter-annotator agreement enforcement.

TECHNICAL DETAILS

Granularity: document-level, paragraph-level, sentence-level, aspect-level
Multi-label: unlimited tags per unit with independent confidence scores
Intent + slot labeling: for conversational AI and chatbot training

Sentiment: 3-point (pos/neg/neutral), 5-point, or continuous scale (0.0–1.0)
Hierarchical taxonomy: parent-child class relationships with inheritance
Aspect-based sentiment: entity-specific sentiment within the same document

PERFORMANCE

Accuracy

Classification accuracy ≥ 95%

Agreement

κ ≥ 0.85

Throughput

1K–5K documents/annotator/day

Capability 5

Document Understanding & Key-Value Extraction

Layout-aware annotation for intelligent document processing

Key-value pair extraction, form field mapping, table structure annotation, section classification, and relationship mapping across PDFs, scanned images, and structured formats. We annotate the spatial and semantic structure that document AI models need to understand complex real-world documents.

TECHNICAL DETAILS

Key-value extraction: field name → field value with bounding region linking
Section classification: headers, paragraphs, lists, footnotes, signatures, stamps
Form field types: text, checkbox, radio, date, signature, handwritten entries

Table structure: cell detection, row/column headers, spanning cells, nested tables
Multi-page linking: cross-page references, continuation markers, page-level metadata
Spatial relationships: above/below/left-of/right-of/contains between regions

PERFORMANCE

Accuracy

Field extraction F1 ≥ 0.95

Throughput

50–150 documents/annotator/day

Table Accuracy

Cell detection F1 ≥ 0.92

Capability 6

Multilingual & Cross-Lingual Annotation

30+ languages with native-speaker annotators and linguistic expertise

NER, classification, sentiment, and document parsing across 30+ languages — each with native-speaker annotators who understand linguistic nuances, cultural context, and domain-specific terminology in their language. Cross-lingual alignment annotation for multilingual model training.

TECHNICAL DETAILS

30+ languages: European, Asian, Middle Eastern, African language families
Script-specific handling: CJK segmentation, Arabic RTL, Devanagari conjuncts
Code-switching annotation: mixed-language text with language-span tagging

Native-speaker annotators: not machine-translated, not bilingual compromise
Cross-lingual entity alignment: same entities linked across language versions
Dialect awareness: regional variations, formal/informal register differences

PERFORMANCE

Languages

30+ with native speakers

Cross Lingual

Entity alignment across versions

Throughput

Varies by script complexity

Domains

Domain-Specific NLP Expertise

Our annotators are trained on industry-specific terminology, taxonomies, and edge cases — not generic crowdworkers labeling random text.

Medical & Clinical NLP

Clinical note parsing, medical NER (drugs, symptoms, procedures, diagnoses), ICD-10/CPT code mapping, clinical trial document annotation, and HIPAA-compliant de-identification of protected health information.

ICD-10 / SNOMED CT / MeSH taxonomy PHI de-identification (99.7%+ recall) Clinical note structure parsing Drug-drug interaction annotation

Legal & Compliance NLP

Contract clause extraction, legal entity recognition, obligation/risk identification, regulatory document parsing, case citation linking, and confidentiality-aware annotation with NDA coverage.

Contract clause taxonomy (50+ types) Obligation vs. right classification Citation extraction and linking Confidentiality-compliant workflows

Financial Services NLP

Financial entity extraction, earnings call analysis, SEC/EDGAR filing parsing, KYC document processing, transaction classification, and sentiment analysis on financial news and analyst reports.

Financial entity taxonomy SEC filing structure parsing Transaction categorization Market sentiment (bullish/bearish/neutral)

E-Commerce & Retail NLP

Product attribute extraction from listings, review sentiment analysis (aspect-level), catalog classification with SKU mapping, search query intent detection, and customer support ticket routing.

Product taxonomy alignment Aspect-based sentiment per feature Intent/slot for search queries Multi-language product descriptions

Technical Documentation

API documentation parsing, code comment extraction, technical spec annotation, knowledge base structuring, and developer documentation classification for AI-powered developer tools.

Code-text boundary detection API parameter extraction Error message classification Multi-language code samples

Content Moderation & Safety

Toxicity detection, hate speech classification, misinformation labeling, policy violation flagging, and age-appropriateness rating across user-generated content in 20+ languages.

40+ harm categories Severity scoring (0–5 scale) Context-aware moderation Multi-language coverage

FORMATS

Supported Input & Output Formats

We ingest any text format and deliver in your training pipeline's preferred schema.

Input Formats

PDF (native + scanned)

DOCX / DOC

Plain Text / TXT

HTML / XML

CSV / TSV

JSON / JSONL

Scanned Images

Handwritten Docs

Emails (EML/MSG)

Chat Logs

Output Formats

CoNLL / CoNLL-U

IOB2 / BIOES

spaCy JSON

Prodigy JSONL

BRAT Standoff

Hugging Face

RDF Triples

CSV / Parquet

Custom JSON Schema

Neo4j Import

Quality

NLP-Specific Quality Controls

Text annotation demands linguistic precision. Here's how we maintain consistency across annotators, languages, and document types.

Span Boundary Validation

Automated checks ensure entity spans are clean — no trailing whitespace, partial words, inconsistent boundary definitions, or invalid character offsets. Rule-based validation catches 80% of span errors before human review.

Automated span validation catches 80%+ of boundary errors pre-review

Taxonomy Consistency Enforcement

Entity types, relation labels, and classification categories are validated against the project taxonomy on every annotation. Out-of-schema labels are blocked automatically. Schema versioning tracks changes across guideline iterations.

Zero out-of-schema labels in delivered data

Cross-Document Entity Consistency

Same entities are labeled the same way across all documents — 'JPMorgan Chase', 'JP Morgan', and 'JPMC' all resolve to the same canonical entity. We track entity-level consistency scores across annotators and batches.

Entity consistency score ≥ 0.95 across document sets

Inter-Annotator Agreement (IAA)

Token-level agreement scores using exact-match F1, Cohen's κ, and Krippendorff's α. Computed per entity type and relation type. Disagreements trigger L3 adjudication and targeted recalibration.

Cohen's κ ≥ 0.85 enforced per entity type per batch

Domain Vocabulary Validation

Custom dictionaries for medical, legal, financial, and technical terminology. Annotators are tested on domain-specific gold sets. Vocabulary updates are distributed to all annotators within 24 hours of approval.

Domain vocabulary test score ≥ 90% for production access

Automated Linguistic Checks

Rule-based validation for language-specific issues: tokenization boundary errors (CJK), script consistency (mixed scripts flagged), encoding issues (UTF-8 validation), and sentence boundary detection accuracy.

Language-specific validation rules per supported language

COMPARISON

UTL NLP Annotation vs. Typical Providers

Capability	UTL Data Engine	Typical Providers
Nested & overlapping entity support		Flat entities only
Cross-document coreference resolution
Relation extraction for knowledge graphs
30+ languages with native-speaker annotators		10–15 languages
Domain taxonomy integration (ICD-10, SNOMED)		Generic schemas
Per-entity-type IAA tracking (κ)		Aggregate only
Automated span boundary validation		Manual QA only
Cross-lingual entity alignment
Layout-aware document parsing		Text-only
HIPAA-compliant de-identification		Basic redaction

“We needed custom NER across 50+ medical entity types with nested span support and cross-document coreference. UTL's team understood the clinical domain from day one, delivered 98.5% F1 on our validation set, and maintained κ ≥ 0.87 across 15 annotators. Their automated span validation alone saved us 30% in review time.”

Data Science Director

Enterprise Health-Tech Platform

FAQS

NLP & Document AI Questions

Can you handle nested and overlapping entities?

Yes. We support complex NER schemas with nested entities (e.g., 'Bank of America' tagged as both ORG and LOC), overlapping spans, discontinuous entities, and cross-sentence coreference chains. Our annotation tooling and QA validation are built specifically for these complex span types.

How do you handle domain-specific terminology?

We train annotators on your domain's specific taxonomy, terminology, and edge cases. This includes 20–40 hour domain training, gold set calibration with domain-specific examples, and ongoing terminology dictionary updates. For medical NLP, we integrate ICD-10, SNOMED CT, and MeSH coding standards directly into the annotation workflow.

What about handwritten and degraded documents?

We provide OCR correction and validation for handwritten documents, including historical archives and degraded scans. Our annotators correct OCR errors at character level, validate field extractions, and annotate document structure. For severely degraded documents, we apply multi-pass review with confidence scoring.

Can you build knowledge graphs from annotated data?

Yes. Our relation extraction annotation produces subject-predicate-object triples with evidence spans, confidence scores, and cross-document linking. Output formats include RDF triples, Neo4j import format, and custom graph schemas. We've built knowledge graphs with 100K+ entities and 500K+ relations for enterprise clients.

How do you maintain quality across 30+ languages?

Each language has native-speaker annotators with linguistic training, separate gold sets and calibration processes, and language-specific validation rules (CJK tokenization, Arabic RTL handling, etc.). We measure IAA independently per language and never mix language teams.

What's the typical engagement process?

Scoping + schema design: 3–5 days. Annotator training + calibration: 5–7 days. Pilot (1K–5K documents): 5–10 days. First delivered batch by Day 20. For domain-specific projects (medical, legal), add 3–5 days for domain training. We maintain pre-qualified teams across major domains for faster ramp-up.

Need Text & Document Annotation?

Let's discuss your computer vision data pipeline — from task design to quality-assured delivery. We'll scope a pilot within 48 hours.

Book a Strategy Call Request a Pilot

LLM & Generative AI Data

Computer Vision

NLP & Document AI

Audio & Speech

Data Collection

Data Curation

Annotation Services

Human-in-the-Loop

Text & Document
Annotation Services

Six Core NLP Annotation Services

Named Entity Recognition (NER)

Relation Extraction & Knowledge Graphs

OCR Correction & Document Parsing

Text Classification & Sentiment

Document Understanding & Key-Value Extraction

Multilingual & Cross-Lingual Annotation

Domain-Specific NLP Expertise

Medical & Clinical NLP

Legal & Compliance NLP

Financial Services NLP

E-Commerce & Retail NLP

Technical Documentation

Content Moderation & Safety

Supported Input & Output Formats

Input Formats

Output Formats

NLP-Specific Quality Controls

Span Boundary Validation

Taxonomy Consistency Enforcement

Cross-Document Entity Consistency

Inter-Annotator Agreement (IAA)

Domain Vocabulary Validation

Automated Linguistic Checks

UTL NLP Annotation vs. Typical Providers

NLP & Document AI Questions

Need Text & Document Annotation?

LLM & Generative AI Data

Computer Vision

NLP & Document AI

Audio & Speech

Data Collection

Data Curation

Annotation Services

Human-in-the-Loop

Text & Document Annotation Services

Six Core NLP Annotation Services

Named Entity Recognition (NER)

Relation Extraction & Knowledge Graphs

OCR Correction & Document Parsing

Text Classification & Sentiment

Document Understanding & Key-Value Extraction

Multilingual & Cross-Lingual Annotation

Domain-Specific NLP Expertise

Medical & Clinical NLP

Legal & Compliance NLP

Financial Services NLP

E-Commerce & Retail NLP

Technical Documentation

Content Moderation & Safety

Supported Input & Output Formats

Input Formats

Output Formats

NLP-Specific Quality Controls

Span Boundary Validation

Taxonomy Consistency Enforcement

Cross-Document Entity Consistency

Inter-Annotator Agreement (IAA)

Domain Vocabulary Validation

Automated Linguistic Checks

UTL NLP Annotation vs. Typical Providers

NLP & Document AI Questions

Need Text & Document Annotation?

Text & Document
Annotation Services