<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=NER%3A_Scan_JPG_NER_JSON</id>
	<title>NER: Scan JPG NER JSON - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=NER%3A_Scan_JPG_NER_JSON"/>
	<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=NER:_Scan_JPG_NER_JSON&amp;action=history"/>
	<updated>2026-04-21T05:14:52Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.35.4</generator>
	<entry>
		<id>https://onnocenter.or.id/wiki/index.php?title=NER:_Scan_JPG_NER_JSON&amp;diff=72777&amp;oldid=prev</id>
		<title>Onnowpurbo: Created page with &quot;Siap! Berikut **script Python all-in-one** untuk:  1. baca gambar hasil scan (JPG/PNG), 2. lakukan OCR (Tesseract via `pytesseract`), 3. lakukan **NER via Ollama** (model loka...&quot;</title>
		<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=NER:_Scan_JPG_NER_JSON&amp;diff=72777&amp;oldid=prev"/>
		<updated>2025-09-12T23:50:37Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Siap! Berikut **script Python all-in-one** untuk:  1. baca gambar hasil scan (JPG/PNG), 2. lakukan OCR (Tesseract via `pytesseract`), 3. lakukan **NER via Ollama** (model loka...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Siap! Berikut **script Python all-in-one** untuk:&lt;br /&gt;
&lt;br /&gt;
1. baca gambar hasil scan (JPG/PNG),&lt;br /&gt;
2. lakukan OCR (Tesseract via `pytesseract`),&lt;br /&gt;
3. lakukan **NER via Ollama** (model lokal, mis. `llama3`/`llama3.1`/`mistral`),&lt;br /&gt;
4. keluaran **JSON** yang rapi.&lt;br /&gt;
&lt;br /&gt;
Termasuk sedikit **pre-processing** (grayscale, denoise, threshold, auto-deskew ringan) agar OCR lebih stabil pada hasil scan.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Prasyarat&lt;br /&gt;
&lt;br /&gt;
* **Tesseract OCR** terpasang:&lt;br /&gt;
&lt;br /&gt;
  * Ubuntu/Debian:&lt;br /&gt;
&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo apt-get update&lt;br /&gt;
    sudo apt-get install -y tesseract-ocr tesseract-ocr-ind tesseract-ocr-eng&lt;br /&gt;
    ```&lt;br /&gt;
* **Ollama** berjalan lokal (default `http://localhost:11434`) dan model sudah di-pull:&lt;br /&gt;
&lt;br /&gt;
  ```bash&lt;br /&gt;
  ollama pull llama3.1&lt;br /&gt;
  ```&lt;br /&gt;
* Python deps:&lt;br /&gt;
&lt;br /&gt;
  ```bash&lt;br /&gt;
  pip install pillow pytesseract opencv-python requests python-dotenv&lt;br /&gt;
  ```&lt;br /&gt;
&lt;br /&gt;
Opsional: buat file `.env` untuk override endpoint/model:&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
OLLAMA_BASE=http://localhost:11434&lt;br /&gt;
OLLAMA_MODEL=llama3.1&lt;br /&gt;
OCR_LANG=ind+eng&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Script: `ocr_ner_ollama.py`&lt;br /&gt;
&lt;br /&gt;
&amp;gt; Jalankan:&lt;br /&gt;
&amp;gt; `python ocr_ner_ollama.py input1.jpg input2.png --lang ind+eng --model llama3.1 --out result.json`&lt;br /&gt;
&lt;br /&gt;
```python&lt;br /&gt;
#!/usr/bin/env python3&lt;br /&gt;
# -*- coding: utf-8 -*-&lt;br /&gt;
&lt;br /&gt;
import argparse&lt;br /&gt;
import base64&lt;br /&gt;
import io&lt;br /&gt;
import json&lt;br /&gt;
import os&lt;br /&gt;
import re&lt;br /&gt;
import sys&lt;br /&gt;
from dataclasses import asdict, dataclass&lt;br /&gt;
from typing import List, Dict, Any, Tuple&lt;br /&gt;
&lt;br /&gt;
import cv2&lt;br /&gt;
import numpy as np&lt;br /&gt;
import pytesseract&lt;br /&gt;
import requests&lt;br /&gt;
from PIL import Image&lt;br /&gt;
from dotenv import load_dotenv&lt;br /&gt;
&lt;br /&gt;
# =========================&lt;br /&gt;
# Data structures&lt;br /&gt;
# =========================&lt;br /&gt;
&lt;br /&gt;
@dataclass&lt;br /&gt;
class OCRWord:&lt;br /&gt;
    text: str&lt;br /&gt;
    conf: float&lt;br /&gt;
    left: int&lt;br /&gt;
    top: int&lt;br /&gt;
    width: int&lt;br /&gt;
    height: int&lt;br /&gt;
    line_num: int&lt;br /&gt;
    par_num: int&lt;br /&gt;
&lt;br /&gt;
@dataclass&lt;br /&gt;
class OCROutput:&lt;br /&gt;
    text: str&lt;br /&gt;
    words: List[OCRWord]&lt;br /&gt;
    image_size: Tuple[int, int]  # (width, height)&lt;br /&gt;
&lt;br /&gt;
@dataclass&lt;br /&gt;
class Entity:&lt;br /&gt;
    text: str&lt;br /&gt;
    label: str&lt;br /&gt;
    start: int&lt;br /&gt;
    end: int&lt;br /&gt;
    confidence: float = None  # model may not provide; keep optional&lt;br /&gt;
&lt;br /&gt;
@dataclass&lt;br /&gt;
class NEROutput:&lt;br /&gt;
    entities: List[Entity]&lt;br /&gt;
    notes: str = &amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# =========================&lt;br /&gt;
# Image preprocessing&lt;br /&gt;
# =========================&lt;br /&gt;
&lt;br /&gt;
def auto_deskew(image: np.ndarray) -&amp;gt; np.ndarray:&lt;br /&gt;
    &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
    Estimate skew via minAreaRect on edges and rotate to correct.&lt;br /&gt;
    Returns rotated image.&lt;br /&gt;
    &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)&lt;br /&gt;
    # Canny edges for orientation&lt;br /&gt;
    edges = cv2.Canny(gray, 50, 150)&lt;br /&gt;
    coords = np.column_stack(np.where(edges &amp;gt; 0))&lt;br /&gt;
    if coords.size == 0:&lt;br /&gt;
        return image&lt;br /&gt;
    rect = cv2.minAreaRect(coords.astype(np.float32))&lt;br /&gt;
    angle = rect[-1]&lt;br /&gt;
    # OpenCV returns angle in [-90, 0); convert&lt;br /&gt;
    if angle &amp;lt; -45:&lt;br /&gt;
        angle = -(90 + angle)&lt;br /&gt;
    else:&lt;br /&gt;
        angle = -angle&lt;br /&gt;
&lt;br /&gt;
    (h, w) = image.shape[:2]&lt;br /&gt;
    center = (w // 2, h // 2)&lt;br /&gt;
    M = cv2.getRotationMatrix2D(center, angle, 1.0)&lt;br /&gt;
    rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)&lt;br /&gt;
    return rotated&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
def preprocess_for_ocr(image_path: str) -&amp;gt; np.ndarray:&lt;br /&gt;
    img = cv2.imread(image_path)&lt;br /&gt;
    if img is None:&lt;br /&gt;
        raise FileNotFoundError(f&amp;quot;Cannot open image: {image_path}&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
    # Resize up (help small scans)&lt;br /&gt;
    scale = 1.5 if max(img.shape[:2]) &amp;lt; 1500 else 1.0&lt;br /&gt;
    if scale != 1.0:&lt;br /&gt;
        img = cv2.resize(img, None, fx=scale, fy=scale, interpolation=cv2.INTER_CUBIC)&lt;br /&gt;
&lt;br /&gt;
    # Deskew&lt;br /&gt;
    img = auto_deskew(img)&lt;br /&gt;
&lt;br /&gt;
    # Grayscale + denoise + contrast (CLAHE) + adaptive threshold&lt;br /&gt;
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)&lt;br /&gt;
    gray = cv2.fastNlMeansDenoising(gray, h=10)&lt;br /&gt;
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))&lt;br /&gt;
    gray = clahe.apply(gray)&lt;br /&gt;
    thr = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C,&lt;br /&gt;
                                cv2.THRESH_BINARY, 31, 10)&lt;br /&gt;
&lt;br /&gt;
    # Morph open to remove small noise&lt;br /&gt;
    kernel = np.ones((1, 1), np.uint8)&lt;br /&gt;
    thr = cv2.morphologyEx(thr, cv2.MORPH_OPEN, kernel)&lt;br /&gt;
    return thr&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# =========================&lt;br /&gt;
# OCR via Tesseract&lt;br /&gt;
# =========================&lt;br /&gt;
&lt;br /&gt;
def run_ocr(processed_img: np.ndarray, lang: str = &amp;quot;ind+eng&amp;quot;) -&amp;gt; OCROutput:&lt;br /&gt;
    pil = Image.fromarray(processed_img)&lt;br /&gt;
    # Use detailed data for bounding boxes&lt;br /&gt;
    data = pytesseract.image_to_data(pil, lang=lang, output_type=pytesseract.Output.DICT)&lt;br /&gt;
    words: List[OCRWord] = []&lt;br /&gt;
    text_parts = []&lt;br /&gt;
&lt;br /&gt;
    n = len(data[&amp;quot;text&amp;quot;])&lt;br /&gt;
    for i in range(n):&lt;br /&gt;
        txt = data[&amp;quot;text&amp;quot;][i].strip()&lt;br /&gt;
        if not txt:&lt;br /&gt;
            continue&lt;br /&gt;
        conf = float(data[&amp;quot;conf&amp;quot;][i]) if data[&amp;quot;conf&amp;quot;][i] != '-1' else -1.0&lt;br /&gt;
        left, top, w, h = data[&amp;quot;left&amp;quot;][i], data[&amp;quot;top&amp;quot;][i], data[&amp;quot;width&amp;quot;][i], data[&amp;quot;height&amp;quot;][i]&lt;br /&gt;
        line_num = int(data.get(&amp;quot;line_num&amp;quot;, [0]*n)[i]) if &amp;quot;line_num&amp;quot; in data else 0&lt;br /&gt;
        par_num = int(data.get(&amp;quot;par_num&amp;quot;, [0]*n)[i]) if &amp;quot;par_num&amp;quot; in data else 0&lt;br /&gt;
        words.append(OCRWord(txt, conf, left, top, w, h, line_num, par_num))&lt;br /&gt;
        text_parts.append(txt)&lt;br /&gt;
&lt;br /&gt;
    full_text = &amp;quot; &amp;quot;.join(text_parts)&lt;br /&gt;
    h, w = processed_img.shape[:2]&lt;br /&gt;
    return OCROutput(text=full_text, words=words, image_size=(w, h))&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# =========================&lt;br /&gt;
# NER via Ollama&lt;br /&gt;
# =========================&lt;br /&gt;
&lt;br /&gt;
DEFAULT_ENTITY_SCHEMA = [&lt;br /&gt;
    &amp;quot;PERSON&amp;quot;, &amp;quot;ORG&amp;quot;, &amp;quot;LOC&amp;quot;, &amp;quot;GPE&amp;quot;, &amp;quot;DATE&amp;quot;, &amp;quot;TIME&amp;quot;, &amp;quot;MONEY&amp;quot;,&lt;br /&gt;
    &amp;quot;PERCENT&amp;quot;, &amp;quot;EMAIL&amp;quot;, &amp;quot;PHONE&amp;quot;, &amp;quot;URL&amp;quot;, &amp;quot;EVENT&amp;quot;, &amp;quot;LAW&amp;quot;, &amp;quot;PRODUCT&amp;quot;, &amp;quot;NORP&amp;quot;&lt;br /&gt;
]&lt;br /&gt;
&lt;br /&gt;
SYSTEM_PROMPT = (&lt;br /&gt;
    &amp;quot;You are an information extraction engine. &amp;quot;&lt;br /&gt;
    &amp;quot;Extract named entities from the provided text and return ONLY valid JSON. &amp;quot;&lt;br /&gt;
    &amp;quot;Use the requested schema. Do not include explanations.&amp;quot;&lt;br /&gt;
)&lt;br /&gt;
&lt;br /&gt;
def build_ner_prompt(doc_text: str, labels: List[str]) -&amp;gt; str:&lt;br /&gt;
    labels_str = &amp;quot;, &amp;quot;.join(labels)&lt;br /&gt;
    # Keep prompt concise; doc may be long&lt;br /&gt;
    return (&lt;br /&gt;
        f&amp;quot;Text (Indonesian/English mix possible):\n\&amp;quot;\&amp;quot;\&amp;quot;\n{doc_text}\n\&amp;quot;\&amp;quot;\&amp;quot;\n\n&amp;quot;&lt;br /&gt;
        f&amp;quot;Extract entities with labels in this set: {labels_str}.\n&amp;quot;&lt;br /&gt;
        f&amp;quot;Rules:\n&amp;quot;&lt;br /&gt;
        f&amp;quot;- Output JSON ONLY with keys: entities, notes.\n&amp;quot;&lt;br /&gt;
        f&amp;quot;- Each entity: {{\&amp;quot;text\&amp;quot;,\&amp;quot;label\&amp;quot;,\&amp;quot;start\&amp;quot;,\&amp;quot;end\&amp;quot;,\&amp;quot;confidence\&amp;quot;}}.\n&amp;quot;&lt;br /&gt;
        f&amp;quot;- Use character offsets on the given Text (0-based, inclusive start, exclusive end).\n&amp;quot;&lt;br /&gt;
        f&amp;quot;- If unsure, set confidence conservatively (0.0–1.0). If not provided by reasoning, use 0.5.\n&amp;quot;&lt;br /&gt;
        f&amp;quot;- Keep notes short (e.g., detection caveats).\n&amp;quot;&lt;br /&gt;
    )&lt;br /&gt;
&lt;br /&gt;
def call_ollama_ner(text: str,&lt;br /&gt;
                    model: str = &amp;quot;llama3.1&amp;quot;,&lt;br /&gt;
                    base_url: str = &amp;quot;http://localhost:11434&amp;quot;,&lt;br /&gt;
                    labels: List[str] = None,&lt;br /&gt;
                    max_chars: int = 8000) -&amp;gt; NEROutput:&lt;br /&gt;
    if labels is None:&lt;br /&gt;
        labels = DEFAULT_ENTITY_SCHEMA&lt;br /&gt;
&lt;br /&gt;
    # Truncate overly long text to fit context&lt;br /&gt;
    doc = text[:max_chars]&lt;br /&gt;
&lt;br /&gt;
    payload = {&lt;br /&gt;
        &amp;quot;model&amp;quot;: model,&lt;br /&gt;
        &amp;quot;options&amp;quot;: {&lt;br /&gt;
            &amp;quot;temperature&amp;quot;: 0.1&lt;br /&gt;
        },&lt;br /&gt;
        # If your Ollama supports JSON mode, include format:&amp;quot;json&amp;quot;&lt;br /&gt;
        &amp;quot;format&amp;quot;: &amp;quot;json&amp;quot;,&lt;br /&gt;
        &amp;quot;system&amp;quot;: SYSTEM_PROMPT,&lt;br /&gt;
        &amp;quot;prompt&amp;quot;: build_ner_prompt(doc, labels),&lt;br /&gt;
        &amp;quot;stream&amp;quot;: False&lt;br /&gt;
    }&lt;br /&gt;
&lt;br /&gt;
    resp = requests.post(f&amp;quot;{base_url}/api/generate&amp;quot;, json=payload, timeout=120)&lt;br /&gt;
    resp.raise_for_status()&lt;br /&gt;
&lt;br /&gt;
    # Ollama returns {&amp;quot;response&amp;quot;: &amp;quot;...json...&amp;quot;}; parse that JSON string&lt;br /&gt;
    content = resp.json().get(&amp;quot;response&amp;quot;, &amp;quot;&amp;quot;).strip()&lt;br /&gt;
&lt;br /&gt;
    # Be robust to accidental leading/trailing text&lt;br /&gt;
    first_brace = content.find(&amp;quot;{&amp;quot;)&lt;br /&gt;
    last_brace = content.rfind(&amp;quot;}&amp;quot;)&lt;br /&gt;
    if first_brace == -1 or last_brace == -1:&lt;br /&gt;
        raise ValueError(f&amp;quot;Ollama did not return JSON: {content[:200]}&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
    json_str = content[first_brace:last_brace+1]&lt;br /&gt;
    parsed = json.loads(json_str)&lt;br /&gt;
&lt;br /&gt;
    ents: List[Entity] = []&lt;br /&gt;
    for e in parsed.get(&amp;quot;entities&amp;quot;, []):&lt;br /&gt;
        ents.append(&lt;br /&gt;
            Entity(&lt;br /&gt;
                text=e.get(&amp;quot;text&amp;quot;, &amp;quot;&amp;quot;),&lt;br /&gt;
                label=e.get(&amp;quot;label&amp;quot;, &amp;quot;&amp;quot;),&lt;br /&gt;
                start=int(e.get(&amp;quot;start&amp;quot;, -1)),&lt;br /&gt;
                end=int(e.get(&amp;quot;end&amp;quot;, -1)),&lt;br /&gt;
                confidence=float(e.get(&amp;quot;confidence&amp;quot;, 0.5)) if e.get(&amp;quot;confidence&amp;quot;, None) is not None else 0.5&lt;br /&gt;
            )&lt;br /&gt;
        )&lt;br /&gt;
    notes = parsed.get(&amp;quot;notes&amp;quot;, &amp;quot;&amp;quot;)&lt;br /&gt;
    return NEROutput(entities=ents, notes=notes)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# =========================&lt;br /&gt;
# Utility: map entities to approximate boxes (optional)&lt;br /&gt;
# =========================&lt;br /&gt;
&lt;br /&gt;
def map_entities_to_boxes(ocr: OCROutput, entities: List[Entity]) -&amp;gt; List[Dict[str, Any]]:&lt;br /&gt;
    &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
    Approximate bounding boxes per entity by matching entity text to concatenated OCR text.&lt;br /&gt;
    This is heuristic: we align by char offsets then collect words overlapping that span.&lt;br /&gt;
    &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
    # Build cumulative char spans for each word in the concatenated text&lt;br /&gt;
    words = ocr.words&lt;br /&gt;
    joined = &amp;quot; &amp;quot;.join([w.text for w in words])&lt;br /&gt;
    # Precompute positions of each word in 'joined'&lt;br /&gt;
    idx = 0&lt;br /&gt;
    word_spans = []&lt;br /&gt;
    for w in words:&lt;br /&gt;
        # find w.text starting at idx or later&lt;br /&gt;
        m = re.search(r'\b' + re.escape(w.text) + r'\b', joined[idx:])&lt;br /&gt;
        if not m:&lt;br /&gt;
            # fallback: raw find&lt;br /&gt;
            m2 = joined.find(w.text, idx)&lt;br /&gt;
            if m2 == -1:&lt;br /&gt;
                # skip if cannot map&lt;br /&gt;
                word_spans.append((None, None))&lt;br /&gt;
                continue&lt;br /&gt;
            start = m2&lt;br /&gt;
            end = m2 + len(w.text)&lt;br /&gt;
        else:&lt;br /&gt;
            start = idx + m.start()&lt;br /&gt;
            end = idx + m.end()&lt;br /&gt;
        word_spans.append((start, end))&lt;br /&gt;
        idx = end + 1  # account for space&lt;br /&gt;
&lt;br /&gt;
    mapped = []&lt;br /&gt;
    for ent in entities:&lt;br /&gt;
        if ent.start &amp;lt; 0 or ent.end &amp;lt;= ent.start:&lt;br /&gt;
            mapped.append({&amp;quot;text&amp;quot;: ent.text, &amp;quot;label&amp;quot;: ent.label, &amp;quot;bbox&amp;quot;: None, &amp;quot;page&amp;quot;: 1, &amp;quot;confidence&amp;quot;: ent.confidence})&lt;br /&gt;
            continue&lt;br /&gt;
        # collect words whose spans overlap [start, end)&lt;br /&gt;
        boxes = []&lt;br /&gt;
        for (span, w), ws in zip(words, word_spans):&lt;br /&gt;
            if ws[0] is None:&lt;br /&gt;
                continue&lt;br /&gt;
            s, e = ws&lt;br /&gt;
            if not (e &amp;lt;= ent.start or s &amp;gt;= ent.end):  # overlap&lt;br /&gt;
                boxes.append((w.left, w.top, w.width, w.height))&lt;br /&gt;
        if not boxes:&lt;br /&gt;
            mapped.append({&amp;quot;text&amp;quot;: ent.text, &amp;quot;label&amp;quot;: ent.label, &amp;quot;bbox&amp;quot;: None, &amp;quot;page&amp;quot;: 1, &amp;quot;confidence&amp;quot;: ent.confidence})&lt;br /&gt;
        else:&lt;br /&gt;
            # merge to one bbox&lt;br /&gt;
            xs = [b[0] for b in boxes]&lt;br /&gt;
            ys = [b[1] for b in boxes]&lt;br /&gt;
            xe = [b[0]+b[2] for b in boxes]&lt;br /&gt;
            ye = [b[1]+b[3] for b in boxes]&lt;br /&gt;
            bbox = [int(min(xs)), int(min(ys)), int(max(xe) - min(xs)), int(max(ye) - min(ys))]&lt;br /&gt;
            mapped.append({&amp;quot;text&amp;quot;: ent.text, &amp;quot;label&amp;quot;: ent.label, &amp;quot;bbox&amp;quot;: bbox, &amp;quot;page&amp;quot;: 1, &amp;quot;confidence&amp;quot;: ent.confidence})&lt;br /&gt;
    return mapped&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# =========================&lt;br /&gt;
# Main&lt;br /&gt;
# =========================&lt;br /&gt;
&lt;br /&gt;
def main():&lt;br /&gt;
    load_dotenv()&lt;br /&gt;
&lt;br /&gt;
    parser = argparse.ArgumentParser(description=&amp;quot;OCR (Tesseract) + NER (Ollama) pipeline with JSON output.&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;images&amp;quot;, nargs=&amp;quot;+&amp;quot;, help=&amp;quot;Path(s) to input JPG/PNG scans.&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;--lang&amp;quot;, default=os.getenv(&amp;quot;OCR_LANG&amp;quot;, &amp;quot;ind+eng&amp;quot;), help=&amp;quot;Tesseract lang (e.g., ind, eng, ind+eng).&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;--model&amp;quot;, default=os.getenv(&amp;quot;OLLAMA_MODEL&amp;quot;, &amp;quot;llama3.1&amp;quot;), help=&amp;quot;Ollama model name.&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;--ollama&amp;quot;, default=os.getenv(&amp;quot;OLLAMA_BASE&amp;quot;, &amp;quot;http://localhost:11434&amp;quot;), help=&amp;quot;Ollama base URL.&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;--labels&amp;quot;, default=&amp;quot;,&amp;quot;.join(DEFAULT_ENTITY_SCHEMA),&lt;br /&gt;
                        help=&amp;quot;Comma-separated labels to extract (override default).&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;--out&amp;quot;, default=&amp;quot;ocr_ner_output.json&amp;quot;, help=&amp;quot;Output JSON file.&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;--max-chars&amp;quot;, type=int, default=8000, help=&amp;quot;Max characters from OCR text to send to NER.&amp;quot;)&lt;br /&gt;
    parser.add_argument(&amp;quot;--map-bbox&amp;quot;, action=&amp;quot;store_true&amp;quot;, help=&amp;quot;Approximate entity bounding boxes from OCR words.&amp;quot;)&lt;br /&gt;
    args = parser.parse_args()&lt;br /&gt;
&lt;br /&gt;
    results = []&lt;br /&gt;
    labels = [s.strip() for s in args.labels.split(&amp;quot;,&amp;quot;) if s.strip()]&lt;br /&gt;
&lt;br /&gt;
    for path in args.images:&lt;br /&gt;
        processed = preprocess_for_ocr(path)&lt;br /&gt;
        ocr = run_ocr(processed, lang=args.lang)&lt;br /&gt;
&lt;br /&gt;
        try:&lt;br /&gt;
            ner = call_ollama_ner(ocr.text, model=args.model, base_url=args.ollama,&lt;br /&gt;
                                  labels=labels, max_chars=args.max_chars)&lt;br /&gt;
        except Exception as e:&lt;br /&gt;
            # If NER fails, still return OCR&lt;br /&gt;
            ner = NEROutput(entities=[], notes=f&amp;quot;NER error: {e}&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
        item = {&lt;br /&gt;
            &amp;quot;file&amp;quot;: os.path.basename(path),&lt;br /&gt;
            &amp;quot;image_size&amp;quot;: {&amp;quot;width&amp;quot;: ocr.image_size[0], &amp;quot;height&amp;quot;: ocr.image_size[1]},&lt;br /&gt;
            &amp;quot;ocr&amp;quot;: {&lt;br /&gt;
                &amp;quot;text&amp;quot;: ocr.text,&lt;br /&gt;
                &amp;quot;num_words&amp;quot;: len(ocr.words),&lt;br /&gt;
                &amp;quot;avg_conf&amp;quot;: float(np.mean([w.conf for w in ocr.words if w.conf &amp;gt;= 0])) if ocr.words else None,&lt;br /&gt;
            },&lt;br /&gt;
            &amp;quot;ner&amp;quot;: {&lt;br /&gt;
                &amp;quot;entities&amp;quot;: [asdict(e) for e in ner.entities],&lt;br /&gt;
                &amp;quot;notes&amp;quot;: ner.notes&lt;br /&gt;
            }&lt;br /&gt;
        }&lt;br /&gt;
&lt;br /&gt;
        if args.map_bbox and ner.entities:&lt;br /&gt;
            mapped = map_entities_to_boxes(ocr, ner.entities)&lt;br /&gt;
            item[&amp;quot;ner&amp;quot;][&amp;quot;entities_with_bbox&amp;quot;] = mapped&lt;br /&gt;
&lt;br /&gt;
        results.append(item)&lt;br /&gt;
&lt;br /&gt;
    with open(args.out, &amp;quot;w&amp;quot;, encoding=&amp;quot;utf-8&amp;quot;) as f:&lt;br /&gt;
        json.dump({&amp;quot;results&amp;quot;: results}, f, ensure_ascii=False, indent=2)&lt;br /&gt;
&lt;br /&gt;
    print(json.dumps({&amp;quot;results&amp;quot;: results}, ensure_ascii=False, indent=2))&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
if __name__ == &amp;quot;__main__&amp;quot;:&lt;br /&gt;
    main()&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Contoh Pakai&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
# 1) Single file, default bahasa ind+eng, model llama3.1&lt;br /&gt;
python ocr_ner_ollama.py scan_ktp.jpg&lt;br /&gt;
&lt;br /&gt;
# 2) Banyak file, pakai mapping bbox, output custom&lt;br /&gt;
python ocr_ner_ollama.py doc1.png doc2.jpg --map-bbox --out hasil.json&lt;br /&gt;
&lt;br /&gt;
# 3) Pakai model lain + endpoint custom&lt;br /&gt;
python ocr_ner_ollama.py nota.png --model mistral --ollama http://127.0.0.1:11434&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Contoh Output JSON (ringkas)&lt;br /&gt;
&lt;br /&gt;
```json&lt;br /&gt;
{&lt;br /&gt;
  &amp;quot;results&amp;quot;: [&lt;br /&gt;
    {&lt;br /&gt;
      &amp;quot;file&amp;quot;: &amp;quot;nota.png&amp;quot;,&lt;br /&gt;
      &amp;quot;image_size&amp;quot;: { &amp;quot;width&amp;quot;: 1754, &amp;quot;height&amp;quot;: 1240 },&lt;br /&gt;
      &amp;quot;ocr&amp;quot;: {&lt;br /&gt;
        &amp;quot;text&amp;quot;: &amp;quot;TOKO MAJU JAYA ... Total: Rp 125.000 ...&amp;quot;,&lt;br /&gt;
        &amp;quot;num_words&amp;quot;: 47,&lt;br /&gt;
        &amp;quot;avg_conf&amp;quot;: 86.2&lt;br /&gt;
      },&lt;br /&gt;
      &amp;quot;ner&amp;quot;: {&lt;br /&gt;
        &amp;quot;entities&amp;quot;: [&lt;br /&gt;
          { &amp;quot;text&amp;quot;: &amp;quot;TOKO MAJU JAYA&amp;quot;, &amp;quot;label&amp;quot;: &amp;quot;ORG&amp;quot;, &amp;quot;start&amp;quot;: 0, &amp;quot;end&amp;quot;: 14, &amp;quot;confidence&amp;quot;: 0.86 },&lt;br /&gt;
          { &amp;quot;text&amp;quot;: &amp;quot;Rp 125.000&amp;quot;, &amp;quot;label&amp;quot;: &amp;quot;MONEY&amp;quot;, &amp;quot;start&amp;quot;: 35, &amp;quot;end&amp;quot;: 45, &amp;quot;confidence&amp;quot;: 0.78 }&lt;br /&gt;
        ],&lt;br /&gt;
        &amp;quot;notes&amp;quot;: &amp;quot;Currency inferred from 'Rp' token.&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  ]&lt;br /&gt;
}&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Catatan &amp;amp; Tips&lt;br /&gt;
&lt;br /&gt;
* Jika hasil OCR berantakan, coba:&lt;br /&gt;
&lt;br /&gt;
  * Scan ulang dengan resolusi ≥300 DPI.&lt;br /&gt;
  * Variasi bahasa `--lang ind`, `--lang ind+eng`.&lt;br /&gt;
  * Ganti thresholding ke Otsu: `cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)`.&lt;br /&gt;
* Untuk **dokumen panjang**, batasi `--max-chars` agar prompt tidak terlalu panjang.&lt;br /&gt;
* Skema label bisa Anda sesuaikan via `--labels`, mis. `--labels &amp;quot;PERSON,ORG,LOC,DATE,EMAIL,PHONE&amp;quot;`.&lt;br /&gt;
&lt;br /&gt;
Butuh versi yang menyimpan **hasil per-kata (bbox)** ke JSON juga, atau integrasi langsung ke **BPMN/PM4Py**? Kasih tahu—saya siap tweak agar cocok dengan pipeline Anda.&lt;/div&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
</feed>