skip to content
gigarouter gigarouter
models / object detection · coming soon

Best Comic Panel Detection

mosesb/best-comic-panel-detection

published Jun 2025 · updated Jun 2025

Best Comic Panel Detection is a detection model that identifies and localizes individual panels in comic book pages using YOLOv12x.

status
coming soon
API providers
0
downloads / mo
4.6K
license
apache-2.0

specs

TaskObject Detection (Comic Panel Detection)
ArchitectureYOLOv12x (extra-large variant)
LicensesMIT
Training DataCustom Roboflow dataset (Custom-Workflow-3-Object-Detection-1)

about this model

mosesb/best-comic-panel-detection is a YOLOv12x object detection model fine-tuned to detect and localize individual panels on comic book pages. It outputs bounding boxes for each panel, enabling automated comic digitization, content extraction, and layout analysis.

Architecture and Training

The model uses the extra-large YOLOv12x architecture, fine-tuned via transfer learning from a COCO-pretrained checkpoint. Training was performed on a custom Roboflow dataset ("Custom-Workflow-3-Object-Detection-1") with a single class: "Comic Panel". Hyperparameters included 640x640 image size, batch size 16, AdamW optimizer (lr=0.002), and up to 200 epochs with early stopping patience of 100.

Performance

On the validation set, the model achieves near-perfect detection metrics:

MetricValueDescription
mAP500.991Mean average precision at IoU threshold 0.50
mAP50-950.985Mean average precision averaged over IoU thresholds 0.50–0.95

The model correctly identifies panels across various sizes and layouts, as shown in validation predictions.

Validation predictions showing detected comic panel bounding boxes

Training and Evaluation Visualizations

Training and validation metrics over epochs are available, along with a confusion matrix and additional performance curves (F1, precision-recall, precision, recall) in the finetuning directory.

Training and validation metrics curves Confusion matrix for comic panel detection

Limitations

The model detects rectangular bounding boxes and may underperform on highly irregular or overlapping panel shapes.

best for

FAQ

What is this model best for?

It is designed to detect and draw bounding boxes around individual panels in comic book pages, enabling structured digital reading, content extraction, and layout analysis.

How accurate is the model?

It achieves a mAP50 of 0.991 and mAP50-95 of 0.985 on its validation set, indicating near-perfect precision and recall for comic panel detection.

What architecture does it use?

It uses the YOLOv12x (extra-large) object detection architecture, fine-tuned from a COCO pre-trained checkpoint.

What license governs use of this model?

The model and associated training code are licensed under the MIT License, allowing free use, modification, and distribution.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending an image URL or base64-encoded image, and receive bounding box coordinates and confidence scores in the response.

not yet live

We're benchmarking and onboarding Best Comic Panel Detection as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related object detection models

compare all →