Best Comic Panel Detection
mosesb/best-comic-panel-detection
published Jun 2025 · updated Jun 2025
Best Comic Panel Detection is a detection model that identifies and localizes individual panels in comic book pages using YOLOv12x.
specs
| Task | Object Detection (Comic Panel Detection) |
| Architecture | YOLOv12x (extra-large variant) |
| Licenses | MIT |
| Training Data | Custom Roboflow dataset (Custom-Workflow-3-Object-Detection-1) |
about this model
mosesb/best-comic-panel-detection is a YOLOv12x object detection model fine-tuned to detect and localize individual panels on comic book pages. It outputs bounding boxes for each panel, enabling automated comic digitization, content extraction, and layout analysis.
Architecture and Training
The model uses the extra-large YOLOv12x architecture, fine-tuned via transfer learning from a COCO-pretrained checkpoint. Training was performed on a custom Roboflow dataset ("Custom-Workflow-3-Object-Detection-1") with a single class: "Comic Panel". Hyperparameters included 640x640 image size, batch size 16, AdamW optimizer (lr=0.002), and up to 200 epochs with early stopping patience of 100.
Performance
On the validation set, the model achieves near-perfect detection metrics:
| Metric | Value | Description |
|---|---|---|
| mAP50 | 0.991 | Mean average precision at IoU threshold 0.50 |
| mAP50-95 | 0.985 | Mean average precision averaged over IoU thresholds 0.50–0.95 |
The model correctly identifies panels across various sizes and layouts, as shown in validation predictions.
Training and Evaluation Visualizations
Training and validation metrics over epochs are available, along with a confusion matrix and additional performance curves (F1, precision-recall, precision, recall) in the finetuning directory.
Limitations
The model detects rectangular bounding boxes and may underperform on highly irregular or overlapping panel shapes.
best for
- ·Digitizing comic books into panel-by-panel format
- ·Extracting text or characters from individual panels
- ·Analyzing comic book layouts and artistic styles
FAQ
It is designed to detect and draw bounding boxes around individual panels in comic book pages, enabling structured digital reading, content extraction, and layout analysis.
It achieves a mAP50 of 0.991 and mAP50-95 of 0.985 on its validation set, indicating near-perfect precision and recall for comic panel detection.
It uses the YOLOv12x (extra-large) object detection architecture, fine-tuned from a COCO pre-trained checkpoint.
The model and associated training code are licensed under the MIT License, allowing free use, modification, and distribution.
Use the gigarouter OpenAI-compatible endpoint with your API key, sending an image URL or base64-encoded image, and receive bounding box coordinates and confidence scores in the response.
We're benchmarking and onboarding Best Comic Panel Detection as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.