Finetuning for Improved Small Icon Detection in OmniParser #3

abrichr · 2024-11-02T00:42:09Z

Objective:
- Implement fine-tuning for OmniParser’s YOLO model to enhance detection accuracy on small icons and UI elements.
Context:
- Current limitations in detecting small or densely packed icons due to model sensitivity thresholds.
Proposed Solution:
- Data Collection: Assemble a labeled dataset of small icons/UI elements, including bounding boxes in YOLO format.
- Training Configuration: Use YOLO-specific parameters, adjusting image size (e.g., 640x640) and hyperparameters to improve small object sensitivity.
- Integration Steps:
  - Modify get_yolo_model to support loading the fine-tuned model.
  - Update config to reference the fine-tuned model.
  - Provide a train_yolo function to manage the fine-tuning process.
- Testing: Evaluate detection accuracy on new test images containing small icons/UI elements, adjusting BOX_THRESHOLD as needed.
Expected Outcome:
- More accurate small icon detection, fewer missed icons in dense layouts, and reduced reliance on preprocessing.

The text was updated successfully, but these errors were encountered:

abrichr mentioned this issue Nov 3, 2024

Finetuning OmniParser #3 OpenAdaptAI/OpenAdapter#3

Open

Provide feedback