zensvi.cv.ObjectDetector

class zensvi.cv.ObjectDetector(config_path: str = os.path.join(current_dir, 'config/GroundingDINO_SwinT_OGC.py'), weights_path: str = str(models_dir / 'groundingdino' / 'groundingdino_swint_ogc.pth'), text_prompt: str = 'tree . building .', box_threshold: float = 0.35, text_threshold: float = 0.25, verbosity: int = 1, device=None)

Class for detecting objects in images using GroundingDINO model.

This class provides functionality to detect objects in images using the GroundingDINO model. It can process single images or directories of images, annotate them with bounding boxes and labels, and save detection summaries in various formats.

Parameters:
  • config_path (str, optional) – Path to GroundingDINO config file. Defaults to included config.

  • weights_path (str, optional) – Path to model weights file. Defaults to included weights.

  • text_prompt (str, optional) – Text prompt for object detection. Defaults to “tree . building .”.

  • box_threshold (float, optional) – Confidence threshold for box detection. Defaults to 0.35.

  • text_threshold (float, optional) – Confidence threshold for text. Defaults to 0.25.

  • verbosity (int, optional) – Level of verbosity for progress bars. Defaults to 1. 0 = no progress bars, 1 = outer loops only, 2 = all loops.

model

The loaded GroundingDINO model.

text_prompt

Text prompt used for detection.

Type:

str

box_threshold

Box confidence threshold.

Type:

float

text_threshold

Text confidence threshold.

Type:

float

model_lock

Lock for thread-safe model inference.

Type:

threading.Lock

verbosity

Level of verbosity for progress reporting.

Type:

int

device

The device used for inference. Options: “cpu”, “cuda”, or “mps”.

detect_objects(dir_input: str | pathlib.Path, dir_image_output: str | pathlib.Path | None = None, dir_summary_output: str | pathlib.Path | None = None, save_format: str = 'json', max_workers: int = 4, verbosity: int = None, group_by_object: bool = False)

Detect objects in images and save results.

Processes images from input directory/file, saves annotated images and detection summaries. Only processes unprocessed images (those without existing annotated versions).

Parameters:
  • dir_input (Union[str, Path]) – Input image file or directory path.

  • dir_image_output (Union[str, Path, None], optional) – Directory to save annotated images. If None, no images are saved, only summary data (dir_summary_output must be provided).

  • dir_summary_output (Union[str, Path, None], optional) – Directory to save detection summaries. If None, no summary data is saved, only annotated images (dir_image_output must be provided).

  • save_format (str, optional) – Format for saving summaries (“json”, “csv”, or “json csv”). Defaults to “json”.

  • max_workers (int, optional) – Maximum number of parallel workers. Defaults to 4.

  • verbosity (int, optional) – Level of verbosity for progress bars. If None, uses the instance’s verbosity level. 0 = no progress bars, 1 = outer loops only, 2 = all loops.

  • group_by_object (bool, optional) – If True, groups detections by object type per image and counts occurrences. If False, returns detailed detection data. Defaults to False.

Raises:
  • ValueError – If dir_input is neither a file nor directory.

  • ValueError – If neither dir_image_output nor dir_summary_output is provided.