Run baseline captioning against one of the datasets identified in Unified-IO

The unified-io isi saga-cluster demo does more than baseline captioning, it also does object detection.

The task here would be to write a script (or otherwise implement a feature) to be included in the docker build that allows the docker to run only captioning.  The output would need to be recorded/logged.

### Object Detection branch:
https://github.com/isi-vista/unified-io-inference/blob/object-detection/Dockerfile

https://github.com/isi-vista/unified-io-inference/blob/object-detection/run.py


The Object Detection branch has a Dockerfile that defines the entry point to run.py.  To run only captioning, a `caption.py`
script could be written.  That script could be passed as an argument to the docker execution to run that script instead of the entrypoint (run.py).  `caption.py` would load the model and send only the captioning prompt to each image listed in a file.

I explored the visual_genome dataset and found the associated Python tools worked in a 3.6 env but not the more current 3.9.  This presented a challenge for including in the existing docker image which only has a single env.  This can be addressed today by adding another env.

I explored using vizwiz tools and started reading JSON caption annotations in a script but did not yet complete that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run baseline captioning against one of the datasets identified in Unified-IO #7

Object Detection branch:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Run baseline captioning against one of the datasets identified in Unified-IO #7

Description

Object Detection branch:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions