Gradio UI using the Hugging Face Transformers library for image captioning (image-to-text) task.