PIL Image conversion issues with Gemini API Parts #5033

xuefei-wang · 2025-01-14T00:15:53Z

What happened?

I encountered type compatibility issues when trying to pass images to Gemini API. The images were converted into PIL Images, but not to Part, therefore causing issues.

I created this conversion function (see below) that works for my use case and added it to autogen/oai/gemini.py. Just wanted to post it in case anyone needs it.

What did you expect to happen?

Error message:

TypeError: Parameter to MergeFrom() must be instance of same class: expected Part got PIL.PngImagePlugin.PngImageFile.

How can we reproduce it (as minimally and precisely as possible)?

from dotenv import load_dotenv

load_dotenv()

import os
from autogen import UserProxyAgent
from autogen.agentchat.contrib.multimodal_conversable_agent import (
    MultimodalConversableAgent,
)


visual_critic_agent = MultimodalConversableAgent(
    "visual_critic_agent",
    llm_config={
        "config_list": [
            {
                "model": "gemini-1.5-flash",
                "api_key": os.environ["GEMINI_API_KEY"],
                "api_type": "google",
            }
        ],
        "cache_seed": None,
    },
)

user_agent = UserProxyAgent(
    "user_agent", human_input_mode="ALWAYS", max_consecutive_auto_reply=0
)


user_agent.initiate_chat(
    visual_critic_agent,
    message="""Please tell me what is in this image?
    <img https://goldenmeadowsretrievers.com/wp-content/uploads/2023/08/golden-retriever-dog-with-newborn-golden-retriever.jpg>
""",
)

AutoGen version

0.4.1

Which package was this bug in

Core

Model used

gemini

Python version

No response

Operating system

No response

Any additional info you think would be helpful for fixing this bug

def _pil_to_part(image: Image.Image) -> Part:
    byte_arr = BytesIO()
    image.save(byte_arr, format=image.format or 'PNG')
    image_bytes = byte_arr.getvalue()
    
    blob = Blob(
        mime_type=f"image/{image.format.lower() if image.format else 'png'}", 
        data=image_bytes
    )
    
    return Part(inline_data=blob)


def _convert_pil_images_in_parts(curr_parts):
    """
    Converts any PIL Images in a list of parts to Part objects while preserving other parts.
    
    Args:
        curr_parts: List of mixed content (PIL Images and Parts)
        
    Returns:
        List where all PIL Images have been converted to Parts
    """
    updated_parts = []
    for part in curr_parts:
        if isinstance(part, Image.Image):
            updated_parts.append(_pil_to_part(part))
        else:
            updated_parts.append(part)
    return updated_parts

The text was updated successfully, but these errors were encountered:

ekzhu · 2025-01-14T01:44:57Z

Thanks for the issue.

I believe this has already been fixed in 0.4.1. The code you are showing is using 0.2 API.

Would you like to submit a fix to the 0.2 package?

Make sure you are using autogen-agentchat and autogen-ext. See readme.

github-actions bot added the needs-triage label Jan 14, 2025

ekzhu added the awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster label Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PIL Image conversion issues with Gemini API Parts #5033

PIL Image conversion issues with Gemini API Parts #5033

xuefei-wang commented Jan 14, 2025 •

edited

Loading

ekzhu commented Jan 14, 2025

PIL Image conversion issues with Gemini API Parts #5033

PIL Image conversion issues with Gemini API Parts #5033

Comments

xuefei-wang commented Jan 14, 2025 • edited Loading

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

AutoGen version

Which package was this bug in

Model used

Python version

Operating system

Any additional info you think would be helpful for fixing this bug

ekzhu commented Jan 14, 2025

xuefei-wang commented Jan 14, 2025 •

edited

Loading