PhotoPrism Pro: Evaluate LLMs and Prompts

Using Open Web UI

to evaluate large language models (LLMs) freely available via Ollama

Introduction

Classic AI models for automatic image recognition provide general categories such as "person" or "building," but often fail to identify industry-specific terms that are crucial for commercial use. This is exactly where modern language and image models, known as "large language models" (LLMs), come in. With Ollama, these models can be run locally, so no sensitive data ends up in external clouds. Open WebUI provides a user-friendly interface where different models can be tried out with custom prompts. This allows images to be labeled with tailored, industry-specific terms, such as "drinking water tank," "construction site pipe laying," or "employee training."

Why this works:

The models are trained to understand image content and translate it into text.
With a precisely formulated prompt, we can guide the models to choose the right technical terms.
Different models have different strengths—by testing several candidates, we can find the solution that best suits your needs.

What should you pay attention to?

Language and precision: Instructions often work more reliably in English, but the output can be in German.
Prompt length: Shorter, clear instructions usually lead to faster and more consistent results.
Consistency: Defining a fixed catalog of permitted labels reduces the risk of outliers.
Iterative approach: It is normal for the initial results to be less than optimal. Experience shows that gradual testing and refinement leads to a stable solution.
Resources: Since the models are executed locally, the speed depends on the available hardware.

Below you will find step-by-step instructions for installing and using Ollama with Open WebUI. This provides you with an intuitive test environment for creating keywords locally. Your images will not be shared with external cloud services and there are no usage-based costs.

Installation with Docker Compose

Both Ollama and Open Web UI are available as free downloadable Docker images.

Which models can be used?

In principle, all Ollama Vision models are suitable for describing images. However, since there are significant differences in reliability, size/speed, vocabulary, and supported output formats, not all models are suitable for practical use. For beginners, we recommend gemma3 from Google (USA)and qwen2.5vl by Alibaba (China). They are relatively compact, have been trained with high-quality data, and generally work very reliably.

Testing and Improving Prompts

Open WebUI and select a model in the upper left corner
Upload image
Enter prompt and analyze result

We recommend testing each prompt variant with different models and images.

Change the prompt and test the new prompt in a new chat

Prompt Examples

Example without specifying labels:

ROLE You label photos for a {{CUSTOMER_CONTEXT}}. Prefer specific, industry terms over generic ones.

TASK Select the BEST {{LABEL_LANGUAGE}} labels for the image.

OUTPUT (JSON ONLY)

{"labels":[...]}

RULES

5–15 labels.
{{LABEL_LANGUAGE}} only; singular; lowercase; keep correct diacritics/accents.
No duplicates or explanations.
If no label fits, return {"labels": []}.

Example values

{{LABEL_LANGUAGE}} → fr-FR
{{CUSTOMER_CONTEXT}} → a french automotive company

Example with a defined list of labels

ROLE You label photos for a {{CUSTOMER_CONTEXT}}. Prefer specific, industry terms over generic ones.

TASK Select the BEST {{LABEL_LANGUAGE}} labels for the image ONLY from the allowed list below.

Allowed labels ({{LABEL_LANGUAGE}}):

OUTPUT (JSON ONLY)

{"labels":[...]}

RULES

5–15 labels.
ONLY from the allowed list.
Use correct accents/diacritics.
No duplicates or explanations.
If no label fits, return {"labels": []}.

Example values

{{LABEL_LANGUAGE}} → de-DE
{{CUSTOMER_CONTEXT}} → a german construction company (Bauunternehmen)
{{ALLOWED_LABELS_LIST}} → baustelle, aushub, schalung, beton, bewehrung, rohrleitung, graben, vermessung, kran, baubüro, arbeitssicherheit, beschilderung, abnahme, entwässerung, erschließung, asphaltierung, brücke, gründung, qualitätskontrolle

Evaluate LLMs and Prompts for Label Generation

Using Open Web UI

Introduction

Why this works:

What should you pay attention to?

Installation with Docker Compose

Which models can be used?

Testing and Improving Prompts

Prompt Examples

Example without specifying labels:

Example values

Example with a defined list of labels

Example values