From 676a3f19d8255c03bdc27ad502e69e3a08fae9f6 Mon Sep 17 00:00:00 2001 From: Al-Tahir_Roman Date: Tue, 21 Apr 2026 20:38:44 +0000 Subject: [PATCH] =?UTF-8?q?=D0=9E=D0=B1=D0=BD=D0=BE=D0=B2=D0=B8=D1=82?= =?UTF-8?q?=D1=8C=20README.md?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 6afc938..256dec3 100644 --- a/README.md +++ b/README.md @@ -40,16 +40,16 @@ - Не требует эталонного изображения (reference-free). - Является общепринятым стандартом в сообществе. - **Источник**: [Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020) (Radford et al.) +- [library for testing](https://github.com/Taited/clip-score/) ### Детали реализации -```python -# Псевдокод расчета CLIP-score -from transformers import CLIPProcessor, CLIPModel +```bash +# код расчета CLIP-score +pip install transformers=4.25.1 +pip install torch +pip install clip-score -model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32") -processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32") +python -m clip_score .\images .\texts -inputs = processor(text=prompt, images=image, return_tensors="pt", padding=True) -outputs = model(**inputs) -score = outputs.logits_per_image.softmax(dim=1).item() # или чистое косинусное сходство \ No newline at end of file +# CLIP Score: 0.3308749198913574 \ No newline at end of file