{"componentChunkName":"component---src-templates-acg-portal-new-template-tsx","path":"/tmpbaqyho","result":{"data":{"markdownRemark":{"html":"<h2 id=\"简介\"><a href=\"#%E7%AE%80%E4%BB%8B\" aria-label=\"简介 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>简介</h2>\n<p><code>ImageViTEmbedding</code>基于 Daft + Ray 分布式框架的图像嵌入向量提取组件，利用 ViT 类模型（如 DINOv2）将图像编码为固定维度的向量表示，适用于图像检索、分类、聚类等下游任务。</p>\n<h2 id=\"功能\"><a href=\"#%E5%8A%9F%E8%83%BD\" aria-label=\"功能 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>功能</h2>\n<ul>\n<li>支持通过图像 URL 批量提取 ViT 嵌入向量</li>\n<li>基于 Ray Actor 实现 GPU 加速推理</li>\n<li>支持多种 ViT 模型（如 <code>facebook/dinov2-large</code>）</li>\n<li>支持 FP16 精度推理，降低显存占用</li>\n</ul>\n<blockquote>\n<p>该算子运行需要依赖对应的模型权重文件，平台已将所需权重文件预置在公共数据集<code>aihc_daft_public_model</code>中，需要在创建开发负载时挂载该数据集即可正常使用。</p>\n</blockquote>\n<h2 id=\"参数\"><a href=\"#%E5%8F%82%E6%95%B0\" aria-label=\"参数 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>参数</h2>\n<table>\n<thead>\n<tr>\n<th>参数名称</th>\n<th>类型</th>\n<th>默认值</th>\n<th>描述</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>image_src_type</td>\n<td>str</td>\n<td>image_url</td>\n<td>输入图像的格式类型，支持：bos/http 地址(image_url)base64 编码(image_base64)二进制流(image_binary)可选值：[\"image_url\", \"image_base64\", \"image_binary\"]默认值：\"image_url\"</td>\n</tr>\n<tr>\n<td>dtype</td>\n<td>str</td>\n<td>float32</td>\n<td>模型推理精度选择：bfloat16: 平衡精度与速度（TPU上更快）float16: 更快的推理速度float32: 最高精度但显存消耗最大可选值：[\"bfloat16\", \"float16\", \"float32\"]默认值：\"float16\"</td>\n</tr>\n<tr>\n<td>batch_size</td>\n<td>int</td>\n<td>32</td>\n<td>批处理大小默认值: 32</td>\n</tr>\n<tr>\n<td>model_path</td>\n<td>str</td>\n<td>/opt/aihc/models</td>\n<td>模型文件存储路径默认值: \"/opt/aihc/models\"</td>\n</tr>\n<tr>\n<td>model_name</td>\n<td>str</td>\n<td>facebook/dinov2-large</td>\n<td>使用的图像向量模型名称可选值: [\"google/vit-base-patch16-224-in21k\",\"google/vit-large-patch16-224-in21k\",\"facebook/dinov2-base\",\"facebook/dinov2-large\"]默认值: \"facebook/dinov2-large\"</td>\n</tr>\n<tr>\n<td>use_cls_token_embedding</td>\n<td>bool</td>\n<td>true</td>\n<td>是否使用CLS Token特征默认值: True</td>\n</tr>\n<tr>\n<td>rank</td>\n<td>int</td>\n<td>0</td>\n<td>指定使用的GPU设备编号（多卡环境有效）。例如：0表示第一张GPU，1表示第二张GPU默认值：0</td>\n</tr>\n</tbody>\n</table>\n<h2 id=\"输入\"><a href=\"#%E8%BE%93%E5%85%A5\" aria-label=\"输入 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>输入</h2>\n<table>\n<thead>\n<tr>\n<th>输入列名</th>\n<th>说明</th>\n<th>说明</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>images</td>\n<td>包含图像数据的数组，元素类型可为图像URL、Base64编码或二进制数据</td>\n<td>图像 URL 地址</td>\n</tr>\n</tbody>\n</table>\n<p>以 Daft DataFrame 形式传入，列名为 <code>image</code>。</p>\n<h2 id=\"输出\"><a href=\"#%E8%BE%93%E5%87%BA\" aria-label=\"输出 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>输出</h2>\n<p>包含特征向量的数组，每个元素为float类型的嵌套数组，\n数组维度由模型输出决定</p>\n<h2 id=\"使用示例\"><a href=\"#%E4%BD%BF%E7%94%A8%E7%A4%BA%E4%BE%8B\" aria-label=\"使用示例 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>使用示例</h2>\n\n    <div class=\"code-block-wrapper\">\n        <div class=\"code-block\">\n            <div class=\"code-block-header\">\n                <span class=\"code-block-name\">Plain Text</span>\n                <button class=\"code-copy-btn\" data-tooltip-text=\"\">\n                    <svg xmlns=\"http://www.w3.org/2000/svg\" width=\"16\" height=\"16\" viewBox=\"0 0 16 16\" fill=\"none\"> <path fill-rule=\"evenodd\" clip-rule=\"evenodd\" d=\"M5.57894 3.45614C5.57894 3.38832 5.63392 3.33333 5.70175 3.33333H12.5439C12.6117 3.33333 12.6667 3.38832 12.6667 3.45614V10.2982C12.6667 10.3661 12.6117 10.4211 12.5439 10.4211H11.7544V5.70175C11.7544 4.89754 11.1025 4.24561 10.2982 4.24561H5.57894V3.45614ZM4.24561 4.24561V3.45614C4.24561 2.65194 4.89754 2 5.70175 2H12.5439C13.3481 2 14 2.65194 14 3.45614V10.2982C14 11.1025 13.3481 11.7544 12.5439 11.7544H11.7544V12.5439C11.7544 13.3481 11.1025 14 10.2982 14H3.45614C2.65194 14 2 13.3481 2 12.5439V5.70175C2 4.89754 2.65194 4.24561 3.45614 4.24561H4.24561ZM3.33333 5.70175C3.33333 5.63392 3.38832 5.57894 3.45614 5.57894H10.2982C10.3661 5.57894 10.4211 5.63392 10.4211 5.70175V12.5439C10.4211 12.6117 10.3661 12.6667 10.2982 12.6667H3.45614C3.38832 12.6667 3.33333 12.6117 3.33333 12.5439V5.70175Z\" fill=\"currentColor\"></path> </svg>\n                    复制\n                </button>\n            </div>\n            <div class=\"code-block-content\">\n                <pre class=\"language-text\"><code><span class=\"line-number\">1</span>from __future__ import annotations\n<span class=\"line-number\">2</span>\n<span class=\"line-number\">3</span>import os\n<span class=\"line-number\">4</span>import daft\n<span class=\"line-number\">5</span>from daft import col\n<span class=\"line-number\">6</span>from daft.aihc.common.udf import aihc_udf\n<span class=\"line-number\">7</span>from daft.aihc.functions.image.embedding.image_vit_embedding import ImageViTEmbedding\n<span class=\"line-number\">8</span>\n<span class=\"line-number\">9</span>if __name__ == &quot;__main__&quot;:\n<span class=\"line-number\">10</span>    if os.getenv(&quot;DAFT_RUNNER&quot;, &quot;native&quot;) == &quot;ray&quot;:\n<span class=\"line-number\">11</span>        import ray\n<span class=\"line-number\">12</span>        ray.init(dashboard_host=&quot;0.0.0.0&quot;, ignore_reinit_error=True)\n<span class=\"line-number\">13</span>        daft.set_runner_ray()\n<span class=\"line-number\">14</span>    daft.set_execution_config(actor_udf_ready_timeout=6000, min_cpu_per_task=0)\n<span class=\"line-number\">15</span>\n<span class=\"line-number\">16</span>    image_src_type = &quot;image_url&quot;\n<span class=\"line-number\">17</span>    batch_size = 64\n<span class=\"line-number\">18</span>    model_path = os.getenv(&quot;MODEL_PATH&quot;, &quot;/opt/aihc/models&quot;)\n<span class=\"line-number\">19</span>    model_name = &quot;facebook/dinov2-large&quot;\n<span class=\"line-number\">20</span>    dtype = &quot;float16&quot;\n<span class=\"line-number\">21</span>    use_cls_token_embedding = True\n<span class=\"line-number\">22</span>    rank = 0\n<span class=\"line-number\">23</span>    num_gpus = 1\n<span class=\"line-number\">24</span>\n<span class=\"line-number\">25</span>    samples = {\n<span class=\"line-number\">26</span>        &quot;image&quot;: [\n<span class=\"line-number\">27</span>            &quot;https://{bucket}.bj.bcebos.com/image.png&quot;,\n<span class=\"line-number\">28</span>        ]\n<span class=\"line-number\">29</span>    }\n<span class=\"line-number\">30</span>\n<span class=\"line-number\">31</span>    ds = daft.from_pydict(samples)\n<span class=\"line-number\">32</span>    ds = ds.with_column(\n<span class=\"line-number\">33</span>        &quot;embedding&quot;,\n<span class=\"line-number\">34</span>        aihc_udf(\n<span class=\"line-number\">35</span>            ImageViTEmbedding,\n<span class=\"line-number\">36</span>            construct_args={\n<span class=\"line-number\">37</span>                &quot;image_src_type&quot;: image_src_type,\n<span class=\"line-number\">38</span>                &quot;batch_size&quot;: batch_size,\n<span class=\"line-number\">39</span>                &quot;model_path&quot;: model_path,\n<span class=\"line-number\">40</span>                &quot;model_name&quot;: model_name,\n<span class=\"line-number\">41</span>                &quot;dtype&quot;: dtype,\n<span class=\"line-number\">42</span>                &quot;use_cls_token_embedding&quot;: use_cls_token_embedding,\n<span class=\"line-number\">43</span>                &quot;rank&quot;: rank,\n<span class=\"line-number\">44</span>            },\n<span class=\"line-number\">45</span>            num_gpus=num_gpus,\n<span class=\"line-number\">46</span>            batch_size=1,\n<span class=\"line-number\">47</span>        )(col(&quot;image&quot;)),\n<span class=\"line-number\">48</span>    )\n<span class=\"line-number\">49</span>\n<span class=\"line-number\">50</span>    ds.show()\n<span class=\"line-number\">51</span>#╭────────────────────────────────┬────────────────────────────────╮                                                                                                                                            \n<span class=\"line-number\">52</span>#│ image                          ┆ embedding                      │\n<span class=\"line-number\">53</span>#│ ---                            ┆ ---                            │\n<span class=\"line-number\">54</span>#│ String                         ┆ List[Float32]                  │\n<span class=\"line-number\">55</span>#╞════════════════════════════════╪════════════════════════════════╡\n<span class=\"line-number\">56</span>#│ https://{bucket}.bj.bcebos.com/┆ [0.00059747696, -0.02935791, … │\n<span class=\"line-number\">57</span>#╰────────────────────────────────┴────────────────────────────────╯</code></pre>\n            </div>\n        </div>\n    </div>\n  ","fields":{"slug":"tmpbaqyho","title":"图像 Embedding（ViT 系列模型）","date":"2026-06-04","extractedHeadings":[]},"headings":[{"value":"简介","depth":2},{"value":"功能","depth":2},{"value":"参数","depth":2},{"value":"输入","depth":2},{"value":"输出","depth":2},{"value":"使用示例","depth":2}]}},"pageContext":{"isCreatedByStatefulCreatePages":false,"slug":"tmpbaqyho","prev":{"id":"Qmo9zz82x","name":"图像重采样处理器","path":"Qmo9zz82x","filePath":"操作指南/AI数据处理/算子列表/图片/图像重采样处理器.md","seo":null,"parentIds":["ilib2qygp","Ymo88m8hi","Imob3m6so","6mob45l3j"],"parents":[{"id":"ilib2qygp","documentId":"bfa43a8b-968a-41a1-8c9d-906507eeaed9","name":"操作指南","repoName":"AIHC","filePath":"操作指南","disabled":false,"path":"ilib2qygp","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":null},{"id":"Ymo88m8hi","documentId":"c8cb5e38-f8c5-40f4-a424-b0c7895f0c0a","name":"AI数据处理","repoName":"AIHC","filePath":"操作指南/AI数据处理","disabled":false,"path":"Ymo88m8hi","lastMergeTime":"2026-04-21 14:23:10","isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""},{"id":"Imob3m6so","documentId":"fe548e34-6659-4ff5-86f6-eee2c43aec90","name":"算子列表","repoName":"AIHC","filePath":"操作指南/AI数据处理/算子列表","disabled":false,"path":"Imob3m6so","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""},{"id":"6mob45l3j","documentId":"f121147a-d538-4da6-b212-1afecb2ecd42","name":"图片","repoName":"AIHC","filePath":"操作指南/AI数据处理/算子列表/图片","disabled":false,"path":"6mob45l3j","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""}]},"next":{"id":"1moa00wo1","name":"音频","path":"1moa00wo1","filePath":"操作指南/AI数据处理/算子列表/音频/音频格式标准化处理器.md","seo":null,"parentIds":["ilib2qygp","Ymo88m8hi","Imob3m6so","zmob46e1c"],"parents":[{"id":"ilib2qygp","documentId":"bfa43a8b-968a-41a1-8c9d-906507eeaed9","name":"操作指南","repoName":"AIHC","filePath":"操作指南","disabled":false,"path":"ilib2qygp","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":null},{"id":"Ymo88m8hi","documentId":"c8cb5e38-f8c5-40f4-a424-b0c7895f0c0a","name":"AI数据处理","repoName":"AIHC","filePath":"操作指南/AI数据处理","disabled":false,"path":"Ymo88m8hi","lastMergeTime":"2026-04-21 14:23:10","isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""},{"id":"Imob3m6so","documentId":"fe548e34-6659-4ff5-86f6-eee2c43aec90","name":"算子列表","repoName":"AIHC","filePath":"操作指南/AI数据处理/算子列表","disabled":false,"path":"Imob3m6so","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""},{"id":"zmob46e1c","documentId":"886f1d97-777e-4f0c-81ae-e69bf16653f0","name":"音频","repoName":"AIHC","filePath":"操作指南/AI数据处理/算子列表/音频","disabled":false,"path":"zmob46e1c","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""}]},"parents":[{"id":"ilib2qygp","documentId":"bfa43a8b-968a-41a1-8c9d-906507eeaed9","name":"操作指南","repoName":"AIHC","filePath":"操作指南","disabled":false,"path":"ilib2qygp","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":null},{"id":"Ymo88m8hi","documentId":"c8cb5e38-f8c5-40f4-a424-b0c7895f0c0a","name":"AI数据处理","repoName":"AIHC","filePath":"操作指南/AI数据处理","disabled":false,"path":"Ymo88m8hi","lastMergeTime":"2026-04-21 14:23:10","isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""},{"id":"Imob3m6so","documentId":"fe548e34-6659-4ff5-86f6-eee2c43aec90","name":"算子列表","repoName":"AIHC","filePath":"操作指南/AI数据处理/算子列表","disabled":false,"path":"Imob3m6so","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""},{"id":"6mob45l3j","documentId":"f121147a-d538-4da6-b212-1afecb2ecd42","name":"图片","repoName":"AIHC","filePath":"操作指南/AI数据处理/算子列表/图片","disabled":false,"path":"6mob45l3j","lastMergeTime":null,"isApiDoc":null,"httpMethod":null,"seo":null,"sourceOrgName":null,"sourceRepoName":null,"sourceDocumentId":""}],"specificSeo":null}}}