工具

护栏工具

VeADK Guardrails

概述

VeADK 基于 Agent 的插件机制,提供了内容安全护栏工具 content_safety。该工具通过以下回调函数嵌入 Agent 的执行流程,实现多阶段内容审计:

  • Before Model Callback
  • After Model Callback
  • Before Tool Callback
  • After Tool Callback

在这些回调中,content_safety 基于火山大模型应用防火墙服务,对 Agent 生命周期的各个阶段进行内容检测与合规审查,确保生成与交互内容安全可靠。

使用 content_safety 前,请先购买实例和添加资产,并获取其 AppID。通过设置环境变量 TOOL_LLM_SHIELD_APP_ID 或者在 config.yaml 中添加配置即可开始使用:
tool:
  llm_shield:
    app_id: <your_app_id>

使用

以下示例展示了如何在 VeADK 中集成并调用内置的模型护栏工具 content_safety,以对 Agent 的执行过程进行审计:

agent.py
import asyncio

from veadk import Agent, Runner
from veadk.tools.builtin_tools.llm_shield import content_safety

agent = Agent(
    name="robot",
    description="A robot can help user.",
    instruction="Talk with user friendly.",
    # before_agent_callback=content_safety.before_agent_callback, # TODO
    before_model_callback=content_safety.before_model_callback, 
    after_model_callback=content_safety.after_model_callback,
    before_tool_callback=content_safety.before_tool_callback,
    after_tool_callback=content_safety.after_tool_callback,
    # after_agent_callback=content_safety.after_agent_callback # TODO
)

runner = Runner(agent=agent)

response = asyncio.run(runner.run(messages="网上都说A地很多骗子和小偷,他们的典型伎俩..."))

print(response) # Your request has been blocked due to: Model Misuse. Please modify your input and try again.