Workers AI

了解使用 Cloudflare Workers AI 的高级功能和最佳实践。

AI 绑定

AI 绑定允许您的 Workers 与 AI 模型交互。在 wrangler.toml 中配置：

toml

[ai]
binding = "AI"

流式响应

对于长时间运行的响应，使用流式传输：

javascript

export default {
  async fetch(request, env, ctx) {
    const stream = await env.AI.run('@cf/meta/llama-2-7b-chat-int8', {
      messages: [{ role: 'user', content: '给我讲个故事' }],
      stream: true
    });

    return new Response(stream, {
      headers: { 'Content-Type': 'text/event-stream' }
    });
  }
};

缓存 AI 响应

通过缓存响应来提高性能：

javascript

export default {
  async fetch(request, env, ctx) {
    const cache = caches.default;
    const cacheKey = new Request(request.url, request);

    let response = await cache.match(cacheKey);
    if (!response) {
      const aiResponse = await env.AI.run('@cf/meta/llama-2-7b-chat-int8', {
        messages: [{ role: 'user', content: '你好！' }]
      });
      response = new Response(JSON.stringify(aiResponse));
      ctx.waitUntil(cache.put(cacheKey, response.clone()));
    }
    return response;
  }
};

最佳实践

选择合适的模型 - 选择满足需求的最小模型
缓存响应 - 减少冗余的 AI 调用
使用流式传输 - 长响应有更好的用户体验
优雅处理错误 - AI 调用可能会失败
监控使用情况 - 跟踪您的 API 使用量

Workers AI ​

AI 绑定 ​

流式响应 ​

缓存 AI 响应 ​

最佳实践 ​

Workers AI

AI 绑定

流式响应

缓存 AI 响应

最佳实践