Agent Cover

1. 前言

通用智能(AGI)时代,以 ChatGPT、Gemini、Claude 等为代表的大模型越来越强,用户要求也在水涨船高, 模型的野心也不再局限于成为手执百科全书的百晓生,更想跨出那一步,成神,成为无所不知无所不能的存在。

已然无所不知,还想无所不能,需给予其双手(打造 API)并放手(充分授权),所以来到了 Agent 时代。

Agent 是一个有手,能够帮你做事的大模型,是一个得力助手,一个更具象化的强大工具。它的底层本质上是 函数调用(Function Calling)。我觉得它的发展阶段脉络是:【Function Calling】 -> 【MCP】 -> 【Agent】, Function Calling 是特定模型上的专用工具,MCP 则是对其的规范化标准化,使其更具有通用性和泛化能力, Agent 则是对 MCP 的封装集合。大概有三条路,第一条是大模型公司把 Agent 做进了模型里,模型自身就能直接 使用工具做事,第二条是大模型公司给模型增加接口,通过外挂自己开发的工具来实现一些动作,第三条是第三方 开发者通过调用模型 API,然后将其与自己开发的操作工具进行整合来实现 Agent。

这些模型已经强大到,你只要给几个简单的工具,比如读取文件、列出目录、执行终端命令,它就能开始自动写代码了。 以前需要很多繁琐设计,现在只需提供能力,模型自己就能推理并完成任务。放权、不限制 token,即使简单的 几个小小的工具也能释放巨大的威力,实现很多功能,有震撼瞬间。

2. 引文

很多人认为构建一个智能体很难很复杂,其存在神秘光环,在今天之前,我也这样认为,困阻不前。 但构建出唯一能与 Claude Code 并列 S 级的智能编码工具 Amp 的 Sourcegraph 公司的核心工程师 Thorsten Ball 说:

构建一个功能完备、能进行代码编辑的智能体并不难。构建一个小巧且令人印象深刻的智能体,你可以在不到 400 行代码内完成。

Coding agent rank

Thorsten 在其公司官网写了一篇反响很好深受欢迎的博文详细介绍了怎样使用 Golang 从零开始构建基于 Claude 的智能体, 这篇博文是对原文的转载翻译。

原文:

How to Build an Agent: https://ampcode.com/how-to-build-an-agent

译文如下:

3. 译文

如何构建智能体

或者:皇帝的新衣

Thorsten Ball,2025 年 4 月 15 日

构建一个功能完备、能进行代码编辑的智能体并不难。

看起来应该很难。当你看到一个智能体在编辑文件、运行命令、从错误中自我恢复、尝试不同策略时——看起来背后肯定有什么秘密。

但是没有。它就是一个 LLM、一个循环,再加上足够的 token。这就是我们从一开始在 播客 中一直在说的。其余的东西,那些让 Amp 如此令人上瘾和印象深刻的东西?都是汗水和努力。

但是构建一个小而高效的智能体甚至不需要那些。你可以用不到 400 行代码就做到,其中大部分还是样板代码。

我现在就要向你展示如何做到这一点。我们要一起写一些代码,从零行代码开始到 “哇,这真是…改变游戏规则的东西”。

我强烈建议你跟着做。真的。你可能会想你只需要读一遍就行,不需要亲自敲代码,但这只有不到 400 行代码。我需要你感受到代码有多少,我希望你在自己的终端、自己的文件夹里亲眼看到这一切。

我们需要的只有:

  • Go
  • 你已将其设置为环境变量的 Anthropic API 密钥ANTHROPIC_API_KEY

开干吧

让我们直接开始,用四个简单的命令建立一个新的 Go 项目:

mkdir code-editing-agent
cd code-editing-agent
go mod init agent
touch main.go

现在,让我们打开main.go,作为第一步,放入我们需要的框架代码:

package main

import (
    "bufio"
    "context"
    "fmt"
    "os"
    
    "github.com/anthropics/anthropic-sdk-go"
)

func main() {
    client := anthropic.NewClient()
    scanner := bufio.NewScanner(os.Stdin)
    
    getUserMessage := func() (string, bool) {
        if !scanner.Scan() {
            return "", false
        }
        return scanner.Text(), true
    }
    
    agent := NewAgent(&client, getUserMessage)
    err := agent.Run(context.TODO())
    if err != nil {
        fmt.Printf("Error: %s\n", err.Error())
    }
}

func NewAgent(client *anthropic.Client, getUserMessage func() (string, bool)) *Agent {
    return &Agent{
        client:         client,
        getUserMessage: getUserMessage,
    }
}

type Agent struct {
    client         *anthropic.Client
    getUserMessage func() (string, bool)
}

是的,这还不能编译。但我们这里有一个 Agent,它可以访问 anthropic.Client(默认情况下会寻找 ANTHROPIC_API_KEY),并且可以通过从终端的 stdin 读取来获取用户消息。

现在让我们添加缺少的Run()方法:

// main.go

func (a *Agent) Run(ctx context.Context) error {
    conversation := []anthropic.MessageParam{}
    fmt.Println("Chat with Claude (use 'ctrl-c' to quit)")
    
    for {
        fmt.Print("\u001b[94mYou\u001b[0m: ")
        userInput, ok := a.getUserMessage()
        if !ok {
            break
        }
        
        userMessage := anthropic.NewUserMessage(anthropic.NewTextBlock(userInput))
        conversation = append(conversation, userMessage)
        
        message, err := a.runInference(ctx, conversation)
        if err != nil {
            return err
        }
        
        conversation = append(conversation, message.ToParam())
        
        for _, content := range message.Content {
            switch content.Type {
            case "text":
                fmt.Printf("\u001b[93mClaude\u001b[0m: %s\n", content.Text)
            }
        }
    }
    
    return nil
}

func (a *Agent) runInference(ctx context.Context, conversation []anthropic.MessageParam) (*anthropic.Message, error) {
    message, err := a.client.Messages.New(ctx, anthropic.MessageNewParams{
        Model:    anthropic.ModelClaude3_7SonnetLatest,
        MaxTokens: int64(1024),
        Messages:  conversation,
    })
    return message, err
}

代码不多,对吧?90行代码,其中最重要的是 Run() 中的这个循环,让我们可以与 Claude 对话。但这已经是这个程序的心跳了。

对于一个心跳来说,它相当直白:我们首先打印一个提示,要求用户输入一些内容,将其添加到对话中,发送给 Claude,将 Claude 的回复添加到对话中,打印回复,然后循环。

这就是你用过的每一个 AI 聊天应用程序,只不过它在终端中。

让我们运行它:

export ANTHROPIC_API_KEY="this is the last time i will tell you to set this"

# 下载依赖
go mod tidy
# 运行
go run main.go

然后你就可以和 Claude 聊天了,就像这样:

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: Hey! I'm Thorsten! How are you?
Claude: Hi Thorsten! I'm doing well, thanks for asking. It's nice to meet you. How are you doing today? Is there something I can help you with?
You: Can you come up with any horse-related nicknames that make fun of my first name?
Claude: I can try to come up with some playful horse-related nicknames based on "Thorsten":

* Thorough-bred Thorsten
* Trotsten
* Thorsten the Stallion
* Neighsten
* Horseten
* Thorsten the Galloper
* Saddle-sten
* Thorsten of the Mane
* Canter-sten
* Thorsteed

These are meant to be lighthearted - I hope they give you a chuckle!
You: 

注意我们如何在多轮对话中保持同一个对话。它从第一条消息中记住了我的名字。conversation 随着每一轮变得更长,我们每次都发送整个对话。服务器——Anthropic的服务器——是无状态的。它只能看到 conversation 切片中的内容。维护那个是我们的责任。

好的,让我们继续,因为昵称很糟糕,而且这还不是一个智能体。什么是智能体?这是 我的定义:一个可以访问工具的 LLM,让它能够修改上下文窗口之外的东西。

一个初步工具

一个可以访问工具的 LLM?什么是工具?基本想法是这样的:你向模型发送一个提示,说如果它想使用”一个工具”,它应该以某种方式回复。然后你,作为消息的接收者,通过执行它来”使用工具”并回复结果。就是这样。我们将看到的其他一切都只是在此基础上的抽象。

想象一下你在和朋友聊天,你告诉他们:”在接下来的对话中,如果你想让我举起胳膊,就眨眨眼。” 这是个奇怪的话,但概念很容易理解。

我们甚至可以在不改变任何代码的情况下尝试一下。

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: You are a weather expert. When I ask you about the weather in a given location, I want you to reply with `get_weather(<location_name>)`. I will then tell you what the weather in that location is. Understood?
Claude: I understand your instructions. When you ask me about the weather in a specific location, I will reply with `get_weather(<location_name>)`. Then you'll provide me with the weather information for that location, and I can discuss it further with you. I'm ready whenever you'd like to ask about the weather somewhere.
You: Hey, what's the weather in Munich?
Claude: get_weather(Munich)

我们告诉Claude在想知道天气时用 get_weather 眨眼。下一步是举起我们的胳膊并回复”工具的结果”:

You: hot and humid, 28 degrees celcius
Claude: Thank you for providing that information. The current weather in Munich is hot and humid at 28 degrees Celsius (about 82 degrees Fahrenheit). These are fairly warm conditions for Munich, especially with the added humidity which can make it feel even warmer than the temperature indicates.

If you're in Munich or planning to visit, it would be good to stay hydrated, wear light clothing, and perhaps plan indoor activities during the hottest parts of the day. The combination of heat and humidity can be quite uncomfortable, particularly if you're not accustomed to such weather.

Would you like to know about the weather in any other location?

这在第一次尝试时就工作得很好,不是吗?

这些模型经过训练和微调来使用”工具”,它们非常渴望这样做。到现在,2025 年,它们有点”知道”自己不知道一切,可以使用工具来获取更多信息。(当然这不是精确的描述,但现在这个解释足够了。)

总结一下,工具和工具使用只有两件事:

  1. 你告诉模型有哪些工具可用
  2. 当模型想要执行工具时,它会告诉你,你执行工具并发送响应

为了简化(1),大型模型提供者已经内置了 API 来发送工具定义。

好的,现在让我们构建我们的第一个工具:read_file

read_file 工具

为了定义 read_file 工具,我们将使用 Anthropic SDK 建议的类型,但请记住:在底层,所有这些都将以字符串的形式发送到模型。这完全是“如果你希望我使用 read_file ,请眨眼”。

我们要添加的每个工具都需要以下内容:

  • 一个名字
  • 一个描述,告诉模型这个工具的作用,何时使用它,何时不用它,它返回什么等等
  • 一个输入模式,描述这个工具期望的输入参数及其格式,以 JSON 模式表示
  • 一个实际执行的工具函数,使用模型发送给我们数据作为输入参数,执行并返回结果

那么让我们把它加到我们的代码中:

// main.go

type ToolDefinition struct {
    Name        string                           `json:"name"`
    Description string                           `json:"description"`
    InputSchema anthropic.ToolInputSchemaParam  `json:"input_schema"`
    Function    func(input json.RawMessage) (string, error)
}

现在我们给出我们的 Agent 工具定义:

// main.go

// `tools`在这里添加:
type Agent struct {
    client         *anthropic.Client
    getUserMessage func() (string, bool)
    tools          []ToolDefinition
}

// 还有这里:
func NewAgent(
    client *anthropic.Client,
    getUserMessage func() (string, bool),
    tools []ToolDefinition,
) *Agent {
    return &Agent{
        client:         client,
        getUserMessage: getUserMessage,
        tools:          tools,
    }
}

// 还有这里:
func main() {
    // [... 之前的代码 ...]
    tools := []ToolDefinition{}
    agent := NewAgent(&client, getUserMessage, tools)
    // [... 之前的代码 ...]
}

并在 runInference 中将它们发送给模型:

// main.go

func (a *Agent) runInference(ctx context.Context, conversation []anthropic.MessageParam) (*anthropic.Message, error) {
    anthropicTools := []anthropic.ToolUnionParam{}
    for _, tool := range a.tools {
        anthropicTools = append(anthropicTools, anthropic.ToolUnionParam{
            OfTool: &anthropic.ToolParam{
                Name:        tool.Name,
                Description: anthropic.String(tool.Description),
                InputSchema: tool.InputSchema,
            },
        })
    }
    
    message, err := a.client.Messages.New(ctx, anthropic.MessageNewParams{
        Model:     anthropic.ModelClaude3_7SonnetLatest,
        MaxTokens: int64(1024),
        Messages:  conversation,
        Tools:     anthropicTools,
    })
    return message, err
}

这里有一些类型相关的杂耍在进行,而且我对泛型 Go 还不算太擅长,所以我不打算尝试解释 anthropic.StringToolUnionParam 给你。但是,真的,我发誓,这非常简单:

我们发送我们的工具定义,服务器 Anthropic 将这些定义包裹在 这个系统提示(内容不多)中,并将其添加到我们的 conversation 中,如果模型想要使用该工具,就会以特定方式回复。

好的,工具定义正在发送,但我们还没有定义工具。让我们来定义 read_file

// main.go

var ReadFileDefinition = ToolDefinition{
    Name:        "read_file",
    Description: "Read the contents of a given relative file path. Use this when you want to see what's inside a file. Do not use this with directory names.",
    InputSchema: ReadFileInputSchema,
    Function:    ReadFile,
}

type ReadFileInput struct {
    Path string `json:"path" jsonschema_description:"The relative path of a file in the working directory."`
}

var ReadFileInputSchema = GenerateSchema[ReadFileInput]()

func ReadFile(input json.RawMessage) (string, error) {
    readFileInput := ReadFileInput{}
    err := json.Unmarshal(input, &readFileInput)
    if err != nil {
        panic(err)
    }
    
    content, err := os.ReadFile(readFileInput.Path)
    if err != nil {
        return "", err
    }
    
    return string(content), nil
}

func GenerateSchema[T any]() anthropic.ToolInputSchemaParam {
    reflector := jsonschema.Reflector{
        AllowAdditionalProperties: false,
        DoNotReference:           true,
    }
    var v T
    schema := reflector.Reflect(v)
    return anthropic.ToolInputSchemaParam{
        Properties: schema.Properties,
    }
}

不多,对吧?这是一个单一的函数 ReadFile,以及模型将看到的两个描述:我们的 Description 描述工具本身("Read the contents of a given relative file path. ..."),以及这个工具的单个输入参数的描述("The relative path of a ...")。

ReadFileInputSchemaGenerateSchema 这部分?我们需要这部分内容,以便为我们的工具定义生成一个 JSON 架构,然后将其发送给模型。为此,我们使用 jsonschema 包,需要导入并下载这个包:

// main.go

package main

import (
    "bufio"
    "context"
    // 添加这个:
    "encoding/json"
    "fmt"
    "os"
    
    "github.com/anthropics/anthropic-sdk-go"
    // 添加这个:
    "github.com/invopop/jsonschema"
)

然后运行以下命令:

go mod tidy

然后,在 main 函数中,我们需要我们确保使用定义:

func main() {
    // [... 之前的代码 ...]
    tools := []ToolDefinition{ReadFileDefinition}
    // [... 之前的代码 ...]
}

是时候试试了!

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: what's in main.go?
Claude: I'll help you check what's in the main.go file. Let me read it for you.
You: 

等等,什么?哈哈,它想使用工具!显然你的输出会略有不同,但听起来 Claude 确实知道它可以读取文件,对吧?

问题是我们没有在听!当 Claude 眨眼时,我们忽略了它。我们需要修复这个问题。

这里,让我用一个单一、快速、出人意料地适合我年龄的敏捷动作来展示如何做到这一点,通过用这个方法替换我们的 AgentRun 方法:

// main.go

func (a *Agent) Run(ctx context.Context) error {
    conversation := []anthropic.MessageParam{}
    fmt.Println("Chat with Claude (use 'ctrl-c' to quit)")
    
    readUserInput := true
    
    for {
        if readUserInput {
            fmt.Print("\u001b[94mYou\u001b[0m: ")
            userInput, ok := a.getUserMessage()
            if !ok {
                break
            }
            userMessage := anthropic.NewUserMessage(anthropic.NewTextBlock(userInput))
            conversation = append(conversation, userMessage)
        }
        
        message, err := a.runInference(ctx, conversation)
        if err != nil {
            return err
        }
        
        conversation = append(conversation, message.ToParam())
        
        toolResults := []anthropic.ContentBlockParamUnion{}
        for _, content := range message.Content {
            switch content.Type {
            case "text":
                fmt.Printf("\u001b[93mClaude\u001b[0m: %s\n", content.Text)
            case "tool_use":
                result := a.executeTool(content.ID, content.Name, content.Input)
                toolResults = append(toolResults, result)
            }
        }
        
        if len(toolResults) == 0 {
            readUserInput = true
            continue
        }
        
        readUserInput = false
        conversation = append(conversation, anthropic.NewUserMessage(toolResults...))
    }
    
    return nil
}

func (a *Agent) executeTool(id, name string, input json.RawMessage) anthropic.ContentBlockParamUnion {
    var toolDef ToolDefinition
    var found bool
    for _, tool := range a.tools {
        if tool.Name == name {
            toolDef = tool
            found = true
            break
        }
    }
    
    if !found {
        return anthropic.NewToolResultBlock(id, "tool not found", true)
    }
    
    fmt.Printf("\u001b[92mtool\u001b[0m: %s(%s)\n", name, input)
    response, err := toolDef.Function(input)
    if err != nil {
        return anthropic.NewToolResultBlock(id, err.Error(), true)
    }
    
    return anthropic.NewToolResultBlock(id, response, false)
}

眯起眼睛你会看到这 90% 是样板代码,10% 才是重要的:当我们从 Claude 那里收到 message 时,我们通过查找 content.Type == "tool_use" 来检查 Claude 是否要求我们执行工具,如果是,我们交给 executeTool 处理,在我们的本地注册表中按名称查找工具,反序列化输入,执行它,返回结果。如果出现错误,我们就切换一个布尔值。就这样。

(是的,有嵌套循环,但这没关系。)

我们执行工具,将结果发送回 Claude,并再次询问 Claude 的回复。真的:就是这样。让我展示给你看。

准备工作,运行这个:

echo 'what animal is the most disagreeable because it always says neigh?' >> secret-file.txt

这在我们的目录中创建了一个 secret-file.txt,包含一个神秘的谜语。

在同一个目录中,让我们运行我们新的使用工具的智能体,并要求它查看文件:

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: Claude, buddy, help me solve the riddle in the secret-file.txt file
Claude: I'll help you solve the riddle in the secret-file.txt file. Let me first read the contents of this file to see what the riddle is.
tool: read_file({"path":"secret-file.txt"})
Claude: Great! I've found the riddle.

The answer to the riddle is: **A horse**

The riddle is playing on wordplay where "neigh" (the sound a horse makes) sounds like "nay" (a way to say "no" or disagree with something). So horses are described as disagreeable because they always say "neigh/nay" - implying they're always disagreeing.
You: 

让我们深呼吸一起说出来。准备好了吗?我们开始:天哪。你只是给它一个工具,它就…在认为有助于解决任务时使用它。记住:我们没有说任何关于”如果用户询问文件,就读取文件”的话。我们也没有说”如果某个东西看起来像文件名,就想办法读取它”。不,什么都没有。我们说”帮我解决这个文件中的问题”,Claude 意识到它可以读取文件来回答,然后就去做了。

当然,我们可以具体一些,真正将其引导向一个工具,但它基本上都是自动完成的:

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: What's going on in main.go? Be brief!
Claude: I'll check the main.go file to see what's going on.
tool: read_file({"path":"main.go"})
Claude: Based on my review, main.go implements a Claude AI assistant agent that can use tools. Here's a brief summary:

- It creates a command-line chat interface with Claude using the Anthropic Go SDK
- Implements a tool system with a single "read_file" tool for reading file contents
- Manages the conversation flow with Claude, handling both regular text responses and tool use requests
- When Claude requests to use a tool, the agent executes it and returns the results
- The program loops continuously to maintain a conversation until the user exits

The code essentially builds a simple CLI-based chatbot that can interact with the file system when requested by Claude.
You: 

完全正确。好了,既然我们已经知道如何让 Claude 使用工具,那么让我们再添加几个。

list_files 工具

如果你像我一样,每次登录新电脑时最先做的事情就是运行 ls — 列出文件来熟悉环境。

让我们也给 Claude 赋予同样的能力,一个列出文件的工具。以下是 list_files 工具的完整实现:

// main.go

var ListFilesDefinition = ToolDefinition{
    Name:        "list_files",
    Description: "List files and directories at a given path. If no path is provided, lists files in the current directory.",
    InputSchema: ListFilesInputSchema,
    Function:    ListFiles,
}

type ListFilesInput struct {
    Path string `json:"path,omitempty" jsonschema_description:"Optional relative path to list files from. Defaults to current directory if not provided."`
}

var ListFilesInputSchema = GenerateSchema[ListFilesInput]()

func ListFiles(input json.RawMessage) (string, error) {
    listFilesInput := ListFilesInput{}
    err := json.Unmarshal(input, &listFilesInput)
    if err != nil {
        panic(err)
    }
    
    dir := "."
    if listFilesInput.Path != "" {
        dir = listFilesInput.Path
    }
    
    var files []string
    err = filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }
        relPath, err := filepath.Rel(dir, path)
        if err != nil {
            return err
        }
        if relPath != "." {
            if info.IsDir() {
                files = append(files, relPath+"/")
            } else {
                files = append(files, relPath)
            }
        }
        return nil
    })
    
    if err != nil {
        return "", err
    }
    
    result, err := json.Marshal(files)
    if err != nil {
        return "", err
    }
    
    return string(result), nil
}

这里没有什么花哨的:list_files 返回当前文件夹中的文件和目录列表。如果我们认真对待这件事,可以(而且应该)进行成千上万种优化,但由于我只想展示一下巫师帽里有什么,这样就足够了。

需要注意的一点:我们返回一个字符串列表,用尾部斜杠表示目录。这不是必需的,这只是我决定做的事情。没有固定格式。只要 Claude 能理解,任何东西都行,而它是否能理解需要通过实验来确定。你也可以在每个目录前加上 "directory: ",或者返回一个带有两个标题 "directories""files" 的 Markdown 文档。有很多选项,你选择哪种方式取决于 Claude 最能理解什么,它需要多少个 token,生成和读取的速度如何等等。

在这里,我们只想创建一个小的 list_files 工具,最简单的选项获胜。

当然我们也需要告诉 Claude 关于 list_files

// main.go

func main() {
    // [... 之前的代码 ...]
    tools := []ToolDefinition{ReadFileDefinition, ListFilesDefinition}
    // [... 之前的代码 ...]
}

就这样。让我们问问 Claude 这个目录里能看到什么。

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: what do you see in this directory?
Claude: I'll help you see what's in the current directory. Let me list the files and directories for you.
tool: list_files({})
Claude: I can see several files and directories in the current directory:

1. Git-related files and directories:
   - `.git/` directory with various Git configuration and tracking files
   - `.gitignore` file

2. Go project files:
   - `main.go` - the main Go source code file
   - `go.mod` - Go module definition file
   - `go.sum` - Go module checksum file

3. Documentation:
   - `blogpost.md` - a markdown file, likely containing blog content

4. Environment configuration:
   - `.envrc` - likely an environment configuration file

Would you like me to examine the contents of any specific file from this list?
You: 

成功了!它可以读取这个目录。

但问题是:Claude 知道如何组合这些工具。我们只需要以某种方式提示它,激发它:

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: Tell me about all the Go files in here. Be brief!
Claude: I'll examine the Go files in the current directory for you. Let me check what files are available first.
tool: list_files({})
Claude: I found a Go file in the current directory. Let me examine it:
tool: read_file({"path":"main.go"})
tool: read_file({"path":"go.mod"})
Claude: Here's a brief overview of the Go files in this project:

**main.go**:
- Implements a Claude AI agent that can interact with the local filesystem
- Uses the Anthropic SDK for Go to communicate with Claude AI models
- Provides two tools: `read_file` and `list_files` that allow Claude to read file contents and list directory contents
- Manages a conversation flow between the user and Claude, handling tool execution when Claude needs to access files

**go.mod**:
- Defines the module dependencies
- Main dependencies are the Anthropic SDK for Go and a JSON schema library
- Uses Go 1.24.1

This appears to be a simple terminal-based application that allows users to chat with Claude while giving the AI access to read local files and list directories.
You: 

它首先使用了 list_files,然后两次调用了 read_file,用的是我询问过的与 Go 相关的文件。

就像…就像我们会做的一样,对吧?我的意思是,如果我问你我们在这个项目中使用什么版本的 Go,你会怎么做?Claude 为我做的是:

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: What go version are we using in this project?
Claude: I'll help you find the Go version being used in this project. Let me check the files to locate where the Go version might be specified.
tool: list_files({})
Claude: I can see there's a Go project here with several important files like `go.mod`, `go.sum`, and `main.go`. Let me check the `go.mod` file first, as it typically contains the Go version information.
tool: read_file({"path":"go.mod"})
Claude: According to the `go.mod` file, this project is using Go version 1.24.1.

克劳德看了看目录,看了看 go.mod,就有了答案。

我们现在大约有 190 行代码。请体会一下。当你体会过后,让我们再添加一个工具。

edit_file 工具

我们要添加的最后一个工具是 edit_file——一个让 Claude 能够编辑文件的工具。

“我的天”,你现在在想,“这就是关键所在,这就是他出奇制胜的地方。”好吧,我们看看,怎么样?

首先,让我们为我们新的 edit_file 工具添加一个定义:

// main.go

var EditFileDefinition = ToolDefinition{
    Name: "edit_file",
    Description: `Make edits to a text file.
Replaces 'old_str' with 'new_str' in the given file. 'old_str' and 'new_str' MUST be different from each other.
If the file specified with path doesn't exist, it will be created.
`,
    InputSchema: EditFileInputSchema,
    Function:    EditFile,
}

type EditFileInput struct {
    Path   string `json:"path" jsonschema_description:"The path to the file"`
    OldStr string `json:"old_str" jsonschema_description:"Text to search for - must match exactly and must only have one match exactly"`
    NewStr string `json:"new_str" jsonschema_description:"Text to replace old_str with"`
}

var EditFileInputSchema = GenerateSchema[EditFileInput]()

没错,我知道你又在想:“用字符串替换来编辑文件?”Claude 3.7 非常喜欢替换字符串(实验是找出它们喜欢或不喜欢什么的方式),所以我们将通过告诉 Claude 它可以通过用新文本替换现有文本来编辑文件来实现 edit_file

现在这是 Go 中 EditFile 函数的实现:

func EditFile(input json.RawMessage) (string, error) {
    editFileInput := EditFileInput{}
    err := json.Unmarshal(input, &editFileInput)
    if err != nil {
        return "", err
    }
    
    if editFileInput.Path == "" || editFileInput.OldStr == editFileInput.NewStr {
        return "", fmt.Errorf("invalid input parameters")
    }
    
    content, err := os.ReadFile(editFileInput.Path)
    if err != nil {
        if os.IsNotExist(err) && editFileInput.OldStr == "" {
            return createNewFile(editFileInput.Path, editFileInput.NewStr)
        }
        return "", err
    }
    
    oldContent := string(content)
    newContent := strings.Replace(oldContent, editFileInput.OldStr, editFileInput.NewStr, -1)
    
    if oldContent == newContent && editFileInput.OldStr != "" {
        return "", fmt.Errorf("old_str not found in file")
    }
    
    err = os.WriteFile(editFileInput.Path, []byte(newContent), 0644)
    if err != nil {
        return "", err
    }
    
    return "OK", nil
}

它检查输入参数,读取文件(如果不存在则创建),用 NewStr 替换 OldStr。然后将内容写回磁盘并返回 “OK”。

还缺少 createNewFile,它只是一个微小的辅助函数,如果不是 Go,这个函数可以短 70%:

func createNewFile(filePath, content string) (string, error) {
    dir := path.Dir(filePath)
    if dir != "." {
        err := os.MkdirAll(dir, 0755)
        if err != nil {
            return "", fmt.Errorf("failed to create directory: %w", err)
        }
    }
    
    err := os.WriteFile(filePath, []byte(content), 0644)
    if err != nil {
        return "", fmt.Errorf("failed to create file: %w", err)
    }
    
    return fmt.Sprintf("Successfully created file %s", filePath), nil
}

最后一步:将其添加到我们发送给 Claude 的工具列表中。

// main.go

func main() {
    // [... 之前的代码 ...]
    tools := []ToolDefinition{ReadFileDefinition, ListFilesDefinition, EditFileDefinition}
    // [... 之前的代码 ...]
}

然后……我们准备好了,但你们呢?准备好放手一搏了吗?

想到这里,让我们开始吧。让 Claude 在 JavaScript 中创建一个新的 FizzBuzz 函数。

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: hey claude, create fizzbuzz.js that I can run with Nodejs and that has fizzbuzz in it and executes it
Claude: I'll create a fizzbuzz.js file that you can run with Node.js. First, let me check if the file already exists:
tool: list_files({})
Claude: I don't see a fizzbuzz.js file, so I'll create one with a classic FizzBuzz implementation. This program will print numbers from 1 to 100, but for multiples of 3 it will print "Fizz", for multiples of 5 it will print "Buzz", and for numbers that are multiples of both 3 and 5 it will print "FizzBuzz".
tool: edit_file({"path":"fizzbuzz.js","old_str":"","new_str":"/**\n * FizzBuzz implementation in JavaScript\n * Prints numbers from 1 to 100, but:\n * - For multiples of 3, print \"Fizz\" instead of the number\n * - For multiples of 5, print \"Buzz\" instead of the number\n * - For multiples of both 3 and 5, print \"FizzBuzz\"\n */\n\nfunction fizzBuzz(n) {\n    for (let i = 1; i <= n; i++) {\n        let output = '';\n        \n        if (i % 3 === 0) output += 'Fizz';\n        if (i % 5 === 0) output += 'Buzz';\n        \n        console.log(output || i);\n    }\n}\n\n// Execute FizzBuzz for numbers 1 to 100\nconsole.log('Running FizzBuzz:');\nfizzBuzz(100);"}
Claude: I've created a fizzbuzz.js file that you can run with Node.js. The program:

1. Defines a function called `fizzBuzz` that takes a parameter `n` for the upper limit
2. Loops from 1 to n (set to 100 in the execution)
3. For each number:
   - If it's divisible by 3, it adds "Fizz" to the output
   - If it's divisible by 5, it adds "Buzz" to the output
   - If it's divisible by both, it adds both resulting in "FizzBuzz"
   - If it's not divisible by either 3 or 5, it outputs the number itself
4. The script automatically executes the function for numbers 1 to 100

You can run this script using Node.js with the following command in your terminal:

node fizzbuzz.js

This will output the FizzBuzz sequence from 1 to 100 to your console.

对吧?!这很令人印象深刻,不是吗?这就是 edit_file 最基本的实现——一个通用代理——你可能想到的实现方式。

但是,它有效吗?是的,它有效:

$ node fizzbuzz.js
Running FizzBuzz:
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
[...]

太棒了。不过,让我们让它真正编辑一个文件,而不仅仅是创建一个。

当我让 Claude 执行 “请编辑 fizzbuzz.js,使其仅打印到 15” 的指令时,这是它的操作:

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: Please edit fizzbuzz.js so that it only prints until 15
Claude: I'll edit the fizzbuzz.js file to print only until 15 instead of its current limit. First, let me check the current content of the file to understand what changes are needed.
tool: read_file({"path":"fizzbuzz.js"})
Claude: I can see that the file currently runs FizzBuzz up to 100. I'll modify it to only run up to 15 instead by changing the parameter in the fizzBuzz function call.
tool: edit_file({"path":"fizzbuzz.js","old_str":"// Execute FizzBuzz for numbers 1 to 100\nconsole.log('Running FizzBuzz:');\nfizzBuzz(100);","new_str":"// Execute FizzBuzz for numbers 1 to 15\nconsole.log('Running FizzBuzz:');\nfizzBuzz(15);"})
Claude: I also need to update the comment at the top of the file to reflect this change:
tool: edit_file({"path":"fizzbuzz.js","old_str":"/**\n * FizzBuzz implementation in JavaScript\n * Prints numbers from 1 to 100, but:","new_str":"/**\n * FizzBuzz implementation in JavaScript\n * Prints numbers from 1 to 15, but:"})
Claude: The changes have been successfully made to the fizzbuzz.js file. The program will now:
1. Only print FizzBuzz from 1 to 15 (instead of 1 to 100)
2. The documentation has been updated to reflect this change

The function itself remains the same, as it takes a parameter to determine the upper limit, and we've simply changed that parameter from 100 to 15.

它读取文件,编辑文件以更改运行时长,然后还编辑文件以更新顶部的注释。

它仍然有效:

$ node fizzbuzz.js
Running FizzBuzz:
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz

好的,让我们再做一个,要求它做以下事情:

“Create a congrats.js script that rot13-decodes the following string ‘Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!’ and prints it”

也许要求有点高。让我们看看:

$ go run main.go
 
Chat with Claude (use 'ctrl-c' to quit)
 
You: Create a congrats.js script that rot13-decodes the following string 'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!' and prints it
Claude: I'll create a congrats.js script that decodes the ROT13-encoded string and prints it. ROT13 is a simple letter substitution cipher that replaces a letter with the 13th letter after it in the alphabet.

Let me create this file for you:
tool: edit_file({"path":"congrats.js","old_str":"","new_str":"/**\n * A script to decode a ROT13-encoded string and print the result\n */\n\nfunction rot13Decode(encodedStr) {\n  return encodedStr.replace(/[a-zA-Z]/g, function(char) {\n    // Get the character code\n    const charCode = char.charCodeAt(0);\n    \n    // Handle uppercase letters\n    if (charCode >= 65 && charCode <= 90) {\n      return String.fromCharCode(((charCode - 65 + 13) % 26) + 65);\n    }\n    \n    // Handle lowercase letters\n    if (charCode >= 97 && charCode <= 122) {\n      return String.fromCharCode(((charCode - 97 + 13) % 26) + 97);\n    }\n    \n    // Return non-alphabetic characters as is\n    return char;\n  });\n}\n\nconst encodedMessage = 'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!';\nconst decodedMessage = rot13Decode(encodedMessage);\n\nconsole.log(decodedMessage);"})
Claude: I've created a JavaScript file called congrats.js that:

1. Defines a `rot13Decode` function that takes an encoded string and applies the ROT13 cipher to decode it
2. Takes the provided string 'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!'
3. Decodes it using the function
4. Prints the decoded message to the console

You can run this script using Node.js by typing `node congrats.js` in your terminal. The script will decode and display the hidden congratulatory message!

它能运行吗?让我们试试看:

$ node congrats.js
Congratulations on building a code-editing agent!

它做到了!

这难道不令人惊叹吗?

如果你像我过去几个月里谈过的所有工程师一样,那么在阅读这篇文章时,你很可能一直在等待我揭晓“兔子出洞”的真相,期待我说“实际上,这要难得多。”但事实并非如此。

这基本上就是代码编辑代理内部循环的全部内容。当然,将其集成到你的编辑器中、调整系统提示、在正确的时间给予正确的反馈、围绕它设计一个漂亮的用户界面、围绕工具提供更好的工具支持、支持多个代理等等——我们在 Amp 中构建了所有这些,但这并不需要灵光一闪的时刻。所需要的是实用工程和辛勤努力。

这些模型现在非常强大。300 行代码和三个工具,现在你就能与一个编辑你代码的外星智能交谈。如果你认为“嗯,但我们并没有真正……”—去试试吧!去看看你能用这个走多远。我敢打赌比你想象的要远得多。

这就是为什么我们认为一切都在改变。

Image for Agent

4. 根据文中教程得到的代码(参考)

// main.go

package main

import (
    "bufio"
    "context"
    "encoding/json"
    "fmt"
    "os"
    "path"
    "path/filepath"
    "strings"

    "github.com/anthropics/anthropic-sdk-go"
    "github.com/invopop/jsonschema"
)

// ToolDefinition 定义工具的结构
type ToolDefinition struct {
    Name        string                         `json:"name"`
    Description string                         `json:"description"`
    InputSchema anthropic.ToolInputSchemaParam `json:"input_schema"`
    Function    func(input json.RawMessage) (string, error)
}

// Agent 代理结构体
type Agent struct {
    client         *anthropic.Client
    getUserMessage func() (string, bool)
    tools          []ToolDefinition
}

// NewAgent 创建新的代理实例
func NewAgent(
    client *anthropic.Client,
    getUserMessage func() (string, bool),
    tools []ToolDefinition,
) *Agent {
    return &Agent{
        client:         client,
        getUserMessage: getUserMessage,
        tools:          tools,
    }
}

// GenerateSchema 生成JSON schema
func GenerateSchema[T any]() anthropic.ToolInputSchemaParam {
    reflector := jsonschema.Reflector{
        AllowAdditionalProperties: false,
        DoNotReference:            true,
    }
    var v T

    schema := reflector.Reflect(v)

    return anthropic.ToolInputSchemaParam{
        Properties: schema.Properties,
    }
}

// runInference 运行推理
func (a *Agent) runInference(ctx context.Context, conversation []anthropic.MessageParam) (*anthropic.Message, error) {
    anthropicTools := []anthropic.ToolUnionParam{}
    for _, tool := range a.tools {
        anthropicTools = append(anthropicTools, anthropic.ToolUnionParam{
            OfTool: &anthropic.ToolParam{
                Name:        tool.Name,
                Description: anthropic.String(tool.Description),
                InputSchema: tool.InputSchema,
            },
        })
    }

    message, err := a.client.Messages.New(ctx, anthropic.MessageNewParams{
        Model:     anthropic.ModelClaude3_7SonnetLatest,
        MaxTokens: int64(1024),
        Messages:  conversation,
        Tools:     anthropicTools,
    })
    return message, err
}

// executeTool 执行工具
func (a *Agent) executeTool(id, name string, input json.RawMessage) anthropic.ContentBlockParamUnion {
    var toolDef ToolDefinition
    var found bool
    for _, tool := range a.tools {
        if tool.Name == name {
            toolDef = tool
            found = true
            break
        }
    }
    if !found {
        return anthropic.NewToolResultBlock(id, "tool not found", true)
    }

    fmt.Printf("\u001b[92mtool\u001b[0m: %s(%s)\n", name, input)
    response, err := toolDef.Function(input)
    if err != nil {
        return anthropic.NewToolResultBlock(id, err.Error(), true)
    }
    return anthropic.NewToolResultBlock(id, response, false)
}

// Run 运行代理主循环
func (a *Agent) Run(ctx context.Context) error {
    conversation := []anthropic.MessageParam{}

    fmt.Println("Chat with Claude (use 'ctrl-c' to quit)")

    readUserInput := true
    for {
        if readUserInput {
            fmt.Print("\u001b[94mYou\u001b[0m: ")
            userInput, ok := a.getUserMessage()
            if !ok {
                break
            }

            userMessage := anthropic.NewUserMessage(anthropic.NewTextBlock(userInput))
            conversation = append(conversation, userMessage)
        }

        message, err := a.runInference(ctx, conversation)
        if err != nil {
            return err
        }
        conversation = append(conversation, message.ToParam())

        toolResults := []anthropic.ContentBlockParamUnion{}
        for _, content := range message.Content {
            switch content.Type {
            case "text":
                fmt.Printf("\u001b[93mClaude\u001b[0m: %s\n", content.Text)
            case "tool_use":
                result := a.executeTool(content.ID, content.Name, content.Input)
                toolResults = append(toolResults, result)
            }
        }
        if len(toolResults) == 0 {
            readUserInput = true
            continue
        }
        readUserInput = false
        conversation = append(conversation, anthropic.NewUserMessage(toolResults...))
    }

    return nil
}

// ReadFileInput 读取文件的输入参数
type ReadFileInput struct {
    Path string `json:"path" jsonschema_description:"The relative path of a file in the working directory."`
}

// ReadFileInputSchema 读取文件的输入schema
var ReadFileInputSchema = GenerateSchema[ReadFileInput]()

// ReadFile 读取文件内容
func ReadFile(input json.RawMessage) (string, error) {
    readFileInput := ReadFileInput{}
    err := json.Unmarshal(input, &readFileInput)
    if err != nil {
        panic(err)
    }

    content, err := os.ReadFile(readFileInput.Path)
    if err != nil {
        return "", err
    }
    return string(content), nil
}

// ReadFileDefinition 读取文件的工具定义
var ReadFileDefinition = ToolDefinition{
    Name:        "read_file",
    Description: "Read the contents of a given relative file path. Use this when you want to see what's inside a file. Do not use this with directorys names.",
    InputSchema: ReadFileInputSchema,
    Function:    ReadFile,
}

// ListFilesInput 列出文件的输入参数
type ListFilesInput struct {
    Path string `json:"path,omitempty" jsonschema_description:"Optional relative path to list files from. Defaults to current directory if not provided."`
}

// ListFilesInputSchema 列出文件的输入schema
var ListFilesInputSchema = GenerateSchema[ListFilesInput]()

// ListFiles 列出文件和目录
func ListFiles(input json.RawMessage) (string, error) {
    listFilesInput := ListFilesInput{}
    err := json.Unmarshal(input, &listFilesInput)
    if err != nil {
        panic(err)
    }

    dir := "."
    if listFilesInput.Path != "" {
        dir = listFilesInput.Path
    }

    var files []string
    err = filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }

        relPath, err := filepath.Rel(dir, path)
        if err != nil {
            return err
        }

        if relPath != "." {
            if info.IsDir() {
                files = append(files, relPath+"/")
            } else {
                files = append(files, relPath)
            }
        }
        return nil
    })

    if err != nil {
        return "", err
    }

    result, err := json.Marshal(files)
    if err != nil {
        return "", err
    }

    return string(result), nil
}

// ListFilesDefinition 列出文件的工具定义
var ListFilesDefinition = ToolDefinition{
    Name:        "list_files",
    Description: "List files and directories at a given path. If no path is provided, lists files in the current directory.",
    InputSchema: ListFilesInputSchema,
    Function:    ListFiles,
}

// EditFileInput 编辑文件的输入参数
type EditFileInput struct {
    Path   string `json:"path" jsonschema_description:"The path to the file"`
    OldStr string `json:"old_str" jsonschema_description:"Text to search for - must match exactly and must only have one match exactly"`
    NewStr string `json:"new_str" jsonschema_description:"Text to replace old_str with"`
}

// EditFileInputSchema 编辑文件的输入schema
var EditFileInputSchema = GenerateSchema[EditFileInput]()

// createNewFile 创建新文件
func createNewFile(filePath, content string) (string, error) {
    dir := path.Dir(filePath)
    if dir != "." {
        err := os.MkdirAll(dir, 0755)
        if err != nil {
            return "", fmt.Errorf("failed to create directory: %w", err)
        }
    }

    err := os.WriteFile(filePath, []byte(content), 0644)
    if err != nil {
        return "", fmt.Errorf("failed to create file: %w", err)
    }

    return fmt.Sprintf("Successfully created file %s", filePath), nil
}

// EditFile 编辑文件内容
func EditFile(input json.RawMessage) (string, error) {
    editFileInput := EditFileInput{}
    err := json.Unmarshal(input, &editFileInput)
    if err != nil {
        return "", err
    }

    if editFileInput.Path == "" || editFileInput.OldStr == editFileInput.NewStr {
        return "", fmt.Errorf("invalid input parameters")
    }

    content, err := os.ReadFile(editFileInput.Path)
    if err != nil {
        if os.IsNotExist(err) && editFileInput.OldStr == "" {
            return createNewFile(editFileInput.Path, editFileInput.NewStr)
        }
        return "", err
    }

    oldContent := string(content)
    newContent := strings.Replace(oldContent, editFileInput.OldStr, editFileInput.NewStr, -1)

    if oldContent == newContent && editFileInput.OldStr != "" {
        return "", fmt.Errorf("old_str not found in file")
    }

    err = os.WriteFile(editFileInput.Path, []byte(newContent), 0644)
    if err != nil {
        return "", err
    }

    return "OK", nil
}

// EditFileDefinition 编辑文件的工具定义
var EditFileDefinition = ToolDefinition{
    Name: "edit_file",
    Description: `Make edits to a text file.
    
    Replace 'old_str' with 'new_str' in the given file. 'old_str' and 'new_str' MUST be different from each other.
    
    If the file specified with path doesn't exist, it will be created.`,
    InputSchema: EditFileInputSchema,
    Function:    EditFile,
}

// main 主函数
func main() {
    client := anthropic.NewClient()

    scanner := bufio.NewScanner(os.Stdin)
    getUserMessage := func() (string, bool) {
        if !scanner.Scan() {
            return "", false
        }
        return scanner.Text(), true
    }

    tools := []ToolDefinition{ReadFileDefinition, ListFilesDefinition, EditFileDefinition}
    agent := NewAgent(&client, getUserMessage, tools)
    err := agent.Run(context.TODO())
    if err != nil {
        fmt.Printf("Error: %s\n", err.Error())
    }
}

5. 亲测过的有效代码

我没有 Claude 的 API Key,没法测试上面的代码,但我有 Gemini 的 API Key(通过谷歌的 AI Studio 免费获取), 所以我将前面的代码改写了一下,将对 anthropic-sdk-go 的使用改写成对 Gemini Golang SDK (google.golang.org/genai)的使用, 所以得到以下代码供读者参考,亲测有效。

完整步骤如下:

  • 打开终端创建项目并安装依赖
mkdir gemini-agent
cd gemini-agent
go mod init agent
touch main.go 
  • 用以下代码填充 main.go 文件
// main.go

package main

import (
    "bufio"
    "context"
    "encoding/json"
    "fmt"
    "os"
    "path"
    "path/filepath"
    "strings"

    "google.golang.org/genai"
)

// ToolDefinition 定义工具的结构
type ToolDefinition struct {
    Name        string        `json:"name"`
    Description string        `json:"description"`
    Parameters  *genai.Schema `json:"parameters"`
    Function    func(input json.RawMessage) (string, error)
}

// Agent 代理结构体
type Agent struct {
    client         *genai.Client
    getUserMessage func() (string, bool)
    tools          []ToolDefinition
}

// NewAgent 创建新的代理实例
func NewAgent(
    client *genai.Client,
    getUserMessage func() (string, bool),
    tools []ToolDefinition,
) *Agent {
    return &Agent{
        client:         client,
        getUserMessage: getUserMessage,
        tools:          tools,
    }
}

// runInference 运行推理
func (a *Agent) runInference(ctx context.Context, conversation []*genai.Content) (*genai.GenerateContentResponse, error) {
    var geminiTools []*genai.Tool

    if len(a.tools) > 0 {
        var functionDeclarations []*genai.FunctionDeclaration

        for _, tool := range a.tools {
            functionDeclarations = append(functionDeclarations, &genai.FunctionDeclaration{
                Name:        tool.Name,
                Description: tool.Description,
                Parameters:  tool.Parameters,
            })
        }

        geminiTools = []*genai.Tool{
            {
                FunctionDeclarations: functionDeclarations,
            },
        }
    }

    response, err := a.client.Models.GenerateContent(ctx, "gemini-2.5-pro", conversation, &genai.GenerateContentConfig{
        Tools: geminiTools,
    })

    return response, err
}

// executeTool 执行工具
func (a *Agent) executeTool(call *genai.FunctionCall) *genai.Part {
    var toolDef ToolDefinition
    var found bool

    for _, tool := range a.tools {
        if tool.Name == call.Name {
            toolDef = tool
            found = true
            break
        }
    }

    if !found {
        return genai.NewPartFromFunctionResponse(call.Name, map[string]any{
            "error": "tool not found",
        })
    }

    fmt.Printf("\u001b[92mtool\u001b[0m: %s(%s)\n", call.Name, mustMarshal(call.Args))

    // 将 args 转为 json.RawMessage
    argsBytes, err := json.Marshal(call.Args)
    if err != nil {
        return genai.NewPartFromFunctionResponse(call.Name, map[string]any{
            "error": err.Error(),
        })
    }

    response, err := toolDef.Function(json.RawMessage(argsBytes))
    if err != nil {
        return genai.NewPartFromFunctionResponse(call.Name, map[string]any{
            "error": err.Error(),
        })
    }

    return genai.NewPartFromFunctionResponse(call.Name, map[string]any{
        "result": response,
    })
}

// Run 运行代理主循环
func (a *Agent) Run(ctx context.Context) error {
    var conversation []*genai.Content

    fmt.Println("Chat with Gemini (use 'ctrl-c' to quit)")

    readUserInput := true
    for {
        if readUserInput {
            fmt.Print("\u001b[94mYou\u001b[0m: ")
            userInput, ok := a.getUserMessage()
            if !ok {
                break
            }

            userMessage := &genai.Content{
                Parts: []*genai.Part{genai.NewPartFromText(userInput)},
                Role:  genai.RoleUser,
            }
            conversation = append(conversation, userMessage)
        }

        response, err := a.runInference(ctx, conversation)
        if err != nil {
            return err
        }

        // 将响应转换为对话格式
        if len(response.Candidates) == 0 {
            continue
        }

        candidate := response.Candidates[0]
        conversation = append(conversation, &genai.Content{
            Parts: candidate.Content.Parts,
            Role:  genai.RoleModel,
        })

        var toolResponseParts []*genai.Part
        hasToolCalls := false

        for _, part := range candidate.Content.Parts {
            if part.FunctionCall != nil {
                hasToolCalls = true
                result := a.executeTool(part.FunctionCall)
                toolResponseParts = append(toolResponseParts, result)
            } else if part.Text != "" {
                fmt.Printf("\u001b[93mGemini\u001b[0m: %s\n", part.Text)
            }
        }

        if hasToolCalls {
            // 添加工具响应到对话中
            toolResponseMessage := &genai.Content{
                Parts: toolResponseParts,
                Role:  genai.RoleUser,
            }
            conversation = append(conversation, toolResponseMessage)
            readUserInput = false
        } else {
            readUserInput = true
        }
    }

    return nil
}

// ReadFileInput 读取文件的输入参数
type ReadFileInput struct {
    Path string `json:"path"`
}

// ReadFile 读取文件内容
func ReadFile(input json.RawMessage) (string, error) {
    var readFileInput ReadFileInput
    err := json.Unmarshal(input, &readFileInput)
    if err != nil {
        return "", err
    }

    content, err := os.ReadFile(readFileInput.Path)
    if err != nil {
        return "", err
    }
    return string(content), nil
}

// ReadFileDefinition 读取文件的工具定义
var ReadFileDefinition = ToolDefinition{
    Name:        "read_file",
    Description: "Read the contents of a given relative file path. Use this when you want to see what's inside a file. Do not use this with directory names.",
    Parameters: &genai.Schema{
        Type: genai.TypeObject,
        Properties: map[string]*genai.Schema{
            "path": {
                Type:        genai.TypeString,
                Description: "The relative path of a file in the working directory.",
            },
        },
        Required: []string{"path"},
    },
    Function: ReadFile,
}

// ListFilesInput 列出文件的输入参数
type ListFilesInput struct {
    Path string `json:"path,omitempty"`
}

// ListFiles 列出文件和目录
func ListFiles(input json.RawMessage) (string, error) {
    var listFilesInput ListFilesInput
    err := json.Unmarshal(input, &listFilesInput)
    if err != nil {
        return "", err
    }

    dir := "."
    if listFilesInput.Path != "" {
        dir = listFilesInput.Path
    }

    var files []string
    err = filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }

        relPath, err := filepath.Rel(dir, path)
        if err != nil {
            return err
        }

        if relPath != "." {
            if info.IsDir() {
                files = append(files, relPath+"/")
            } else {
                files = append(files, relPath)
            }
        }
        return nil
    })

    if err != nil {
        return "", err
    }

    result, err := json.Marshal(files)
    if err != nil {
        return "", err
    }

    return string(result), nil
}

// ListFilesDefinition 列出文件的工具定义
var ListFilesDefinition = ToolDefinition{
    Name:        "list_files",
    Description: "List files and directories at a given path. If no path is provided, lists files in the current directory.",
    Parameters: &genai.Schema{
        Type: genai.TypeObject,
        Properties: map[string]*genai.Schema{
            "path": {
                Type:        genai.TypeString,
                Description: "Optional relative path to list files from. Defaults to current directory if not provided.",
            },
        },
    },
    Function: ListFiles,
}

// EditFileInput 编辑文件的输入参数
type EditFileInput struct {
    Path   string `json:"path"`
    OldStr string `json:"old_str"`
    NewStr string `json:"new_str"`
}

// createNewFile 创建新文件
func createNewFile(filePath, content string) (string, error) {
    dir := path.Dir(filePath)
    if dir != "." {
        err := os.MkdirAll(dir, 0755)
        if err != nil {
            return "", fmt.Errorf("failed to create directory: %w", err)
        }
    }

    err := os.WriteFile(filePath, []byte(content), 0644)
    if err != nil {
        return "", fmt.Errorf("failed to create file: %w", err)
    }

    return fmt.Sprintf("Successfully created file %s", filePath), nil
}

// EditFile 编辑文件内容
func EditFile(input json.RawMessage) (string, error) {
    var editFileInput EditFileInput
    err := json.Unmarshal(input, &editFileInput)
    if err != nil {
        return "", err
    }

    if editFileInput.Path == "" || editFileInput.OldStr == editFileInput.NewStr {
        return "", fmt.Errorf("invalid input parameters")
    }

    content, err := os.ReadFile(editFileInput.Path)
    if err != nil {
        if os.IsNotExist(err) && editFileInput.OldStr == "" {
            return createNewFile(editFileInput.Path, editFileInput.NewStr)
        }
        return "", err
    }

    oldContent := string(content)
    newContent := strings.Replace(oldContent, editFileInput.OldStr, editFileInput.NewStr, -1)

    if oldContent == newContent && editFileInput.OldStr != "" {
        return "", fmt.Errorf("old_str not found in file")
    }

    err = os.WriteFile(editFileInput.Path, []byte(newContent), 0644)
    if err != nil {
        return "", err
    }

    return "OK", nil
}

// EditFileDefinition 编辑文件的工具定义
var EditFileDefinition = ToolDefinition{
    Name: "edit_file",
    Description: `Make edits to a text file.
    
    Replace 'old_str' with 'new_str' in the given file. 'old_str' and 'new_str' MUST be different from each other.
    
    If the file specified with path doesn't exist, it will be created.`,
    Parameters: &genai.Schema{
        Type: genai.TypeObject,
        Properties: map[string]*genai.Schema{
            "path": {
                Type:        genai.TypeString,
                Description: "The path to the file",
            },
            "old_str": {
                Type:        genai.TypeString,
                Description: "Text to search for - must match exactly and must only have one match exactly",
            },
            "new_str": {
                Type:        genai.TypeString,
                Description: "Text to replace old_str with",
            },
        },
        Required: []string{"path", "old_str", "new_str"},
    },
    Function: EditFile,
}

// mustMarshal 辅助函数,用于将对象转换为JSON字符串
func mustMarshal(v interface{}) string {
    data, err := json.Marshal(v)
    if err != nil {
        return fmt.Sprintf("failed to marshal: %v", err)
    }
    return string(data)
}

// main 主函数
func main() {
    ctx := context.Background()

    // 初始化 Gemini 客户端,使用 AI Studio 的免费 API Key
    client, err := genai.NewClient(ctx, &genai.ClientConfig{
        APIKey:  os.Getenv("GEMINI_API_KEY"), // 从环境变量读取 API Key
        Backend: genai.BackendGeminiAPI,
    })
    if err != nil {
        fmt.Printf("Failed to create client: %v\n", err)
        return
    }

    scanner := bufio.NewScanner(os.Stdin)
    getUserMessage := func() (string, bool) {
        if !scanner.Scan() {
            return "", false
        }
        return scanner.Text(), true
    }

    tools := []ToolDefinition{ReadFileDefinition, ListFilesDefinition, EditFileDefinition}
    agent := NewAgent(client, getUserMessage, tools)
    err = agent.Run(ctx)
    if err != nil {
        fmt.Printf("Error: %s\n", err.Error())
    }
}
  • 设置 API Key
printf "\nexport GEMINI_API_KEY=your-api-key-from-google-ai-studio\n" >> ~/.zshrc
source ~/.zshrc

用自己的 API Key 替换掉 your-api-key-from-google-ai-studio 部分。 我使用的是 Mac + Zsh, 其它情况请自行进行相应调整。

  • 下载依赖并运行
go mod tidy
go run main.go

然后就可以进行各种操作了。

My custom agent shot

要退出时按 control + c 快捷键即可。

6. 推荐阅读