一文带你了解 MCP - 开发调优

实现一个简单的服务端

安装 uv ，在 PowerShell 中运行以下命令：

irm https://astral.sh/uv/install.ps1 | iex

直接创建文件：

uv init calculator_server

进入文件夹：

cd calculator_server

创建虚拟环境：

uv venv

激活虚拟环境：

.venv\Scripts\activate

安装依赖：

uv add mcp[cli] httpx

编写代码

创建一个 calculator_server.py 文件：

from mcp.server.fastmcp import FastMCP
from typing import Union

mcp = FastMCP()

@mcp.tool()
def add(x: float, y: float) -> float:
    """执行加法运算，返回两个数字的和"""
    return x + y

@mcp.tool()
def subtract(x: float, y: float) -> float:
    """执行减法运算，返回第一个数减去第二个数的结果"""
    return x - y

@mcp.tool()
def multiply(x: float, y: float) -> float:
    """执行乘法运算，返回两个数字的乘积"""
    return x * y

@mcp.tool()
def divide(x: float, y: float) -> Union[float, str]:
    """执行除法运算，当除数为零时返回错误提示"""
    if y == 0:
        return "错误：除数不能为零"
    return x / y

if __name__ == "__main__":
    mcp.run(transport='stdio')

运行项目

使用 vscode 打开项目，使用 ctrl+~ 打开终端，执行以下命令：

激活环境：

.venv\Scripts\activate

运行项目：

mcp dev calculator_server.py

或者直接放到 Cherry Studio 里面运行：

Cherry Studio 下载链接
之后双击打开，到设置里面

`MCP`概念

MCP (Multi-Client Protocol)是什么，官方解释是：MCP 是一个开放协议，它为应用程序向 LLM 提供上下文的方式进行了标准化。你可以将 MCP 想象成 AI 应用程序的 USB-C 接口。

我们一般要实现的就是mcp服务端，客户端一般使用cherry studio、Claude Desktop、IDE等；mcp服务端就是我们定义好一个个工具(@mcp.tool())，`@mcp.tool()`可以类比为flask里面的`@app.route`。

通用架构

官方解释：MCP 核心采用客户端-服务器架构

我们只要知道服务器有本地资源和远程资源就是，远程资源是通过webapi进行交互。

官方图解

AI 使用 MCP 流程

客户端：将提问和MCP服务端所具有的工具一起发送给AI
AI：回复(说了应该调用那些工具)
客户端：接收到了AI的回复查看是否有`tool_use`，有就代表了要调用工具，然后就调用回复里面提到的工具，最后将回复和调用工具得到的结果一起渲染到用户界面。

简单实现一个客户端

import asyncio
import hashlib
import re
import xml.etree.ElementTree as ET
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import requests

class MCPClient:
    def generate_tool_hash(self, tool_name: str, tool_args: dict) -> str:
        """为工具生成唯一的MD5哈希值"""
        # 将工具名称和参数转换为字符串并组合
        tool_str = f"{tool_name}:{str(tool_args)}"
        # 创建MD5哈希
        return hashlib.md5(tool_str.encode()).hexdigest()

    def parse_tool_use(self, response_text: str) -> tuple[str, dict]:
        """解析工具调用XML，返回工具哈希和参数"""
        try:
            # 使用正则表达式提取<tool_use>标签内容
            tool_use_match = re.search(r'<tool_use>(.*?)</tool_use>', response_text, re.DOTALL)
            if not tool_use_match:
                return None, None
            
            tool_use_xml = f'<tool_use>{tool_use_match.group(1)}</tool_use>'
            root = ET.fromstring(tool_use_xml)
            
            tool_hash = root.find('name').text
            arguments = eval(root.find('arguments').text)  # 将字符串转换为字典
            return tool_hash, arguments
        except Exception as e:
            print(f"解析tool_use失败: {e}")
            return None, None

    def get_tool_name_by_hash(self, tool_hash: str, tools) -> str:
        """根据哈希值查找对应的工具名称"""
        for tool in tools:
            current_hash = self.generate_tool_hash(tool.name, {})
            if current_hash == tool_hash:
                return tool.name
        return None

    async def connect_to_server(self, server_script_path: str):
        """连接到 MCP 服务器并调用工具

        参数:
            server_script_path: 服务器脚本的路径 (.py 或 .js)
        """
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            raise ValueError("服务器脚本必须是 .py 或 .js 文件")

        command = "python" if is_python else "node"
        server_params = StdioServerParameters(
            command=command,
            args=[server_script_path],
            env=None
        )

        async with stdio_client(server_params) as stdio_transport:
            self.stdio, self.write = stdio_transport
            async with ClientSession(self.stdio, self.write) as self.session:
                await self.session.initialize()

                # 获取可用工具列表
                response = await self.session.list_tools()
                # 构建XML格式的工具信息
                tools_xml = "<tools>\n\n"
                for tool in response.tools:
                    # 为每个工具生成唯一的MD5哈希值
                    tool_hash = self.generate_tool_hash(tool.name, {})
                    tools_xml += f"<tool>\n"
                    tools_xml += f"  <name>{tool_hash}</name>\n"
                    tools_xml += f"  <description>{tool.description}</description>\n"
                    tools_xml += f"  <arguments>\n    {tool.inputSchema}\n  </arguments>\n"
                    tools_xml += f"</tool>\n\n\n"
                tools_xml += "</tools>"
                # print(tools_xml)

                prompts_part1 = """In this environment you have access to a set of tools you can use to answer the user\'s question. You can use one tool per message, and will receive the result of that tool use in the user\'s response. You use tools step-by-step to accomplish a given task, with each tool use informed by the result of the previous tool use.\n\n## Tool Use Formatting\n\nTool use is formatted using XML-style tags. The tool name is enclosed in opening and closing tags, and each parameter is similarly enclosed within its own set of tags. Here\'s the structure:\n\n<tool_use>\n  <name>{tool_name}</name>\n  <arguments>{json_arguments}</arguments>\n</tool_use>\n\nThe tool name should be the exact name of the tool you are using, and the arguments should be a JSON object containing the parameters required by that tool. For example:\n<tool_use>\n  <name>python_interpreter</name>\n  <arguments>{"code": "5 + 3 + 1294.678"}</arguments>\n</tool_use>\n\nThe user will respond with the result of the tool use, which should be formatted as follows:\n\n<tool_use_result>\n  <name>{tool_name}</name>\n  <result>{result}</result>\n</tool_use_result>\n\nThe result should be a string, which can represent a file or any other output type. You can use this result as input for the next action.\nFor example, if the result of the tool use is an image file, you can use it in the next action like this:\n\n<tool_use>\n  <name>image_transformer</name>\n  <arguments>{"image": "image_1.jpg"}</arguments>\n</tool_use>\n\nAlways adhere to this format for the tool use to ensure proper parsing and execution.\n\n## Tool Use Examples\n\nHere are a few examples using notional tools:\n---\nUser: Generate an image of the oldest person in this document.\n\nAssistant: I can use the document_qa tool to find out who the oldest person is in the document.\n<tool_use>\n  <name>document_qa</name>\n  <arguments>{"document": "document.pdf", "question": "Who is the oldest person mentioned?"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>document_qa</name>\n  <result>John Doe, a 55 year old lumberjack living in Newfoundland.</result>\n</tool_use_result>\n\nAssistant: I can use the image_generator tool to create a portrait of John Doe.\n<tool_use>\n  <name>image_generator</name>\n  <arguments>{"prompt": "A portrait of John Doe, a 55-year-old man living in Canada."}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>image_generator</name>\n  <result>image.png</result>\n</tool_use_result>\n\nAssistant: the image is generated as image.png\n\n---\nUser: "What is the result of the following operation: 5 + 3 + 1294.678?"\n\nAssistant: I can use the python_interpreter tool to calculate the result of the operation.\n<tool_use>\n  <name>python_interpreter</name>\n  <arguments>{"code": "5 + 3 + 1294.678"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>python_interpreter</name>\n  <result>1302.678</result>\n</tool_use_result>\n\nAssistant: The result of the operation is 1302.678.\n\n---\nUser: "Which city has the highest population , Guangzhou or Shanghai?"\n\nAssistant: I can use the search tool to find the population of Guangzhou.\n<tool_use>\n  <name>search</name>\n  <arguments>{"query": "Population Guangzhou"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>search</name>\n  <result>Guangzhou has a population of 15 million inhabitants as of 2021.</result>\n</tool_use_result>\n\nAssistant: I can use the search tool to find the population of Shanghai.\n<tool_use>\n  <name>search</name>\n  <arguments>{"query": "Population Shanghai"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>search</name>\n  <result>26 million (2019)</result>\n</tool_use_result>\nAssistant: The population of Shanghai is 26 million, while Guangzhou has a population of 15 million. Therefore, Shanghai has the highest population.\n\n\n## Tool Use Available Tools\nAbove example were using notional tools that might not exist for you. You only have access to these tools:\n"""

                prompts_part2 = """\n\n## Tool Use Rules\nHere are the rules you should always follow to solve your task:\n1. Always use the right arguments for the tools. Never use variable names as the action arguments, use the value instead.\n2. Call a tool only when needed: do not call the search agent if you do not need information, try to solve the task yourself.\n3. If no tool call is needed, just answer the question directly.\n4. Never re-do a tool call that you previously did with the exact same parameters.\n5. For tool use, MARK SURE use XML tag format as shown in the examples above. Do not use any other format.\n\n# User Instructions\n\n\nNow Begin! If you solve the task correctly, you will receive a reward of $1,000,000."""

                prompts = prompts_part1 + tools_xml + prompts_part2
                # print(prompts)

                headers = {
                    'accept': '*/*',
                    'accept-language': 'zh-CN',
                    # Already added when you pass json=
                    # 'content-type': 'application/json',
                    'priority': 'u=1, i',
                    'sec-ch-ua': '"Not/A)Brand";v="8", "Chromium";v="126"',
                    'sec-ch-ua-mobile': '?0',
                    'sec-ch-ua-platform': '"Windows"',
                    'sec-fetch-dest': 'empty',
                    'sec-fetch-mode': 'cors',
                    'sec-fetch-site': 'cross-site',
                    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) CherryStudio/1.2.7 Chrome/126.0.6478.234 Electron/31.7.6 Safari/537.36',
                    'x-goog-api-client': 'google-genai-sdk/0.8.0 gl-node/web',
                    'x-goog-api-key': 'AIzaSyCD6Clgaaoq_aHyZEbBtrTFjcqm1wV45UA',
                }


                json_data = {
                    'contents': [
                        {
                            'parts': [
                                {
                                    'text': input("请输入您的问题: "),  # 改为从控制台获取输入
                                },
                            ],
                            'role': 'user',
                        },
                    ],
                    'systemInstruction': {
                        'parts': [
                            {
                                'text': f'{prompts}',
                            },
                        ],
                        'role': 'user',
                    },
                    'safetySettings': [
                        {
                            'category': 'HARM_CATEGORY_HATE_SPEECH',
                            'threshold': 'BLOCK_NONE',
                        },
                        {
                            'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
                            'threshold': 'BLOCK_NONE',
                        },
                        {
                            'category': 'HARM_CATEGORY_HARASSMENT',
                            'threshold': 'BLOCK_NONE',
                        },
                        {
                            'category': 'HARM_CATEGORY_DANGEROUS_CONTENT',
                            'threshold': 'BLOCK_NONE',
                        },
                        {
                            'category': 'HARM_CATEGORY_CIVIC_INTEGRITY',
                            'threshold': 'BLOCK_NONE',
                        },
                    ],
                    'tools': [],
                    'generationConfig': {},
                }

                chat_response = requests.post('https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-8b:streamGenerateContent', headers=headers, json=json_data)

                # 打印chat_response的状态码和内容
                # print(chat_response.status_code)
                # print(chat_response.text)

                json_response = chat_response.json()
                for item in json_response:
                    if "candidates" in item:
                        text = item["candidates"][0]["content"]["parts"][0]["text"]
                        print(text, end='')

                # response是ListToolsResult对象，直接访问其tools属性
                tools = response.tools

                

                # 解析响应
                ai_response = ""
                json_response = chat_response.json()
                for item in json_response:
                    if "candidates" in item:
                        text = item["candidates"][0]["content"]["parts"][0]["text"]
                        ai_response += text

                # 解析tool_use并执行工具调用
                tool_hash, tool_args = self.parse_tool_use(ai_response)
                if tool_hash and tool_args:
                    tool_name = self.get_tool_name_by_hash(tool_hash, tools)
                    if tool_name:
                        print(f"\n正在调用工具: {tool_name} 参数: {tool_args}")
                        result = await self.session.call_tool(tool_name, tool_args)
                        
                        if result and result.content:
                            for item in result.content:
                                if item.type == 'text':
                                    print(f"调用结果: {item.text}")
                                else:
                                    print("工具调用结果格式不正确")
                        else:
                            print("工具调用返回为空")
                    else:
                        print(f"未找到哈希值 {tool_hash} 对应的工具")
                else:
                    print("未找到有效的tool_use标签")

async def main():
    if len(sys.argv) < 2:
        print("用法: python call_tool.py <服务器脚本路径>")
        sys.exit(1)

    server_script_path = sys.argv[1]
    client = MCPClient()
    await client.connect_to_server(server_script_path)

if __name__ == "__main__":
    import sys
    asyncio.run(main())

x-goog-api-key 是论坛佬友 yiming 分享的。

HL🌱数字花园

探索

一文带你了解 MCP - 开发调优 - LINUX DO

实现一个简单的服务端

编写代码

运行项目

`MCP`概念

通用架构

AI 使用 MCP 流程

简单实现一个客户端

视频演示

关系图谱

目录

HL🌱数字花园

探索

一文带你了解 MCP - 开发调优 - LINUX DO

实现一个简单的服务端

编写代码

运行项目

MCP概念

通用架构

AI 使用 MCP 流程

简单实现一个客户端

视频演示

关系图谱

目录

`MCP`概念