跳过正文
  1. 文章/

盘点开源的 DeepResearch 实现方案

·6718 字·14 分钟
DeepResearch DeepSearch Agent LLM Dify LangChain HuggingFace Zilliz 智能体 大模型应用
Weaxs
作者
Weaxs

引言
#

Deep Research 是 OpenAI 在 2 月 2 号推出的一个基于 o3 模型的研究型 Agent 智能体。

OpenAI 的官方表示用户只需要给一个提示词,Agent 就会像一个研究员分析师那样,产出研究报告。

如果用户的研究细节不够明确,此时模型会提醒用户对细节进行澄清;如果用户的问题足够清晰,那么此时 Agent 会开始联网分析。

不过其实 Deep Research 这个概念 Google Gemini 在 12 月份就提到过。JinaAI 在 1 月份也开始尝试实现它。相继地,perplexity 也推出了 Deep Research 的 Agent。除此之外,LangChain、Dify、HuggingFace 等等也都开源了自己的实践方案。

今天我们从这些开源的案例来看看具体的实践方法。

Deep Research
#

我们先大体介绍一下它的流程,从工程侧的角度是如何使用大模型驱动起来的。

首先明确这里会需要至少两个 Agent:

  1. 研究 Agent:用来决定是否进行下一个课题的研究,如果继续那么需要给出具体的研究方向
  2. 报告 Agent:用于把研究院整理的所有内容进行总结,并产出最终的研究报告

基于这两个 Agent,应用到 Deep Research 中的大体流程如下:

  1. 用户给出提问
  2. 研究员 Agent 会出给一个具体的课题
  3. 通过搜索引擎、向量库检索等等手段获取相关的信息
  4. 再次询问研究研究员 Agent,看看有没有下一个研究课题,如果有则回到步骤2,否则继续
  5. 将所有的研究数据给到报告员 Agent,让报告员 Agent 产出最终的报告

整体上类似一个 Reason + Act 模型,然后以这个为基础添加一下模型的反思 Reflection、需求的澄清等等。

JinaAI DeepSearch Framework.png

不过在部分严格命名的实现中,上面这个流程只能被定义为 Deep Search,真正的 Deep Research 还需要在最开始的地方添加一个规划 Agent,用来生成解决这个问题的 COT 思维链,思维链给出的每一步都需要经过和上面一样的步骤,有点类似报告的大纲和每个章节;除此之外,还需要拥有反思能力,以便在每章生成的过程中保持整个报表的贴近课题。最终收集每一步的完整报告,再给出最终的超长报告。

JinaAI DeepResearch Framework

Dify 的实现
#

概要
#

Dify 这里是通过 Dify 的 Workflow 功能实现的,具体可以看这篇文章 DeepResearch: Building a Research Automation App with Dify

搜索这部分是走的 Tavily 搜索。调用模型的地方大概分为了两处:

  1. 以 gpt-4o 实现的研究员:判断课题方向和是否结束
  2. 以 deepseek-r1 实现的报告员:总结搜索结果并产出最终报告

DeepResearch Dify Workflow

gpt-4o 研究
#

研究员 Agent 是 Workflow 迭代器中的 LLM 节点,使用的是gpt-4o。每次请求会让 gpt-4o 给出 nextSearchTopicshouldContinue 。具体的提示词如下:

You are a research agent investigating the following topic.
What have you found? What questions remain unanswered? What specific aspects should be investigated next?

## Output
- Do not output topics that are exactly the same as already searched topics.
- If further information search is needed, set nextSearchTopic.
- If sufficient information has been obtained, set shouldContinue to false.
- Please output in json format

```json
nextSearchTopic: str | None
shouldContinue: bool 

Dify LLM Node GPT-4o

deepseek-r1 书写报告
#

产出报告阶段使用的是推理模型,这里的实现是直接用了 deepsekk-r1 来做这部分工作。

我们可以具体看一下人设的提示词。user 部分的提示词只是传了用户的 query 和 搜索接口中查询出来的 findings 。重点是 system 人设部分。

Dify LLM Node Deepseek-R1

LangChain 的实现
#

流程概要
#

LangChain 这里主要是源码,它们实现了一整套规划、研究、产出报告的逻辑。

langchain-ai/open_deep_research

Jupyter Notebook
2137
292

LangChain DeepResearch

执行步骤
#

LangChain 这里将模型大体分为了两种类型:

  • planner_model:规划模型,一般是推理模型,用于 ① 在最开始的时候规划需要生成的章节和课题 ② 在递归调用的时候,用于决定是否执行递归或执行下一个章节
  • writer_model:写作模型,一般是非推理模型,用于 ① 在调用查询之前生成 Query 列表,便于查询 ② 针对课题的每个章节,书写这部分的报告

其次看一下这里整体的调度逻辑吧,和我们之前说到的流程有很多细节的补充,十分值得一看、

根据下面的代码,整体的执行流程大致如下:

  1. 用户输入 Query
  2. generate_report_plan:生成报告产出的规划或计划,主要产出需要的课题并组织报告的目录结构。这里会先根据课题使用 writer_model 来生成 Query 列表,其次使用 Query 列表去搜索引擎查询,最后让 planner_model 根据查询结果生成整个报告的目录结构、章节、课题等。
  3. human_feedback:提醒用户对产出的规划进行审阅,如果审批失败回到步骤 2 重新规划,通过则继续执行。
  4. build_section_with_web_research:根据规划中的章节列表步骤,构建每个章节的书写任务。每个章节的具体执行在 section_builder 中定义。
    1. generate_queries :根据课题和当前的章节生成 query 列表,这里调用的是 writer_model
    2. search_web:根据上一步产出的 Query 列表,查询搜索引擎,这里 LangChain 支持了 tavilyperplexityexaarxivpubmedlinkup
    3. write_section:调用 writer_model 模型总结这部分内容;然后让 planner_modelreflection, 即判断当前章节的报告产出是否通过。如果未通过重新回到步骤 a 重新书写当;如果通过那么产出下一章节。
  5. gather_completed_sections:收集已经执行完成的所有章节的报告产出,转化为最终上下文
  6. initiate_final_section_writing:针对第 2 步中不需要研究的章节,在这一步统一调用 write_final_sections 方法生成这一章的具体内容
  7. write_final_sections:根据之前研究并生成的所有章节,调用 writer_model 来生成不需要研究的章节的内容,理论上这里也包含整个报告中的结语部分。
  8. compile_final_report:将步骤 c 和 步骤6 中的结语部分拼接成最终的报告产出。
# Report section sub-graph -- 

# Add nodes 
section_builder = StateGraph(SectionState, output=SectionOutputState)
section_builder.add_node("generate_queries", generate_queries)
section_builder.add_node("search_web", search_web)
section_builder.add_node("write_section", write_section)

# Add edges
section_builder.add_edge(START, "generate_queries")
section_builder.add_edge("generate_queries", "search_web")
section_builder.add_edge("search_web", "write_section")

# Outer graph -- 

# Add nodes
builder = StateGraph(ReportState, input=ReportStateInput, output=ReportStateOutput, config_schema=Configuration)
builder.add_node("generate_report_plan", generate_report_plan)
builder.add_node("human_feedback", human_feedback)
builder.add_node("build_section_with_web_research", section_builder.compile())
builder.add_node("gather_completed_sections", gather_completed_sections)
builder.add_node("write_final_sections", write_final_sections)
builder.add_node("compile_final_report", compile_final_report)

# Add edges
builder.add_edge(START, "generate_report_plan")
builder.add_edge("generate_report_plan", "human_feedback")
builder.add_edge("build_section_with_web_research", "gather_completed_sections")
builder.add_conditional_edges("gather_completed_sections", initiate_final_section_writing, ["write_final_sections"])
builder.add_edge("write_final_sections", "compile_final_report")
builder.add_edge("compile_final_report", END)

graph = builder.compile()

书写 Agent
#

Query 生成
#

这部分我们在前面的步骤中说到,是用来帮助搜索引擎检索的 Query 列表。具体在步骤 2 和处理每个章节的步骤 a 中使用到了。但提示词还是略有不同。

我们先看看生成步骤 2 中生成用于查询 报告计划相关 数据的Query。system 人设提示词和 user 提示词如下:

### system 
You are performing research for a report. 

<Report topic>
{topic}
</Report topic>

<Report organization>
{report_organization}
</Report organization>

<Task>
Your goal is to generate {number_of_queries} web search queries that will help gather information for planning the report sections. 

The queries should:

1. Be related to the Report topic
2. Help satisfy the requirements specified in the report organization

Make the queries specific enough to find high-quality, relevant sources while covering the breadth needed for the report structure.
</Task>

### user
Generate search queries that will help with planning the sections of the report.

其次就是在处理具体的某个章节的时候,需要查询当前章节的相关资料,这里也需要生成调用搜索引擎时的 Query 列表。system 人设提示词和 user 提示词如下:

### system
You are an expert technical writer crafting targeted web search queries that will gather comprehensive information for writing a technical report section.

<Report topic>
{topic}
</Report topic>

<Section topic>
{section_topic}
</Section topic>

<Task>
Your goal is to generate {number_of_queries} search queries that will help gather comprehensive information above the section topic. 

The queries should:

1. Be related to the topic 
2. Examine different aspects of the topic

Make the queries specific enough to find high-quality, relevant sources.
</Task>

### user
Generate search queries on the provided topic.

需要研究的章节生成
#

这部分是在处理每个需要进行研究的章节的时候,生成当前章节的报告。输入主要是整个报告的课题、章节名称、本章的子课题等等。除此之外就是对生成的报告字数、风格等等做了一些限制。

### system
You are an expert technical writer crafting one section of a technical report.

<Report topic>
{topic}
</Report topic>

<Section name>
{section_name}
</Section name>

<Section topic>
{section_topic}
</Section topic>

<Existing section content (if populated)>
{section_content}
</Existing section content>

<Source material>
{context}
</Source material>

<Guidelines for writing>
1. If the existing section content is not populated, write a new section from scratch.
2. If the existing section content is populated, write a new section that synthesizes the existing section content with the Source material.
</Guidelines for writing>

<Length and style>
- Strict 150-200 word limit
- No marketing language
- Technical focus
- Write in simple, clear language
- Start with your most important insight in **bold**
- Use short paragraphs (2-3 sentences max)
- Use ## for section title (Markdown format)
- Only use ONE structural element IF it helps clarify your point:
  * Either a focused table comparing 2-3 key items (using Markdown table syntax)
  * Or a short list (3-5 items) using proper Markdown list syntax:
    - Use `*` or `-` for unordered lists
    - Use `1.` for ordered lists
    - Ensure proper indentation and spacing
- End with ### Sources that references the below source material formatted as:
  * List each source with title, date, and URL
  * Format: `- Title : URL`
</Length and style>

<Quality checks>
- Exactly 150-200 words (excluding title and sources)
- Careful use of only ONE structural element (table or list) and only if it helps clarify your point
- One specific example / case study
- Starts with bold insight
- No preamble prior to creating the section content
- Sources cited at end
</Quality checks>

### user
Generate a report section based on the provided sources.

无需研究的章节生成
#

这部分主要是在所有需要研究的章节都生成完毕之后,生成不需要研究的章节内容,理论上这里也包含整个报告的结语部分。

### system
You are an expert technical writer crafting a section that synthesizes information from the rest of the report.

<Report topic>
{topic}
</Report topic>

<Section name>
{section_name}
</Section name>

<Section topic> 
{section_topic}
</Section topic>

<Available report content>
{context}
</Available report content>

<Task>
1. Section-Specific Approach:

For Introduction:
- Use # for report title (Markdown format)
- 50-100 word limit
- Write in simple and clear language
- Focus on the core motivation for the report in 1-2 paragraphs
- Use a clear narrative arc to introduce the report
- Include NO structural elements (no lists or tables)
- No sources section needed

For Conclusion/Summary:
- Use ## for section title (Markdown format)
- 100-150 word limit
- For comparative reports:
    * Must include a focused comparison table using Markdown table syntax
    * Table should distill insights from the report
    * Keep table entries clear and concise
- For non-comparative reports: 
    * Only use ONE structural element IF it helps distill the points made in the report:
    * Either a focused table comparing items present in the report (using Markdown table syntax)
    * Or a short list using proper Markdown list syntax:
      - Use `*` or `-` for unordered lists
      - Use `1.` for ordered lists
      - Ensure proper indentation and spacing
- End with specific next steps or implications
- No sources section needed

3. Writing Approach:
- Use concrete details over general statements
- Make every word count
- Focus on your single most important point
</Task>

<Quality Checks>
- For introduction: 50-100 word limit, # for report title, no structural elements, no sources section
- For conclusion: 100-150 word limit, ## for section title, only ONE structural element at most, no sources section
- Markdown format
- Do not include word count or any preamble in your response
</Quality Checks>

### user
Generate a report section based on the provided sources.

规划 Agent
#

生成报告计划
#

这里的规划,指的是在上面第二步中的 planner_model 模型。

通过上面的执行步骤我们知道了,planner_model 会根据检索出来的相关内容,生成整个报告整体的章节、子课题、目录结构等等。

下面看看提示词。system 人设提示词中具体的输入有:

  • topic:用户传递的课题
  • report_organization:报告组织,可选项
  • context :从搜索引擎中查询出来的上下文信息
  • feedback:这是在第 3 步中,用户对生成的规划不满意后的具体反馈内容,首次生成是没有的
I want a plan for a report that is concise and focused.

<Report topic>
The topic of the report is:
{topic}
</Report topic>

<Report organization>
The report should follow this organization: 
{report_organization}
</Report organization>

<Context>
Here is context to use to plan the sections of the report: 
{context}
</Context>

<Task>
Generate a list of sections for the report. Your plan should be tight and focused with NO overlapping sections or unnecessary filler. 

For example, a good report structure might look like:
1/ intro
2/ overview of topic A
3/ overview of topic B
4/ comparison between A and B
5/ conclusion

Each section should have the fields:

- Name - Name for this section of the report.
- Description - Brief overview of the main topics covered in this section.
- Research - Whether to perform web research for this section of the report.
- Content - The content of the section, which you will leave blank for now.

Integration guidelines:
- Include examples and implementation details within main topic sections, not as separate sections
- Ensure each section has a distinct purpose with no content overlap
- Combine related concepts rather than separating them

Before submitting, review your structure to ensure it has no redundant sections and follows a logical flow.
</Task>

<Feedback>
Here is feedback on the report structure from review (if any):
{feedback}
</Feedback>

输出是在 user 用户提示词中指定的,具体有每个章节对应的名称、描述、计划、是否需要研究和相关的上下文。

Generate the sections of the report. Your response must include a 'sections' field containing a list of sections. 
Each section must have: name, description, plan, research, and content fields.

判断报告内容 Reflection
#

这是每个section_builder 的核心,在书写 Agent 生成完当前部分的章节后,判断是否需要重写这章节,也就是流程图中的 reflection 反思。这里的提示词加上了当前生成章节的内容。

让大模型返回是否通过,如果未通过则需要返回继续或者说是重写的 Query 列表,为重新生成章节内容给出一个大致的方向。具体的提示词如下:

### system
Review a report section relative to the specified topic:

<Report topic>
{topic}
</Report topic>

<section topic>
{section_topic}
</section topic>

<section content>
{section}
</section content>

<task>
Evaluate whether the section content adequately addresses the section topic.

If the section content does not adequately address the section topic, generate {number_of_follow_up_queries} follow-up search queries to gather missing information.
</task>

<format>
    grade: Literal["pass","fail"] = Field(
        description="Evaluation result indicating whether the response meets requirements ('pass') or needs revision ('fail')."
    )
    follow_up_queries: List[SearchQuery] = Field(
        description="List of follow-up search queries.",
    )
</format>

### user
Grade the report and consider follow-up questions for missing information.
If the grade is 'pass', return empty strings for all follow-up queries.
If the grade is 'fail', provide specific search queries to gather missing information.

HuggingFace smolagents 的实现
#

概要
#

HuggingFace 在 smolagents 库的示例中给了一个基础的 DeepResearch 实现。

这个实现主要是基于经典的 ReAct 提示框架,即 Reason + Act——给 Agent 设定好特定的工具集后,让 Agent 自行根据问题触发多工具执行。示例这里封装了两个 Agent:

  • text_webbrowser_agent:用于触发浏览器工具 function call 的 ToolCallingAgent(MultiStepAgent) ,封装了谷歌查询、浏览器访问特定地址、页面向下、页面向上、浏览器查询、archive 查询等浏览器相关的工具,还有解析文件中文本信息的工具
  • manager_agentCodeAgent(MultiStepAgent) 实例,封装了 CodeAct 的主要执行框架,这块是 smolagents 之前就实现好的,传参这部分主要是传了text_webbrowser_agent 智能体。相当于是这个 agent 将另一个 agent 作为工具来驱动它的执行
huggingface/smolagents

🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.

Python
14364
1260

HuggingFace DeepResearch

MultiStepAgent
#

HuggingFace 的实现主要是基于 MultiStepAgent 实现的。CodeAgent(MultiStepAgent)ToolCallingAgent(MultiStepAgent) 都是 MultiStepAgent 的子类,大体流程基本是类似的。

所以这里我们先看看 MultiStepAgent 的执行流程:

调用路径是:MultiStepAgent**.**runMultiStepAgent**.**_runMultiStepAgent**.**_execute_stepMultiStepAgent**.**planning_stepCodeAgent/ToolCallingAgent.stepMultiStepAgent**.**_run → … 直到找到 final_answer 或者递归的次数达到 max_steps

# https://github.com/huggingface/smolagents/blob/main/src/smolagents/agents.py
class MultiStepAgent:   
    
    def run(
        self,
        task: str,
        stream: bool = False,
        reset: bool = True,
        images: Optional[List[str]] = None,
        additional_args: Optional[Dict] = None,
        max_steps: Optional[int] = None,
    ):
        """
        Run the agent for the given task.
        
        Args:
            task (`str`): Task to perform.
            stream (`bool`): Whether to run in a streaming way.
            reset (`bool`): Whether to reset the conversation or keep it going from previous run.
            images (`list[str]`, *optional*): Paths to image(s).
            additional_args (`dict`, *optional*): Any other variables that you want to pass to the agent run, for instance images or dataframes. Give them clear names!
            max_steps (`int`, *optional*): Maximum number of steps the agent can take to solve the task. if not provided, will use the agent's default value.
        """
				...       

        if stream:
            # The steps are returned as they are executed through a generator to iterate on.
            return self._run(task=self.task, max_steps=max_steps, images=images)
        # Outputs are returned only at the end. We only look at the last step.
        return deque(self._run(task=self.task, max_steps=max_steps, images=images), maxlen=1)[0]
    
    
    def _run(
        self, task: str, max_steps: int, images: List[str] | None = None
    ) -> Generator[ActionStep | AgentType, None, None]:
		    
        final_answer = None
        self.step_number = 1
        while final_answer is None and self.step_number <= max_steps:
            step_start_time = time.time()
            memory_step = self._create_memory_step(step_start_time, images)
            try:
                final_answer = self._execute_step(task, memory_step)
            except AgentError as e:
                memory_step.error = e
            finally:
                self._finalize_step(memory_step, step_start_time)
                yield memory_step
                self.step_number += 1

        if final_answer is None and self.step_number == max_steps + 1:
            final_answer = self._handle_max_steps_reached(task, images, step_start_time)
            yield memory_step
        yield handle_agent_output_types(final_answer)
        
    def _execute_step(self, task: str, memory_step: ActionStep) -> Union[None, Any]:
        if self.planning_interval is not None and self.step_number % self.planning_interval == 1:
            self.planning_step(task, is_first_step=(self.step_number == 1), step=self.step_number)
        self.logger.log_rule(f"Step {self.step_number}", level=LogLevel.INFO)
        final_answer = self.step(memory_step)
        if final_answer is not None and self.final_answer_checks:
            self._validate_final_answer(final_answer)
        return final_answer

CodeAgent + ToolCallingAgent
#

前面我们大概了解了一下 MultiStepAgent,现在看看这里的 DeepResearch 实现是怎么创建 CodeAgent 实例的。

这里设定了text_webbrowser_agentmanager_agent ;并指定 text_webbrowser_agent 的 max_steps 为 20,manager_agentmax_steps 为 12。

基本上就是基于 gpt-o1 封装了一个 CodeAgent,而 CodeAgent 里面又封装了一个 ToolCallingAgent 以此来驱动另一个智能体执行。

ToolCallingAgent 里面主要封装很多浏览器、搜索和文件解析相关的操作。

这两个 Agent 都是基于 Reason + Act 推理执行的方式运转。

# https://github.com/huggingface/smolagents/blob/main/examples/open_deep_research/run_gaia.py		
		model_params = {
        "model_id": model_id,
        "custom_role_conversions": custom_role_conversions,
        "max_completion_tokens": 8192,
    }
    if model_id == "o1":
        model_params["reasoning_effort"] = "high"
    model = LiteLLMModel(**model_params)
    
    # web tools
    WEB_TOOLS = [
        SearchInformationTool(browser),
        VisitTool(browser),
        PageUpTool(browser),
        PageDownTool(browser),
        FinderTool(browser),
        FindNextTool(browser),
        ArchiveSearchTool(browser),
        # inspect file as text tool 
        TextInspectorTool(model, text_limit),
    ]
    # tool calling agent for web tools
    text_webbrowser_agent = ToolCallingAgent(
        model=model,
        tools=WEB_TOOLS,
        max_steps=20,
        verbosity_level=2,
        planning_interval=4,
        name="search_agent",
        description="""A team member that will search the internet to answer your question.
    Ask him for all your questions that require browsing the web.
    Provide him as much context as possible, in particular if you need to search on a specific timeframe!
    And don't hesitate to provide him with a complex search task, like finding a difference between two webpages.
    Your request must be a real sentence, not a google search! Like "Find me this information (...)" rather than a few keywords.
    """,
        provide_run_summary=True,
    )
    text_webbrowser_agent.prompt_templates["managed_agent"]["task"] += """You can navigate to .txt online files.
    If a non-html page is in another format, especially .pdf or a Youtube video, use tool 'inspect_file_as_text' to inspect it.
    Additionally, if after some searching you find out that you need more information to answer the question, you can use `final_answer` with your request for clarification as argument to request for more information."""

		# ReAct code agent instance
    manager_agent = CodeAgent(
        model=model,
        tools=[visualizer, ti_tool],
        max_steps=12,
        verbosity_level=2,
        additional_authorized_imports=AUTHORIZED_IMPORTS,
        planning_interval=4,
        managed_agents=[text_webbrowser_agent],
    )

最后看看提示词,这里就相对普通一下。具体如下:

You have one question to answer. It is paramount that you provide a correct answer.
Give it all you can: I know for a fact that you have access to all the relevant tools to solve it and find the correct answer (the answer does exist). Failure or 'I cannot answer' or 'None found' will not be tolerated, success will be rewarded.
Run verification steps if that's needed, you must make sure you find the correct answer!
Here is the task:
{question}

To solve the task above, you will have to use these attached files:
{files}

Zilliz Cloud 的实现
#

概要
#

Zilliz 实现的是一个 DeepSearch,搜索部分也从搜索引擎换成了知识库检索。

流程图大概如下,整体比较清晰,我这里感觉没必要再复述一遍了。有三点值得关注一下:

  • 这里的搜索改成了基于向量库做近似查询
  • Reflection 的的内容不是模型生成的内容而是向量库查询出来的内容
  • Query 列表查询向量库的过程中,会判断知识库内容和原始 Query 、Query 列表是否匹配,代码中被称为 Rerank,但这里不是重排序的意思
zilliztech/deep-searcher

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Python
3824
363

Zilliz Cloud DeepSearch

生成 Query 列表
#

这里的生成 Query 只是生成第一步的 Query 列表,后续未通过的 Query 列表是在上面生成的。

所以这里的输入也比较简单,只有原始 Query 即 original_query

To answer this question more comprehensively, please break down the original question into up to four sub-questions. Return as list of str.
If this is a very simple question and no decomposition is necessary, then keep the only one original question in the python code list.

Original Question: {original_query}

<EXAMPLE>
Example input:
"Explain deep learning"

Example output:
[
    "What is deep learning?",
    "What is the difference between deep learning and machine learning?",
    "What is the history of deep learning?"
]
</EXAMPLE>

Provide your response in a python code list of str format:

Rerank 判断
#

这里主要是判断检索结果和 Query 的关联度。输入的 query 是原始 Query + Query 列表,retrieved_chunk 是具体的向量查询结果。

Based on the query questions and the retrieved chunk, to determine whether the chunk is helpful in answering any of the query question, you can only return "YES" or "NO", without any other information.

Query Questions: {query}
Retrieved Chunk: {retrieved_chunk}

Is the chunk helpful in answering the any of the questions?

Reflection 反思判断
#

和之前看到的实现不同,这步主要是判断从知识库查询的内容是否符合。

如果符合那么根据这些内容来生成最终报告;如果不符合则重新生成 Query 列表并查询向量库。

这里的输入有 原始 Query question、Query 列表 mini_questions 和 向量库查询出来的 mini_chunk_str

输出这里,如果通过则输出一个空列表,如果没通过则输出新一轮的 Query 列表以便下一次查询。

Determine whether additional search queries are needed based on the original query, previous sub queries, and all retrieved document chunks. If further research is required, provide a Python list of up to 3 search queries. If no further research is required, return an empty list.

If the original query is to write a report, then you prefer to generate some further queries, instead return an empty list.

Original Query: {question}

Previous Sub Queries: {mini_questions}

Related Chunks: 
{mini_chunk_str}

Respond exclusively in valid List of str format without any other text.

产出结论
#

生成最终的结论,输入有原始 Query question 、Query 列表 mini_questions 和向量库查询的结果 mini_chunk_str

You are a AI content analysis expert, good at summarizing content. Please summarize a specific and detailed answer or report based on the previous queries and the retrieved document chunks.

Original Query: {question}

Previous Sub Queries: {mini_questions}

Related Chunks: 
{mini_chunk_str}

总结
#

综观上面的这些时间,实现起来其实大同小异,核心主要还是生成计划或Query 列表 + 反思的这套模板。

  • Dify 主要是基于目前现有的界面化功能实现的,实现的算是一个小的 DeepSearch,看起来简单好用有反思,但是缺少生成计划且最终报告是最后一个模型总结的
  • LangChain 的实现个人认为是最完善的,有生成计划、逐章节生成,并在章节生成完成之后会进行反思,这种方案确实是一个完整的 DeepResearch 的实现,可以生成一个很长的报告,缺点就是耗时且费钱
  • HuggingFace 的 smolagents 示例是基于 MultiStepAgent 实现的,有计划步骤、也会根据计划步骤逐步去生成,但是缺少反思
  • Ziiliz Cloud 的实现算是实现了一个 DeepSearch,流程上比 DeepResearch 会简化不少,同时有反思,并将检索改为了向量检索,但是缺少了类似逐章节生成的功能

总计来说个人觉得,LangChain 的实现更为全面, Zilliz Cloud 和 HuggingFace 的实现也挺有意思的。

最后还有一些像 JinaAI 的实现和 deep-research 这里没有细看,感兴趣的也可以了解下。

参考
#

Introducing deep research

Open-source DeepResearch – Freeing our search agents

DeepResearch: Building a Research Automation App with Dify - Dify Blog

A Practical Guide to Implementing DeepSearch/DeepResearch

Try Deep Research and our new experimental model in Gemini, your AI assistant

smolagents/examples/open_deep_research at main · huggingface/smolagents

langchain-ai/open_deep_research

Jupyter Notebook
2137
292
huggingface/smolagents

🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.

Python
14364
1260
zilliztech/deep-searcher

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Python
3824
363
dzhng/deep-research

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the simplest implementation of a deep research agent - e.g. an agent that can refine its research direction overtime and deep dive into a topic.

TypeScript
14273
1446
jina-ai/node-DeepResearch

Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)

TypeScript
3355
305

相关文章

浅谈 DeepSeek-R1 和 Kimi k1.5 论文中的思维链 + 强化学习
·2598 字·6 分钟
AI LLM CoT 强化学习 DeepSeek Kimi 模型蒸馏 思维链
浅谈 DeepSeek-R1 和 Kimi k1.5 两个模型在推理能力上的技术特点:DeepSeek 采用 GRPO 算法和模型蒸馏提升推理表现,Kimi 则探索长文本思维链和强化学习的结合方案。
使用 TiDB Vector 构建 LightRAG 知识库
·2505 字·5 分钟
RAG LLM AI TiDB 工程实践
梳理了 LightRAG 之后,发现 LightRAG 对持久化支持的还不够多,缺少了最重要的 TiDB (不是)。故抽空贡献之,顺便写个软文。
从论文到源码:详解 RAG 算法
·11763 字·24 分钟
RAG LLM AI 论文笔记 算法原理
本文旨在通过论文+源码的解读,探究 RAG 算法的架构设计和具体的代码实现。本文主要讨论了 GraphRAG、LightRAG 和 RAPTOR RAG,除此之外还提及了 Anthropic 提出的 Contextual Retrieval 上下文检索和 RAG 算法的评估方法。最后在实践中,建议还是根据知识库文档的规模来选择不同的方法。