-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Summary
I'd like to suggest exploring the Google Cloud Gemini API as an alternative to the current Gemini API implementation for the deep research functionality. The Google Cloud Gemini API's Deep Research mode appears to offer enhanced capabilities including:
- Ability to execute more research steps
- Detailed citation information for each reference source
- Potentially more comprehensive research workflows
https://docs.cloud.google.com/gemini/enterprise/docs/research-assistant#deep-research-rest
Current Implementation
The project currently uses the @google/genai SDK with the deep-research-pro-preview-12-2025 model for deep research operations 1 2 . The extension provides tools for starting research sessions, checking status, and generating reports 3 .
Suggested Enhancement
Consider investigating the Google Cloud Gemini API interface, which may provide:
- Enhanced Deep Research Mode: The ability to execute more comprehensive research steps compared to the current implementation
- Detailed Citation Tracking: Specific information for each original citation, which would improve the research quality and verifiability
- Potential Performance Benefits: Different execution characteristics that might be beneficial for complex research workflows
Implementation Considerations
If exploring this enhancement, you might want to:
- Compare the API capabilities between the current
@google/genaiSDK and Google Cloud Gemini API - Evaluate the differences in Deep Research functionality and citation handling
- Consider maintaining compatibility with the existing MCP tool structure 4
- Assess any configuration or authentication changes required
Potential Benefits
This could enhance the project's value by providing users with more detailed research outputs and better source attribution, which aligns with the project's goal of automating complex research workflows and generating comprehensive reports 5 .
Would be happy to provide more details about the Google Cloud Gemini API capabilities if helpful for evaluation.
Notes
This issue is based on the user's observation about Google Cloud Gemini API's Deep Research mode capabilities. The current project uses the standard Gemini API through the @google/genai SDK, and the suggestion is to evaluate whether the Google Cloud variant offers superior functionality for deep research workflows, particularly around citation detail and research step execution.
Wiki pages you might want to explore:
- User Guide (allenhutchison/gemini-cli-deep-research)
- File Search Operations (allenhutchison/gemini-cli-deep-research)
Citations
File: src/index.ts (L4-4)
import { GoogleGenAI } from '@google/genai';File: src/index.ts (L256-327)
server.registerTool(
'research_start',
{
description: 'Starts a new Deep Research interaction in the background.',
inputSchema: z.object({
input: z.string().describe('The research query or instructions'),
report_format: z.string().optional().describe('The desired format of the report (e.g., "Executive Brief", "Technical Deep Dive", "Comprehensive Research Report")'),
model: z.string().optional().default('deep-research-pro-preview-12-2025').describe('The agent to use (default: deep-research-pro-preview-12-2025)'),
fileSearchStoreNames: z.array(z.string()).optional().describe('Optional list of file search store names for grounding'),
}).shape,
},
async ({ input, report_format, model, fileSearchStoreNames }) => {
let finalInput = input;
if (report_format) {
finalInput = `[Report Format: ${report_format}]\n\n${input}`;
}
const interaction = await researchManager.startResearch({
input: finalInput,
model,
fileSearchStoreNames,
});
if (interaction.id) {
WorkspaceConfigManager.addResearchId(interaction.id);
}
return {
content: [{
type: 'text',
text: `Research started. ID: ${interaction.id}\nStatus: ${interaction.status}\nUse research_status to check progress.`
}]
};
}
);
server.registerTool(
'research_status',
{
description: 'Checks the status and retrieves outputs of a Deep Research interaction.',
inputSchema: z.object({
id: z.string().describe('The interaction ID'),
}).shape,
},
async ({ id }) => {
const interaction = await researchManager.getStatus(id);
return { content: [{ type: 'text', text: JSON.stringify(interaction, null, 2) }] };
}
);
server.registerTool(
'research_save_report',
{
description: 'Generates a Markdown report from a completed research interaction and saves it to a file.',
inputSchema: z.object({
id: z.string().describe('The interaction ID'),
filePath: z.string().describe('The local file path to save the report (e.g., report.md)'),
}).shape,
},
async ({ id, filePath }) => {
const interaction = await researchManager.getStatus(id);
if (interaction.status !== 'completed') {
return { isError: true, content: [{ type: 'text', text: `Interaction ${id} is not completed. Current status: ${interaction.status}` }] };
}
if (!interaction.outputs) {
return { isError: true, content: [{ type: 'text', text: 'No outputs found for this interaction.' }] };
}
const markdown = reportGenerator.generateMarkdown(interaction.outputs);
fs.writeFileSync(filePath, markdown);
return { content: [{ type: 'text', text: `Report saved to ${filePath}` }] };
}
);File: deep-research-GEMINI.md (L22-26)
### Deep Research
- `research_start`: Start a long-running background research task. You can ground it in your uploaded files by providing `fileSearchStoreNames`. Use `report_format` to specify the desired output structure (e.g., "Executive Brief", "Technical Deep Dive", "Comprehensive Research Report").
- `research_status`: Check if the research is done and retrieve the results.
- `research_save_report`: Once completed, save the findings as a professional Markdown report.
File: GEMINI.md (L5-11)
**gemini-deep-research** is an extension for the Gemini CLI that enables deep research capabilities. It is designed to automate complex research workflows, synthesize information from various sources, and generate detailed reports.
The extension leverages the Model Context Protocol (MCP) to integrate seamlessy with the Gemini CLI environment. It allows users to:
- Execute multi-step research plans.
- Integrate with file search databases for grounded research.
- Generate comprehensive reports with citations.