Add timeout for content extraction in fetch tool #285102

dmitrivMS · 2025-12-26T12:13:50Z

Copilot

Pull request overview

This PR adds a timeout mechanism to prevent content extraction from hanging indefinitely in the web page loader used by the fetch tool. The change addresses issue #280736 by wrapping the content extraction logic (both accessibility tree and DOM extraction) in a raceTimeout call with a 2-second timeout.

Key changes:

Adds EXTRACT_CONTENT_TIMEOUT constant (2 seconds) to limit content extraction duration
Wraps the content extraction logic in raceTimeout to handle hanging extractions
Adds a test case to verify timeout behavior when accessibility tree extraction hangs

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
src/vs/platform/webContentExtractor/electron-main/webPageLoader.ts	Introduces the `EXTRACT_CONTENT_TIMEOUT` constant and wraps the content extraction logic in `raceTimeout` to prevent indefinite hangs during accessibility tree or DOM extraction
src/vs/platform/webContentExtractor/test/electron-main/webPageLoader.test.ts	Adds test case to verify that timeout errors are properly returned when accessibility tree extraction hangs indefinitely

Copilot · 2025-12-26T12:23:47Z

src/vs/platform/webContentExtractor/test/electron-main/webPageLoader.test.ts

+		});
+
+		const loadPromise = loader.load();
+		window.webContents.emit('did-start-loading');


The test only emits 'did-start-loading' but not 'did-finish-load'. While this works because the timeout will eventually trigger content extraction, it would be more realistic and explicit to emit 'did-finish-load' as well, similar to other tests in this file. This would also make the test's intent clearer and ensure it's testing the timeout during extraction rather than relying on the page load timeout.

Suggested change

window.webContents.emit('did-start-loading');

window.webContents.emit('did-start-loading');

window.webContents.emit('did-finish-load');

Copilot · 2025-12-26T12:23:48Z

src/vs/platform/webContentExtractor/test/electron-main/webPageLoader.test.ts

+	test('returns error when accessibility tree extraction hangs', () => runWithFakedTimers({ useFakeTimers: true }, async () => {
+		const uri = URI.parse('https://example.com/page');
+		const loader = createWebPageLoader(uri);
+
+		window.webContents.debugger.sendCommand.callsFake((command: string) => {
+			switch (command) {
+				case 'Network.enable':
+					return Promise.resolve();
+				case 'Accessibility.getFullAXTree':
+					// Return a promise that never resolves to simulate hanging
+					return new Promise(() => { });
+				default:
+					assert.fail(`Unexpected command: ${command}`);
+			}
+		});
+
+		const loadPromise = loader.load();
+		window.webContents.emit('did-start-loading');
+		const result = await loadPromise;
+
+		assert.strictEqual(result.status, 'error');
+		if (result.status === 'error') {
+			assert.ok(result.error.includes('Failed to extract meaningful content'));
+		}
+		// Verify executeJavaScript was NOT called for DOM extraction
+		assert.ok(!window.webContents.executeJavaScript.called);
+	}));


Consider adding a test case where accessibility tree extraction returns insufficient content quickly, but then DOM extraction hangs. This would provide more comprehensive coverage of the timeout behavior during the fallback extraction phase. Currently, the test only covers the case where accessibility tree extraction itself hangs.

TylerLeonhardt · 2025-12-26T17:32:32Z

src/vs/platform/webContentExtractor/electron-main/webPageLoader.ts

+			await raceTimeout((async () => {
+				result = await this.extractAccessibilityTreeContent() ?? '';
+				if (result.length < WebPageLoader.MIN_CONTENT_LENGTH) {
+					this.trace(`Accessibility tree extraction yielded insufficient content, trying main DOM element extraction`);
+					const domContent = await this.extractMainDomElementContent() ?? '';
+					result = domContent.length > result.length ? domContent : result;
+				}
+			})(), WebPageLoader.EXTRACT_CONTENT_TIMEOUT);


Can we add some cancellation? Like we shouldn't do the second async call if we've timed out.

Add timeout for content extraction in fetch tool

4cda34a

Copilot AI review requested due to automatic review settings December 26, 2025 12:13

dmitrivMS added the fetch-tool Issues concerning the fetch internal tool and web extracting service label Dec 26, 2025

dmitrivMS self-assigned this Dec 26, 2025

dmitrivMS requested a review from TylerLeonhardt December 26, 2025 12:13

dmitrivMS enabled auto-merge December 26, 2025 12:13

Copilot started reviewing on behalf of dmitrivMS December 26, 2025 12:14 View session

vs-code-engineering bot added this to the December / January 2026 milestone Dec 26, 2025

Copilot AI reviewed Dec 26, 2025

View reviewed changes

TylerLeonhardt reviewed Dec 26, 2025

View reviewed changes

Merge branch 'main' into dev/dmitriv/fetch-tool-hang

568f6fa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add timeout for content extraction in fetch tool #285102

Add timeout for content extraction in fetch tool #285102

dmitrivMS commented Dec 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

TylerLeonhardt Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	window.webContents.emit('did-start-loading');
	window.webContents.emit('did-start-loading');
	window.webContents.emit('did-finish-load');

Add timeout for content extraction in fetch tool #285102

Are you sure you want to change the base?

Add timeout for content extraction in fetch tool #285102

Conversation

dmitrivMS commented Dec 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

TylerLeonhardt Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants