Tiny WebCrawler for Laravel using Playwright.
Version 2 has been reworked as a simple package that depends on Playwright. It only implements minimal functionality, since you can use playwright-php/playwright directly.
In addition, version 2.2 now supports the Vercel agent-browser.
- PHP >= 8.3
- Laravel >= 11.x
composer require revolution/salvagerInstall Playwright browsers:
vendor/bin/playwright-install --browsersOr install Playwright browsers with OS dependencies:
vendor/bin/playwright-install --with-depsInstall agent-browser and Chromium globally and run it as a Laravel Process.
npm install -g agent-browser
agent-browser installIf you want to use custom Chromium binary, you can specify it in .env file.
# .env
SALVAGER_AGENT_BROWSER_PATH=/path/to/agent-browser
SALVAGER_AGENT_BROWSER_EXECUTABLE_PATH=/path/to/chromium
SALVAGER_AGENT_BROWSER_OPTIONS=The browser will be terminated when you exit Salvager::browse(), so please obtain any necessary data within the Salvager::browse() closure. The Page object cannot be used outside of Salvager::browse().
use Revolution\Salvager\Facades\Salvager;
use Playwright\Page\Page;
class SalvagerController
{
public function __invoke()
{
Salvager::browse(function (Page $page) use (&$url, &$text) {
$page->goto('https://example.com/');
$page->screenshot(config('salvager.screenshots').'example.png');
$url = $page->url();
$text = $page->locator('p')->first()->innerText();
});
dump($url);
dump($text);
}
}If you want more control, just launch the browser with Salvager::launch().
use Playwright\Browser\BrowserContextInterface;
use Revolution\Salvager\Facades\Salvager;
/* @var BrowserContextInterface $browser */
$browser = Salvager::launch();
$page = $browser->newPage();
$page->goto('https://example.com/');
// Do something...
// Don't forget to close the browser
$browser->close();use Revolution\Salvager\AgentBrowser;
use Revolution\Salvager\Facades\Salvager;
Salvager::agent(function (AgentBrowser $agent) use (&$url, &$text, &$html) {
$agent->userAgent('Chromium');
$agent->open('https://example.com/');
$agent->screenshot(config('salvager.screenshots').'agent-test.png');
$url = $agent->url();
$text = $agent->text('xpath=//p[1]', '--json');
$html = $agent->html('css=html');
// Run any agent-browser command
$result = $agent->run(command: '', args: '', options: '');
$agent->close();
});Since text() and html() use Playwright's page.locator(), using a CSS selector will result in an error if multiple elements are found. If you want to specify one of multiple elements, use XPath.
MIT