Crawler options 0.7.13+
📝 Name: crawlerOptions
· 🖥️ Option: --crawler-options
Additional options for configurable crawlers.
INFO
These options only apply to configurable crawlers. If the configured crawler does not implement the required interface, a warning is shown.
Example
Pass crawler options in the expected input format.
IMPORTANT
When passing crawler options as command parameter or environment variable, make sure to pass them as JSON-encoded string.
./cache-warmup.phar --crawler-options '{"concurrency": 3, "request_options": {"delay": 3000}}'
{
"crawlerOptions": {
"concurrency": 3,
"request_options": {
"delay": 3000
}
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('concurrency', 3);
$config->setCrawlerOption('request_options', [
'delay' => 3000,
]);
return $config;
};
crawlerOptions:
concurrency: 3
request_options:
delay: 3000
CACHE_WARMUP_CRAWLER_OPTIONS='{"concurrency": 3, "request_options": {"delay": 3000}}'
Option Reference
Both default crawlers are implemented as configurable crawlers:
EliasHaeussler\CacheWarmup\Crawler\ConcurrentCrawler
EliasHaeussler\CacheWarmup\Crawler\OutputtingCrawler
The following configuration options are currently available for both crawlers:
client_config
1.2.0+
🎨 Type: array<string, mixed>
· 🐝 Default: []
Optional configuration used when instantiating a new Guzzle client.
INFO
This crawler option can only be configured with a PHP configuration file.
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$stack = \GuzzleHttp\HandlerStack::create();
$stack->push($customMiddleware);
$config->setCrawlerOption('client_config', [
'handler' => $stack,
]);
return $config;
};
concurrency
0.7.13+
🎨 Type: integer
· 🐝 Default: 3
Define how many URLs are crawled concurrently.
INFO
Internally, Guzzle's Pool feature is used to send multiple requests concurrently using asynchronous requests. You may also have a look at how this is implemented in the library's RequestPoolFactory
.
./cache-warmup.phar --crawler-options '{"concurrency": 5}'
{
"crawlerOptions": {
"concurrency": 5
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('concurrency', 5);
return $config;
};
crawlerOptions:
concurrency: 5
CACHE_WARMUP_CRAWLER_OPTIONS='{"concurrency": 5}'
request_headers
0.7.13+
🎨 Type: array<string, mixed>
· 🐝 Default: ['User-Agent' => '<default user-agent>']
A list of HTTP headers to send with each cache warmup request.
INFO
The default User-Agent is built in ConcurrentCrawlerTrait::getRequestHeaders()
.
./cache-warmup.phar --crawler-options '{"request_headers": {"X-Foo": "bar", "User-Agent": "Foo-Crawler/1.0"}}'
{
"crawlerOptions": {
"request_headers": {
"X-Foo": "bar",
"User-Agent": "Foo-Crawler/1.0"
}
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('request_headers', [
'X-Foo' => 'bar',
'User-Agent' => 'Foo-Crawler/1.0',
]);
return $config;
};
crawlerOptions:
request_headers:
X-Foo: bar
User-Agent: 'Foo-Crawler/1.0'
CACHE_WARMUP_CRAWLER_OPTIONS='{"request_headers": {"X-Foo": "bar", "User-Agent": "Foo-Crawler/1.0"}}'
request_method
0.7.13+
🎨 Type: string
· 🐝 Default: HEAD
The HTTP method used to perform cache warmup requests.
./cache-warmup.phar --crawler-options '{"request_method": "GET"}'
{
"crawlerOptions": {
"request_method": "GET"
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('request_method', 'GET');
return $config;
};
crawlerOptions:
request_method: GET
CACHE_WARMUP_CRAWLER_OPTIONS='{"request_method": "GET"}'
request_options
2.0+
🎨 Type: array<string, mixed>
· 🐝 Default: []
Additional request options used for each cache warmup request.
./cache-warmup.phar --crawler-options '{"request_options": {"delay": 500, "timeout": 10}}'
{
"crawlerOptions": {
"request_options": {
"delay": 500,
"timeout": 10
}
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('request_options', [
'delay' => 500,
'timeout' => 10,
]);
return $config;
};
crawlerOptions:
request_options:
delay: 500
timeout: 10
CACHE_WARMUP_CRAWLER_OPTIONS='{"request_options": {"delay": 500, "timeout": 10}}'