Crawler options 0.7.13+
📝 Name: crawlerOptions
· 🖥️ Option: --crawler-options
Additional options for configurable crawlers.
INFO
These options only apply to configurable crawlers. If the configured crawler does not implement the required interface, a warning is shown.
Example
Pass crawler options in the expected input format.
IMPORTANT
When passing crawler options as command parameter or environment variable, make sure to pass them as JSON-encoded string.
./cache-warmup.phar --crawler-options '{"concurrency": 3, "request_options": {"delay": 3000}}'
{
"crawlerOptions": {
"concurrency": 3,
"request_options": {
"delay": 3000
}
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('concurrency', 3);
$config->setCrawlerOption('request_options', [
'delay' => 3000,
]);
return $config;
};
crawlerOptions:
concurrency: 3
request_options:
delay: 3000
CACHE_WARMUP_CRAWLER_OPTIONS='{"concurrency": 3, "request_options": {"delay": 3000}}'
Option Reference
Both default crawlers are implemented as configurable crawlers:
The following configuration options are currently available for both crawlers:
concurrency
0.7.13+
🎨 Type: integer
· 🐝 Default: 3
Define how many URLs are crawled concurrently.
INFO
Internally, Guzzle's Pool feature is used to send multiple requests concurrently using asynchronous requests. You may also have a look at how this is implemented in the library's RequestPoolFactory
.
./cache-warmup.phar --crawler-options '{"concurrency": 5}'
{
"crawlerOptions": {
"concurrency": 5
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('concurrency', 5);
return $config;
};
crawlerOptions:
concurrency: 5
CACHE_WARMUP_CRAWLER_OPTIONS='{"concurrency": 5}'
request_headers
0.7.13+
🎨 Type: array<string, mixed>
· 🐝 Default: ['User-Agent' => '<default user-agent>']
A list of HTTP headers to send with each cache warmup request.
INFO
The default User-Agent is built in ConcurrentCrawlerTrait::getRequestHeaders()
.
./cache-warmup.phar --crawler-options '{"request_headers": {"X-Foo": "bar", "User-Agent": "Foo-Crawler/1.0"}}'
{
"crawlerOptions": {
"request_headers": {
"X-Foo": "bar",
"User-Agent": "Foo-Crawler/1.0"
}
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('request_headers', [
'X-Foo' => 'bar',
'User-Agent' => 'Foo-Crawler/1.0',
]);
return $config;
};
crawlerOptions:
request_headers:
X-Foo: bar
User-Agent: 'Foo-Crawler/1.0'
CACHE_WARMUP_CRAWLER_OPTIONS='{"request_headers": {"X-Foo": "bar", "User-Agent": "Foo-Crawler/1.0"}}'
request_method
0.7.13+
🎨 Type: string
· 🐝 Default: HEAD
The HTTP method used to perform cache warmup requests.
./cache-warmup.phar --crawler-options '{"request_method": "GET"}'
{
"crawlerOptions": {
"request_method": "GET"
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('request_method', 'GET');
return $config;
};
crawlerOptions:
request_method: GET
CACHE_WARMUP_CRAWLER_OPTIONS='{"request_method": "GET"}'
request_options
2.0+
🎨 Type: array<string, mixed>
· 🐝 Default: []
Additional request options used for each cache warmup request.
./cache-warmup.phar --crawler-options '{"request_options": {"delay": 500, "timeout": 10}}'
{
"crawlerOptions": {
"request_options": {
"delay": 500,
"timeout": 10
}
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('request_options', [
'delay' => 500,
'timeout' => 10,
]);
return $config;
};
crawlerOptions:
request_options:
delay: 500
timeout: 10
CACHE_WARMUP_CRAWLER_OPTIONS='{"request_options": {"delay": 500, "timeout": 10}}'
write_response_body
4.0+
🎨 Type: bool
· 🐝 Default: false
Define whether or not to write response body of crawled URLs to the corresponding response object.
WARNING
Enabling this option may significantly increase memory consumption during cache warmup.
./cache-warmup.phar --crawler-options '{"write_response_body": true}'
{
"crawlerOptions": {
"write_response_body": true
}
}
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
$config->setCrawlerOption('write_response_body', true);
return $config;
};
crawlerOptions:
write_response_body: true
CACHE_WARMUP_CRAWLER_OPTIONS='{"write_response_body": true}'