Frequently Asked Questions (FAQ)

Are sitemap indexes supported?

Yes, the XML parser detects if an XML sitemap or a sitemap index is parsed. Referenced XML sitemaps in sitemap indexes are followed until a potentially configured limit is reached.

I can't see any valuable output during cache warmup. How can I debug the process?

There exist various debugging and logging tools to increase verbosity of the cache warmup process. Take a look at the logFile, logLevel and progress configuration options. You may also increase output verbosity by using the -v command option.

Can I limit the number of concurrently warmed URLs?

When using the default crawlers, you can configure the concurrency value using the concurrency crawler option.

Is it possible to crawl URLs with `GET` instead of `HEAD`?

Yes, this can be configured by using the request_method crawler option in combination with one of the default crawlers.

How can I configure basic auth credentials?

This is possible by using the clientOptions configuration option in combination with one of the default crawlers and the default parser. Pass your basic auth credentials with the auth request option, for example:

bash

./cache-warmup.phar --client-options '{"auth": ["username", "password"]}'

json

{
    "clientOptions": {
        "auth": ["username", "password"]
    }
}

php

use EliasHaeussler\CacheWarmup;
use GuzzleHttp\RequestOptions;

return static function (CacheWarmup\Config\CacheWarmupConfig $config) {
    $config->setClientOption(RequestOptions::AUTH, ['username', 'password']);

    return $config;
};

yaml

clientOptions:
  auth: ['username', 'password']

bash

CACHE_WARMUP_CLIENT_OPTIONS='{"auth": ["username", "password"]}'

Can I use a custom `User-Agent` header instead of the default one?

Yes, a custom User-Agent header can be configured by using the request_headers crawler option in combination with one of the default crawlers. In addition, it can be configured by using the request_headers parser option in combination with the default parser.

How can I reduce memory consumption and CPU load?

When crawling large sitemaps, memory consumption and CPU load may increase rapidly. The following measures can reduce consumption and save resources:

Avoid progress together with -v/--verbose option. At best, do not use --progress.
Make sure the crawler option write_response_body is set to false (default).

What does "default crawlers" actually mean?

The library ships with two crawlers. Depending on the provided configuration options, one of the crawlers is used for cache warmup, unless you configure a custom crawler by using the crawler configuration option. Read more at Default crawlers.

Frequently Asked Questions (FAQ) ​

Are sitemap indexes supported? ​

I can't see any valuable output during cache warmup. How can I debug the process? ​

Can I limit the number of concurrently warmed URLs? ​

Is it possible to crawl URLs with GET instead of HEAD? ​

How can I configure basic auth credentials? ​

Can I use a custom User-Agent header instead of the default one? ​

How can I reduce memory consumption and CPU load? ​

What does "default crawlers" actually mean? ​