转载本站文章请注明,转载自: 月影鹏鹏 [http://Jacky.Aiwaly.com]
本文链接: http://jk.aiwaly.com/wp/varnish%e4%b8%ad%e8%8b%b1%e6%96%87%e6%89%8b%e5%86%8c.html
.设置Backend服务器
backend www {
.host = “www.example.com”;
.port = “http”;
}
The backend object can later be used to select a backend at request time:
if (req.http.host ~ “^example.com$”) {
set req.backend = www;
}
////////////////////////////////////////////////////////////////////////////////////
.多Backend负载均衡
Can Varnish do load balancing?
Yes, Varnish allows backends to be grouped in a director, which directs requests to its members in a pre-defined
fashion. Here is an example of a round robin director:
指定一个Backend组 为随机机制
director www-director round-robin {
{ .backend = www; }
{ .backend = { .host = “www2.example.com; .port = “http”; } }
}
使用这个组, 则会随机选择组中的backend
sub vcl_recv {
if (req.http.host ~ “^(www.)?example.com$”) {
set req.backend = www-director;
}
}
2)
backend b3 = {
.host = ”b3”;
.port=83;
}
director b2 random {
{ .backend = {
.host = ”b1”;
.port=81;
}
.weight = 7;
}
{ .backend = b3;
.weight = 2;
}
}
///////////////////////////////////////////////////////////////////////
.当一台Backend无法获取数据时,重新请求另一台
Retrying with another backend if one backend reports a non-200 response.
1)
sub vcl_recv {
if (req.restarts == 0) {
set req.backend = b1;
} else {
set req.backend = b2;
}
}
sub vcl_fetch {
if (obj.status != 200) {
restart;
}
}
2)
backend b1 {
.host = “fs.freebsd.dk”;
.port = “82″;
}
backend b2 {
.host = “fs.freebsd.dk”;
.port = “81″;
}
backend b3 {
.host = “fs.freebsd.dk”;
.port = “80″;
}
sub vcl_recv {
if (req.restarts == 0) {
set req.backend = b1;
} else if (req.restarts == 1) {
set req.backend = b2;
} else {
set req.backend = b3;
}
}
sub vcl_fetch {
## If the request to the backend returns a code other than 200, restart the loop
## If the number of restarts reaches the value of the parameter max_restarts,
## the request will be error’ed. max_restarts defaults to 4. This prevents
## an eternal loop in the event that, e.g., the object does not exist at all.
if (obj.status != 200 && obj.status != 403 && obj.status != 404) {
restart;
}
}
////////////////////////////////////////////////////////////////////////
.阻止spider
Preventing search engines from populating the cache with old documents
This can be done by checking the user-agent header in the HTTP request.
sub vcl_miss {
if (req.http.user-agent ~ “spider”) {
error 503 “Not presently in cache”;
}
}
////////////////////////////////////////////////////////////////////////
.常用命令
Varnish has a set of command line tools and utilities to monitor and administer Varnish. These are:
* varnishncsa: Displays the varnishd shared memory logs in Apache / NCSA combined log format
* varnishlog: Reads and presents varnishd shared memory logs.
* varnishstat: Displays statistics from a running varnishd instance.
* varnishadm: Sends a command to the running varnishd instance.
* varnishhist: Reads varnishd shared memory logs and presents a continuously updated histogram showing the
distribution of the last N requests by their processing.
* varnishtop: Reads varnishd shared memory logs and presents a continuously updated list of the most commonly
occurring log entries.
* varnishreplay: Parses varnish logs and attempts to reproduce the traffic.
//////////////////////////////////////////////////////////////////////////
.使用pipe还是pass
Should I use pipe or pass in my VCL code? What is the difference?
When varnish does a pass it acts like a normal HTTP proxy. It reads the request and pushes it onto the backend. The
next HTTP request can then be handled like any other.
pipe is only used when Varnish for some reason can’t handle the pass. pipe reads the request, pushes in onty the
backend _only_ pushes bytes back and forth, with no other actions taken.
Since most HTTP clients do pipeline several requests into one connection this might give you an undesirable result -
as every subsequent request will reuse the existing pipe.
Varnish versions prior to 2.0 does not support handling a request body with pass mode, so in those releases pipe is
required for correct handling.
In 2.0 and later, pass will handle the request body correctly.
////////////////////////////////////////////////////////////////////////////
.刷新缓存,或清空缓存
How can I force a refresh on a object cached by varnish?
Refreshing is often called purging a document. You can purge at least 2 different ways in Varnish:
1. From the command line you can write:
url.purge ^/$
to purge your / document. As you might see url.purge takes an regular expression as its argument. Hence the ^ and $
at the front and end. If the ^ is ommited, all the documents ending in a / in the cache would be deleted.
So to delete all the documents in the cache, write:
url.purge .*
at the command line.
////////////////////////////////////////////////////////////////////////////
.针对client进制调试
How can I debug the requests of a single client?
The “varnishlog” utility may produce a horrendous amount of output. To be able debug our own traffic can be useful.
The ReqStart? token will include the client IP address. To see log entries matching this, type:
$ varnishlog -c -o ReqStart 192.0.2.123
To see the backend requests generated by a client IP address, we can match on the TxHeader? token, since the IP
address of the client is included in the X-Forwarded-For header in the request sent to the backend.
At the shell command line, type:
$ varnishlog -b -o TxHeader 192.0.2.123
/////////////////////////////////////////////////////////////////////////////
.重写url
How can I rewrite URLS before they are sent to the backend?
You can use the “regsub()” function to do this. Here’s an example for zope, to rewrite URL’s for the
virtualhostmonster:
if (req.http.host ~ “^(www.)?example.com”) {
set req.url = regsub(req.url, “^”, “/VirtualHostBase/http/example.com:80/Sites/example.com/VirtualHostRoot”);
}
//////////////////////////////////////////////////////////////////////////////
.针对域名进行访问
I have a site with many hostnames, how do I keep them from multiplying the cache?
You can do this by normalizing the “Host” header for all your hostnames. Here’s a VCL example:
if (req.http.host ~ “^(www.)?example.com”) {
set req.http.host = “example.com”;
}
///////////////////////////////////////////////////////////////////////////////
.在backend日志记录Client的IP
How can I log the client IP address on the backend?
All I see is the IP address of the varnish server. How can I log the client IP address?
We will need to add the IP address to a header used for the backend request, and configure the backend to log the
content of this header instead of the address of the connecting client (which is the varnish server).
Varnish configuration:
sub vcl_recv {
# Add a unique header containing the client address
remove req.http.X-Forwarded-For;
set req.http.X-Forwarded-For = client.ip;
# [...]
}
For the apache configuration, we copy the “combined” log format to a new one we call “varnishcombined”, for
instance, and change the client IP field to use the content of the variable we set in the varnish configuration:
LogFormat “%{X-Forwarded-For}i %l %u %t \”%r\” %>s %b \”%{Referer}i\” \”%{User-Agent}i\”" varnishcombined
And so, in our virtualhost, you need to specify this format instead of “combined” (or “common”, or whatever else you
use)
<VirtualHost *:80>
ServerName www.example.com
# [...]
CustomLog /var/log/apache2/www.example.com/access.log varnishcombined
# [...]
</VirtualHost>
The mod_extract_forwarded Apache module might also be useful.
////////////////////////////////////////////////////////////////////////////////////
.添加HTTP头信息
How do I add a HTTP header?
To add a HTTP header, unless you want to add something about the client/request, it is best done in vcl_fetch as
this means it will only be processed every time the object is fetched:
sub vcl_fetch {
# Add a unique header containing the cache servers IP address:
remove obj.http.X-Varnish-IP;
set obj.http.X-Varnish-IP = server.ip;
# Another header:
set obj.http.Foo = “bar”;
}
/////////////////////////////////////////////////////////////////////////////////////
.修改前往backend的请求
How do I do to alter the request going to the backend?
You can use the bereq object for altering requests going to the backend but from my experience you can only ’set’
values to it. So, if you need to change the requested URL, this doesn’t work:
sub vcl_miss {
set bereq.url = regsub(bereq.url,”stream/”,”/”);
fetch;
}
Because you cannot read from bereq.url (in the value part of the assignment). You will get:
mgt_run_cc(): failed to load compiled VCL program:
./vcl.1P9zoqAU.o: undefined symbol: VRT_r_bereq_url
VCL compilation failed
Instead, you have to use req.url:
sub vcl_miss {
set bereq.url = regsub(req.url,”stream/”,”/”);
fetch;
}
///////////////////////////////////////////////////////////////////////////////////////
.强制backend发送多样的headers
How do I force the backend to send Vary headers?
We have anectdotal evidence of non-RFC2616 compliant backends, which support content negotiation, but which do not
emit a Vary header, unless the request contains Accept headers.
It may be appropriate to send no-op Accept headers to trick the backend into sending us the Vary header.
The following should be sufficient for most cases:
Accept: */*
Accept-Language: *
Accept-Charset: *
Accept-Encoding: identity
Note that Accept-Encoding can not be set to *, as the backend might then send back a compressed response which the
client would be unable to process.
This can of course be implemented in VCL.
////////////////////////////////////////////////////////////////////////////////////////
.定制error信息
How can I customize the error messages that Varnish returns?
A custom error page can be generated by adding a vcl_error to your configuration file. The default error page looks
like this:
sub vcl_error {
set obj.http.Content-Type = “text/html; charset=utf-8″;
synthetic {”
<?xml version=”1.0″ encoding=”utf-8″?>
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>”} obj.status ” ” obj.response {“</title>
</head>
<body>
<h1>Error “} obj.status ” ” obj.response {“</h1>
<p>”} obj.response {“</p>
<h3>Guru Meditation:</h3>
<p>XID: “} req.xid {“</p>
<address><a href=”http://www.varnish-cache.org/”>Varnish</a></address>
</body>
</html>
“};
deliver;
}
///////////////////////////////////////////////////////////////////////////////////////////
.忽略URL中的参数
How do I instruct varnish to ignore the query parameters and only cache one instance of an object?
This can be achieved by removing the query parameters using a regexp:
sub vcl_recv {
set req.url = regsub(req.url, “\?.*”, “”);
}
///////////////////////////////////////////////////////////////////////////////////////////
.Varnish settings
To see further description of these settings, also check param.show -l in the Varnish management interface.
-p thread_pool_max=4000 (default 1000)
This number should be as low as possible, but with an upwards margin. Do not set it much higher than you need,
that only leads to thread pile-ups. The “correct” number is something like the 90% centile number of concurrent
requests when running your peak load. Since that is an incredible tricky number to measure, I suggest you set it 10%
over the highest number of threads you see during normal operation.
-p thread_pools=4 (default 1)
To reduce lock contention, you might want to increase this number a little. But just a little.
-p listen_depth=4096 (default 1024)
You may want to increase this, but there is little advantage to increasing it too high. Set it to your peak
connection/second rate, so that you get a buffer of a full second if the acceptor gets busy. More than that is not
going to do anything good.
============================================
Running with many objects
If you have many objects (more than 100000), you may need to set the following command line options:
-p lru_interval=3600 (default: 2 seconds)
If your cache servers cache most/all objects for a longer time, it makes sense to increase the period before an
object is moved to the LRU list. This reduces the amount of lock operations necessary for LRU list access.
-h classic,500009 (default: 16383)
To keep hash lookups fast, you should not have more than 10 objects per hash bucket. If you have 3 million
objects, number of objects should be at least 300000. The number should be a prime number. You can generate one on
http://www.prime-numbers.org/.
-p obj_workspace=4096 (default: 8192)
For every object, this amount of memory is allocated for HTTP protocol header information. Try to decrease this
setting, it will decrease the need for VM space to fit all your objects. Be aware that Varnish currently crashes if
there is an object is too big for this limit (see #214)
-s malloc,50G
Try running with malloc storage if you experience VM hangs. You do this instead of setting up data files, and
might have to increase the amount of swap space needed. You can set a limit for how much to allocate, which should
be smaller than available swap space on the machine. Possible benefit of not having any swap space on the OS/system
disk.
======================================================
VCL Setting
Enable grace period (varnish serves stale (but cacheable) objects while retriving object from backend)
in vcl_recv:
set req.grace = 30s;
in vcl_fetch:
set obj.grace = 30s;
//////////////////////////////////////////////////////////////////////////////////////////////////
.FreeBSD
* If using FreeBSD 7.0 or newer, try using SCHED_ULE instead of SCHED_4BSD in your kernel config.
* Turn off soft-updates on the filesystems where you keep your Varnish data files. It will not help Varnish.
* sysctl.conf settings (see tuning(7) manpage and http://www.freebsd.org/doc/en/books/handbook/configtuning-
kernel-limits.html):
kern.ipc.nmbclusters=65536 kern.ipc.somaxconn=16384 kern.maxfiles=131072 kern.maxfilesperproc=104856
kern.threads.max_threads_per_proc=4096
* loader.conf settings:
kern.ipc.maxsockets=”131072″ kern.ipc.maxpipekva=”104857600″ (only if you get the “kern.ipc.maxpipekva exceeded”
messages in your logs, varnish does not use pipes for worker pool synchronization any more)
* If you run 32-bit FreeBSD, you will need to change set kern.maxdsiz (maximum data size per process in number
of bytes) in loader.conf to a larger number if you want to cache more than 512 MB (the default setting) of objects.
* If you use the malloc storage type, and your system hangs with “swap zone exhausted, increase kern.maxswzone”
on the console, try increasing kern.maxswzone (default is 32 MB in FreeBSD 7.0) in loader.conf.
///////////////////////////////////////////////////////////////////////////////////////////////////
.Linux
Edit /etc/sysctl.conf
These are numbers from a highly loaded varnishe serving about 4000-8000 req/s
(details: http://projects.linpro.no/pipermail/varnish-misc/2008-April/001769.html)
net.ipv4.ip_local_port_range = 1024 65536
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216
net.ipv4.tcp_fin_timeout = 3
net.ipv4.tcp_tw_recycle = 1
net.core.netdev_max_backlog = 30000
net.ipv4.tcp_no_metrics_save=1
net.core.somaxconn = 262144
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
///////////////////////////////////////////////////////////////////////////////////////////////////
.All UNIX platforms
* Set the mount option noatime and nodiratime on the filesystems where you keep your Varnish data files. There
is no point in keeping track of how often they are accessed, it will waste cycles/give unneccessary disk activity.
//////////////////////////////////////////////////////////////////////////////////////////////////
.缓存带cookie的页面
Caching, even when cookies are present
Please note that this might quite easily end up serving content meant for one user to another, with all the chaos
which can follow.
By default (and design) Varnish does not cache content with cookies in it. This is the recommended behaviour, so
please only use the following receipe if you are sure you want to cache even with cookies and changing the web
application is not possible.
Adding the cookie to the hash
This causes a lookup for a given object to include the Cookie. This will give you a per-user cache, so fairly low
cache hit ratio and requires your system to not change the cookie on each page hit.
sub vcl_hash {
set req.hash += req.http.cookie;
}
////////////////////////////////////////////////////////////////////////////////////////
.清除jpg等的cookie
Removing Set-Cookie from the backend (for a particular path)
In this case, we remove both the Cookie header and the Set-Cookie header for objects under a predefined path. This
is quite common for images and similar static content.
sub vcl_recv {
if (req.url ~ “^/images”) {
unset req.http.cookie;
}
}
sub vcl_fetch {
if (req.url ~ “^/images”) {
unset obj.http.set-cookie;
}
}
Caching based on file extensions
Here we throw away the cookie the client supplied by forcing a lookup. The default VCL code is _not_ run after
vcl_recv.
sub vcl_recv {
if (req.url ~ “\.(png|gif|jpg|swf|css|js)$”) {
lookup;
}
}
# strip the cookie before the image is inserted into cache.
sub vcl_fetch
if (req.url ~ “\.(png|gif|jpg|swf|css|js)$”) {
unset obj.http.set-cookie;
}
/////////////////////////////////////////////////////////////////////////////////////////
.在客户端长久缓存
How to cache things longer on Varnish than on the client
RFC2616 spends quite a lot of time explaining what the expiration rules are for normal client-side caches.
The explanation is not the best I have seen, and Varnish is not a client side cache anyway, so this is my attempt to
set the record straight, or at least firmly crooked on the subject in a Varnish context.
At the sound of the tone…
In an ideal world, all computers would have clocks that show the correct time.
If they did, the Expires: header could be used to say when a given web-object should be thrown away, and my
explanation would be done now.
Getting computer clocks in sync is a lot harder than it sounds and despite the valiant efforts of Prof. Dave Mills
and his NTP gang, this is far from the situation on the internet.
The main obstacle is that people just does not care enough to do it, and the secondary obstacles are complicated
rules for timekeeping, which involves not only time zones and daylight savings time but also leap seconds.
Fortunately, once upon a time it was predicted that some basic web-clients would not have a clock at all, and
therefore the RFC2616 standard offers a way to control lifetime in relative terms (“throw away after 600 seconds”)
instead of absolute terms (“throw away at 10:35:00 20-01-2008 UTC”).
RFC2616 specifies an algorithm in section 13.2.4 which combines the absolute information and the relative
information, and then picks the earlier of the two resulting deadlines.
The Varnish complication
Varnish does not fit the model in RFC2616 for the simple reason that varnish is not a client side cache, but a part
of the web-server.
Where a client cache must be defensive about everything, to not get in the way or change semantics for the
client/server relationship, varnish is the server in the relationship and may be responsible for implementing
content policies etc.
At the most basic level, how long varnish and the client can cache a given object may differ.
A website may very likely want varnish to cache an object forever, trusting the backend server to explicitly purge
it, should it be updated.
But that does not mean that we want the clients to cache the object forever.
Because the backends purge requests can not reach the clients, it is necessary to have the client check back after a
reasonable amount of time, to see if the object has changed.
How it works
Varnish acts like a RFC2616 client side cache by default, with the footnote, that if no cacheability information is
available, we use a default Time To Live (TTL) from the paramter “default_ttl”.
This means that Varnish will respect the s-maxage or max-age Cache-Control fields and will respect Expires headers.
Varnish leaves Expires: and Cache-Control: headers intact, and sets the Age: header with the number of seconds the
object have been cached and therefore, any RFC2616 client will do the right thing by default.
How it should work
It is very likely that you want to have Varnish cache objects longer than the clients do, and this is where RFC2616
comes up short: it offers no way to communicate the two different lifetimes from the backend.
The solution is to have the backend emit the objects with the desired headers for client use, and then set the
obj.ttl in the VCL code to the longer duration.
But this is not quite enough to get the desired effect.
The Expires header from the backend must be removed, it would pertain only to the direct client onnection case, and
it could in theory be replaced, by varnish, with a new header.
According to RFC2616, just issuing a max-age to the client should be just as precise as generating an Expires
header, and it has the advantage of not expecting the clients clock to be correct, so unless informs me otherwise,
my recommendation is to not bother with Expires.
Besides, we do not have a convenient way to generate this timestamp in Varnish presently.
The Age: header generated by Varnish must also be neutered, otherwise it would grow well beyond the max-age sent to
the client.
A solution in VCL could look like this:
sub vcl_fetch {
if (obj.cacheable) {
/* Remove Expires from backend, it’s not long enough */
unset obj.http.expires;
/* Set the clients TTL on this object */
set obj.http.cache-control = “max-age = 900″;
/* Set how long Varnish will keep it */
set obj.ttl = 1w;
/* marker for vcl_deliver to reset Age: */
set obj.http.magicmarker = 1;
}
}
sub vcl_deliver {
if (resp.http.magicmarker) {
/* Remove the magic marker */
unset resp.http.magicmarker;
/* By definition we have a fresh object */
set resp.http.age = “0″;
}
}
/////////////////////////////////////////////////////////////////////////////////////////////
.Removing all, but not all cookies
In some cases, you might want to remove only a few, selected cookies, for example if you use Google Analytics.
Currently, this has to be done using regular expressions as follows:
sub vcl_recv {
# Is it the first one?
set req.http.cookie = regsub(req.http.cookie, “foo=[^;]+(; )?”, “”);
# Or perhaps one in the middle or the last one?
set req.http.cookie = regsub(req.http.cookie, “(; )?foo=[^;]+”, “”);
if (req.http.cookie ~ “^ *$”) {
remove req.http.cookie;
}
}
Replace foo with the name of the cookie you wish to remove.
//////////////////////////////////////////////////////////////////////////////////////////////
.Enable force-refresh from clients 当客户端强制刷新浏览器,比如按Ctrl+F5 无缓存请求时
When receiving a “force-refresh” request from a client, this configuration will fetch the requested element from the
backend, update the cache and deliver it to the client.
sub vcl_recv {
# Force lookup if the request is a no-cache request from the client
if (req.http.Cache-Control ~ “no-cache”) {
purge_url(req.url);
}
}
////////////////////////////////////////////////////////////////////////////////////////////////
.重定向URL
If you, for some reason, don’t want to redirect on the backend, but prefer to do it in VCL, you can do it using one
of the following receipes:
Redirect if the user agent matches a regex
sub vcl_recv {
if (req.http.user-agent ~ “iP(hone|od)”) {
error 750 “Moved Temporarily”;
}
}
sub vcl_error {
if (obj.status == 750) {
set obj.http.Location = “http://www.example.com/iphoneversion/”;
set obj.status = 302;
deliver;
}
}
Redirect if the user agent matches a regex (multiple sites)
sub vcl_recv {
if (req.http.user-agent ~ “lwp”) {
if (req.http.host ~ “example.com”) {
error 750 “example.com”;
} else {
error 750 “localhost”;
}
}
}
sub vcl_error {
if (obj.status == 750) {
if (obj.response ~ “example.com”) {
set obj.http.Location = “http://www.example.com/customversion”;
} elsif (obj.response ~ “localhost”) {
set obj.http.Location = “http://localhost/customversion”;
}
set obj.status = 302;
deliver;
}
///////////////////////////////////////////////////////////////////////////////
.忽略cookie
Ignore cache headers from the backend
Some backends send headers that tell varnish not to cache elements. Header examples are:
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
To override these headers and still put the element into cache for 2 minutes, the following configuration may be
used:
sub vcl_fetch {
if(obj.ttl < 120s){
set obj.ttl = 120s;
}
}