Roleli Web Talkings: 2016

Monday, August 29, 2016

Varnish Statistics One-liners

Varnish Command Line

Here are some useful examples of varnishtop and varnishlog at work.

Displays a continuously updated list of the most frequently requested URLs:
varnishtop -i RxURL
varnishlog -c | grep 'RxURL'

Top requests to your backend. This shows only the "TxURL", the URL being retrieved from your backend.
varnishtop -i TxURL
varnishlog -b | grep 'TxURL'

#See what cookies values are the most commonly sent to varnish.
varnishtop -i RxHeader -I Cookie
varnishlog -c | grep 'Cookie: '

#Which host is being requested the most. Only really useful when you're serving multiple hosts in Varnish.
varnishtop -i RxHeader -I '^Host:'
varnishlog -i RxHeader | grep 'Host: '

#Accept-Encoding will show the most popular Accept-Encoding header the client are sending you.
varnishtop -i RxHeader -I

See what useragents are the most common from the clients
varnishtop -i RxHeader -C -I ^User-Agent

See what user agents are commonly accessing the backend servers, compare to the previous one to find clients that are commonly causing misses.
varnishtop -i TxHeader -C -I ^User-Agent

See what accept-charsets are used by clients
varnishtop -i RxHeader -I '^Accept-Charset

Listing all details about requests resulting in a 500/404 status:
varnishlog -b -m "RxStatus:500"
varnishlog -b -m "RxStatus:404"

References

http://book.varnish-software.com/3.0/Getting_started.html
https://www.varnish-cache.org/docs/3.0/reference/varnishlog.html
https://www.varnish-cache.org/docs/3.0/reference/varnishtop.html
https://www.varnish-cache.org/docs/3.0/tutorial/increasing_your_hitrate.html
https://www.varnish-cache.org/docs/3.0/tutorial/logging.html#tutorial-logging
https://ma.ttias.be/useful-varnish-3-0-commands-one-liners-with-varnishtop-and-varnishlog/
http://stackoverflow.com/questions/13247707/how-to-read-output-of-varnishtop
http://www.eldefors.com/varnish-command-line-tools/
https://www.varnish-cache.org/trac/wiki/

Wednesday, August 24, 2016

Request Entity Too Large Solution

Stumbled across this issue of request entity too large when trying a script on server is trying to upload large files. This referrers to Apache/NGINX/IIS servers running PHP.

The solution seems to lie in the change the following variables.

memory_limit =128M
post_max_size = 96M
upload_max_filesize = 64M

Test and increase the above variables slowly by 16/32M increments to ensure your web hosting can handled

If you are on NGINX you will need to adjust client_max_body_size in the http block of the config file

The error, '413 Request Entity Too Large'.

http {
    #...
        client_max_body_size 128m;
    #...
}

Note 'post_max_size integer' sets max size of post data allowed. This setting also affects file upload. To upload large files, this value must be larger than 'upload_max_filesize'. Generally speaking, 'memory_limit' should be larger than 'post_max_size'.

You might also want to increase the max-execution-time. Note however you server configurations, i.e. timeouts, might interrupt executions.

Thursday, August 18, 2016

Varnish and SEO

Varnish is a a great caching solution but it can do more. With Search Engine Optimisation it is always recommended that you have one base URL,often referred to as a Canonical URL, either www.mysample or mysample domain. Some website owners even have multiple domains. The snippets below show how you can redirect www/non-www to non-www/www and or multiple domains to a Canonical URL.

Varnish 3
in sub vcl_recv add close to the top

    if (req.http.host == "www.mysample.com" || req.http.host == "my-sample.com" || req.http.host == "www.my-sample.com") {
        set req.http.host = "mysample.com";
        error 750 "http://" + req.http.host + req.url;
    }

in sub vcl_error add
if (obj.status == 750) {

        set obj.http.Location = obj.response;
        set obj.status = 301;
        return(deliver);
    }

Varnish 4
in sub vcl_recv add

    if (req.http.host ~ "^www.mysample.com") {
        return (synth (750, ""));
    }

in sub vcl_synth

    if (resp.status == 750) {
        set resp.status = 301;
        set resp.http.Location = "http://mysampele.com" + req.url;
        return(deliver);
    }

Actually I like the implementation in Varnish 4 better. As you can make all the related changes at one place instead at 2 locations in Varnish 4. This also helps improving your memory used as only a single option is stored in cache instead of one for www.mysample.com/index.html and another for mysample.com/index.html

Hope this helps someone

Source
How to redirect non-www URLs to www in Varnish

Monday, January 25, 2016

SSH and Multiplexing

Recently I worked a project that required the transfer of data between servers using RSYNC at regular intervals. One of the things I noticed was that after awhile the number of open SSH processes started to increase. This as each new RSYNC session would open its own connection. I started wondering if those SSH connections could be reused/re-opened

Then I discovered there was a way to configure SSH to reuse the open TCP connection rather that setting up an new one. This comes with a big advantage as the overhead of creating a new TCP connection is now eliminated. This results in faster connection time and transfer of data.

To configure you will need to edit the SSH config file for the respective user account (~/.ssh/config)

Host x.x.x.x
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h-%p
ControlPersist 1800

What do these options mean?

Host x.x.x.x # the IP/Domain name of the server you are connecting to. You can opt to use the wildcard * but I prefer being specific and offers some security
ControlMaster auto #The default is no. So you have to specify either "yes","auto","ask","autoask".
ControlPath ~/.ssh/sockets/%r@%h-%p #Make sure the path exist. Create the folder and secure. %h - target hostname, %r - remote username, &%p - port. Other variables include %L,%l, %n & %u.
ControlPersist 3600 #How long the the master connection remains open before timing out due to inactivity. Defaults to seconds but can be expressed as 60m or 1hr

You can have multiple blocks

Host 1.1.1.1
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h-%p
ControlPersist 1800

Host 2.2.2.2.
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h-%p
ControlPersist 1800

All commands that use SSH, e.g. RSYNC, SCP, SFTP, will benefit from multiplexing. It should be noted that Multiplexing allows a a maximum of 10 open sessions per connection by default. You can increase this number by changing the value of MaxSessions. If you have a large number of connections you might need to change MaxStartups.

References

Monday, January 18, 2016

MySQL Replication and slave_net_timeout

In an article written in 2009, Jeremy Zawodny took on the MYSQL default for slave_net_timeout among other settings. Daniel Schneller, author of MySQL Admin Cookbook, had written about his experience in 2006.

What is slave_net_timeout

It is the time in seconds for the slave to wait for more data from the master before considering the connection broken, after which it will abort the read and attempt to reconnect. The retry interval is determined by the MASTER_CONNECT_RETRY open for the CHANGE MASTER statement, while the maximum number of re-connection attempts is set by the master-retry-count variable. The first reconnect attempt takes place immediately.

For installations before MySQL 5.7 the default value for slave_net_timeout is 3600 seconds or 1 hour but as of MySQL 5.7 it is 60 seconds. MariaDB still has 3600 seconds as the default.

The problem is summed up in the following statement.

When the network connection between a master and slave database is interrupted in a way that neither side can detect (like a firewall or routing change), you must wait until slave_net_timeout seconds have passed before the slave realizes that something is wrong. It will then try to reconnect to the master and pick up where it left off. This is bad. This could be 1 hour if the slave_net_timeout is set to 1 hour (3600 seconds)

I discovered this issue when I noticed the slave was not up to date but there were no errors. However if the slave was stopped (STOP SLAVE) and started (START SLAVE), suddenly replication started again and everything was up to date.

Before the restart the following would be noted. The Read_Master_Log_Pos and Exec_Master_Log_Pos values did not match the log position (Show Master Status) on the master server. On the slave, the Slave_IO_State is "Waiting for master to send event", the Slave_IO_Running and Slave_SQL_Running values are both are "Yes". The Master_Log_File and Relay_Master_Log_File matched. In essence don't trust "Show Slave Status" alone.

The solution is to lower slave_net_timeout to a more reasonable value 60 - 300 seconds.

Other values worth looking include:

skip-name-resolve - enable and use IPs only in Grants
connect_timeout - Set higher than default
max_connect_errors - Set to High value
interactive_timeout - set to 300 seconds. Just lower that 28800 seconds
wait_timeout - set to 300 seconds. Just lower that 28800 seconds

The last two have been of concern to Eliot Kristan since 2006.

As usual make one configuration change (or related changes only) at a time in order to monitor and evaluate the effectiveness of the change.

References

Monday, January 11, 2016

Removing Directories older than x days

I have setup where directories are created daily for the given date in the following format, yyyymmdd and content placed there for temporary access.

After a while this content becomes stale and can be deleted.

The holding folder is
/home/user/holding/

and the folders created are
/home/user/holding/20151221
/home/user/holding/20151222
/home/user/holding/20151223

I want to delete directories older than x days

The following commands will delete files only and not the directories.

/usr/bin/find /home/user/holding/ -type f -mtime +5 | /usr/bin/xargs rm
/usr/bin/find /home/user/holding/ -type f -mtime +5 -exec rm -rf {} \;

To delete directories and files included, modify the above and change rm to rm -rf

/usr/bin/find /home/user/holding/* -mtime +5 | /usr/bin/xargs rm -rf

Note the asterisk and the removal of "-type f"

Testing
Checking with -mmin +1 i.e. files/directories modified over 1 min ago. Always test what you are going to delete before proceeding.

/usr/bin/find /home/user/holding/* -mmin +1

/home/user/holding/20151221
/home/user/holding/20151222
/home/user/holding/20151223

/usr/bin/find /home/user/holding/ -mmin +1

/home/user/holding/
/home/user/holding/20151221
/home/user/holding/20151222
/home/user/holding/20151223

Therefore to delete the folders/directories in /home/user/holding/ and not /home/user/holding/ itself use /usr/bin/find /home/user/holding/*

Complete
Using with cron to delete folders older than x days, in this case 5, use the following.

0 6 * * * /usr/bin/find /home/user/holding/* -mtime +5 | /usr/bin/xargs rm -rf > /dev/null

Note be very careful with rm -rf. It can wipe out your entire system if incorrectly used. Always test what you want to delete before attempting to do so

Roleli Web Talkings