Copyright (c) 2003-2008 ViSolve
Table of Contents
1. What is Vicompress?
2. Compiling and Installing Vicompress
3. Running Vicompress
4. Configuring Vicompress
5. Compression
6. Caching
7. Load Balancing and Failover
8. Log Statistics
9. Log Files
Vicompress is an HTTP web accelerator. It speeds up download response
times by caching frequently requested pages, and by compressing text
pages for smaller downloads. Vicompress can be used in two different
setups, one for Internet Service Providers (ISPs), and one for individual
websites:
Setup for ISPs
Setup for website
What features does Vicompress support?
- In-Memory compression
Vicompress supports in-memory compression of text pages, such
as HTML, javascript, stylesheets, PDF documents, and Microsoft Word documents.
Image files and other file types are not compressed. Because text pages
are compressed, the download time will be faster for clients, especially
over slow connections.
- In-Memory caching
Vicompress supports in-memory caching of static data, such as
images, stylesheets, and html files. If a web page is in the
cache, Vicompress will respond to the client directly, rather than
contacting the web server. For ISPs, this results in a
faster response to clients. For single websites, this reduces the
load on the backend web server.
- Load Balancing and Failover
For websites, Vicompress supports load-balancing over multiple backend
web servers. If a backend web server goes down, vicompress will internally
mark the server as down and avoid sending requests to that server.
Vicompress will then periodically check the down servers, to see if they
have come back up.
- Sticky Sessions
Vicompress supports sticky sessions, where all traffic from the same
client browser goes to the same backend webserver.
- DNS lookup caching
When requesting a webpage, the DNS lookup time (looking up the IP address
of the website's hostname) can often be slow, sometimes up to 20 or 30 seconds.
Vicompress will cache the DNS lookups, thereby improving response time.
- Log files
Vicompress logs every HTTP request to a log file. It supports the
Apache Combined Log Format, as well as the Squid Access Log format.
- Log statistics
Vicompress provides tools to generate log statistics in HTML format.
The HTML reports include statistics such as bandwidth, compression,
and caching for a given period (hour, day, month).
- Scales to multiple processors
Vicompress uses multiple threads for the CPU intensive gzip compression.
Vicompress will automatically determine how many processors your system has,
and will spawn one compression thread per processor.
What features does Vicompress not support?
- IPv6 Addresses
Vicompress does not support IPv6 addresses, only IPv4 addresses.
2. Compiling and Installing Vicompress
|
Requirements
- A POSIX compatible Unix system. However, only Linux 2.4, Linux 2.6, and
Cygwin have been tested.
- The GNU gcc compiler. Other ANSI-C compilers may work, but will probably
require Makefile modifications to compile.
- The pthread and zlib libraries.
- GNU make.
Compiling from the source
Extract the vicompress source code, and change into the src directory.
# gunzip vicompress-1.0.x.tar.gz
# tar -xvf vicompress-1.0.x.tar.gz
# cd vicompress-1.0.x/src/
|
Run the configure script, passing the directory to install vicompress into.
The default directory is
/usr/local/vicompress.
# ./configure /usr/local/vicompress
|
If you're using a C compiler other than gcc, you will need to edit the
compiler flags in the Makefile. The default flags are:
CC=gcc
LIBS= -lpthread -lz
CFLAGS= -O1 -Wall
LDFLAGS=
|
Compile the source code.
The make install command will
copy the vicompress runtimes files into the install directory
(/usr/local/vicompress ).
The following files will be installed:
Vicompress Files
/usr/local/vicompress/LICENSE
|
|
The License
|
/usr/local/vicompress/README.html
|
|
The HTML documentation
|
/usr/local/vicompress/bin/tune_kernel.sh
|
|
Script to tune Linux kernel parameters, for performance
|
/usr/local/vicompress/bin/update_log_stats
|
|
Program to generate/update log statistics every hour
|
/usr/local/vicompress/bin/update_log_stats.sh
|
|
Script to cleanly start/stop the update_log_stats program.
|
/usr/local/vicompress/bin/vicompress
|
|
Main vicompress server
|
/usr/local/vicompress/bin/vicompress.sh
|
|
Script to cleanly start/stop the vicompress server.
|
/usr/local/vicompress/etc/vicompress.conf
|
|
Configuration file
|
/usr/local/vicompress/log/ |
|
Directory where log files are stored
|
/usr/local/vicompress/logstats/ |
|
Directory where HTML log statistics are written to
|
/usr/local/vicompress/logstats/template.html |
|
Template HTML file used when generating HTML statistics reports.
|
/usr/local/vicompress/logstats/verticalbarN.png |
|
Image files used in HTML statistics reports.
|
/usr/local/vicompress/logstats/visolvelogo.png |
|
Image file used in HTML statistics reports.
|
|
Installing from an RPM
Visolve also distributes Vicompress as a pre-compiled binary RPM file
(RedHat Package Manager) for Linux x86 based systems. You can install
Vicompress from the binary RPM by running the rpm install
command:
# rpm -i Vicompress-1.0.x-1.i386.rpm
|
To remove the Vicompress package, use the rpm erase command. All the
Vicompress files will be removed, except for log files.
Vicompress can be run in one of two setups:
For ISPs: a forward HTTP proxy that forwards requests to the server
given in the URL.
For websites: a reverse HTTP proxy that forwards requests to a set
of backend web servers.
If you are running Vicompress in the first setup, you can start the
server with the default settings. However, for the website setup,
you must add the IP address and port of all the backend webservers
to the configuration file
/usr/local/vicompress/etc/vicompress.conf
webserver <IP address> <port>
For example:
webserver 192.168.10.2 80
webserver 192.168.10.3 80
See
Configuring Vicompress for more details
about the configuration file.
To start and stop the server, use the script:
/usr/local/vicompress/bin/vicompress.sh
It can take one of three possible arguments:
start | |
start the vicompress server |
stop | |
stop a running vicompress server process |
status | |
print whether or not vicompress is running |
To start Vicompress, simply use the "start" argument:
# cd /usr/local/vicompress/bin
# ./vicompress.sh start
|
The server is automatically run in the background. The default listening
port for Vicompress is 80.
4. Configuring Vicompress
|
Vicompress uses the configuration file
/usr/local/vicompress/etc/vicompress.conf
One option is specified per line. Blank lines are ignored. Lines beginning
with a hash (#) are ignored.
Each option is explained in detail below as follows:
- Option name and parameters
- Example parameters
- Description
webserver <IP address> <port>
webserver 192.168.10.2 80
|
Specify a backend web server to forward requests to. For ISPs, no webserver
entries should be specified. In this scenario, Vicompress will act as a
forward HTTP proxy, and will connect to the origin server specified in the HTTP
request.
For websites, one or more webserver entries should be specified. Multiple
webserver entries may be specified, each on a separate line. In this
scenario, Vicompress will act as a reverse HTTP proxy, and will distribute the
requests among the backend webserver entries specified. See the
Load Balancing and Failover section for more
details.
|
listenip <IP address>
listenip 192.168.0.1
|
Specify the IP address for Vicompress to listen on. The default value is
all interfaces, 0.0.0.0.
|
listenport <port>
listenport 80
|
Specify the port for Vicompress to listen on. The default port is 80, the
standard HTTP port. Only servers started by root can bind to
ports less than 1024.
|
outgoingip <IP address>
outgoingip 192.168.0.1
|
Specify the IP address for Vicompress to bind to when making outgoing connections.
The default value is any interface, 0.0.0.0.
|
enable_sessions <yes|no>
enable_sessions yes
|
Enable or disable sticky sessions. The default value is yes. This option
is only used when two or more webservers are specified in the Vicompress
configuration. When enabled, Vicompress will use HTTP Cookies to ensure
that a client is sent to the same backend web server for the duration of
it's session. An HTTP session lasts until the client browser is closed.
See the Load Balancing and Failover section for
more details.
|
enable_compression <yes|no>
enable_compression yes
|
Enable or disable gzip compression. The default value is yes. When enabled,
Vicompress will gzip HTML and text pages before sending the response to
the client.
|
enable_caching <yes|no>
enable_caching yes
|
Enable or disable caching of pages. The default value is yes. When enabled,
Vicompress will cache static pages and images in memory.
|
cache_memory <size in megabytes>
cache_memory 50
|
Specify the size of the in-memory cache, in megabytes. The default value
is 50. Note that Vicompress will also use around 30 MB for basic operation
and compression. The total memory (cache_memory + 30 MB) should not exceed
the amount of RAM memory available. If cache_memory is set to 0, caching
is disabled.
|
max_cacheditem_size <size in kilobytes>
max_cacheditem_size 512
|
Web pages larger than this size will not be cached. In order to have a high
hit rate, Vicompress should cache many small pages,
rather than a few large pages. To prevent large pages from being cached,
use this option, max_cacheditem_size. The default value is 512 kilobytes.
|
cache_expires <hours>
cache_expires 240
|
When a web page is cached, it remains in the cache based on its age.
The expiration time is set to half of the item's age. For example,
a page that is 4 days old will be cached for 2 days.
This option is used to place an upper limit on the expiration time.
The default value is 240 hours (10 days). After 10 days, the page is
removed from the cache.
|
enable_dns_caching <yes|no>
enable_dns_caching yes
|
Enable or disable caching of dns lookups. The default value is yes.
When enabled, Vicompress will store the hostname-to-IP address
mappings in memory.
|
dns_expires <hours>
dns_expires 48
|
Specify the amount of time a cached DNS mapping is valid. The default
is 48 hours.
|
user <username>
user nobody
|
The user to run the server as. Vicompress will switch to this user after
binding to the listening port (usually port 80). Vicompress is generally
started as root, since only root can bind to ports less than 1024.
However, it is unsafe to run a server program as root. A buffer overflow error
can give clients root access to the server machine. Therefore, Vicompress
will switch to a non-root user after binding to port 80. That user is
specified by the "user" option given above.
The default value is the user who started the server.
|
hostheader <hostname>
hostheader mydomain.com
|
This option only applies when accelerating a single webserver (when the
webserver option is enabled). Specify the hostname to use in the
HTTP Host header, when sending the HTTP request to the webserver. By
default, Vicompress will just send the same HTTP Host header it receives
from the client browser.
|
accesslog <path to logfile>
accesslog /usr/local/vicompress/log/accesslog
|
Specify the file path where the access log should be stored. The file must be
writable by the username given in the "user" option. If the accesslog is not
specified, no access log file will be created or written to.
|
errorlog <path to logfile>
errorlog /usr/local/vicompress/log/errorlog
|
Specify the file path where the error log should be stored. The file must be
writable by the username given in the "user" option. If the errorlog is not
specified, no error log file will be created or written to.
|
rotatesize <size in megabytes>
rotatesize 1024
|
Rotate the log files when they reach the specified size, in megabytes.
The default value is 1024. When rotation occurs,
the current log file at
<accesslog>
is moved to
<accesslog>.1
and a blank log file is created at
<accesslog>.
See Log Files for further details
about log file rotation.
|
logformat <apache|squid>
logformat squid
|
Specify the format of the accesslog file. The supported formats are the
Apache Combined Log Format, and the Squid Access Log Format.
The default value is the Squid format. See the
Log Files and Log Statistics
sections for further details.
|
Vicompress can compress text pages, such as HTML, javascript,
stylesheets, PDF documents, and Microsoft Word documents,
before sending them to back to the client.
This results in faster download times, especially over slow modem connections.
Both static and dynamic pages can be compressed, such as output from PHP or CGI
scripts. Images and other binary file types are not compressed.
Vicompress checks the HTTP header
Accept-Encoding to determine whether the client's
browser supports gzip encoding or not.
Compression related options:
enable_compression
Vicompress can cache data in memory, such as html pages and images.
When a browser requests an item found in the cache, Vicompress
will send the response directly, rather than contacting the destination
webserver. For ISPs, this results in a faster response time for clients.
For single websites, this reduces load on the backend webserver.
Vicompress will not cache web pages that are generated dynamically,
such as through ASP, PHP, or CGI scripts. Vicompress uses the HTTP
headers Last-Modified and
Content-Length to determine
whether a response is dynamically generated or not. In addition,
Vicompress will not cache pages that are password protected (pages
that require the HTTP header Authorization.
Items will remain in the cache based on their age (the
Last-Modified header).
The expiration time is set to half of the item's age. For example, a web
page that was last modified 8 days ago will remain in the vicompress
cache for 4 days. In addition, the
cache_expires option sets an upper limit on the time an item can remain
in the cache.
When the in-memory cache becomes full, items that have not been
recently accessed are removed to make room for new items in the cache.
Users can view the list of URLs in the memory cache by logging into the
vicompress machine and sending the following special URL to vicompress:
http://<hostname>/_viewcache_
Vicompress will return a plain text list of the URLs in the cache, one
per line. Vicompress will return the list only for http clients on the
same machine as vicompress. Outside clients cannot access the cached URL
list.
Caching related options:
enable_caching
cache_memory
max_cacheditem_size
cache_expires
7. Load Balancing and Failover
|
Load Balancing
When one or more webserver entries are specified,
Vicompress will act as a reverse HTTP proxy, and will distribute requests
among the backend webservers. Vicompress uses a simple round-robin
algorithm for load distribution.
Failover
If Vicompress fails to connect to a backend web server, that web server
is marked as down, and will be skipped for future requests.
Clients that had previous sessions with that web server will be forwarded
to a new backend web server. Vicompress will try to re-connect to a down
web server every 3 minutes. If the connection succeeds, the web server is
marked as up again. If all backend web servers are down when a request
arrives, Vicompress will simply choose among the down web servers, in
round-robin fashion.
Sessions
Many web applications keep session information for each client, such as
shopping cart items. Session information may be stored on a central database,
or may be stored locally on individual web servers. If your website stores
session information on individual web servers, then a client's requests
cannot be distributed across multiple web servers. To force a client to
use the same backend web server throughout a session, Vicompress provides
the enable_sessions option. When enabled,
Vicompress will send the client a cookie to indicate which backend web server
to use:
Set-Cookie: vicompressid=1
For the duration of the session, the client browser will send the vicompress
Cookie for every request:
Cookie: vicompressid=1
Vicompress will forward the requests to the backend webserver specified by
the cookie. When the client browser is closed, the browser discards the
vicompress Cookie, and the session is ended. Note that if sessions are
enabled, client connections may not be evenly distributed across the
multiple backend web servers.
Load Balancing related options:
webserver
enable_sessions
Vicompress includes a tool to generate statistics about bandwidth, caching,
and compression.
To generate the log statistics, use the script:
/usr/local/vicompress/bin/update_log_stats.sh
This script takes one of three possible arguments:
start <dir>
| |
Run a daemon to generate/update the log statistics every hour.
Store the log statistics in the given directory <dir>.
If the <dir> argument is not given, the default directory
is /usr/local/vicompress/logstats.
|
stop | |
Stop the update_log_stats.sh program.
|
status | |
Print whether or not the update_log_stats.sh program is running.
|
To generate the log statistics, run the following command:
# cd /usr/local/vicompress/bin
# ./update_log_stats.sh start /usr/local/vicompress/logstats
|
The update_log_stats program will run in the background. Every hour,
it will parse the accesslog and write an
HTML statistics report to
/usr/local/vicompress/logstats/YYYYMMstats.html
where YYYY is the year, and MM is the month. For example:
/usr/local/vicompress/logstats/200501stats.html (Jan 2005)
/usr/local/vicompress/logstats/200502stats.html (Feb 2005)
/usr/local/vicompress/logstats/200503stats.html (Mar 2005)
This report will show the bandwidth saved with caching and compression for
- The entire month
- Each day of the month
- Each day of the week
- Each hour of the day
View a sample report.
In addition, an HTML date index file will be created containing
hyperlinks to all the monthly statistics.
/usr/local/vicompress/logstats/statsindex.html
Vicompress produces two log files: an access log and error log.
Access Log
The accesslog stores information about each client request on a single line.
It is used for gathering website statistics. The path of the accesslog is
determined by the configuration option accesslog:
accesslog /usr/local/vicompress/logs/accesslog
The log format is determined by the logformat option:
logformat <apache|squid>
The Apache Combined Log Format is described at
http://httpd.apache.org/docs/logs.html
. A summary of the format is given below. Note that Vicompress
uses the "ident" field (2nd field) to store cache and compression information.
clientip |
The IP Address of the client.
|
hit and compression |
Either "hit" or "miss", followed by the content length before compression.
|
username |
The username sent for authentication, or "-" if not given.
|
date |
The date of the response [day/month/year:hour:min:sec +/-timezone]
|
firstline |
The first line of the HTTP request (method url version).
|
replycode |
The server HTTP reply status code.
|
contentlength |
The length of the server reply body, in bytes.
|
referer |
The URL which referred the user to this website.
|
useragent |
The platform and version of the client browser.
|
|
Here is a sample Apache Log entry:
15.13.130.10 miss15923 - [21/Aug/2003:17:26:45 -0700] "GET /index.html HTTP/1.0" 200 1852 "http://www.google.com/" "Mozilla 4.0 (IE 6.0 compatible)"
|
The Squid Access Log Format is described at
http://www.squid-cache.org/Doc/FAQ/FAQ-6.html
.
The Squid format contains information similar to the Apache format.
The Squid format conatins additional information about cache hits, but does not
store the Referer or User-Agent headers. Note that Vicompress
uses the "ident" field (9th field) to store compression statistics.
date |
The date of the response, the number of seconds since 1970.
|
duration |
The duration of the response, in milliseconds.
|
client |
The IP address of the client.
|
hit status |
TCP_HIT if the request is a cache hit, else TCP_MISS.
|
replycode |
The server HTTP reply status code.
|
contentlength |
The length of the server reply body, in bytes.
|
method |
The HTTP request method (GET, POST, etc).
|
URL |
The requested URL.
|
compression |
The content length of the reply before compression.
|
peerstatus |
NONE if the request is a cache hit, else DIRECT.
|
peerhost |
The IP address of the backend web server, or "-" if a cache hit.
|
contenttype |
The content type of the HTTP reply, or "-" if not given.
|
|
Here is a sample Squid log entry:
1112387949.000 529 15.13.130.249 TCP_MISS/200 1031 GET http://www.amazon.com/somefile.jpg 8523 DIRECT/15.0.110.12 text/html
|
Log Rotation
The accesslog file can grow quickly under heavy load. Therefore, vicompress
will automatically rotate log files once they reach a certain size.
This size is given by the configuration option:
rotatesize <size in megabytes>
The default size is 1024,
or about 1 gigabyte.
When rotation occurs, Vicompress will execute the following:
mv accesslog.8 accesslog.9
mv accesslog.7 accesslog.8
mv accesslog.6 accesslog.7
mv accesslog.5 accesslog.6
mv accesslog.4 accesslog.5
mv accesslog.3 accesslog.4
mv accesslog.2 accesslog.3
mv accesslog.1 accesslog.2
mv accesslog accesslog.1
|
The current logfile (accesslog) is moved to accesslog.1. The previous
logfile at accesslog.1 is moved to accesslog.2, and so forth. The last
logfile at accesslog.9 is deleted.
For errorlog rotation, the same commands occur, except that the errorlog
file is moved, instead of the accesslog file.
Error Log
The error log stores error and debugging messages from Vicompress.
The path of the errorlog is determined by the configuration option errorlog:
errorlog /usr/local/vicompress/logs/errorlog
By default, Vicompress will write only basic startup and shutdown messages
to the error log, prefixed by the current date. The messages are shown below:
Vicompress started
Vicompress shutting down
Maximum concurrent clients is <number>
Cache size is <number> MB
|
Users can enable additional debug messages during runtime to troubleshoot
any problems with vicompress. Debugging is toggled (enabled/disabled)
by running the command below. Note that debugging is initially
disabled when vicompress is started.
# cd /usr/local/vicompress/bin
# ./vicompress.sh debug
|
Each debug message in the errorlog contains the date, a message, and
the IP address:port of the client connection.
Here is a sample entry:
[Tue May 10 17:18:41 2005] [127.0.0.1:52689] New client accepted.
Below is the complete list of debugging messages:
New client accepted.
Read http request from client: status=<status message>, url=<urlpath> <HTTP request>
Reading server reply from <IP address>:<port>
Read http reply from server <IP address>:<port>: status=<status message> <HTTP reply>
Writing server reply
Writing and caching server reply
Writing and compressing server reply
Writing, caching, and compressing server reply
Writing error reply to client
Writing cache url list reply to client
Writing cached reply to client
Writing direct reply to client
Wrote server reply: status=<status message>
Wrote cached reply: status=<status message>
Wrote direct reply: status=<status message>
Wrote error reply: status=<status message>
Wrote cache url list reply: status=<status message>
Keeping client connection alive
Closing connection
|
For the <HTTP request> and <HTTP reply>, vicompress
will print out the full http request and reply headers.
For <status message>, vicompress will print out one of
the messages below:
Success
Bad HTTP Request from client
Bad HTTP Reply from server
Client closed prematurely
Server closed prematurely
Connect failed
Error reading from client
Error writing to client
Error reading from server
Error writing to server
DNS lookup failed
|