dc.description.abstract | We present a method to extract a time series (Number of Active Requests (NAR))
from web cache logs which serves as a transport level measurement of internet traffic.
This series also reflects the performance or Quality of Service of a web cache. Using
time series modelling, we interpret the properties of this kind of internet traffic and
its effect on the performance perceived by the cache user.
Our preliminary analysis of NAR concludes that this dataset is suggestive of a
long-memory self-similar process but is not heavy-tailed. Having carried out more
in-depth analysis, we propose a three stage modelling process of the time series: (i)
a power transformation to normalise the data, (ii) a polynomial fit to approximate
the general trend and (iii) a modelling of the residuals from the polynomial fit. We
analyse the polynomial and show that the residual dataset may be modelled as a
FARIMA(p, d, q) process.
Finally, we use Canonical Variate Analysis to determine the most significant defining
properties of our measurements and draw conclusions to categorise the differences
in traffic properties between the various caches studied. We show that the strongest
illustration of differences between the caches is shown by the short memory parameters
of the FARIMA fit. We compare the differences revealed between our studied
caches and draw conclusions on them. Several programs have been written in Perl and
S programming languages for this analysis including totalqd.pl for NAR calculation,
fullanalysis for general statistical analysis of the data and armamodel for FARIMA
modelling. | en_US |