(In chronological order, grouped by topic)
Singular Value Perturbation and Deep Network Optimization
Rudolf H. Riedi, Randall Balestriero and Richard G. Baraniuk
Constructive Approximation (2023) 57:807–852
https://doi.org/10.1007/s00365-022-09601-5
We develop new theoretical results on matrix perturbation to shed light on the impact
of architecture on the performance of a deep network. In particular, we explain analytically
what deep learning practitioners have long observed empirically: the parameters
of some deep architectures (e.g., residual networks, ResNets, and Dense networks,
DenseNets) are easier to optimize than others (e.g., convolutional networks, ConvNets).
Building on our earlier work connecting deep networks with continuous
piecewise-affine splines, we develop an exact local linear representation of a deep
network layer for a family of modern deep networks that includes ConvNets at one
end of a spectrum and ResNets, DenseNets, and other networks with skip connections
at the other. For regression and classification tasks that optimize the squared-error
loss, we show that the optimization loss surface of a modern deep network is piece-wise
quadratic in the parameters, with local shape governed by the singular values of
a matrix that is a function of the local linear representation. We develop new perturbation
results for how the singular values of matrices of this sort behave as we add a
fraction of the identity and multiply by certain diagonal matrices. A direct application
of our perturbation results explains analytically why a network with skip connections
(such as a ResNet or DenseNet) is easier to optimize than a ConvNet: thanks to its
more stable singular values and smaller condition number, the local loss surface of
such a network is less erratic, less eccentric, and features local minima that are more
accommodating to gradient-based optimization. Our results also shed new light on the
impact of different nonlinear activation functions on a deep network’s singular values,
regardless of its architecture.
A strong Tauberian theorem for characteristic functions
R. Ried and P. Gonçalves
Appl. Comput. Harmon. Anal.
Letter to the Editor, no 39, p 552-557, 2015
Using wavelet analysis we show
that if the characteristic function of a random variable X
can be approximated at 0 by some polynomial of even degree 2p
then the moment of order 2p of X exists. This strengthens a
Tauberian-type result by Ramachandran and implies that the
characteristic function is actually 2p times differentiable
at 0.
This fact also provides the theoretical basis for a
wavelet based non-parametric estimator of the tail index of a distribution.
Cascades Infiniment Divisibles
R. Riedi
Gazette des Mathématiciens (Soc. Math. France)
numéro spécial "Benoît Mandelbrot, Père de la géométrie fractale"
2013, no 136, p 119-128
L'objectif de cette contribution est de relier les
méthodes de travail particulières et le génie innovateur
de Benoît Mandelbrot.
Le contexte choisi est le développement récent
des cascades infiniment divisibles
qui a été fortement influencé par Mandelbrot
et qui a aidé à construire des ponts entre
les domaines de la théorie multifractale et les
processus stochastiques.
Brownian Motion
R. H. Riedi
in
Encyclopedia of Environmetrics, A.-H. El-Shaarawi and W. Piegorsch (eds), John Wiley & Sons Ltd: Chichester, UK.
Published online 1/15/2013 at the
home page of the Encyclopedia at Wiley Online Library
This article on Brownian motion starts with a short historical survey on its discovery
and importance and continues with its most relevant properties. Mathematically detailed
formulas are given and connections between the properties are indicated,
however no derivations or proofs are provided.
The properties covered start with the mathematical modeling of
Brownian motion via the Wiener process; equivalent definitions and
constructions such as the Lévy characterization, the midpoint displacement
and the re-scaled random walks are discussed.
Basic properties given include uniqueness, symmetry,
restarted Brownian motion, and time inversion. More advanced topics treated cover
stopping times, the strong Markov property, the reflection principle, extreme values,
escape times from a strip, and last visits. Discussed are also
path oscillations, stationarity, Gaussian white noise, and Karhunen-Löwe expansions.
Briefly mentioned are the Brownian motion with drift,
the Brownian bridge, as well as intersections, recurrence and transience of
Brownian motion in higher dimensions.
Multifractal Analysis of Infinite Products of Stationary Jump Processes
P. Mannersalo, I. Norros and R. Riedi
Journal of Probability and Statistics
vol. 2010, Article ID 807491, 26 pages, (2010).
version with a higher level of details available as
Technical Report TR2009-02,
Department of Statistics, Rice University, (June 2009)
There has been growing interest in constructing stationary measures
with known multifractal properties. In an earlier paper, the authors
introduced the multifractal products of stochastic processes
(MPSP) and provided basic properties concerning convergence,
non-degeneracy and scaling of moments. This paper considers a
subclass of MPSP, termed
multifractal jump process (MJP),
which is determined by jump processes with i.i.d. exponentially
distributed interjump times. In this paper, the information
dimension and a multifractal spectrum of the MJP are computed. As a
side result it is shown that the random partitions imprinted
naturally by the Poisson arrivals are sufficient to determine the
spectrum in this case.
Infinitely Divisible Shot-Noise: Modeling Fluctuations in
Networking, Finance and Turbulence
R. Riedi and D. Gershman,
ICNF 07,
Proc.\ International Conference on Noise and Fluctuations, Tokyo,
Japan, September 2007
Shot-Noise Cascades
R. Riedi,
Fractal Geometry &
Stochastics 4,
Greifswald, Germany, September 2008.
Infinitely Divisible Shot-Noise:
Cascades of Non-Compact Pulses
R. Riedi
Technical Report TR2010-02,
Department of Statistics, Rice University,
(March 2009)
Both conference papers provide an overview over recent advances in the study of
Cascades of Pulses, a generalization of
the infinitely divisible cascades. Special cases of these cascades of pulses
include the shot-noise cascades.
To the best of our knowledge, this is the first analysis of cascades with
pulses that are not compactly supported. Non-compact support allows for multiplying
natural pulses such as exponentials, Gaussians, or mother wavelets.
The pulses having potentially infinite
support precludes the use of certain standard techniques for establishing its
multifractal properties. As a first step towards overcoming this hurdle, the paper
establishes path regularity and scaling properties of the limiting
cascade. Conditions and
formulas take simple forms in the shot-noise cases.
The technical report provides technical details and proofs.
Measurement-Based
Analysis, Modeling, and Synthesis of the Internet Delay Space
Bo Zhang, T. S. Eugene Ng, Animesh Nandi, Rudolf Riedi, Peter Druschel,
Guohui Wang
Transactions on Networking
18(1), 229-242, 2010
presented at
Internet Measurement Conference IMC2006, Rio de Janeiro, Brazil, October
2006.
The term 'Internet delay space' refers to the all-pairs set of static
round-trip propagation delays among edge networks in the Internet.
Understanding its characteristics is important for the design of
global-scale distributed systems. For instance, self-organization
algorithms used in overlay networks are sensitive to the triangle
inequality violations and the growth properties within the Internet delay
space. Since designers of distributed systems often rely on simulation and
emulation to study design alternatives, having a realistic model of the
Internet delay space is important.
Our analysis shows that existing delay space models do not adequately
capture important properties of the Internet delay space. In this paper, we
analyze measured delays among thousands of Internet edge networks and
identify key properties that are important for distributed system design.
Furthermore, we derive a simple model of the Internet delay space based on
our analytical findings. This model preserves the relevant metrics far
better than existing models, allows for a compact representation, and can
be used to synthesize delay data for simulations and emulations at a scale
where direct measurement and storage is impractical.
A Cloud Data Center Optimization Approach Using Dynamic Data Interchanges
Rappos Efstratios, Robert Stephan , Riedi Rudolf
IEEE 2nd International Conference on Cloud Networking (CloudNet),
San Francisco, CA, November 2013, p 175-179.
Distributed data center architectures have been
recently developed for a more efficient and economical storage of
data. In many models of distributed storage, the aim is to store
the data in such a way so that the storage costs are minimized
and increased redundancy requirements are maintained. However,
many approaches do not fully consider issues relating to
delivering the data to the end user and the associated costs that
this creates. We present an integer programming optimization
model for determining the optimal allocation of data components
among a network of Cloud data servers in such a way that the
total costs of additional storage, estimated data retrieval costs and
network delay penalties is minimized. The method is suitable for
periodic dynamic reconfiguration of the Cloud data servers, so
that the when localized data request spikes occur the data can
be moved to a closer or cheaper data server for cost reduction
and increased efficiency.
Opportunities
and Limits of Remote Timing Attacks
S. Crosby, R. Riedi and D. Wallach
ACM Transactions on Information and Systems Security (TISSEC)
12(3), 485--507, (January 2009).
Many algorithms can take a variable amount of time to complete
depending on the data being processed. These timing differences can
sometimes disclose confidential information. Indeed, researchers have
been able to reconstruct an RSA private key purely by
querying an SSL web server and timing the results. Our work analyzes
the limits of attacks based on accurately measuring network
response times and jitter over a local network and across the Internet.
We present the design of filters based on statistical methodologies
to significantly reduce the effects
of jitter, allowing an attacker to measure events with
15-100 micro-sec accuracy across the Internet, and as good as 100ns over a
local network. Notably, security-related algorithms on web servers and
other network servers need to be carefully engineered to avoid timing channel
leaks at the accuracy demonstrated in this paper.
Safari:
A Self-Organizing, Hierarchical Architecture for Scalable Ad Hoc Networking.
S. Du, A. Khan, S. PalChaudhuri, A. Post, A. Saha, P. Druschel, D. Johnson and R. Riedi
Ad Hoc Networks
6(4), 485--507, (2008).
As wireless devices become more pervasive, mobile ad hoc networks are gaining importance, motivating the development of highly scalable ad hoc networking techniques. In this paper, we give an overview of the Safari architecture for highly scalable ad hoc network routing, and we present the design and evaluation of a specific realization of the Safari
architecture, which we call Masai. We focus in this work on the scalability of learning and maintaining the routing state necessary for a large ad hoc network. The Safari architecture provides scalable ad hoc network routing, the seamless integration of infrastructure networks when and where they are available, and the support of self-organizing, decentralized network applications.
Safari’s architecture is based on (1) a self-organizing network hierarchy that recursively groups participating nodes into an adaptive, locality-based hierarchy of cells; (2) a routing protocol that uses a hybrid of proactive and reactive routing information in the cells and scales to much larger numbers of nodes than previous ad hoc network routing protocols; and (3) a distributed hash table grounded in the network hierarchy, which supports decentralized network services on top of Safari.
We evaluate the Masai realization of the Safari architecture through analysis and simulations, under varying network sizes, fraction of mobile nodes, and offered traffic loads. Compared to both the DSR and the L+ routing protocols, our results show that the Masai realization of the Safari architecture is significantly more scalable, with much higher packet delivery ratio and lower overhead.
Keywords: Ad hoc network routing; Peer-to-peer networking; Landmark routing; Hierarchical routing; Reactive routing; Proactive routing; Hybrid routing; Safari; Masai
On the Impact of Realism of Mobility Models for Wireless Networks
H. Flores, R. Riedi, S. Eidenbenz and N. Hengartner
IEEE Globecom 2008,
Global Communications Conference,
New Orleans, LA, December 2008.
We present PedSims, a suite of mobility models
spanning the entire range from the classic Random Waypoint
(RWP), to correlated movement, to trace-level-quality social-
activity-based mobility. Instrumenting PedSims at various levels
of structural complexity we assess and quantify the impact
of real world mobility patterns, such as common interests or
synchronous behavior, on ad hoc network performance. Thus,
PedSims provides not only a flexible platform for mobility
simulation but also a means to strike a trade-off between a
tolerable error and computational complexity. Notably, as our
study of DSR, AODV and SAFARI finds, neglecting to
capture mobility patterns may result in inversion of the ranking
of routing protocols as well as in over- and under-estimation
of network performance. Further contributions include a bias
correction for the sampling of node-to-node connection time.
Bounds
on the Benefit of Network Coding:
Throughput and Energy Saving in Wireless Networks
A. Keshavarz-Haddad, R. Riedi
IEEE INFOCOM 2008,
27th Ann. Conf. Computer Communications,
Phoenix, Arizona, April 2008.
Bounds
on the Benefit of Network Coding for Wireless Multicast and Unicast
A. Keshavarz-Haddad and R. Riedi
IEEE Transactions on Mobile Computing
vol 13(1), 102-115, Jan 2014.
In this work we establish fundamental limitations to
the benefit of network coding in terms of energy and throughput
in multihop wireless networks. Thereby we adopt two well
accepted scenarios in the field: single multicast session and
multiple unicast sessions. Most of our results apply to arbitrary
wireless network and are, in particular, not asymptotic in kind.
In terms of throughput and energy saving we prove that the
gain of network coding of a single multicast session is at most a
constant factor. Also, we present a lower bound on the expected
number of transmissions of multiple unicast sessions under an
arbitrary network coding. We identify scenarios for which the
network coding gain for energy saving becomes surprisingly close
to 1, in some cases even exactly 1, corresponding to no benefit
at all. Interestingly, we prove that the gain of network coding
in terms of transport capacity is bounded by a constant factor pi
in any arbitrary wireless network and for all traditional channel
models. This shows that the traditional bounds on the transport
capacity do not change more than constant factor pi if
we employ network coding. As a corollary, we find that the gain
of network coding on the throughput of large scale homogeneous
wireless networks is asymptotically bounded by a constant. Note
that our result is more general than the previous work [5] and
it is obtained by a different technique. In conclusion, we show
that in contrast to wired networks, the network coding gain in
wireless networks is constraint by fundamental limitations.
Multicast Capacity
of Large Homogeneous Multihop Wireless Networks
A. Keshavarz-Haddad, R. Riedi
6th Intern. Symp. on Modeling and
Optimization in Wireless Networks,
WiOPT’08, Berlin, Germany, April 2008.
Most existing work on multicast capacity of large
homogeneous networks is based on a simple model for wireless
channel, namely the Protocol Model. In this paper,
we exploit a local capacity tool called arena which we introduced
recently in order to render multicast accessible to analysis also
under more realistic, and notably less pessimistic channel models.
Through the present study we find three regimes of the multicast
capacity for a homogeneous network depending on the ratio
of terminals among the nodes of the network. We note that
the upper bounds we establish under the more realistic channel
assumptions are only root of log(n) larger than the existing bounds.
Further, we propose a multicast routing and time scheduling
scheme to achieve the computed asymptotic bound over all
channel models except the simple Protocol Model. To this end,
we employ percolation theory among other analytical tools.
Finally, we compute the multicast capacity of large mobile
wireless networks. Comparing the result to the static case reveals
that mobility increases the multicast capacity. However, the
mobility gain decreases when increasing the number of terminals
in a fixed size mobile network.
Arena Function: A Framework for Computing Capacity Bounds in Wireless Networks
A. Keshavarz-Haddad and R. Riedi
IEEE Transactions on Wireless Communications
no 17, issue 2, p 1134-1146, 2018
Bounds for the Capacity of Wireless Multihop Networks
Imposed by Topology and Demand
A. Keshavarz-Haddad and R. Riedi
MobiHoc 07,
8th ACM International Symposium on Mobile Ad Hoc Networking and Computing, September 2007
Bounds on the capacity of wireless networks often
rely on simplifying assumptions and are given in terms of
coarse network parameters such as the number of nodes. While
useful due to their simplicity such bounds can significantly
overestimate the achievable capacity in real world situations,
ignoring actual network topology and traffic patterns.
The results
of this paper improve such analytical results on network capacity
in several ways. At the heart of our methodology lies the concept
of transmission arenas which indicate the presence of active
transmissions near any given location in the network. This novel
space-based approach is well suited to untangle the interactions
of simultaneous transmissions. Avoiding a graph-based model of
the network it opens new avenues of studying capacities.
For
homogeneous networks we recover classical bounds. However,
our methodology applies to arbitrary networks and can, thus,
inform placing and activating of nodes also in the presence of
clustering. Our method works with all classical channel models
and dimensions. It provides bounds on the transport capacity
which involve only high level knowledge of node locations, such
as the length of Euclidean Minimum Spanning Tree. As an
additional novelty we establish bounds on wireless unicast and
multicast capacities.
Broadcast
Flooding Revisited: Survivability and Latency
P. Mannersalo, A. Keshavarz-Haddad, and R. Riedi
IEEE INFOCOM 2007,
Anchorage, Alaska, USA, May 2007.
This paper addresses the dynamics of broadcast flooding in random
wireless ad hoc networks. In particular, we study the subset of
nodes covered by a flood as well as timing issues related to the
first (latency) and the last time (duration of back-chatter) at
which a broadcast is received by a fixed node. Notably, this
analysis takes into account the MAC-layer as well as background
traffic which both are often neglected in related studies.
Assuming a protocol model for the transmission channel which
accounts for carrier sensing and interference, we find bounds for
the probability of survival of the flood and for its coverage
probabilities. Moreover, under certain conditions on the parameters,
we establish asymptotical linear bounds on the latency as the
distance from the origin of the flood increases and show that the
duration of the back-chatter is stochastically bounded. The
analytical results are compared to simulation.
Towards obtaining our results using models from percolation theory
proved to be crucial.
Broadcast
Capacity in Multihop Wireless Networks
A. Keshavarz-Haddad, V. Ribeiro, R. Riedi,
MobiCom’06, Los Angeles, CA, September 2006
On
the Broadcast Capacity of Multihop Wireless Networks: Interplay of Power,
Density and Interference
A. Keshavarz-Haddad and R. Riedi,
SECON 2007, San Diego, CA, June 2007.
In these papers we study the broadcast capacity of multihop wireless networks
which we define as the maximum rate at which broadcast packets can be
generated in the network such that all nodes receive the packets
successfully in a limited time.
In the MobiCom paper, we employ the Protocol Model for successful
packet reception and provide novel upper and lower bounds for the broadcast
capacity for arbitrary connected networks. In a homogeneous dense network
these bounds simplify to O(W/max(1,D^d)) where W is the wireless channel
capacity, D the interference parameter, and d the number of dimensions of
space in which the network lies. Interestingly, we show that the broadcast
capacity does not change by more than a constant factor when we vary the
number of nodes, the radio range, the area of the network, and even the
node mobility. To address the achievability of capacity, we demonstrate
that any broadcast scheme based on a backbone of size proportional to the
Minimum Connected Dominating Set guarantees a throughput within a constant
factor of the broadcast capacity. Finally, we demonstrate that broadcast
capacity, in stark contrast to unicast capacity, does not depend on the
choice of source nodes or the dimension of the network.
In the SECON paper, we
employ the Physical Model and teh Generalized Physical
Model for the channel. Under the Physical Model, we
find that the broadcast capacity is within a constant factor of the
channel capacity for a wide class of network topologies. Under the
Generalized Physical Model, on the other hand, the network
configuration is divided into three regimes depending on how the
power is tuned in relation to network density and size and in which
the broadcast capacity is asymptotically either zero, constant or
unbounded. As we show, the broadcast capacity is limited by distant
nodes in the first regime and by interference in the second regime.
In the second regime, which covers a wide class of networks, the
broadcast capacity is within a constant factor of the bandwidth.
DRB and DCCB:
Efficient and Robust Dynamic Broadcast for Ad Hoc and Sensor Networks
A. Keshavarz-Haddad, V. Ribeiro and R. Riedi
SECON 2007, San Diego, CA, June 2007.
Deterministic, timer-based broadcast schemes not only
guarantee full reachability over an idealistic lossless
MAC layer, they also stand out for their robustness
against node failure as well as more general
changes in the network topology. This paper
proposes the first broadcast schemes in this class
which provably perform within a factor of the optimal
efficiency (in terms of number of rebroadcasts).
To the best of our knowledge no other deterministic
timer-based scheme possesses this property. NS-2 simulations employing
the 802.11b MAC protocol confirm our analysis. The factor can be
estimated to be quite small.
Novel to the proposed schemes is also their hybrid backbone
consisting of a given, static Dominating Set (DS) and a
dynamically computed set of connecting nodes. As an additional
contribution, this paper studies the trade-off of timer settings
(and thus latency) against the number of rebroadcasts, as well as
the robustness of the proposed algorithms. To the best of our
knowledge, no prior study of these issues in the context of broadcast exists.
Describing
MANETS: Principal Component Analysis of Sparse Mobility Traces
N. Hengartner, H. Flores, S. Eidenbenz and R. Riedi
Intern. Workshop on Performance Evaluation of Wireless Ubiquitous Networks,
Malaga, Spain, October 2006.
Data collected in realistic mobility traces for mobile ad hoc
networks(MANETS) is intrinsically high dimensional. Principal Component
Analysis (PCA) is a good tool for reducing the data dimension by extracting
important features of the data. We propose a method for computing principal
components using iterative regression for matrices with missing values with
an application to node degree time series. We expand this method to handle
an additional dimension of information for a defined neighborhood ancestry
of node degree, exposing patterns when they exist. We test our methodology
on node degree data from a simulated university campus model (Pedsims) and
real campus data. Results indicate that in both cases, the student's major
field of study along with class schedule are strong factors to
differentiate mobile node degree time series. The ability to detect
differences is a powerful tool for application specific network management,
allowing for: optimal placement of routers, design of specialized protocols
for various user populations and lending insight to gauging the
energy/bandwidth needs of mobile devices.
Color-Based
Broadcasting for Ad Hoc Networks
4th Intern. Symp. on Modeling and
Optimization in WirelessNetworks,
WiOPT’06, Boston, MA, April 2006.
A. Keshavarz-Haddad, V. Ribeiro, R. Riedi
This paper develops a novel color-based broadcast scheme for wireless ad
hoc networks where each forwarding of the broadcast message is assigned a
color from a given pool of colors. A node only forwards the message if it
can assign it a color from the pool which it has not already overheard
after a random time. In the closely related counter-based broadcast scheme
a node simply counts the number of broadcasts not the colors overheard. The
forwarding nodes form a so-called backbone, which is determined by the
random timers and, thus, is random itself. Notably, any counter-generated
backbone could result from pruning a color-generated backbone; the typical
color-generated backbone, however, exhibits a connectivity graph richer
than the counter-based ones. As a particular advantage, the colors reveal
simple geometric properties of the backbones which we exploit to prove that
the size of both, color- and counter-generated backbones are within a small
constant factor of the optimum. We also propose two techniques, boosting
and edge-growing, that improve the performance of color- and counter-based
broadcast in terms of reachability and number of rebroadcasts. Experiments
reveal that the powerful boosting method is considerably more effective
with the color-based schemes.
On
Non-Scale-Invariant Infinitely Divisible Cascades
IEEE Trans IT, 51 (3), pp 1063--1083, (March 2005)
[Short versions:
Warped
infinitely divisible cascades: beyond power laws
(Traitement du Signal 22(1),
2005)
Compound Poisson Cascades
(Proc. Colloque "Autosimilarite et Applications" Clermont-Ferrant, France,
May 2002;
appeared in Annales Mathematiques Blaise
Pascal.)
Scale invariant Infinitely
Divisible Cascades [PS]<>
(Proceedings of PSIP'2003 - 2nd Internat. Symp. on
Physics in Signal and Image Processing, Grenoble, France, January 2003)
On non scale invariant
Infinitely Divisible Cascades (in pdf)
(19th GRETSI Symposium on Signal
and Image Processing Paris, France, September 2003) ]
P. Chainais, R. Riedi and P. Abry,
Multiplicative processes and multifractals proved useful in various
applications ranging from hydrodynamic turbulence to computer network
traffic, to name but two. Placing multifractal analysis in the more general
framework of infinitely divisible laws, we design processes which possess
at the same time stationary increments as well as multifractal and more
general infinitely divisible scaling over a continuous range of scales. The
construction is based on a Poissonian geometry to allow for continuous
multiplication. As they possess compound Poissonian statistics we term the
resulting processes compound Poisson cascades. We explain how to
tune their correlation structure, as well as their scaling properties, and
hint at how to go beyond scaling in form of pure power laws towards more
general infinitely divisible scaling. Further, we point out that these
cascades represent but the most simple and most intuitive case out of an
entire array of infinitely divisible cascades allowing to construct general
infinitely divisible processes with interesting scaling properties.
Keywords: Process synthesis, Infinitely divisible cascade,
multifractal process, compound Poisson distribution, Brownian motion,
multifractal time, random walk.
Diverging moments and
parameter estimation
Paulo Goncalves and R. H. Riedi
J. American Statistical Association, 100
(472), 1382-1393, (December 2005). [PS]
Heavy tailed distributions enjoy increased popularity and become more
readily applicable as the arsenal of analytical and numerical
tools grows. They play key roles in modeling approaches in networking,
finance, hydrology to name but a few.
The tail parameter is of central importance as it governs both the
existence of moments of positive order and the thickness of the tails of
the distribution. Some of the best known tail estimators such as
Koutrouvelis and Hill are either parametric or show lack in robustness or
accuracy. This paper develops a shift and scale invariant, non-parametric
estimator for both, upper and lower bounds for orders with finite moments.
The estimator builds on the equivalence between tail behavior and the
regularity of the characteristic function at the origin and achieves its
goal by deriving a simplified wavelet analysis which is particularly suited
to characteristic functions.
Keywords: Diverging moments, heavy tail distributions,
characteristic functions, wavelet transform, multifractal analysis.
A strong Tauberian theorem for characteristic functions
R. Riedi and P.Gonçalves
Appl. Comput. Harmon. Anal.
Letter to the Editor, no 39, p 552-557, 2015
Using wavelet analysis we show
that if the characteristic function of a random variable X
can be approximated at 0 by some polynomial of even degree 2p
then the moment of order 2p of X exists. This strengthens a
Tauberian-type result by Ramachandran and implies that the
characteristic function is actually 2p times differentiable
at 0.
This fact also provides the theoretical basis for a
wavelet based non-parametric estimator of the tail index of a distribution.
Spatio-Temporal
Available Bandwidth Estimation with STAB
V. J. Ribeiro, R. H. Riedi and R. G. Baraniuk
IEEE Internet Computing Magazine, 8 (5), pp 34-41.
Spatio-Temporal
Available Bandwidth Estimation with STAB
V. J. Ribeiro, R. H. Riedi and R. G. Baraniuk
Proceedings
SIGMETRICS/Performance'04, New York, NY, June 2004.
STAB stands for Spatio-temporal Available bandwidth and is a new
edge-based probing tool to locate tight links on an end-to-end
network path in space and over time. Tight links are those links
with less available bandwidth than all links preceding them on the
path. STAB uses special chirp probing trains that allow accurate
estimation of available bandwidth using few packets. Tight link
localization helps network operations and troubleshooting,
provides insight into the causes of network congestion, as well as
aids network-aware applications. We review the current
state-of-the-art available bandwidth tools, describe the working
of STAB, and validate it through Internet experiments and
simulations.
Multiscale
Queuing Analysis
V. J. Ribeiro, R. H. Riedi and R. G. Baraniuk
IEEE Trans. on Networking, scheduled to appear October 2006.
This paper introduces a new multiscale framework for estimating the
tail probability of a queue fed by an arbitrary traffic process.
Using traffic statistics at a small number of time scales, our
analysis extends the theoretical concept of the critical time scale
and provides practical approximations for the tail queue probability.
These approximations are non-asymptotic; that is they apply to any
finite queue threshold. While our approach applies to any traffic
process, it is particularly apt for long-range-dependent (LRD)
traffic. For LRD fractional Brownian motion, we
prove that a sparse exponential spacing of time scales yields optimal
performance. Simulations with LRD traffic models and real Internet
traces demonstrate the accuracy of the approach. Finally, simulations
reveal that the marginals of traffic at multiple time scales have a
strong influence on queuing that is not captured well by its global
second-order correlation in non-Gaussian scenarios.
Fractals in Networking:
Modeling and Inference
R. Riedi, A. Keshavarz-Haddad, Shriram Sarvotham and Richard G. Baraniuk
Proceedings of Fractals 2004, Conference on ``Fractals and Complexity in
Nature'', Vancouver, Canada, April~2004
The use of fractal models in computer networking has a strong tradition.
Most prominently, fractional Brownian motion provides a parsimonious
abstraction of aggregate traffic at time scales of roughly a second and
beyond, which is found useful for design and management and which explains
the occurrence of detrimental traffic bursts in terms of user behavior. At
time scales small enough to be relevant for control, queueing and
multiplexing, on the other side, multifractal cascades appear to be more
accurate than self-similar models. cause two models and novel queueing
approaches In measured traffic traces, alpha connection can easily subsume
half of the bandwidth. Yet, they appear to offer workload at a constant but
high rate to the queue. Thus, the On-Off burst model represents the current
state of the internet more accurately than the self-similar burst model,
where alpha connections are modelled as depositing their entire load
instantaneously into the queue. Furthermore, since no increased loss is
evident during the presence of alpha connections we must conclude that the
condition $\rho
Network and User
Driven On-Off Source Model for Network Traffic
Shriram Sarvotham, Rudolf H. Riedi, and Richard G. Baraniuk
Special Issue of the Computer Network Journal on "Long-range
Dependent Traffic", Computer Networks, 48 (2005), p 335-350
We shed light on the effect of network resources on user behavior through a
physically motivated model for network traffic. The classical on-off model
has been successful in capturing the correlation of traffic, in particular
the long-range dependence, allowing to conclude that at timescales beyond
round trip time the transport protocol mechanisms do not matter. However,
the on-off model fails to capture small scale burstiness of the network
traffic of relevance to transfer protocols and congestion control. Through
observations at the connection-level we conclude that small rate sessions
can be characterized by independent duration and rate, while large rate
sessions show independent filesize and rate. In other words, patience is
the limiting factor of the small bandwidth user, while users with large
bandwidth freely choose their files. Incorporating this insight into the
on-off model we arrive at a more realistic and accurate model which
incorporates in a parsimonious way different user behavior as affected by
the network resources and provides a deeper understanding of burstiness and
long range dependence in network traffic.
Optimal sampling strategies for multiscale
stochastic processes
IMS Lecture Notes–Monograph Series, 2nd Lehmann Symposium –
Optimality 49, 266–290, (2006)
Sampling Strategies for Multiscale
Models with Application to Network Traffic Estimation
Proceedings Workshop on Statistical Signal Processing SSP03, St. Louis,
MO, Sept 2003
Optimal
Sampling Strategies for Tree-based Time Series
TR2004-05, Rice University, Dept. of Statistics, August 2004
to be submitted to Proceedings Lehmann Symposium on Optimality
Vinay J. Ribeiro, Rudolf H. Riedi and Richard G. Baraniuk
In this paper, we determine which non-random sampling of fixed size gives
the best linear predictor of the sum of a finite spatial population. We
employ different multiscale superpopulation models and use the minimum
mean-squared error as our optimality criterion. In multiscale
superpopulation tree models, the leaves represent the units of the
population, interior nodes represent partial sums of the population, and
the root node represents the total sum of the population.We prove that the
optimal sampling pattern varies dramatically with the correlation structure
of the tree nodes. While uniform sampling is optimal for trees with
“positive correlation progression”, it provides the worst
possible sampling with “negative correlation progression.” As
an analysis tool, we introduce and study a class of independent innovations
trees that are of interest in their own right. We derive a fast
water-filling algorithm to determine the optimal sampling of the leaves to
estimate the root of an independent innovations tree.
pathChirp: Efficient Available Bandwidth
Estimation for Network Paths
Vinay J. Ribeiro, Rudolf H. Riedi, Jiri Navratil, Les Cottrell, and Richard
G. Baraniuk
Proceedings Workshop on Passive and Active Measurement PAM2003
Best Student Paper Award
This paper presents pathChirp, a new active probing tool for
estimating the available bandwidth on a communication network path. Based
on the concept of ``self-induced congestion,'' pathChirp features an
exponential flight pattern of probes we call a chirp. Packet chips
offer several significant advantages over current probing schemes based on
packet pairs or packet trains. By rapidly increasing the probing rate
within each chirp, pathChirp obtains a rich set of information from which
to dynamically estimate the available bandwidth. Since it uses only packet
interarrival times for estimation, pathChirp does not require synchronous
nor highly stable clocks at the sender and receiver. We test pathChirp with
simulations and Internet experiments and find that it provides good
estimates of the available bandwidth while using only a fraction of the
number of probe bytes that current state-of-the-art techniques use.
Keywords: probing, networks, bandwidth, available bandwidth, chirp,
end-to-end, inference
The Multiscale Nature of Network
Traffic: Discovery, Analysis, and Modelling
Patrice Abry, Richard Baraniuk, Patrick Flandrin, Rudolf Riedi, Darryl Veitch
IEEE Signal Processing Magazine vol 19, no 3, pp 28--46 (May 2002).
The complexity and richness of telecommunications traffic is such that one
may despair to find any regularity or explanatory principles. Nonetheless,
the discovery of scaling behavior in tele-traffic has provided hope that
parsimonious models can be found. The statistics of scaling behavior
present many challenges, especially in non-stationary environments. In this
paper, we overview the state of the art in this area, focusing on the
capabilities of the wavelet transform as a key tool for unravelling the
mysteries of traffic statistics and dynamics.
Keywords: Computer network traffic, Tele-traffic, Wavelets, Scaling,
Self-similarity, Long-Range Dependence, Fractals, Multifractals, Cascade
Processes, Multiplicative Processes, Infinitely Divisible Cascades,
Fractional Brownian Motion.
Long-Range
Dependence: Now you see it now you don't! [PDF]
T. Karagiannis, M. Faloutsos and R. H. Riedi
Proceedings Global Internet, Taiwan, November 2002
Over the last few years, the network community has started to rely heavily
on the use of novel concepts such as self-similarity and Long-Range
Dependence (LRD).
Despite their wide use, there is still much confusion regarding the
identification of such phenomena in real network traffic data. In this
paper, we show that estimating Long Range Dependence is not
straightforward: there is no systematic or definitive methodology. There
exist several estimating methodologies, but they can give misleading and
conflicting estimates. More specifically, we arrive at several conclusions
that could provide guidelines for a systematic approach to LRD. First,
long-range dependence may exist even, if the estimators have different
estimates in value.
Second, long-range dependence is unlikely to exist, if there are several
estimators that do not ``converge'' statistically to a value. Third, we
show that periodicity can obscure the analysis of a signal giving partial
evidence of long range dependence. Fourth, the Whittle estimator is the
most accurate in finding the exact value when LRD exists, but it can be
fooled easily by periodicity.
As a case-study, we analyze real round-trip time data. We find and remove a
periodic component from the signal, before we can identify long-range
dependence in the remaining signal.
Keywords: Long Range Dependence, parameter estimation,
round-trip-time on the Internet.
Network Traffic Modeling using
Connection-Level Information
Xin Wang, Shriram Sarvotham, Rudolf H. Riedi, and Richard G. Baraniuk
Proceedings SPIE ITCom, Boston, MA, August 2002
Network Traffic Analysis and
Modeling at the Connection Level
S. Sarvotham, R. Riedi, and R. Baraniuk
Proceedings IEEE/ACM SIGCOMM Internet Measurement Workshop 2001, San
Francisco, CA.
Related conference papers:
Additive and Multiplicative
Mixture trees for Network Traffic Modeling
S. Sarvotham, X. Wang, R. Riedi, and R. Baraniuk
Proceedings ICASSP Orlando, FL, (May 2002).
Connection-Level Modeling of
Network Traffic
X. Wang, S. Sarvotham, R. Riedi, and R. Baraniuk
Proceedings DIMACS Workshop on Internet and WWW Measurement, Mapping and
Modeling, 2002, Rutgers, NJ, (February 2002).
Technical Report:
Technical Report, ECE Dept.,
Rice University, June 30, 2001
Network traffic analysis and modeling needs to reflect long-range-dependent
(LRD) correlations and non-Gaussian marginal distributions, properties
which affect performance, control, design and traffic management. The
on/off model owes its success to its physical relevance, relating LRD to
customer behavior, but it falls short in explaining convincingly the bursty
behavior present from small time scale up to TCP round trip time and
larger.
Importantly, in a model based on superposition of equal on/off sources,
traffic bursts arise from many connections being active simultaneously. To
the contrary, the careful analysis on the connection-level in this paper
reveals that traffic bursts typically arise from a few high-volume
connections (alpha traffic) that dominate all others. More
importantly, we find that alpha traffic is caused uniquely by large files
transmissions over high bandwidth links. Finally, we report a strong
correlation between short round-trip times and large bandwidth, suggesting
that actual geographical location and topology of a network has a strong
influence on the burstiness of data traffic.
These observations lead naturally to our alpha-beta traffic model, where
heavy-tailed connection durations give rise to LRD and, in conjunction with
the heterogeneity of the network resources produce burstiness. Towards a
better understanding of this phenomenon, we propose a novel tree-based
model which is flexible enough to accommodate Gaussian as well as bursty
behavior on different scales in a parsimonious way. Implication to network
traffic simulation as well as performance related issues are discussed. We
present a fast scheme to separate the alpha and beta components of traffic
using wavelet denoising.
Multifractal products of stochastic processes:
construction and some basic properties
[Preliminary results (COST
257 (1999) p31)]
P. Mannersalo, I. Norros and R. Riedi
Advances in Applied Probability, 34 (4), Dec 2002, pp 888-903.
In various fields, such as teletraffic and economics, measured times series
have been reported to adhere to multifractal scaling. Classical cascading
measures possess multifractal scaling, but their increments form a
non-stationary process. To overcome this problem we introduce a
construction of random multifractal measures based on iterative
multiplication of stationary stochastic processes, a special form of
T-martingales. We study L2-convergence, non-degeneracy and continuity of
the limit process. Establishing a power law for its moments we obtain a
formula for the multifractal spectrum and hint at how to prove the full
formalism.
Zooming Statistics: Inference across scales
J. Hannig, J. S. Marron and R. H. Riedi
J. Korean Statistical Society 30(2) (June 2001), pp.
327--345
New statistical methods are needed to analyze data in a multi-scale way.
Some multi-scale extensions of standard methods, including novel
visualization using dynamic graphics are proposed. These tools are used to
explore non-standard structure in internet traffic data.
Wavelets and Multifractals for network traffic
modeling and inference
V. J. Ribeiro, R. H. Riedi and R. G. Baraniuk
Proceedings ICASSP Salt Lake City, Utah, (May 2001).
This paper reviews the multifractal wavelet model (MWM) and its
applications to network traffic modeling and inference. The discovery of
the fractal nature of traffic has made new models and analysis tools for
traffic essential, since classical Poisson and Markov models do not capture
important fractal properties like multiscale variability and burstiness
that deleteriously affect performance.
Set in the framework of multiplicative cascades, the MWM provides a link to
multifractal analysis, a natural tool to characterize burstiness. The
simple structure of the MWM enables fast O(N) synthesis of traffic for
simulations and a tractable queuing analysis, thus rendering it suitable
for real networking applications including end-to-end path modeling.
Analyzing Robot Behavior in E-Business Sites
Virgílio Almeida, Daniel Menascé, Rudolf Riedi, Flávia Ribeiro, Rodrigo
Fonseca, Wagner Meira Jr.,
IEEE Internet Computing submitted February 2001
Conference paper:
Analyzing Robot Behavior and
their Impact on Caching Virgílio Almeida, Daniel Menascé, Rudolf Riedi,
Flávia Ribeiro, Rodrigo Fonseca, Wagner Meira Jr.,
Workshop on Web Caching and Content Delivery June 2001
Analyzing Robot Behavior in
E-Business Sites
Daniel Menascé, Virgílio Almeida, Rudolf Riedi, Flávia Ribeiro, Rodrigo
Fonseca, Wagner Meira Jr.,
Proc. ACM SigMetrics'01 (June 2001)
Understanding the nature and characteristics of e-business workloads is an
essential step to improve quality of service. Using a multi-layer
hierarchical workload model, this paper presents a characterization of the
workload generated by autonomous agents and robots. The characterization of
the robot workload focuses on the statistical properties of the arrival
process and on the robot behavior graph model. A multi-scale time analysis
is used to understand the impact of robot workloads on the performance of
e-business sites. Based on the workload characterization, we develop an
analytical model for the interaction between robots and e-business sites.
Using the model, we show the resource utilization associated with robots'
requests. Multiple time-scale analysis point out that server overload may
occur at fine time scales, even though average utilizations at larger
time-scales do not show evidences of overloading. These results indicate
not only the need for a better understanding of the behavior of robots, but
also help to quantify the impact of robots on the quality of service
provided by a site.
A Hierarchical and Multiscale
Analysis of E-Business Workloads
Daniel Menascé, Virgílio Almeida, Rudolf Riedi, Flávia Ribeiro, Rodrigo
Fonseca, Wagner Meira Jr.,
Performance Evaluation 54(1), Sept 2003, pp 33--57.
Understanding the nature and characteristics of E-business workloads is a
crucial step to improve the quality of service offered to customers in
electronic business environments. Using a multi-layer hierarchical model,
this paper presents a detailed multiscale characterization of the workload
of two actual E-business sites: an online bookstore and an electronic
auction site. Our analysis of the workloads showed that the session length,
measured in number of requests to execute E-business functions, is
heavy-tailed, especially for sites subject to requests generated by robots.
An overwhelming majority of the sessions consists of only a handful
requests. This seems to suggest that most customers are human (as opposed
to robots). A significant fraction of the functions requested by customers
were found to be product selection functions as opposed to product
ordering. An analysis of the popularity of search terms revealed that it
follows a Zipf distribution. However, Zipf's law as applied to E-business
is time scale dependent due to the shift in popularity of search terms. We
also found that requests to execute frequent E-business functions exhibit a
similar pattern of behavior as observed for the total number of HTTP
requests. Finally, our analysis demonstrated that there is a strong
correlation in the arrival process at the HTTP request level. These
correlations are particularly stronger at intermediate time scales of a few
minutes.
Keywords: E-business, WWW, workload characterization, performance
modeling, heavy-tailed distribution
On the multiplicative
structure of network traffic
Rudolf H. Riedi
Proceedings IMA Conference on Mathematics in Signal Processing, Warwick,
(December 2000)
This is an overview of multiplicative analysis and modelling of network
traffic, emphasizing the fact that multifractal structure is particularly
present at small scales, and with an outlook to novel stationary
multiplicative processes.
Multifractal Cross-Traffic Estimation
V. Ribeiro, M. Coates, R. Riedi, S. Sarvotham, B. Hendricks and R. Baraniuk,
Proceedings ITC Specialist Seminar on IP Traffic Measurement, Modeling and
Management, September 2000, Monterey, CA
In this paper we develop a novel model-based technique, the Delphi
algorithm, for inferring the instantaneous volume of competing
cross-traffic across an end-to-end path. By using only end-to-end
measurements, Delphi avoids the need for data collection within the
Internet. Unique to the algorithm is an efficient exponentially spaced
probing packet train and a parsimonious multifractal parametric model for
the cross-traffic that captures its multiscale statistical properties
(including long-range dependence) and queuing behavior. The algorithm is
adaptive; it requires no a priori traffic statistics and effectively tracks
changes in network conditions. ns (network simulator) experiments reveal
that Delphi gives accurate cross-traffic estimates for higher link
utilization levels while at lower utilizations it over-estimates the
cross-traffic. Also, when Delphi's single bottleneck assumption does not
hold it over-estimates the cross-traffic.
In Search of Invariants for E-Business
Workloads
Daniel Menascé, Flávia Ribeiro, Virgílio Almeida, Rodrigo Fonseca, Rudolf
Riedi, Wagner Meira Jr.,
Proceedings EC'00, Inst. Math. Appl., October 2000, Minneapolis,
MN.
Understanding the nature and characteristics of e-business workloads is a
crucial step to improve the quality of service offered to customers in
electronic business environments. However, the variety and complexity of
the interactions between customers and sites make the characterization of
e-business workloads a challenging problem. Using a multi-layer
hierarchical model, this paper presents a detailed characterization of the
workload of two actual e-business sites: an online bookstore and an
electronic auction site. Through the characterization process, we found the
presence of autonomous agents, or robots, in the workload and used the
hierarchical structure to determine their characteristics. We also found
that search terms follow a Zipf distribution.
Keywords: e-commerce, WWW, workload characterization, performance
modeling, heavy-tailed distribution
Multiplicative Multiscale Image
Decompositions: Analysis and Modeling
Justin K. Romberg, Rolf H. Riedi, Hyeokho Choi, and Richard G. Baraniuk
Proceedings of the SPIE's 45th Annual Meeting, Internat. Symp. on Optical
Science and Technology, August 2000, San Diego, CA.
Multiscale processing, in particular using the wavelet transform, has
emerged as an incredibly effective paradigm for signal processing and
analysis. In this paper, we discuss a close relative of the Haar wavelet
transform, the multiscale multiplicative decomposition. While the
Haar transform captures the differences between signal approximations at
different scales, the multiplicative decomposition captures their ratio.
The multiplicative decomposition has many of the properties that have made
wavelets so successful. Most notably, the multipliers are a sparse
representation for smooth signals, they have a dependency structure similar
to wavelet coefficients, and they can be calculated efficiently.
The multiplicative decomposition is also a more natural signal
representation than the wavelet transform for some problems. For example,
it is extremely easy to incorporate positivity constraints into multiplier
domain processing. In addition, there is a close relationship between the
multiplicative decomposition and the Poisson process; a fact that has been
exploited in the field of photon-limited imaging. In this paper, we will
show that the multiplicative decomposition is also closely tied with the
Kullback-Leibler distance between two signals. This allows us to derive an
n-term KL approximation scheme using the multiplicative decomposition.
Multiscale Image Segmentation
using Joint Texture and Shape analysis
Ramesh Neelamani, Justin Romberg, Hyeokho Choi, Rudolf Riedi, and Richard
Baraniuk
Proceedings of the SPIE's 45th Annual Meeting, Internat. Symp. on Optical
Science and Technology, August 2000, San Diego, CA.
We develop a general framework to simultaneously exploit texture and shape
characterization in multiscale image segmentation. By posing multiscale
segmentation as a model selection problem, we invoke the powerful framework
offered by minimum description length (MDL). This framework dictates that
multiscale segmentation comprises multiscale texture characterization and
multiscale shape coding. Analysis of current multiscale maximum a
posteriori (MAP) segmentation algorithms reveals that these algorithms
implicitly use a shape coder with the aim to estimate the optimal MDL
solution, but find only an approximate solution.
Towards achieving better segmentation estimates, we first propose a shape
coding algorithm based on zero-trees which is well-suited to represent
images with large homogeneous regions. For this coder, we design an
efficient tree-based algorithm using dynamic programming that attains the
optimal MDL segmentation estimate. To incorporate arbitrary shape coding
techniques into segmentation, we design an iterative algorithm that uses
dynamic programming for each iteration. Though the iterative algorithm is
not guaranteed to attain exactly optimal estimates, it more effectively
captures the prior set by the shape coder. Experiments demonstrate that the
proposed algorithms yield excellent segmentation results on both synthetic
and real world data examples.
Multifractal Processes
R. H. Riedi,
in Long range dependence : theory and applications,
eds. Doukhan, Oppenheim and Taqqu, (Birkh\"auser 2002) ISBN: 0817641688, pp
625-715.
12 page summary
Multifractal theory up to date has been concerned mostly with random and
deterministic singular measures, with the notable exception of fractional
Brownian motion and Lévy motion. Real world problems involved with the
estimation of the singularity structure of both, measures and processes,
has revealed the need to broaden the known `multifractal formalism' to
include more sophisticated tools such as wavelets. Moreover, the pool of
models available at present shows a gap between `classical' multifractal
measures, i.e.\ cascades in all variations with rich scaling properties,
and stochastic processes with nice statistical properties such as
stationarity of increments, Gaussian marginals, and long-range dependence
but with degenerate scaling characteristics.
This paper has two objectives, then. For one it develops the multifractal
formalism in a context suitable for functions and processes. Second, it
introduces truly multifractal processes, building a bridge between
multifractal cascades and self-similar processes.
Keywords: Multifractal analysis, self-similar processes, fractional
Brownian motion, Lévy flights, stable motion, wavelets, long-range
dependence, multifractal subordinator.
Long-Range
Dependence and Data Network Traffic
W. Willinger, V. Paxson, R. H. Riedi and M. S. Taqqu,
in Long range dependence : theory and applications,
eds. Doukhan, Oppenheim and Taqqu (Birkh\"auser 2002) ISBN: 0817641688.
This is an overview of a relatively recent application of long-range
dependence (LRD) to the area of communication networks, in particular
to problems concerned with the dynamic nature of packet flows in high-speed
data networks such as the Internet. We demonstrate that this new
application area offers unique opportunities for significantly advancing
our understanding of LRD and related phenomena. These advances are made
possible by moving beyond the conventional approaches associated with the
wide-spread ``black-box'' perspective of traditional time series analysis
and exploiting instead the physical mechanisms that exist in the networking
context and that are intimately tied to the observed characteristics of
measured network traffic. In order to describe this complexity we provide a
basic understanding of the design, architecture and operations of data
networks, including a description of the TCP/IP protocols used in today's
Internet. LRD is observed in the large scale behavior of the data traffic
and we provide a physical explanation for its presence. LRD tends to be
caused by user and application characteristics and has little to do with
the network itself. The network affects mostly small time scales, and this
is why a rudimentary understanding of the main protocols is important. We
illustrate why multifractals may be relevant for describing some aspects of
the highly irregular traffic behavior over small time scales. We
distinguish between a time-domain and wavelet-domain approach to analyzing
the small time scale dynamics and discuss why the time-domain appears to be
better suited for studying the performance (e.g., a queueing analysis)
while the wavelet-domain approach appears to be better suited for
identifying particular features in measured traffic with a dominant
influence on particular time scales (e.g., relatively regular traffic
patterns over certain time scales which have a direct networking
interpretation in terms of ``round trip'' time behavior).
Keywords: Long-range dependence, network traffic, self-similar
processes, fractional Brownian motion, multifractal analysis,
cascades.
Network
Traffic Modeling Using a Multifractal Wavelet Model
R. H. Riedi, V. J. Ribeiro, M. S. Crouse and R. G. Baraniuk
Proceedings European Congress of Mathematics, Barcelona 2000.
In this paper, we develop a simple and powerful multiscale model for
synthesizing nonGaussian, long-range dependent (LRD) network traffic.
Although wavelets effectively decorrelate LRD data, wavelet-based models
have generally been restricted by a Gaussianity assumption that can be
unrealistic for traffic. Using a multiplicative superstructure on top of
the Haar wavelet transform, we exploit the decorrelating properties of
wavelets while simultaneously capturing the positivity and ``spikiness'' of
nonGaussian traffic. This leads to a swift O(N) algorithm for fitting and
synthesizing N-point data sets. The resulting model belongs to the class of
multifractal cascades, a set of processes with rich statistical properties.
We elucidate our model's ability to capture the covariance structure of
real data and then fit it to real traffic traces. Queueing experiments
demonstrate the accuracy of the model for matching real data.
Multiscale Queuing Analysis of
Long-Range-Dependent Network Traffic
V. J. Ribeiro, R. H. Riedi, M. S. Crouse and R. G. Baraniuk
IEEE Trans. on Networking, submitted
see also: Technical Report 99-08, ECE Dept., Rice University;
as well as: Proceedings of IEEE INFOCOM 2000, Tel
Aviv, Israel, March 2000.
This paper develops a novel approach to queuing analysis tailor-made for
multiscale long-range-dependent (LRD) traffic models. We review two such
traffic models, the wavelet-domain independent Gaussian model (WIG) and the
multifractal wavelet model (MWM). The WIG model is a recent generalization
of the ubiquitous fractional Brownian motion process. Both models are based
on a multiscale binary tree structure that captures the correlation
structure of traffic and hence its LRD. Due to its additive nature, the WIG
is inherently Gaussian, while the multiplicative MWM is non-Gaussian. The
MWM is set within the framework of multifractals, which provide natural
tools to measure the multiscale statistical properties of traffic loads, in
particular their burstiness.
Our queuing analysis leverages the tree structure of the models and
provides a simple closed-form approximation to the tail queue probability
for any given queue size. This makes the WIG and MWM suitable for numerous
practical applications, including congestion control, admission control,
and cross-traffic estimation. The queuing analysis reveals that the
marginal distribution and, in particular, the large values of traffic at
different time scales strongly affect queuing. This implies that merely
modeling the traffic variance at multiple time scales, or equivalently, the
second-order correlation structure, can be insufficient for capturing the
queuing behavior of real traffic. We confirm these analytical findings by
comparing the queuing behavior of WIG and MWM traffic through
simulations.
Toward an Improved
Understanding of Network Traffic Dynamics [PDF]
R. H. Riedi and W. Willinger
in: Self-similar Network Traffic and Performance Evaluation
eds. Park and Willinger, (Wiley 2000), chapter 20 , pp 507-530.
Since the discovery of long range dependence in Ethernet LAN traces there
has been significant progress in developing appropriate mathematical and
statistical techniques that provide a physical-based, networking-related
understanding of the observed fractal-like or self-similar scaling behavior
of measured data traffic over time scales ranging from hundreds of
milliseconds to seconds and beyond. These developments have helped
immensely in demystifying fractal-based traffic modeling and have given
rise to new insights and physical understanding of the effects of
large-time scaling properties in measured network traffic on the design,
management and performance of high-speed networks.
However, to provide a complete description of data network traffic, the
same kind of understanding is necessary with respect to the dynamic nature
of traffic over small time scales, from a few hundreds of milliseconds
downwards. Because of the predominant protocols and end-to-end congestion
control mechanisms that determine the flow of packets, studying the
fine-time scale behavior or local characteristics of data traffic is
intimately related to understanding the complex interactions that exist in
data networks.
In this chapter, we first summarize the results that provide a unifying and
consistent picture of the large-time scaling behavior of data traffic. We
then report on recent progress in studying the small-time scaling behavior
in data network traffic and outline a number of challenging open problems
that stand in the way of providing an understanding of the local traffic
characteristics that is as plausible, intuitive, appealing and relevant as
the one that has been found for the global or large-time scaling properties
of data traffic.
Attracteurs, orbites et
ergodicité
C. Tricot and R. Riedi Ann. Math. Blaise Pascal 6 (1999), 55-72.
L'attracteur d'un systeme de fonctions itéré est le support d'une mesure
dite invariante, ou auto-similaire: c'est le point fixe de
l'opérateur de Markov dans l'espace métrique des mesures de probabilité a
support compact. Un algorithme aléatoire permet la contruction de
l'attracteur, qui est presque surement l'ensemble des points limites d'une
orbite. Un théoreme ergodique permet de prouver que la fréquence de visite
d'un ensemble par cette orbite est presque surement égale a la mesure
invariante de cet ensemble. Nous donnons ici une démonstration simple de ce
résultat connu.
Abstract in English:
The attractor of an iterated function system (IFS) is the support of a
so-called invariant or self-similar measure: this measure is,
more precisely, the unique fixed point of the Markov (or Hutchinson)
operator which acts in the metric space of all compactly supported
probability measures and is associated with the IFS. A simple iterative
algorithm allows to produce a random orbit the limit points of which are
almost surely the attractor of the IFS. This fact is an essential
ingredient in fractal image compression. Moreover, an ergodic theorem
states that the frequency of visits of a set by such a random orbit is
almost surely equal to the probability assigned to that set by the
invariant measure. Thus, the random orbit actually samples the self-similar
measure almost surely. We give here a simple proof of this known
result.
Wavelet
Analysis of Fractional Brownian Motion in Multifractal Time
P. Gonçalvès and R. H. Riedi
Proceedings of the 17th Colloquium GRETSI, Vannes, France, Sept
1999.
We study fractional Brownian motions in multifractal time, a model
for multifractal processes proposed recently in the context of economics.
Our interest focuses on the statistical properties of the wavelet
decomposition of these processes, such as residual correlations (LRD) and
stationarity, which are instrumental towards computing the statistics of
wavelet-based estimators of the multifractal spectrum.
Simulation of
Non-Gaussian Long-Range-Dependent Traffic using Wavelets
V. J. Ribeiro, R. H. Riedi, M. S. Crouse and R. G. Baraniuk
Proc. ACM SigMetrics'99 (May 1999), 1-12
In this paper, we develop a simple and powerful multiscale model for the
synthesis of nonGaussian, long-range dependent (LRD) network traffic.
Although wavelets effectively decorrelate LRD data, wavelet-based models
have generally been restricted by a Gaussianity assumption that can be
unrealistic for traffic. Using a multiplicative superstructure on top of
the Haar wavelet transform, we exploit the decorrelating properties of
wavelets while simultaneously capturing the positivity and ``spikiness'' of
nonGaussian traffic. This leads to a swift O(N) algorithm for fitting and
synthesizing N-point data sets. The resulting model belongs to the class of
multifractal cascades, a set of processes with rich statistical properties.
We elucidate our model's ability to capture the covariance structure of
real data and then fit it to real traffic traces. Queueing experiments
demonstrate the accuracy of the model for matching real data. Our results
indicate that the nonGaussian nature of traffic has a significant effect on
queuing.
A Multifractal Wavelet Model with Application to
Network Traffic
R. H. Riedi, M. S. Crouse, V. J. Ribeiro, and R. G. Baraniuk
IEEE Special Issue on Information Theory, 45. (April 1999),
992-1018.
In this paper, we develop a new multiscale modeling framework for
characterizing positive-valued data with long-range-dependent correlations
(1/f noise). Using the Haar wavelet transform and a special multiplicative
structure on the wavelet and scaling coefficients to ensure positive
results, the model provides a rapid O(N) cascade algorithm for synthesizing
N-point data sets. We study both the second-order and multifractal
properties of the model, the latter after a tutorial overview of
multifractal analysis. We derive a scheme for matching the model to real
data observations and, to demonstrate its effectiveness, apply the model to
network traffic synthesis. The flexibility and accuracy of the model and
fitting procedure result in a close fit to the real data statistics
(variance-time plots and moment scaling) and queuing behavior. Although for
illustrative purposes we focus on applications in network traffic modeling,
the multifractal wavelet model could be useful in a number of other areas
involving positive data, including image processing, finance, and
geophysics.
Keywords: Multifractals, long-range dependence, positive 1/f noise,
wavelets, network traffic
Simple Statistical
Analysis of Wavelet-based Multifractal Spectrum Estimation,
P. Gonçalvès, R. H. Riedi and R. G. Baraniuk
Proceedings of the 32nd Conference on `Signals, Systems and Computers',
Asilomar, Nov 1998
The multifractal spectrum characterizes the scaling and singularity
structures of signals and proves useful in numerous applications, from
network traffic analysis to turbulence. Of great concern is the estimation
of the spectrum from a finite data record. In this paper, we derive
asymptotic expressions for the bias and variance of a wavelet-based
estimator for a fractional Brownian motion (fBm) process. Numerous
numerical simulations demonstrate the accuracy and utility of our
results.
Multifractal Properties of TCP
Traffic: a Numerical Study,
R. H. Riedi and J. Lévy Véhel.
INRIA research report 3129, March 1997.
The apparent `fractal' properties of TCP data traffic have recently
attracted considerable interest. Most prominently, fractional Brownian
motion (FBM) has been used to model the long range dependence of traffic
traces through self-similarity. Traffic being by nature a process of
positive increments, though, a multifractal approach appears more natural.
In this study, various traces of TCP traffic have been analyzed from both
points of view. Though evidence for statistical self-similarity is present
in certain `aspects' of the traffic, the multifractal scaling
behavior is much more convincing. Furthermore, crucial LAN specific
characteristics of data traffic are revealed by the multifractal analysis
(MA) only. TCP traffic at Berkeley and at CNET, e.g., looks entirely
different from a multifractal point of view while showing about the same
self-similarity parameter H. As a further example, MA suggests that Lévy
stable motion is in certain situations a more accurate model than FBM. In
conclusion, to consider traffic traces as multifractal random measures
rather than as (monofractal) self-similar processes is not only more
natural but has also various numerical advantages. A novel approach to
queueing supports this conclusion.
Fractional Brownian motion and
data traffic modeling: The other end of the spectrum,
J. Lévy Véhel and R. H. Riedi
in: Fractals in Engineering 97
, Springer 1997.
We analyze the fractal behavior of the high frequency part of the Fourier
spectrum of fBm using multifractal analysis and show that it is not
consistent with what is measured on real traffic traces. We propose two
extensions of fBm which come closer to actual traffic traces multifractal
properties.
Keywords: fractional Brownian motion, Multifractal analysis, TCP
Exceptions to the Multifractal
Formalism for Discontinuous Measures,
R. H. Riedi and B. B Mandelbrot,
Math. Proc. Cambr. Phil. Soc.123 (1998), 133--157.
In an earlier paper (Adv. Appl. Math. 18 (1997), 50--58) the
authors introduced the inverse measure m' of a given measure m on [0,1] and
presented the inversion formula f(a) = a f(1/a) which was argued to
link the respective multifractal spectra of m and m'. A second paper
(Adv. Appl. Math. 19 (1997), 332--354) established the
formula under the assumption that m and m' are continuous measures.
Here, we investigate the general case which reveals telling details of
interest to the full understanding of multifractals. Subjecting
self-similar measures to the operation m->m' creates a new class of
discontinuous multifractals. Calculating explicitly we find that the
inversion formula holds only for the `fine' multifractal spectra and not
for the `coarse' ones. As a consequence, the multifractal formalism fails
for this class of measures. A natural explanation is found when drawing
parallels to equilibrium measures of dynamical systems. In the context of
our work it becomes natural to consider the degenerate Hölder exponents 0
and infinity.
Inversion formula for
Continuous Multifractals,
R. H. Riedi and B. B Mandelbrot,
Adv. Appl. Math.19 (1997), 332--354.
In a previous paper (Adv. Appl. Math. 18 (1997), 50--58) the
authors introduced the inverse measure m' of a probability measure m on
[0,1]. It was argued that the respective multifractal spectra are linked by
the inversion formula f'(a) = a f(1/a). Here, the statements of Part
I are put in more mathematical terms and proofs are given for the inversion
formula in the case of continuous measures. Thereby, f may stand for the
Hausdorff spectrum, the pacing spectrum, or the coarse grained spectrum.
With a closer look at the special case of self-similar measures we offer a
motivation of the inversion formula as well as a discussion of possible
generalizations. Doing so we find a natural extension of the scope of the
notion `self-similar' and a failure of the usual multifractal
formalism.
Inverse Measures, the Inversion
formula and Discontinuous Multifractals,
B. B. Mandelbrot and R. H. Riedi,
Adv. Appl. Math. 18 (1997), 50--58.
The present paper is part I of a series of three closely related papers in
which the inverse measure m' of a given measure m on [0,1] is introduced.
In the first case discussed in detail, both these measures are multifractal
in the usual sense, that is, both are linearly self-similar and continuous
but not differentiable and both are non--zero for every interval of [0,1].
Under these assumptions the Hölder multifractal spectra of the two measures
are shown to be linked by the inversion formula f'(a) = a f(1/a) .
The inversion formula is then subjected to several diverse variations,
which reveal telling details of interest to the full understanding of
multifractals. The inverse of the uniform measure on a Cantor dust leads us
to argue that this inversion formula applies to the Hausdorff spectrum even
if the measures m and m' are not continuous while it may fail for the
spectrum obtained by the Legendre path. This phenomenon goes along with a
loss of concavity in the spectrum. Moreover, with the examples discussed it
becomes natural to include the degenerate Hölder exponents 0 and infty in
the Hölder spectra.
This present paper is the first of three closely related papers on inverse
measures, introducing the new notion in a language adopted for the
physicist. Parts II and III make rigorous what is argued with intuitive
arguments here. Part II extends the common scope of the notion of
self-similar measures. With this broader class of invariant measures part
III shows that the multifractal formalism may fail.
Multifractals and Wavelets: A
potential tool in Geophysics
R. H. Riedi,
SEG, New Orleans 1998
The study of fractal quantities and structures exhibiting highly erratic
features on all scales has proved to be of outstanding significance in
various disciplines. While scaling phenomena are pervasive in natural and
man-made constructs, such objects are less fractal than multifractal. In
most simple terms this means that moments of different orders scale
differently with increasing resolution.
This paper should be understood as a tutorial in multifractals and their
analysis via wavelets, in view of possible applications in geophysics. It
is elaborated how a description of the well log measurement through
wavelets provides a new way of modeling reflection of waves in a material
which is dependent on frequency. The wavelet analysis has the potential to
provide an explanation for the inconsistencies that are observed when
comparing subsurface models that have been constructed from measurements
with different resolutions, such as surface seismic, vertical seismic
profiles and well logs.
Conditional and Relative
Multifractal Spectra.
R. H. Riedi and I. Scheuring,
Fractals. An Interdisciplinary Journal.5 (1997), 153--168.
In the study of the involved geometry of singular distributions the use of
fractal and multifractal analysis has shown results of outstanding
significance. So far, the investigation has focused on structures produced
by one single mechanism which were analyzed with respect to the ordinary
metric or volume. Most prominent examples include self-similar measures and
attractors of dynamical systems. In certain cases, the multifractal
spectrum is known explicitly, providing a characterization in terms of the
geometrical properties of the singularities of a distribution.
Unfortunately, strikingly different measures may possess identical spectra.
To overcome this drawback we propose two novel methods, the conditional and
the relative multifractal spectrum, which allow for a direct comparison of
two distributions. These notions measure the extent to which the
singularities of two distributions `correlate'. Being based on multifractal
concepts, however, they go beyond calculating correlations. As a
particularly useful tool we develop the multifractal formalism and
establish some basic properties of the new notions. With the simple example
of Binomial multifractals we demonstrate how in the novel approach a
distribution mimics a metric different from the usual one. Finally, the
provided applications to real data show how to interpret the spectra in
terms of mutual influence of dense and sparse parts of the
distributions.
Numerical Estimates of
Generalized Dimensions D_q for Negative q
R. Pastor-Satorras and R. H. Riedi,
J. Phys. A29 (1996) L391-L398.
Usual fixed-size box-counting algorithms are inefficient for computing
generalized fractal dimensions in the range of q<0. In this Letter we
describe a new numerical algorithm specifically devised to estimate
generalized dimensions for large negative q, providing evidence of its
better performance. We compute the complete spectrum of the Hénon
attractor, and interpret our results in terms of a ``phase transition''
between different multiplicative laws.
Multifractal Formalism
for Infinite Multinomial Measures
R. H. Riedi and B. B Mandelbrot, Adv. Appl. Math. 16 (1995)
132--150.
There are strong reasons to believe that the multifractal spectrum of DLA
shows anomalies which have been termed left sided. In order to show that
this is compatible with strictly multiplicative structures Mandelbrot et
al. introduced a one parameter family of multifractal measures invariant
under infinitely many linear maps on the real line. Under the assumption
that the usual multifractal formalism holds, the authors showed that the
multifractal spectrum of these measure is indeed left sided, i.e. they
possess arbitrarily large Hölder exponents and the spectrum is increasing
over the whole range of these values. Here, it is shown that the
multifractal formalism for self-similar measures does indeed hold also in
the infinite case, in particular that the singularity exponents D(q)
satisfy the usual equation of self-similar measures and that the
multifractal spectrum f(a) is the Legendre transform of (q-1)D(q).
An Improved Multifractal
Formalism and Self-Similar Measures
R. H. Riedi,
J. Math. Anal. Appl. 189 (1995) 462-490.
To characterize the geometry of a measure, its so-called generalized
dimensions D(q) have been introduced recently. The mathematically precise
definition given by Falconer turns out to be unsatisfactory for reasons of
convergence as well as of undesired sensitivity to the particular choice of
coordinates in the negative q range. A new definition is introduced, which
is based on box-counting too, but which carries relevant information also
for negative q. In particular, rigorous proofs are provided for the
Legendre connection between generalized dimensions and the so-called
multifractal spectrum and for the implicit formula giving the generalized
dimensions of self-similar measures, which was until now known only for
positive q.
Explicit Lower Bounds of the Hausdorff
Dimension of Certain Self-Affine Sets [pdf]
R. H. Riedi,
Fractals in the Natural and Applied Sciences pp 313--324,
IFIP Transactions, M. Novak ed., North-Holland, Amsterdam 1994.
A lower bound of the Hausdorff dimension of certain self-affine sets is
given. Moreover, this and other known bounds such as the box dimension are
expressed in terms of solutions of simple equations involving the singular
values of the affinities.
An Improved
Multifractal Formalism and Self-affine Measures
R. H. Riedi,
Summary of Ph.D. thesis ETH Zurich, Switzerland, 1993
This document is a six page summary of my Ph.D. thesis in which
multifractal formalism based on counting on coarse levels (as opposed to a
dimensional approach) is developed (see also J. Math. Anal. Appl.
16 (1995) 132--150). This formalism is then applied to self-affine
measures discovering phase transitions which are not present with
self-similar measures.
An introduction to
multifractals
R. H. Riedi,
Rice University, (Version Oct 26, 1999)
This is an easy read introduction to multifractals. We start with a
thorough study of the Binomial measure from a multifractal point of view,
introducing the main multifractal tools. We then continue by showing how to
generate more general multiplicative measures and close by presenting an
extensive set of examples on which we elaborate how to `read' a
multifractal spectrum.