1 month, 3 weeks

Amortized Analysis




 


In computer science, amortized analysis is a method for analyzing a given algorithm's complexity, or how much of a resource, especially time or memory, it takes to execute. The motivation for amortized analysis is that looking at the worst-case run time can be too pessimistic. Instead, amortized analysis averages the running times of operations in a sequence over that sequence. As a conclusion: “Amortized analysis is a useful tool that complements other techniques such as worst-case and average-case analysis.“

Amortized analysis initially emerged from a method called aggregate analysis, which is now subsumed by amortized analysis. The technique was first formally introduced by Robert Tarjan in his 1985 paper Amortized Computational Complexity, which addressed the need for a more useful form of analysis than the common probabilistic methods used. 

Amortized analysis is applied on data structures that support many operations. The sequence of operations and the multiplicity of each operation is application specific or the associated algorithm specific. Classical asymptotic analysis gives worst case analysis of each operation without taking the effect of one operation on the other, whereas amortized analysis focuses on a sequence of operations, an interplay between operations, and thus yielding an analysis which is precise and depicts a micro-level analysis.

To calculate the cost of an opertion or the amortized cost of an operation, we take the average over all operations. In particular, worst case time of each operation is taken into account to calculate the average cost in the worst case. Some of the highlights of amortized analysis include; 

• Amortized Analysis is applied to algorithms where an occasional operation is very slow, but most of the other operations are faster.

• In Amortized Analysis, we analyze a sequence of operations and guarantee a worst case average time which is lower than the worst case time of a particular expensive operation.

• Amortized analysis is an upper bound: it is the average performance of each operation in the worst case. Amortized analysis is concerned with the over all cost of a sequence of operations. It does not say anything about the cost of a specific operation in that sequence.

• Amortized analysis can be understood to take advantage of the fact that some expensive operations may pay for future operations by somehow limiting the number or cost of expensive operations that can happen in the near future.  

• Amortized analysis may consist of a collection of cheap, less expensive and expensive operations, however, amortized analysis (due to its averaging argument) will show that average cost of an operation is cheap. 

• This is different from average case analysis, wherein averaging argument is given over all inputs for a specific operation. Inputs are modeled using a suitable probability distribution. In amortized analysis, no probability is involved.


The Basics: the three approaches to amortized analysis

It is important to note that these approaches are for analysis purpose only. The underlying algorithm design is unaltered and the purpose of these micro-level analysis is to get a good insight into the operations being performed.

• Aggregate Analysis (brute force): 
Aggregate analysis is a simple method that involves computing the total cost T(n) for a sequence of n operations, then dividing T(n) by the number n of operations to obtain the amortized cost or the average cost in the worst case. For all operations the same amortized cost T(n)/n is assigned, even if they are of different types. The other two methods may allow for assigning different amortized costs to different types of operations in the same sequence.


• Accounting Method (banker’s method):
The accounting method is a form of aggregate analysis which assigns to each operation an amortized cost which may differ from its actual cost. Early operations have an amortized cost higher than their actual cost, which accumulates a saved "credit" that pays for later operations having an amortized cost lower than their actual cost. Because the credit begins at zero, the actual cost of a sequence of operations equals the amortized cost minus the accumulated credit. Because the credit is required to be non-negative, the amortized cost is an upper bound on the actual cost. Usually, many short-running operations accumulate such credit in small increments, while rare long-running operations decrease it drastically.

Assign (potentially) different charges to each operation.

・Di = data structure after ith operation.

・ci = actual cost of ith operation.

・ĉi = amortized cost of ith operation = amount we charge operation i.

・When ĉi > ci, we store credits in data structure Di to pay for future ops; when ĉi < ci, we consume credits in data structure Di.

・Initial data structure D0 starts with 0 credits. 


• Potential Function Method (physicist’s method):
The analysis is done by focusing on structural parameters such as the number of elements, the height, the number of property violations of a data structure. For an operation, after performing the operation, the change in a structural parameter is captured as a function which is stored at a data structure. The function that captures the change is known as a potential function. As part of the analysis, we work with non-negative potential functions. If the
change in potential is positive, then that operation is over charged and similar to accounting method, the excess potential will be stored at the data structure. If the change in potential is negative, then that operation is under charged which would be compensated by excess potential available at the data
structure.

Φ(Di) maps each data structure Di to a real number s.t.:
・Φ(D0) = 0.
・Φ(Di) ≥ 0 for each data structure Di. 

Actual and amortized costs.
・ci = actual cost of ith operation.
・ĉi = ci + Φ(Di) – Φ(Di–1) = amortized cost of ith operation.

 


Case Studies

Stack
Our stack implementation consists of three operations, namely push(), pop() and multi-pop(). The operation multi-pop(k) fetches top k elements of the stack if stack contains at least k elements. If stack contains less than k elements, then output the entire stack. Let us analyze stack from the perspective of amortized analysis by assuming there are n of the above operations which are performed in some order. In classical world, the actual cost in worst case for push and pop is O(1) and multi-pop is O(n). We now show that the amortized cost of all three operations is O(1).

Aggregate Analysis:
Let us assume that in a sequence of n operations, there are l multi-pop operations and the rest are push and pop operations. Clearly, l multi-pop operations perform at most n pops and therefore, the cost is O(n). The cost for the other n − l operations is O(n). The amortized cost is O(n) + O(n) divided by n which is O(1). Since aggregate analysis assigns the same cost to all operations, push, pop and multi-pop incur O(1) amortized cost.

Accounting Method:
As part of accounting method, we need to assign credits to objects of stack S. Whenever we push an element x into S, we charge ’2’ credits. One credit will be used for pushing the element which is the actual cost for pushing x into S and the other credit is stored at x which would be used later. For pop operation, we charge nothing, i.e., pop operation is free. Although, the actual cost is ’1’, in our analysis we charge ’0’. The excess credit ’1’ available at the element will be used when we perform a pop operation. Since pop is peformed on a non-empty stack, the account will always be at credit. Similarly, we charge nothing for multi-pop as sufficient credits are available for all k elements at elements themselves. Thus, the amortised cost of push is
2 = O(1), pop is 0 = O(1) and multi-pop is 0 = O(1).

Potential Function Method: 
The first thing in this method is to define a potential function capturing some structural parameter. A natural structural parameter of interest for stack is the number of elements. Keeping the number of elements as the potential function, we now analyze all three operations.

1. Push
cˆpush = cpush + φi − φi−1
=1 + x + 1 − x, where x denotes the number of elements in S before push operation.
=2
2. Pop
cˆpush = cpush + φi − φi−1
=1 + x − 1 − x
=0
3. Multi-pop(k)
cˆpush = cpush + φi − φi−1
=k + n − k − n, where n = |S|.
=0

Thus, the amortized cost of all three operations is O(1). Note the similarity between this method and the accounting method. The potential function, in particular, the change in potential helps us to fix the excess value to be stored at each element of S. Also, note that potential functions such as 2 · |S| or c · |S|, where c is a constant will also work fine and still yields O(1) amortized cost. The excess potential will be simply stored at the data structure as credits. Similarly, in accounting method, if we charge, say 4 credits for push, then excess credits will be stored at elements. Suppose, we introduce another operation, namely, multi-push(k) which pushes k elements into S. Let us analyze the cost of multi-push in all three techniques.

• Aggregate Analysis: In any sequence of n operations consisting of push, pop, multi-pop and multi-push, we may find l1 multi-pop, l2 multi-push and the rest are push and pop operations. In the worst case scenario, l2 = O(1) and each multi-push pushes O(n) elements. For example, three multi-push with
each push inserts n/3 elements. Therefore, the total cost is O(n) and the amortized cost is O(n)/O(1) which is O(n). Thus, for all operations, the amortized cost is O(n).

• Accounting Method: As before, we use ’2’ credits for push and ’0’ for pop. With this credit scheme, the cost of push, pop and multi-pop is O(1) amortized, whereas, multi-push is 2 · O(n)/O(1) which is O(n). Therefore, the amortized cost of multi-push is O(n).

• Potential Function Method: For multi-push(k);
cˆpush = cpush + φi − φi−1
=k + n + k − n, where n = |S|.
=2 · k = O(n).


Amortized analysis: applications
・Splay trees.
・Hash Tables
・Disjoint Sets
・Dynamic table.
・Fibonacci heaps.
・Garbage collection.
・Move-to-front list updating.
・Push–relabel algorithm for max flow.
・Path compression for disjoint-set union.
・Structural modifications to red–black trees.
・Security, databases, distributed computing, ...
・Online algorithms commonly use amortized analysis.


Responses(0)