Shortest path algorithms, Java collections framework and AVL tree ...

November 25, 2016 | Author: Anonymous | Category: Java
Share Embed


Short Description

Floyd–Warshall algorithm is also known as Floyd's algorithm. 5. 4. their efficiency can be different based on the type...

Description

AVL TREE Self balancing binary trees any node“a self-balancing (or height-balanced) binary search tree is any nodebased binary based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insertions and deletions” Since we know that on binary search trees (BST), most of the operations take time in accordance with the height of the tree, hence we strive to keep its height minimal with respect to the number of elements entered. And operations hence take less time to perform. S o we don’t have to worry about the performance overhead in search, however insertion takes more time, since we know that insertion is less performed operation than search, so this approach is acceptable in such a scenario.  Avl tree. Introduction:

AVL tree is a self-balancing binary seach tree [BST], which was introduced in1962 in a research paper (“An algorithm a lgorithm for the organization of information”) by Soviet mathematicians, namely: 

Georgy Adelson-Velsky



Evgenii Mikhailovich Landis

Time complexity

Worst case time complexity for search, delete and insert is  O (log n). where n = the total number of nodes in the tree. However rebalancing after insertion and deletion may take some s ome extra time which has not been accounted for in this time complexity analysis. Total time taken for adjustment and rotation is O(1).

Working of AVL tree:

AVL tree is used in applications in which search is the most performed operation. after each insertion or deletion, it is checked from the current place of insertion to the root. if the braches are sagging more than one level from the level of other branches, then the rotation is done. Balance factor:

Balance factor of an AVL tree is calculated by following formula at each node: Height of left subtree – Height of right subtree If the balancing factor is less than -2 or more than 2 then rotation is done in case of mildly balanced tree in order to reduce the imbalance in the tree, however in strictly balanced trees, height of both subtrees should be equal.

Rotation mechanism

Left rotation

Consider the tree shown in the diagram on the right, we can see that it is unbalanced on node 4 according to the rule of balance factor, what we will have to rotate the neighboring nodes so balance could be achieved, and AVL tree will remain in its healthy state, and does not degenerate into an unbalanced tree.

Now, since we know that the AVL tree still work on the principle that: left child < parent < right child so after the rotation, we can see (in the picture on the right) that node 5 has taken the place of parent

instead of node 4 which has moved to the status of left child. And node 6 has been moved to the place of right child. Right rotation:

Now as indicated in the picture that an element is added below the leaf X, so we know that the rotation is compulsory since the tree will be unbalanced. hence to balance the tree, we have to move node m to the right hand side f the node ‘n’, such that node Z will become the child of node ‘m’. and since node ‘X’ is already at height+1, it will be appended directly below the root, so as to balance the height on both left and right sub trees.

Node ‘Y’ has remained a child of node ‘m’. In the scheme of rotation, we can see that it has been strived to preserve the parent child relation, in order to simplify the process.

 Advantages of using AVL tree: 1. Time complexity for search is O(log N) since AVL trees are balanced at all times. 2. Operations of insertion and deletion also have a time complexity of O(log n) 3. Speed of insertion is not so much impeded by the balancing operation since it only adds a constant time factor to the operation. Disadvantages of using AVL tree: 1. AVL tree is difficult to realize in form of code; more space is required for balancing factor. 2. AVL tree serves to enable faster operations but rebalancing is a costly operation in terms of time.

References Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition. Addison-Wesley, 1998. Wikipedia article: Self balancing binary search tree http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree Wikipedia article: AVL tree http://en.wikipedia.org/wiki/AVL_tree Lectures from University of Washington website

courses.cs.washington.edu/courses/cse373/04wi/slides/lecture08.ppt Lectures from University of Wisconsin website http://pages.cs.wisc.edu/~ealexand/cs367/NOTES/AVLTrees/index.html

Java Collection Framework What is a collection? Collection is an object that combines/collects multiple elements into a single unit.

What is a collections framework? A collections framework is an architecture which is a combination of different programming ingredients made for the sole purpose of using collections, representing them and manipulating them. A collection framework comprises of these three things: 1. Algorithms: Algorithms are the step by step approach to achieve the purpose of representing data as we see in a structure theoretically. 2. Implementations: implementations are simply the implementation of the algorithms which enable the user to use the collection in their own way to satisfy the needs of a program. 3. Interfaces: Interfaces are the abstract classes through which we can access the implementations of the algorithms. Collections can be used to store, access, manipulate, and communicate aggregate data or perform different operations on it. Collections represent data items that belong to a similar group of elements, like: 

A dictionary (collection of valid words in a given language)



A deck of cards (collection of valid and playable cards representing a card system) etc.

 Advantages of using JCF (Java Collection Framework) Following some of the advantages that are offered by the used of JCF:













Reduction in effort required in programming:  by providing ready made and reusable structures to use in the programs, so the programmer does not have to write those algorithms him/herself. Increase in performance: by adding in productivity via better implementations of data structures that are to be used, also that various interfaces offer different implementations to suit the need of the specific  programs. Provision of compatibility and interoperability between unrelated APIs:  because using java as a common language these APIs can pass the data amongst themselves. Reduces the learning cost: by requiring us to learn different collection APIs. Reduces the effort required to design and implement APIs:  by not requiring you to produce ad hoc collections APIs. Encourages collection reuse: via implementing different algorithms, and interfaces with which to use and manipulate them to owns benefit.

 Architecture and interfaces: Following are some core collection interfaces:

Collections Collections are the root structure of collection framework. It represents a combination of objects known as its members. Some types of collections also allow similar elements in a set, while others don’t. ordered and unordered are another variety present in the realm of collections. Java platform doesn’t have any implementations as it is for the collections, but set and list are the subinterfaces that apply the thought behind the collections. Add, clear, clearAll, retainAll are some of individual and bulk operations that the collection class exhibits.

Set Set is a collection of unique elements i.e. it cannot contain duplicate elements. This interface is modeled after the representation of sets as in mathematics. ContainsAll, removeAll, retainAll, addAll are some of the bulk operations that can be performed on sets.

List List is an ordered set of elements. It can have similar elements more than once, i.e. it can have or can not have unique elements. Every element has an index number and members can be manipulated via these index numbers. Sort, Shuffle, rotate, binarySearch, indexOfSublist are the examples of the operations that can be performed on the lists.

Queue This collection is very common. Most common use for this data structure is to hold elements before they are to be entered in for processing. A queue comprises of additional methods which are insertion, extraction etc. queues implement FIFO (first in first out) sequence of the element entry and removal of elements, but this not a hard and fast rule. Priority queues are the example of such queues which defy FIFO structure. Insert, remove and examine are some of the examples of operations that can be performed on the queues.

Dequeues Dequeues are double ended queues, they can be operated on from both ends of the pipeline, they implement both FIFO (Fist in first out) and LIFO (Last in first out). addFirst, addLast, peekFirst, peekLast are some of the examples of the operations that can be performed on Dequeues.

Map Map is basically an object which points to values through keys. Duplicate keys cannot be entered because that would lead to confusing pointing and retrieval of information. Java implements maps in three basic forms i.e. Treemaps, linkedHashMap, HashMaps.

Sorted Set Sorted sets are different from sets in manner that its elements are arranged in sorted way. Advantage of sorting is taken using different new operations that are not present in the Set interface. These find application in dictionaries, roll numbers of students and other different areas of applications.

Sorted Map Sorted maps are different from the Maps in the way that their mappings keys are maintained in ascending order. Otherwise they are same.

Legacy classes: Legacy classes have been retrofitted into the collections framework, they can be used but they are not as efficient as newer data structures. however, the list of present classes have been mentioned below. Hashtable: newer alternative would be Hashmap. Enumeration: collection and iterators should be used instead, but Collection.Enumeration library can be used if needed. Vectors: Arraylist can be used to replace this data structure. Stack: linkedlists can be used in place of stacks. Bitset: Arraylist of Boolean type variables can be used to replace the BitSet Data structure, but obvious disadvantage would be wastage of space, because one bit would take 8 bits to represent an array.

References Object Oriented Software Construction (Lecture notes) University of Auckland https://www.cs.auckland.ac.nz/~hongyul/tut/3_Collections Chapter no. 22 Java collections framework Liang, Introduction to Java programming, Seventh edition, Pearson education

Java collections, software engineering notes, University of Washington http://www.cs.washington.edu/education/courses/403/03wi Wikipedia article: Java collections framework http://en.wikipedia.org/wiki/Java_collections_framework Tutorial: Introduction to Java collections (Sun corporation) https://docs.oracle.com/javase/tutorial/collections/index.html Java collections main page https://docs.oracle.com/javase/7/docs/technotes/guides/collections /index.html Java collections framework: overview https://docs.oracle.com/javase/7/docs/technotes/guides/collections /overview.html JCF Lesson: Interfaces (The Java collections > Collections) https://docs.oracle.com/javase/tutorial/collections/interfaces/

Shortest path algorithm Graph theory and directed graphs: In order to understand the shortest path algorithms we will have to understand what digraphs are. Graph theory is a branch of mathematics, more specifically of discrete mathematics. Basically the shortest paths using different algorithms are applied to digraphs which are a part of the graph theory. A directed graph / digraph is a set of nodes connected by edges / paths which either have or don’t have a direction associated with them. Graphs are expressed in the notation G=(V,A) where:  

V denotes the nodes or in other words, vertices. A is a set of ordered pairs of vertices which are also called edges in a graph, they can be directed or non-directed.

 Analogy of digraphs to road maps An analogy can be given from the map of a city where the shortest path between various destinations can be found out by traversing the path between the destinations. Nodes of the graphs correspond to the intersections on the road map, and edges are road segments, weights can be assigned to the segments according to the length of the road segments.

Shortest path problem: Shortest path problems are defined in Wikipedia as: “In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized.”

Categorization of Shortest path problems Shortest path algorithms can be categorized as follows: 1. 2. 3. 4.

Single pair shortest path problem single source shortest path problem single destination shortest path problem all pairs shortest path problem

Shortest path algorithms: Following are different shortest path algorithms which can be used in different scenarios and settings, their efficiency can be different based on the type of problem given:      

Dijkstra's algorithm Bellman–Ford algorithm A* search algorithm Floyd–Warshall algorithm Johnson's algorithm Viterbi algorithm

Some of the algorithms will be explained here briefly:

Floyd-Warshall ’s shortest path algorithm: The Floyd–Warshall algorithm was published by Robert Floyd in 1962. However, it is similar to the algorithms previously published by Bernard Roy in 1959. Floyd–Warshall algorithm is also known as Floyd's algorithm, Roy–Warshall algorithm, Roy–Floyd algorithm. it is a graph analysis algorithm for finding the shortest path in a weighted digraph with positive or negative edge weights, but comprising of negative cycles. A single execution will find the lengths of the shortest paths between all paired nodes. Time complexity in the worst case scenario is O (|V|3)

 Algorithm: 1. it compares all possible paths through the graph between each paired node. 2. It does this with the help of Θ(|V |3) comparisons in a graph. 3. there may be up to Ω(|V |2) edges in the graph, and every combination of edges is tested. 4. It does so by step by step improving the estimate on the shortest path between two vertices. 5. until the estimate is optimum the loop is continued.

Dijkstra’s shortest path algorithm: Dijkstra's algorithm was published by Edsger Dijkstra in 1959. It is a graph search algorithm that solves the single-source shortest path problem. It is used in graphs that cannot use negative edge cost. It produces a shortest path tree. This algorithm is often used in routing along with other graph algorithms.

A source node is given in the graph in which the shortest path is to be found, the algorithm finds the path with lowest edge cost which obviously is the shortest path from that node to every other node in the graph (single source - multiple destination problem). It can also be used to find the shortest path in single source – single destination problems by stopping the algorithm once the algorithm has reached the destination and shortest path is achieved. Same road map analogy can be applied here. For example, if we want to find the shortest distance from our current location to all other destinations or road intersections in the city, then Dijkstra’s algorithm can be used, or in the second case as we have discussed, we will have to stop midway in order to find the shortest path from current location to a specific location. But the algorithm will also find the shortest paths for all the destinations in between of that specific path.

 Algorithm: Source node is called initial node. Let the distance of node Y be the distance from the initial node to Y. Following steps will be followed to find the shortest path from initial node to node Y. 1. An initial value is set to all nodes. This is 0 for initial node and infinity for all other nodes. 2. All nodes are marked unvisited. Initial node is set as current. A set called “unvisited set” is created comprising of all the unvisited nodes. 3. Consider all of its unvisited neighbors of the current node and calculate their tentative distances. Compare the newly calculated distance to the current assigned value and assign the smaller one, if the previous value was greater. 4. When the distance is measured for all the neighbors from the current node, set the current node as visited and eliminate it from the unvisited set. 5. If the destination node is set to visited (in single source – single destination problem) or if the smallest tentative distance from current node to the nodes in unvisited set is infinity (in single source – multiple destination proble, when the unvisited nodes are disjointed from the current node) then the algorithm has finished. 6. Select the unvisited node that is marked with the smallest tentative distance, and set it as the new "current node" then go back to step 3.

Comparison between Floyd- Warshall and Dijkstra’s algorithm: Floyd-Warshall Algorithm:

1. Finds a shortest-path for all node-pairs (x, y). 2. We can have one or more links of negative cost, c(x, y)
View more...

Comments

Copyright © 2017 DATENPDF Inc.