In this article following Data Structures are discussed.
5. Binary Tree
6. Binary Search Tree
7. Binary Heap
9. Hashing
Binary Tree
Unlike Arrays, Linked Lists, Stack and queues, which are linear data structures, trees are hierarchical data structures.A binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. It is implemented mainly using Links.
Binary Tree Representation: A tree is represented by a pointer to the topmost node in tree. If the tree is empty, then value of root is NULL. A Binary Tree node contains following parts.
1. Data
2. Pointer to left child
3. Pointer to right child
A Binary Tree can be traversed in two ways:
Depth First Traversal: Inorder (Left-Root-Right), Preorder (Root-Left-Right) and Postoder (Left-Right-Root)
Breadth First Traversal: Level Order Traversal
Binary Tree Properties:
The maximum number of nodes at level ‘l’ = 2l-1.
Maximum number of nodes = 2h – 1.
Here h is height of a tree. Height is considered
as is maximum number of nodes on root to leaf path
Minimum possible height = ceil(Log2(n+1))
In Binary tree, number of leaf nodes is always one
more than nodes with two children.
Time Complexity of Tree Traversal: O(n)
Examples : One reason to use binary tree or tree in general is for the things that form a hierarchy. They are useful in File structures where each file is located in a particular directory and there is a specific hierarchy associated with files and directories. Another example where Trees are useful is storing heirarchical objects like JavaScript Document Object Model considers HTML page as a tree with nesting of tags as parent child relations.
Binary Search Tree
In Binary Search Tree is a Binary Tree with following additional properties:1. The left subtree of a node contains only nodes with keys less than the node’s key.2. The right subtree of a node contains only nodes with keys greater than the node’s key.3. The left and right subtree each must also be a binary search tree.
Time Complexities:
Search : O(h)
Insertion : O(h)
Deletion : O(h)
Extra Space : O(n) for pointers
h: Height of BST
n: Number of nodes in BST
If Binary Search Tree is Height Balanced,
then h = O(Log n)
Self-Balancing BSTs such as AVL Tree, Red-Black
Tree and Splay Tree make sure that height of BST
remains O(Log n)
BST provide moderate access/search (quicker than Linked List and slower than arrays).
BST provide moderate insertion/deletion (quicker than Arrays and slower than Linked Lists).
Examples : Its main use is in search application where data is constantly entering/leaving and data needs to printed in sorted order. For example in implementation in E- commerce websites where a new product is added or product goes out of stock and all products are lised in sorted order.
Binary Heap
A Binary Heap is a Binary Tree with following properties.1) It’s a complete tree (All levels are completely filled except possibly the last level and the last level has all keys as left as possible). This property of Binary Heap makes them suitable to be stored in an array.2) A Binary Heap is either Min Heap or Max Heap. In a Min Binary Heap, the key at root must be minimum among all keys present in Binary Heap. The same property must be recursively true for all nodes in Binary Tree. Max Binary Heap is similar to Min Heap. It is mainly implemented using array.
Get Minimum in Min Heap: O(1) [Or Get Max in Max Heap]
Extract Minimum Min Heap: O(Log n) [Or Extract Max in Max Heap]
Decrease Key in Min Heap: O(Log n) [Or Extract Max in Max Heap]
Insert: O(Log n)
Delete: O(Log n)
Example : Used in implementing efficient priority-queues, which in turn are used for scheduling processes in operating systems. Priority Queues are also used in Dijstra’s and Prim’s graph algorithms.
Order statistics: The Heap data structure can be used to efficiently find the k’th smallest (or largest) element in an array.
Heap is a special data structure and it cannot be used for searching of a particular element.
Hashing
Hash Function: A function that converts a given big input key to a small practical integer value. The mapped integer value is used as an index in hash table. A good hash function should have following properties1) Efficiently computable.2) Should uniformly distribute the keys (Each table position equally likely for each key)
Hash Table: An array that stores pointers to records corresponding to a given phone number. An entry in hash table is NIL if no existing phone number has hash function value equal to the index for the entry.
Collision Handling: Since a hash function gets us a small number for a key which is a big integer or string, there is possibility that two keys result in same value. The situation where a newly inserted key maps to an already occupied slot in hash table is called collision and must be handled using some collision handling technique. Following are the ways to handle collisions:
Chaining:The idea is to make each cell of hash table point to a linked list of records that have same hash function value. Chaining is simple, but requires additional memory outside the table.
Open Addressing: In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record or NIL. When searching for an element, we one by one examine table slots until the desired element is found or it is clear that the element is not in the table.
Space : O(n)
Search : O(1) [Average] O(n) [Worst case]
Insertion : O(1) [Average] O(n) [Worst Case]
Deletion : O(1) [Average] O(n) [Worst Case]
Hashing seems better than BST for all the operations. But in hashing, elements are unordered and in BST elements are stored in an ordered manner. Also BST is easy to implement but hash functions can sometimes be very complex to generate. In BST, we can also efficiently find floor and ceil of values.
Example : Hashing can be used to remove duplicates from a set of elements. Can also be used find frequency of all items. For example, in web browsers, we can check visited urls using hashing. In firewalls, we can use hashing to detect spam. We need to hash IP addresses. Hashing can be used in any situation where want search() insert() and delete() in O(1) time.