operation		        ArrayList		           LinkedList

add(x)		       O(1) amortized 	                              O(1) (provided a tail pointer)

get(i)		       O(1) random access	            O(n)

removeFirst()                   O(n) shift to left		           O(1)                  	      
(remove(0))

removeLast()                  O(1) (numElements--)                    O(1) (doubly linked list)
(remove(size()-1))

add(i,x):                              O(n)                                               O(n)
1. find i                               O(1)                                               O(n)
2. add 		         O(n)                                               O(1)
--------------------------------------------------------------------------------------------------

Implementing a priority queue


1. using an unsorted array/array list


add(x) --> add x to the end of the array   O(1) amortized time

peek() --> scan the array to find the min  O(n) time

poll() --> find min, then remove it             O(n) + O(n) = O(n) 


2. unsorted array with maintaining the min

add(x) --> add x to the end of the array    O(1) amortized time
                  if x < min, set min = index(x)  O(1) time

peek() --> return index(min)     O(1) time

poll() --> find the min       O(1) time
                remove(min)     O(n) time
                find the next min O(n) time 


3. sorted array descending 

add(x)  add x into the sorted order 
             (run the inner loop of insertion sort) O(n) time


peek()  return array[size-1]     O(1) time


poll()  size--;                          O(1) time


What we want, ideally, is adding and polling to take O(log n) time. 

Use a binary minimum heap

A binary tree is a structure where every node has at most 2 children
and there is a unique path from any node to the root. 

children = the nodes underneath a node

parent = the node above. 

root = the topmost node 

complete binary tree is a binary tree where every row is completely filled 
(from left to right) except the last row. 

binary minimum heap property says that for every node x in the tree,
 x<=its children (x>= its parent)

A binary min heap is a complete binary tree that obeys the binary min heap property. 

ex:

			                                    3(0)
                                         5(1)                                                                                             7(2)
                          6(3)                                   8(4)                                              7(5)                                    8(6)   
                   6(7)          10(8)           11(9)           14(10)                     11(11)         8(12)               8(13)           8(14)    
               

peek(): O(1) time
    return the root 

add(x): O(h) where h = height of the tree 

  put x at the end of the array (leftmost open spot in the tree) 

  while x is not at the root and x< parent(x)
          swap(x, parent(x))
          x = parent(x)

poll(): O(h) 
     swap the root with the last element
     delete the last element 
     find the smallest of root, left(root), and right(root)  //provided left(root) and right(root) exist
     if smallest = root then stop 
     else
           swap(root, smallest)
           repeat this process with smallest as the new root


left(x) = 2x + 1

right(x) = 2x+2

parent(x) = floor((x-1)/2)

The running times for poll() and add() are O(h). But what is h in terms of n? 

n =        1 + 2 + 4 + 8 + ... + 2^(h-1)

multiply by the common ratio (in this case, it's 2) 
2n =            2  + 4 + 8 + ... + 2^(h-1)   + 2^h

2n - n = n = 2^h - 1  -> n + 1 = 2^h  -> log_2 (n+1) = h

------------------------------------------------

Heaps for sorting

heapsort(arr, n) 

1. take arr and make it into a max heap using the build heap algorithm 
    while n>1 do 
2. swap arr[0] with arr[n-1] // put the max at the end 
3. n--; //ignore the max 
4. fix the heap 


5 12  3   8   1   20     5    4   

1. turn the array into a max heap 
                                                    20 
                        12                                            5
                  8          1                                  3            5
             4

2. repeatedly pop the heap 

                                                    D1 
                        D3                                            D4
                  D5         D5                                  D8            D12
             D20

total time O(n log n) 


Sorting algorithms:

heapsort 
merge sort 
bubble sort
selection sort 
insertion sort 
quicksort 


bubble sort

3   5    1   8   5   4   12    20

3   1    5   5   4   8   12    20

1   3    5   4   5   8   12    20

1   3    4   5   5   8   12    20

1   3    4   5   5   8   12    20

bubble sort obeys the following invariant:

after p passes, the last p numbers are in sorted order. 

best case: already sorted. O(n) 

worst case: reverse sorted 

n-1 + n-2 + n-3 + ... + 1

= 1 + 2 + 3 + ... + n-1 = n(n-1)/2 = O(n^2)

------------------------------------------------------
insertion sort 

     sorted                      ???
0___________i-1   i  ___________   


5  ||  12  3   8   1   20     5    4

5    12  ||  3   8   1   20     5    4

3     5      12 || 8   1   20  5      4

 3    5        8   12 || 1   20  5     4

  1      3           5     8      12 20 ||   5    4
  1         3             5       5         8     12        20 || 4
   1           3              4          5           5         8            12       20


void insertionSort(int[] arr, int n) {

     for(int i=1; i<n; i++) {
          int key = arr[i];
          int loc = i-1;	     
         while( loc >= 0 && arr[loc] > key) {
	arr[loc+1] = arr[loc];
	loc--;
         }
          arr[loc+1] = key;
      }
}

best case: already sorted O(n) because the while loop doesn't run ever. 
if every number is a constant distance from where it belongs. 

worst case: reverse sorted  O(n^2)

----------------------------------------------------

2     6       7       10                                  1       3       9      12  

 1    2      3       6      7       9    10  12


void merge(int[] arr, int start, int mid, int end) {

//arr[start..mid-1] is sorted and so is arr[mid..end-1] 

	int left = start;
	int right = mid;
	int i = 0;
	int[] temp = new int[end-start];

	while(left<mid && right < end) {

		if(arr[left] < arr[right]) {
			temp[i] = arr[left];
			left++;
		}else{
			temp[i] = arr[right];
			right++;
		}
		i++;
	}
	
	while(left < mid) {
		temp[i] = arr[left];
		left++;
		i++;
	}

	while(right < end) {
		temp[i] = arr[right];
		right++;
		i++;
	}
	
	for(int i=0; i<temp.length; i++)
		arr[start+i] = temp[i];
}

Let n = the total size of both arrays 
running time for merge: 


public static void mergesort(int[] arr) {
	mergesort(arr, 0, arr.length);
}

private static void mergesort(int[] arr, int start, int end) {
	if(end-start <=1)
		return;

	int mid = (start + end) / 2;

	mergesort(arr, start, mid);
	mergesort(arr, mid, end);
	
	merge(arr, start, mid, end);
}

0     1          2         3           4           5          6              7
5     12        3         8      ||     1          20         5             4

5     12        3         8     ||      1          20         5            4

5     12   ||   3        8     ||        1         20     ||     5          4      

5 ||   12  ||    3 ||       8   ||         1   ||     20   ||      5    ||       4
  
5      12 ||    3         8     ||         1        20  ||      4         5
3       5        8       12      ||        1       4         5        20 

1    3       4    5      5     8     12     20  


Let T(n) be the running time for mergesort on an array of size n 

T(1) = 1

T(n) = 2T(n/2) + n             T(n/2) = 2T(n/4) + n/2

          = 2 (2T(n/4) + n/2) + n     T(n/4) = 2 T(n/8) + n/4

         = 2*2 T(n/4) + n  + n

       = 2 * 2 (2 T(n/8) + n/4) + n  + n

      = 2 * 2 * 2 T(n/8) + n + n   + n

      = 2 * 2 * 2 * 2 T(n/16) + n  +n + n + n

     =  ... 
     after k steps 

    = 2^k T(n/(2^k)) + kn

    want to stop when n/2^k = 1   --> n = 2^k    --> k = log_2 n

    = n (1)  + n log_2 n  = O(n log n)

-----------------------------------------------------------------------
    
Quicksort

Partitioning: 

Given an array A[start..end], how can we prepare the array in such a way such that when we 
are done, we can just sort? 

Goal: Using some element inside the array called the pivot, move all elements <=pivot to the left 
and all elements >= pivot to the right.  Once this is accomplished, all we need to do is sort the left and 
right parts. 

One possible method:
<=pivot                 >=pivot            ???
                         |  |                   |   |
                         |  |                   |   |
                         |b|                   | i |
                         |  |                   |   |

PS int partition(int[] arr, int start, int end) {
       int r = random.nextInt(end-start);
       swap(arr, start, r);
       int pivot = arr[start];
      int boundary = start;
      for(int i=start+1; i<end; i++) {
	if(arr[i] < pivot) {
	     boundary++;
	     swap(arr, i, boundary);
	}
     }
     swap(arr, start, boundary);
     return boundary;
}


5    12     3     8       1       20        5       4   
b             i

5     3     12     8       1        20       5        4
       b                        i

5     3       1      8      12       20      5          4
                 b                                                 i

5      3       1     4     12       20        5           8
                         b

4      3       1     5     12        20       5            8
                  b    P
1     3       4     5       5         8         12           20  
P     P       P    P        P         P         P             P                 


1    4     3     2      5                    
                             b
               

------------------------------------------

Given an array A[0..n-1], does A contain any duplicates? 

Set<Integer> set = new HashSet<>();

for(int x : A)
	set.add(x);

return A.length != set.size();

for(int x: A) {
	if(set.contains(x))
		return true;
	else
		set.add(x);
}
return false;

5    12     3     8       1       20        5       4  

for(int x: A) {
	if(set.contains(23-x))
		return true;
	else
		set.add(x);
}
return false;


------------------------------------------------------------------

A set is a data structure that models a mathematical set. I.e. a collection where
all we are interested in is whether some element is in the collection or not. 
It's useful in programming since it allows us to remember things we've already seen, like above. 

Useful operations:

1. S.contains(x) : Is x an element of S?

2. S.add(x) : S = S U {x} 

3. S.remove(x) : S = S \ {x} 

4. S.addAll(T):  S = S U T

5. S.removeAll(T): S = S \ T

6. S.retainAll(T): S = S intersect T 

Implementations:

1. TreeSet: The data are stored in a tree using a natural ordering (or a Comparator ordering) 

Pros: 
1. iterating through the keys in a TreeSet yields a sorted order of the keys
2. Because of the underlying structure, we can query the set about max/min keys 
     and the predecessor/successor of a given key in O(log n) time. 

Cons:
1. Keys must have an ordering
2. most operations are O(log n) worst case time which is generally slower than O(1) amortized expected. 

2. HashSet: data are stored in a hash table based on prehash and hash function mappings. 

Pros:
1. Very fast (O(1) time amortized in expectation) insertions/deletions/searches
2. Can store any key imaginable (since all Objects in java have hash codes = pre hash) 

Cons:
1. Iteration produces haphazard results. 
2. Cannot efficiently answer predecessor/successor queries. 

------------------------------------------------------------------------

Maps

A map is a data structure that allows us to associate one type of data (called the key) 
with another called the value. Searches are optimized based on the key. 
Structurally, maps are identical to sets, but store pairs of elements (based on the key) instead of 
just keys like a set. 

Implementations in java are TreeMap and HashMap, and are subject to the same pros and cons as 
TreeSets and HashSets. 

Important methods:

M.containsKey(k) : is k a key in M 

M.get(k) :                returns the value associated with k if M.containsKey(k) and null otherwise.
                                 (Recall: if you want to use get() in order to check membership of k in the key set 
	              of M, you cannot rely on autounboxing since it can be null) 

M.put(k, v) :           Adds the pair (k,v) to the map 

M.remove(k) :       Deletes the pair associated with k in the map 

M.keySet():           Returns a Set over all keys in M. Return type is Set<K>. This allows iteration over M. 

M.entrySet():        Returns a Set over all pairs in M. Return type is Set<Map.Entry<K,V>>. 
   
---------------------------------------------------------

TreeSets are implemented using binary search trees. 

A binary search tree is a binary tree in which every node obeys the binary search tree property:

For all nodes n, if x is a node in n's left subtree, then x.key <= n.key and 
                           if x is a node in n's right subtree, then x.key>=n.key 


                                       20
                       18                      25 
               15         19           23             29
                    17              22       24    26           35
                                                               27

preorder:      20 18 15 17 19 25 23 22 24 29 26 27 35
inorder:        15 17 18 19 20 22 23 24 25 26 27 29 35 
postorder:    17 15 19  18 22 24 23 27  26 35  29  25   20
class TreeNode {
	int key;
	TreeNode left;
	TreeNode right;
}

public static TreeNode search(TreeNode root, int key)
{

	if(root==null)
		return null;
	if(root.key == key)
		return root;

	if(key < root.key)
		return search(root.left, key);
	else
		return search(root.right, key);
}

public static TreeNode insert(TreeNode root, int target) {

	if(root==null) {
		root = new TreeNode();
		root.key = target;
		root.left = null;
		root.right = null;
	}else if(target < root.key) {
		root.left = insert(root.left, target)
	}else {
		root.right = insert(root.right, target);
	}
	return root;
}

preorder = root first 
public static void printNLR     (TreeNode root) {

	if(root==null)
		return;

	System.out.println(root.key);
	printNLR   (root.left);
	print NLR(root.right);

}

inorder = look at root in the middle 
public static void printLNR     (TreeNode root) {

	if(root==null)
		return;
	printLNR(root.left);
	
                    System.out.println(root.key);
	printLNR(root.right);

}

postorder = look at root at the end  
public static void printLRN     (TreeNode root) {

	if(root==null)
		return;
	printLRN(root.left);
	printLRN(root.right);
	 System.out.println(root.key);

}

public static int sum(TreeNode root) {

	if(root==null)
		return 0;
	return root.key + sum(root.left) + sum(root.right);
}


 1
     2
         3
              4
                  5
                      6


StringBuilder sb = new StringBuilder();

for(E e: this) 
	sb.append(e + " ");
return sb.toString();

if(root==null)
	return false;

if(value == root.value)
	return true;

return containsValue(root.left) || containsValue(root.right);


boolean ans = false;
ArrayStack<Integer> saved = new ArrayStack<>();
while(!stack.isEmpty()) {
	if(stack.peek()==target) {
		ans = true;
		break;
	}else {
		saved.push(stack.pop());
	}
}

while(!saved.isEmpty())
	stack.push(saved.pop());

return ans;
---------------------------------------------------------------------------
Hash tables

By way of introduction, suppose that I wanted to implement a set, and I knew
that the keys could only be in the set {0, ..., 100}. How could I do this 
really efficiently?

Ans: A direct access table  = A direct access table is an array T[0..n] of boolean values such that 
T[i] = true if i is in the set and T[i] = false if i is not in the set. 

constructor(): set all T[i] to false

add(x): T[x] = true

contains(x): return T[x]

remove(x): T[x] = false 

Is this generally realistic? for small n, sure. for large n, definitely not. 
How can we leverage this idea? 

Create a smaller table. To make our keys fit into the table, do key % table size. 
Called the division method.

add(14)
add(63)
add(150)
add(75)
add(10)

0 Hello
1
2
3    63
4
5
6
7
8
9
10    150 -> 10
11    
12
13
14     14
15     75
16
17
18
19 

But there is a major problem = collisions! 

Resolutions:

chaining 

probing 


public class OurClass {
	private char ch;

	OurClass(char ch) {
		this.ch = ch;
	}
	public int hashCode() {
		return (ch - 'A') + 10;
	}
	public boolean equals(Object o) {
		if(! (o instanceof OurClass))
			return false;
		OurClass rhs = (OurClass)o;
		return ch == rhs.ch;
	}
}

OurClass o1 = new OurClass('A');
OurClass o2 = new OurClass('Z');


---------------------------------------------

An undirected graph G is an ordered pair (V, E) (G = (V,E) ) 
V = a set of vertices (nodes) 
E = a set of edges (arcs) where an edge is a set of 2 vertices, 

For the undirected graph:
V = {A, B, C, D, E}
E = {{A, B}, {A, C}, {B,C}, {B,D}, {B,E}, {C,E}, {D,E}}

A directed graph G is an ordered pair (V, E) (G = (V,E) ) 
V = a set of vertices (nodes) 
E = a set of edges (arcs) where an edge is an ordered pair
of vertices. 
 
For the directed graph, 
V = {A, B, C, D, E, F}
E = {(A,B), (A,D), (B,C), (C,D), (F,A), (E,F), (F,E)} 

Abuse of notation:
In reality, in an undirected graph edges are unordered pairs,
i.e. doubleton sets.  So the correct notation is as above. 
However, many textbooks and academic papers will use
the parenthesis notation for both, even though not technically 
correct. Many examples abound. 
(e.g. O(1), f(x,y) etc.) 

Definition: We say that a vertex v is adjacent to a vertex u 
if there is an edge from u to v. I.e. if (u,v) is an edge, 
we say that v is adjacent to u. 

Definition: The adjacency matrix of a grsph G=(V,E) 
is a 2D array Adj[|V|, |V|] where Adj[i,j] = 1 
if (i,j) is an edge and 0 if not. 

undirected graph 

   A    B     C     D    E
A 0   1      1      0     0
B 1   0      1      1     1
C 1   1      0     0      1
D 0   1      0     0      1
E  0   1      1     1     0

directed graph:
    A   B     C    D     E    F
A 0    1     0     1      0    0
B 0    0     1     0      0    0
C 0    0     0     1      0    0
D 0    0     0     0      0    0
E  0   0     0     0      0     1
F  1    0     0     0      1    0

undirected:

A  = B- >C
B  = A->C->D->E
C = A->B->E
D = B-> E
E = C -> B-> D

directed:
A = B -> D
B = C 
C = D
D = null
E = F
F = A -> E