일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
31 |
- data scientist
- Newyork
- Linked List
- data
- Heap
- BST
- 빅데이터 커리어 가이드북
- hash
- 빅데이터
- 뉴욕 화장실
- 화장실 지도
- Data Analyst
- 데이터 엔지니어
- Data Structure
- 데이터 분석가
- dataStructure
- 빅데이터 지식
- HEAPS
- algorithm
- Computer Organization
- Preparing for the Google Cloud Professional Data Engineer Exam
- exam
- Computer Science
- Study
- priority queue
- Restroom
- Algorithms
- Binary Tree
- binary search tree
- Data Engineer
- Today
- Total
Jaegool_'s log
IT 279: Algorithms and Data Structure EXAM2 prep 본문
Contents
Sorting
Insertion Sort
Selection Sort
Merge Sort
Quick Sort
Heap Sort
Shell Sort
Disjoint Set
- A data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets
Term: Forest
Union
Find
Union by Size + Path Compression
Union by Rank/Height
Graph
Topological Sort: ordering the nodes in a directed acyclic graph
Shortest path: Dijkstra's Algorithm, Breadth-First Search
Network Flow problems
Minimum Spanning Tree: Kruskal's Algorithm, Prim's Algorithm
--------------------------------
Sorting
(code, stability, complexity)
Insertion Sort
template <typename Comparable>
void insertionSort( vector<Comparable> & a )
{
for( int p = 1; p < a.size(); ++p) // start with second element
{
Comparable temp = std::move( a[ p ] );
int j;
for( j = p; j > 0 && tmp < a[ j - 1 ]; --j )
a[ j ] = a[ j - 1];
a[ j ] = std::move( tmp );
}
}
time: O(n) - best, O(n^2) - average & worst(reverse order)
space: O(1)
stable
1. compare the second element and the previous item and then sort
2. get the next element and put it in the right place
Selection Sort
1. find the minimum element
2. swap with the first item
3. keep doing the process except for the sorted part
Merge Sort
1. divide half and make them as two parts
2. recursively make half and sort.
Quick Sort
sort using pivot(3 median - sort first, middle, last elements and swap middle one and last -1 element)
/**
* Quicksort algorithm (driver).
*/
void quicksort(vector<Comparable> & a)
{
quicksort(a, 0, a.size() - 1);
}
void quicksort(vector<Comparable> & a, int left, int right)
{
if(left + 10 <= right)
{
const Comparable & pivot = median3(a, left, right);
// Begin partitioning
int i = left, j = right - 1;
for( ; ; )
{
while(a[++i] < pivot){}
while(a[++j] > pivot){}
if (i < j)
std::swap(a[i], a[j]);
else
break;
}
std::swap(a[i], a[right-1]); //restore pivot
quicksort(a, left, i-1);
quicksort(a, i+1, right);
}
else
insertionSort(a, left, right);
}
const Comparable & median3(vector<Comparable> & a, int left, int right)
{
int center = (left + right) / 2;
if(a[center] < a[left])
std::swap(a[center], a[left]);
if(a[right] < a[left])
std::swap(a[center], a[left]);
if(a[right] < a[center])
std::swap(a[right], a[center]);
std::swap(a[center], a[right-1]);
return a[right-1];
}
Heap Sort
Step 1: O(n): linear time required to build heap algorithm - heapify(Maxheap)
Step 2: n-1 deleteMax operations (swap root and the end of the node then percolate down to keep the heap)
deleteMax is O(log n)
Therefore, O(n) + O(n log n) => O(n log n). - all cases
Space cost of n + 1
Higher overhead than some other O(n log n) sorts.
Not a stable sort
Shell Sort
template <typename Comparable>
void shellsort( vector<Comparable> & a )
{
for( int gap = a.size() / 2; gap > 0; gap /= 2 )
for( int i = gap; i < a.size(); ++i )
{
Comparable temp = std::move( a[ i ] );
ing j = i;
for( ; j >= gap && tmp < a[ j - gap ]; j -= gap)
a[ j ] = std::move( a[ j - gap ] );
a[ j ] = std::move( tmp );
}
}
time complexity: O(n^2) - worst case, better case O(n log n) ~ O(log n^3/2)
space complexity: O(1)
Not stable
based on the Insertion sort.
Disjoint Set
- A data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets
Term: Forest
Union
Find
Union by Size + Path Compression
Union by Rank/Height
Graph
Topological Sort: ordering the nodes in a directed acyclic graph (needs: DAG, In-degree Array, queue / stack)
1. put 0 In-degree vertices in the queue
2. remove the first element in the queue and sort
3. update the In-degree array
edge: dependency
Shortest path
Breadth-First Search(BFS):
BFS is used for traversing or searching tree or graph data structures.
It starts at the tree root and explores the neighbor nodes first, before moving to the next level of nodes.
Unweighted Graphs: BFS does not take into account the weight of the edges connecting nodes.
Shortest Path: In an unweighted graph, BFS can be used to find the shortest path between two nodes because it explores the closest nodes first.
Implementation: Queue, (From, to, cost(total cost))
Complexity: time complexity is O(V+E), where V is the number of vertices and E is the number of edges.
Usage: for finding the shortest path, testing bipartiteness, or finding connected components in an undirected graph.
Queue (From, to, total cost)
Array (path and total cost)
Dijkstra's Algorithm:
for finding shortest paths between nodes in a graph, ex) road networks.
Weighted Graphs: It works for weighted graphs with non-negative weights.
This is because Dijkstra's assumes that once a node has been visited and the shortest path to that node has been determined, the node can be "closed" and does not need to be visited again.
Shortest Path: it can produce a shortest path tree to find the shortest paths from a starting node to all other nodes.
Implementation: priority queue(min-heap), (Vertex, Prev, cost, found(to use count))
Complexity: O((V+E) log V), which is due to the fact that every edge is looked at once, and nodes are processed in the priority queue.
Usage: routing, navigation systems, network routing protocols, shortest path calculation on weighted graphs.
Key Differences: BFS is for unweighted graphs, while Dijkstra's can handle weighted graphs with non-negative weights. BFS uses a queue, and Dijkstra's uses a priority queue.
Minimum Spanning Tree
We only look at the edge weight.
Prim's Algorithm(for dense graph):
Process: Begins with a single starting vertex and grows the spanning tree one edge at a time. At every step, it adds the cheapest possible connection from the tree to another vertex.
Data Structure: priority queue to select the next minimum weight edge.
The running time is O(|V|2) without heaps, which is optimal for dense graphs, and O(|E| log |V|) using binary heaps, which is good for sparse graphs.
It focuses on the individual edges while Dijkstra's focuses on the path.
empty priority queue meaning: we don't have a connected graph
Kruskal's Algorithm(for sparse graph(fewer edges)):
Process: Sorts all the edges of the graph by ascending edge weight and then adds them one by one to the spanning tree. If adding an edge would create a cycle, then that edge is skipped.
Data Structure: Uses a disjoint-set data structure to keep track of the set of trees in the forest during the algorithm's execution.
Suitability: Typically more efficient in sparse graphs because it deals with edges and adds the next lightest edge that doesn't produce a cycle.
To summarize, Kruskal's algorithm is often more efficient for sparse graphs because the graph's sparseness makes sorting the edges relatively quick and the disjoint-set data structure effectively maintains the forest of trees that eventually form the MST. Prim's algorithm is better suited for dense graphs because it efficiently connects vertices to the growing MST and can take advantage of the decreased overhead compared to managing a larger edge set in dense graphs.
크루스칼 알고리즘은 희소 그래프(sparse graph)에 효과적입니다. 그래프의 희소성 때문에 간선을 정렬하는 것이 비교적 빠르며, 분리 집합(disjoint-set) 자료 구조를 사용하여 최소 신장 트리(MST)를 형성하는 트리의 숲을 효과적으로 관리할 수 있기 때문입니다. 반면에, 프림 알고리즘은 밀집 그래프(dense graph)에 적합합니다. 그 이유는 프림 알고리즘이 점점 성장하는 MST에 정점을 효율적으로 연결할 수 있고, 많은 간선을 관리하는 데 드는 부담이 적기 때문입니다.
Network Flow: A weighted directed graph where each edge has a capacity and each edge receives a flow. The amount of flow on an edge cannot exceed the capacity of the edge.
Source(s): The node in the flow network from which the flow originates.
Sink(t): The node in the flow network where the flow is consumed.
Capacity Constraints: edge with non-negative capacity, and the flow <= capacity.
Path: edges from the source to the sink
the total flow coming in must equal the total flow going out. both condition cycle acyclic works.
'Courses' 카테고리의 다른 글
IT279: Algorithm and Data Structures Recap 2/3 (0) | 2023.12.11 |
---|---|
IT279: Algorithm and Data Structures Recap 1/3 (1) | 2023.12.10 |
IT 279: Algorithms and Data Structures, Week 6 Heap (1) | 2023.10.05 |
IT 279: Algorithms and Data Structures, Week 5 Hash Tables (1) | 2023.10.05 |
IT 279: Algorithms and Data Structures, Week 4 Search Trees (0) | 2023.10.04 |