Here I want to share some of the fun things that I've found.

[C++] std::vector<bool>

  1. sizeof(bool) is NOT NECESSARILY 1 but it is implementation-defined.

  2. But since a byte is the smallest addressable unit of memory in many computer architectures, for the computer architectures in widespread use today, it takes up practically one byte.

  3. Nevertheless, std::vector<bool> is specialized so that each bool takes 1 bit. Thus std::vector<bool> v; bool *pb =&v[0]; won't compile!

Alternatives to this can be to use (1) std::deque<bool> if you don't need an array, or (2) an alternative vector container that doesn't specialize bool, such as the one in the Boost Container

March 2, 2022

Binary Search with a Predicate

Assume we have a bounded set of integers D in [L, R], and a predicate P s.t. P(x) == true if and only if x >= K for some K in D.

We can then use binary search to find K = min { x in D : P(x) is true }

[Algorithm 1]
while(l<=r){

m = (l+r)/2;

if(P(m)){ r = m-1; } // m large enough

else{ l = m+1; } // m too small

}

return l;


This might look obvious at the first glance, but it is not. For instance, when P is true for x <= K and we want to find K = max { x in D : P(x) is true } then we need to round m towards r, so we choose m = (l+r+1)/2;. Also the choice of <= instead of < is important here; consider L==0, R==2, and K==1. Finally, we choose to return l because r is eventually K-1 all the time.

I first saw this on leetcode problem about splitting an array with minimum largest partial sum (i.e., finding min_{P}(max {S(I) : I in P}), where P is a partition of an array into intervals and S(I) is a sum of all elements in an interval I.)

March 31, 2022

A Simple Partition Algorithm

Let C be a container and left, right, pivot be iterators of C. We partition the range [left, right) with a pivot value p = *pivot and a predicate Pred, so that after partitioning we get an iterator it such that Pred(e) == true for all e in [left, it), and P(e) == false for all e in [it, right)

[Algorithm 2]

iterator partition(iterator left, iterator right, iterator pivot){

iterator last_elt = right; --last_elt; // iterator to the last element (right == past-the-end of the range)

swap(*pivot, *last_elt); // move the pivot to the end

iterator store_idx = left; // start filling from the left

for(iterator it = left; it != right; ++it){ // for each element in the range

if(Pred(it)){ // If P is true,

swap(*(store_idx++), *it); // swap and move forward

}

}

swap(*store_idx, *last_elt); // bring the pivot value to the proper place

return store_idx; // return the iterator to the pivot value

}

Obviously, the time complexity of the algorithm is O(N), where N is the number of elements in the range [left, right).

This is called a Lumoto partition scheme, and can be used for implementing the quickselect algorithm.

April 9, 2022


Morris Traversal Algorithm

Let T be a binary tree, where each node has pointers to its child nodes, but no parent nodes. Morris Traversal is an algorithm that traverses T with O(1) additional memory, by modifying T during the traversal. To be more precise, it uses the fact that leaf nodes have null children, and use those places to save parent nodes. Thus, although the tree remains the same after completing the traversal, it cannot be used when nodes of T are not allowed to be modified.

If curr is a node and L is its left subtree with root l (i.e., l=curr->left), let Lr be the rightmost node of L (i.e., Lr = l->right->right->...->right). Note that Lr->right == nullptr. We save n as a right child of Lr (i.e., Lr->right = curr) [A]. Intuitively, it makes sense because during in-order traversal, we traverse L, then from the rightmost child of L (Lr), the next node should be l's parent node curr.

Therefore, with this strategy, we only need to go ->right after we visit a node in order [B].

When we finished searching the left subtree, its rightmost child's right child will be its parent node, and we need to detect it [C].

Say, we need to do some task f at the current node curr. In fact, Morris traversal can apply two different tasks f1 at nodes without left child, and f2 at nodes with left child.

Then the implementation of the algorithm is as follows:


// Recall: L = left subtree of curr, l = root of L, and Lr = rightmost child of L.


while(curr){ // Morris Traversal

if(!curr->left){ // no unvisited nodes on the left (curr is the minimum of remaining nodes)

f1(curr); // f1 for every node without left child

curr = curr->right; // [B] only needs to go ->right after visiting a node

}else{

tmp = curr->left; // tmp = l = root of L

while(tmp->right && tmp->right != curr){ // [A] find the rightmost child (Lr) of L

tmp = tmp->right;

}

if(tmp->right == curr){ // [C] this means we finished searching the left subtree L,

// and came back to curr. It's time to search the right subtree

tmp->right = nullptr; // undo the modification (unlink curr from Lr)

f2(curr); // f2 for every node with left child

curr = curr->right; // [B] only needs to go ->right after visiting a node


}else{ // [A], save the parent node, and search the left subtree

tmp->right = curr;

curr = curr->left;

}

}

}

It is clear that the algorithm only requires O(1) extra memory. How about the time complexity? Notice that each edge in the tree is visited at most three times.

For instance, in the following example,

a

/ \

b c

/ \ / \

d e f g

/ \ / \ / \ / \

* * * * * * * * // * is nullptr

edge b-d :

  1. while finding d, the rightmost child of the left subtree of b [A],

  2. after visiting d, while checking if b is a parent node with a left child [C].

edge b-e :

  1. while finding e, the rightmost child of the left subtree of a [A],

  2. after visiting b, to visit e [B],

  3. after visiting e, while checking if a is a parent node with a left child [C].

edge e-* :

  1. while checking if e has left child.


Since there are 2n edges in the binary tree with n nodes, the time complexity of the algorithm is O(n).


April 22, 2022

[C++] Structured Binding Declaration

Structured binding declaration is supported since C++ 17. That is, one can do the following:

pair<int, int> pr{ 1, 2 };

auto [k, v] = pr; // works for tuples, too


int a[3] = { 3, 4, 5 }; // but doesn't work for vectors

auto& [a1, a2, a3] = a; // Note the use of & here

If this doesn't compile in your Visual Studio, check the project setting (Right click the project - Properties - Configuration Properties - General - C++ Language Standard.)

This is very useful when you need a range-based for loop with (unordered) maps:

map<KeyType, ValueType> m;

for (const auto& [key, val] : m) {

cout << "key: " << key << ", value = " << val << ".\n";

}

For more detail, refer to this page.

May 15, 2022

Fenwick Tree

TBU. See https://en.wikipedia.org/wiki/Fenwick_tree

Oct 2022