Florian Kurpicz
Presentation at ALENEX 2018

Simple, Fast and Lightweight Parallel Wavelet Tree Construction

Today, I presented our results on wavelet tree and wavelet matrix construction that is based on our novel approach the bottom-up construction. This allows us to reduce the number of text access during the computation, as we can compute all auxiliary information needed from one scan. The slides are available as handout and heavily animated. You can find the paper here (DOI).

Slides in Handout Mode

Slide image alenex_2018_slides_handout-01.jpg
Slide image alenex_2018_slides_handout-02.jpg
Slide image alenex_2018_slides_handout-03.jpg
Slide image alenex_2018_slides_handout-04.jpg
Slide image alenex_2018_slides_handout-05.jpg
Slide image alenex_2018_slides_handout-06.jpg
Slide image alenex_2018_slides_handout-07.jpg
Slide image alenex_2018_slides_handout-08.jpg
Slide image alenex_2018_slides_handout-09.jpg
Slide image alenex_2018_slides_handout-10.jpg
Slide image alenex_2018_slides_handout-11.jpg
Slide image alenex_2018_slides_handout-12.jpg
Slide image alenex_2018_slides_handout-13.jpg
Slide image alenex_2018_slides_handout-14.jpg
Slide image alenex_2018_slides_handout-15.jpg
Slide image alenex_2018_slides_handout-16.jpg
Slide image alenex_2018_slides_handout-17.jpg
Slide image alenex_2018_slides_handout-18.jpg
Slide image alenex_2018_slides_handout-19.jpg
Slide image alenex_2018_slides_handout-20.jpg

Abstract

The wavelet tree (Grossi et al. [SODA, 2003]) and wavelet matrix (Claude et al. [Inf. Syst., 47:15–32, 2015]) are compact indices for texts over an alphabet [0, σ) that support rank, select and access queries in O(lg σ) time. We first present new practical sequential and parallel algorithms for wavelet tree construction. Their unifying characteristics is that they construct the wavelet tree bottom-up, i.e., they compute the last level first. We also show that this bottom-up construction can easily be adapted to wavelet matrices. In practice, our best sequential algorithm is up to twice as fast as the currently fastest sequential wavelet tree construction algorithm (Shun [DCC, 2015]), simultaneously saving a factor of 2 in space. This scales up to 32 cores, where we are about equally fast as the currently fastest parallel wavelet tree construction algorithm (Labeit et al. [DCC, 2016]), but still use only about 75% of the space. An additional theoretical result shows how to adapt any wavelet tree construction algorithm to the wavelet matrix in the same (asymptotic) time, using only little extra space.