Skip to main content Link Search Menu Expand Document (external link)

Slides presented at 2016 Workshop über Algorithmen und Komplexität (Theorietag): "On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching"

DOI arXiv slides pdf

Abstract

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with pp processors. Given a static text of length nn, we first show how to compute the suffix array interval of a given pattern of length mm in O(mp+lgp+lglgplglgn)O(\frac{m}{p} + \lg p + \lg\lg p \cdot \lg\lg n) time for pmp \leq m. For approximate pattern matching with kk differences or mismatches, we show how to compute all occurrences of a given pattern in O(mkσkpmax{k,lglgn}+(1+m/p)lgplglgn+occ)O(\frac{m^k\sigma^k}{p} \max\{k, \lg\lg n\} + (1+m/p)\lg p \cdot \lg\lg n + occ) time, where σ\sigma is the size of the alphabet and pσkmkp \leq \sigma^k m^k. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns PP and PP', we present a data structure for computing the interval of PPPP' in O(lglgn)O(\lg\lg n) sequential time, or in O(1+lgplgn)O(1 + \lg_p\lg n) parallel time. All our data structures are of size O(n)O(n) bits (in addition to the suffix array).

Preview

Preview of the slides.

Download