Package: AhoCorasickTrie 0.1.3

AhoCorasickTrie: Fast Searching for Multiple Keywords in Multiple Texts

Aho-Corasick is an optimal algorithm for finding many keywords in a text. It can locate all matches in a text in O(N+M) time; i.e., the time needed scales linearly with the number of keywords (N) and the size of the text (M). Compare this to the naive approach which takes O(N*M) time to loop through each pattern and scan for it in the text. This implementation builds the trie (the generic name of the data structure) and runs the search in a single function call. If you want to search multiple texts with the same trie, the function will take a list or vector of texts and return a list of matches to each text. By default, all 128 ASCII characters are allowed in both the keywords and the text. A more efficient trie is possible if the alphabet size can be reduced. For example, DNA sequences use at most 19 distinct characters and usually only 4; protein sequences use at most 26 distinct characters and usually only 20. UTF-8 (Unicode) matching is not currently supported.

Authors:Matt Chambers [aut, cre], Tomas Petricek [aut, cph], Vanderbilt University [cph]

AhoCorasickTrie_0.1.3.tar.gz
AhoCorasickTrie_0.1.3.zip(r-4.5)AhoCorasickTrie_0.1.3.zip(r-4.4)AhoCorasickTrie_0.1.3.zip(r-4.3)
AhoCorasickTrie_0.1.3.tgz(r-4.5-x86_64)AhoCorasickTrie_0.1.3.tgz(r-4.5-arm64)AhoCorasickTrie_0.1.3.tgz(r-4.4-x86_64)AhoCorasickTrie_0.1.3.tgz(r-4.4-arm64)AhoCorasickTrie_0.1.3.tgz(r-4.3-x86_64)AhoCorasickTrie_0.1.3.tgz(r-4.3-arm64)
AhoCorasickTrie_0.1.3.tar.gz(r-4.5-noble)AhoCorasickTrie_0.1.3.tar.gz(r-4.4-noble)
AhoCorasickTrie_0.1.3.tgz(r-4.4-emscripten)AhoCorasickTrie_0.1.3.tgz(r-4.3-emscripten)
AhoCorasickTrie.pdf |AhoCorasickTrie.html
AhoCorasickTrie/json (API)

# Install 'AhoCorasickTrie' in R:
install.packages('AhoCorasickTrie', repos = c('https://chambm.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/chambm/ahocorasicktrie/issues

Uses libs:
  • c++– GNU Standard C++ Library v3

On CRAN:

cpp

4.65 score 10 stars 2 packages 15 scripts 283 downloads 2 exports 1 dependencies

Last updated 16 days agofrom:b3a09ae28e. Checks:11 OK. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKFeb 05 2025
R-4.5-win-x86_64OKFeb 05 2025
R-4.5-mac-x86_64OKFeb 05 2025
R-4.5-mac-aarch64OKFeb 05 2025
R-4.5-linux-x86_64OKFeb 05 2025
R-4.4-win-x86_64OKFeb 05 2025
R-4.4-mac-x86_64OKFeb 05 2025
R-4.4-mac-aarch64OKFeb 05 2025
R-4.3-win-x86_64OKFeb 05 2025
R-4.3-mac-x86_64OKFeb 05 2025
R-4.3-mac-aarch64OKFeb 05 2025

Exports:AhoCorasickSearchAhoCorasickSearchList

Dependencies:Rcpp