Package: audubon 0.5.2

audubon: Japanese Text Processing Tools

A collection of Japanese text processing tools for filling Japanese iteration marks, Japanese character type conversions, segmentation by phrase, and text normalization which is based on rules for the 'Sudachi' morphological analyzer and the 'NEologd' (Neologism dictionary for 'MeCab'). These features are specific to Japanese and are not implemented in 'ICU' (International Components for Unicode).

Authors:Akiru Kato [cre, aut], Koki Takahashi [cph], Shuhei Iitsuka [cph], Taku Kudo [cph]

audubon_0.5.2.tar.gz
audubon_0.5.2.zip(r-4.5)audubon_0.5.2.zip(r-4.4)audubon_0.5.2.zip(r-4.3)
audubon_0.5.2.tgz(r-4.4-any)audubon_0.5.2.tgz(r-4.3-any)
audubon_0.5.2.tar.gz(r-4.5-noble)audubon_0.5.2.tar.gz(r-4.4-noble)
audubon_0.5.2.tgz(r-4.4-emscripten)audubon_0.5.2.tgz(r-4.3-emscripten)
audubon.pdf |audubon.html
audubon/json (API)
NEWS

# Install 'audubon' in R:
install.packages('audubon', repos = c('https://paithiov909.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/paithiov909/audubon/issues

Pkgdown site:https://paithiov909.github.io

Datasets:
  • hiroba - Whole tokens of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko
  • polano - Whole text of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko

On CRAN:

japanesejavascript

5.50 score 10 stars 1 packages 3 scripts 677 downloads 20 exports 38 dependencies

Last updated 2 months agofrom:5f69e56564. Checks:3 OK, 4 NOTE. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKDec 26 2024
R-4.5-winOKDec 26 2024
R-4.5-linuxOKDec 26 2024
R-4.4-winNOTEDec 26 2024
R-4.4-macNOTEDec 26 2024
R-4.3-winNOTEDec 26 2024
R-4.3-macNOTEDec 26 2024

Exports:bind_lrbind_tf_idf2collapse_tokensget_dict_featureslex_densitymute_tokensngram_tokenizerpackprettifyread_rewrite_defstrj_fill_iter_markstrj_hiraganizestrj_katakanizestrj_normalizestrj_rewrite_as_defstrj_romanizestrj_segmentstrj_tinysegstrj_tokenizestrj_transcribe_num

Dependencies:bitbit64cachemclicliprcpp11crayoncurldplyrfansifastmapgenericsgluehmsjsonlitelatticelifecyclemagrittrMatrixmemoisepillarpkgconfigprettyunitsprogresspurrrR6Rcppreadrrlangstringitibbletidyselecttzdbutf8V8vctrsvroomwithr