NEWS
audubon 0.6.3 (2026-04-22)
- Modified some examples and tests to address addtional issues on CRAN.
- Japanese locale-dependent examples are no longer run on CRAN.
- Simplified phrase-based wrapping locale.
audubon 0.6.2 (2026-01-09)
- Modified some examples to address addtional issues on CRAN. There are no user-facing changes.
audubon 0.6.1 (2025-12-21)
New features
- Added
label_wrap_jp() and label_wrap_jp_gen() for Japanese word wrapping in ggplot2 labellers.
- Added
label_date_jp() and label_date_jp_gen() for Japanese calendar date labels in ggplot2.
- Added
strj_parse_date() to parse Japanese calendar date strings into POSIXct values.
Changes
- Removed
mecab and sudachipy engines and related arguments from strj_tokenize().
- Removed functions overlapping with those provided by the gibasa package. Users requiring morphological analysis or related features should use gibasa.
Other
- Performed internal refactoring and maintenance improvements.
audubon 0.5.2 (2024-04-27)
- Corrected probabilistic IDF calculation by
global_idf3.
- Refactored
bind_tf_idf2.
- Changed behavior when
norm=TRUE. Cosine nomalization is now performed on tf_idf values as in the RMeCab package.
- Added
tf="itf" and idf="df" options.
- Refactored
pack for performance.
audubon 0.5.1 (2023-05-02)
- Refactored
tokenize_mecab and tokenize_sudachipy.
audubon 0.5.0 (2023-03-04)
- Added
bind_lr function which can calculate the 'LR' value of bigrams.
pack now always returns a tibble, not a data.frame.
audubon 0.4.0 (2022-12-15)
- Added some new functions.
bind_tf_idf2 can calculate and bind the term frequency, inverse document frequency, and tf-idf of the tidy text dataset.
collapse_tokens, mute_tokens, and lexical_density can be used for handling a tidy text dataset of tokens.
strj_tokenize now preserves the original order of text names.
prettify now can get delim argument.
audubon 0.3.0 (2022-07-22)
- Updated
strj_fill_iter_mark function.
strj_fill_iter_mark now replaces a sequence of iteration marks recursively.
- Updated
strj_tokenize function.
strj_tokenize now can retrieve engine argument to switch tokenizers for splitting text into tokens.
audubon 0.2.0 (2022-05-24)
- Updated
ngram_tokenizer function.
- Added a wrapper function of the 'TinySegmenter' written by Taku Kudo.
audubon 0.1.2 (2022-04-02)
- Updated
pack function.
- Switched arguments order of
pack function. pack now accepts pull as its second argument and n as its third argument.
pull now can accept a symbol.
audubon 0.1.1 (2022-02-14)
audubon 0.1.0
- Relicensed as Apache License, Version 2.0.
- Added a
NEWS.md file to track changes to the package.