Package: ldccr 2025.02.02
ldccr: Utilities for Various Japanese Corpora
The goal of ldccr package is to make easy to use Japanese language resources. This package provides parsers for several Japanese corpora that are free or open licensed and a downloader of zipped text files published on Aozora Bunko.
Authors:
ldccr_2025.02.02.tar.gz
ldccr_2025.02.02.zip(r-4.5)ldccr_2025.02.02.zip(r-4.4)ldccr_2025.02.02.zip(r-4.3)
ldccr_2025.02.02.tgz(r-4.5-x86_64)ldccr_2025.02.02.tgz(r-4.5-arm64)ldccr_2025.02.02.tgz(r-4.4-x86_64)ldccr_2025.02.02.tgz(r-4.4-arm64)ldccr_2025.02.02.tgz(r-4.3-x86_64)ldccr_2025.02.02.tgz(r-4.3-arm64)
ldccr_2025.02.02.tar.gz(r-4.5-noble)ldccr_2025.02.02.tar.gz(r-4.4-noble)
ldccr_2025.02.02.tgz(r-4.4-emscripten)ldccr_2025.02.02.tgz(r-4.3-emscripten)
ldccr.pdf |ldccr.html✨
ldccr/json (API)
# Install 'ldccr' in R: |
install.packages('ldccr', repos = c('https://paithiov909.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/paithiov909/ldccr/issues
- AozoraBunkoSnapshot - Meta data of text files published on Aozora Bunko
- NekoText - Whole text of ‘Wagahai Wa Neko Dearu’ written by Natsume Souseki from Aozora Bunko
Last updated 20 days agofrom:82bcbc7bfe. Checks:5 OK, 6 NOTE. Indexed: yes.
Target | Result | Latest binary |
---|---|---|
Doc / Vignettes | OK | Feb 02 2025 |
R-4.5-win-x86_64 | OK | Feb 02 2025 |
R-4.5-mac-x86_64 | OK | Feb 02 2025 |
R-4.5-mac-aarch64 | OK | Feb 02 2025 |
R-4.5-linux-x86_64 | OK | Feb 02 2025 |
R-4.4-win-x86_64 | NOTE | Feb 02 2025 |
R-4.4-mac-x86_64 | NOTE | Feb 02 2025 |
R-4.4-mac-aarch64 | NOTE | Feb 02 2025 |
R-4.3-win-x86_64 | NOTE | Feb 02 2025 |
R-4.3-mac-x86_64 | NOTE | Feb 02 2025 |
R-4.3-mac-aarch64 | NOTE | Feb 02 2025 |
Exports:clean_emojiclean_urldownload_unidicis_within_erajrte_rte_filesldnws_categoriesparse_jrte_reasoningparse_to_jdateread_aozoraread_ja_text8read_jrteread_ldnwssqidsunidic_availablesunsqids
Dependencies:bitbit64cachemclicliprcpp11crayondplyrfansifastmapgenericsgluehmslifecyclemagrittrmemoisepillarpkgconfigprettyunitsprogresspurrrR6RcppRcppSimdJsonreadrrlangstringitibbletidyselecttzdbutf8vctrsvroomwithryesno
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Meta data of text files published on Aozora Bunko | AozoraBunkoSnapshot |
Data for Textual Entailment | jrte_rte_files |
List of categories of the Livedoor News Corpus | ldnws_categories |
Whole text of ‘Wagahai Wa Neko Dearu’ written by Natsume Souseki from Aozora Bunko | NekoText |
Parse reasoning column of 'rte.*.tsv' | parse_jrte_reasoning |
Download text file from Aozora Bunko | read_aozora |
Read the ja.text8 corpus | read_ja_text8 |
Read the JRTE Corpus | read_jrte |
Read the Livedoor News Corpus | read_ldnws |
Generate random-looking IDs from integer ranks | sqids unsqids |
Download and unzip 'UniDic' | download_unidic unidic_availables |
Utility functions | clean_emoji clean_url is_within_era parse_to_jdate utils |