R 包中的双语(英语和葡萄牙语)文档

Bilingual (English and Portuguese) documentation in an R package

我正在编写一个程序包以方便导入巴西社会经济微观数据集(人口普查、PNAD 等)。 我预见到该软件包的两个不同的用户组:

是否可以以文档为 "bilingual"(英语和葡萄牙语)的方式编写程序包,并且向用户显示的语言将取决于他们的 country/language 设置?

此外,

这在 roxygen2 文档框架内可行吗?

我意识到,通过双语使软件包更加用户友好与增加复杂性和维护难度之间存在权衡。也欢迎对以前的经验进行权衡的一般评论。

编辑:根据评论的建议,我交叉发布了 r-package-devel 邮件列表。 HERE,然后按照底部的答案进行操作。 Duncan Murdoch 发布了一个有趣的答案,涵盖了@Brandons 答案(如下)所涵盖的一些内容,还包括我认为有用的另外两个建议:

根据Ropensci, there is no standard mechanism for translating package documentation into non-English languages. They describe the typical process of internationalization/localization如下:

To create non-English documentation requires manual creation of supplemental .Rd files or package vignettes.

Packages supplying non-English documentation should include a Language field in the DESCRIPTION file.

还有一些关于 Language 字段的更多信息:

A ‘Language’ field can be used to indicate if the package documentation is not in English: this should be a comma-separated list of standard (not private use or grandfathered) IETF language tags as currently defined by RFC 5646 (https://www.rfc-editor.org/rfc/rfc5646, see also https://en.wikipedia.org/wiki/IETF_language_tag), i.e., use language subtags which in essence are 2-letter ISO 639-1 (https://en.wikipedia.org/wiki/ISO_639-1) or 3-letter ISO 639-3 (https://en.wikipedia.org/wiki/ISO_639-3) language codes.

如果您的包包含非 ASCII 文本,特别是如果它打算用于多个语言环境,则需要小心。可以标记 DESCRIPTION 文件和 .Rd 文件中使用的编码。

关于编码...

First, consider carefully if you really need non-ASCII text. Many users of R will only be able to view correctly text in their native language group (e.g. Western European, Eastern European, Simplified Chinese) and ASCII.72. Other characters may not be rendered at all, rendered incorrectly, or cause your R code to give an error. For .Rd documentation, marking the encoding and including ASCII transliterations is likely to do a reasonable job. The set of characters which is commonly supported is wider than it used to be around 2000, but non-Latin alphabets (Greek, Russian, Georgian, …) are still often problematic and those with double-width characters (Chinese, Japanese, Korean) often need specialist fonts to render correctly.

在相关说明中,R 确实,但是,为 "errors and warnings" in different languages 提供支持 - “有 机制来翻译 R- 和 C -级错误和警告消息。仅当 R 编译时带有 NLS 支持(由配置选项 --enable-nls 请求,默认值)时才可用。

除了双语文档,请允许我发表以下评论:鉴于您的两个 "target" 组,可以假设您的一些用户将 运行 非英语 OS (通常是 Windows 葡萄牙语)。导入时间序列数据(或实际上任何日期条目)时,由于 "date" 格式不同(英语与非英语),您可能会得到不同的 "results"(即错误的日期条目)导入到 English/non-English 台机器时。我对这些问题有一些经验(我经常使用基于捷克语的 OSs)并且 - 除了临时编码 - 我找不到简单的解决方案。 (如发现题外话,请随时删除)