将指针传递给 C++ 中的方法导致奇怪的输出

Question

我试图通过将指针传递到我创建的 toUpper 方法来将字符串转换为大写。逻辑似乎没问题，但我得到了像 ìëïà 这样的奇怪输出。有什么我在这里出错的想法吗？

#include <iostream>
#include <string.h> 

using namespace std;

void toUpper(char *);

int main()
{
    char name[80];
    char *namePtr = name;

    cout << "Enter a name :";
    cin >> name;

    toUpper(namePtr);
    cout << "The string in Upper Case is: " << name << endl;


}

void toUpper(char *p)
{

    int asciiValue; 

    // Loop through each char in the string
    for(int i = 0 ; i < strlen(p); i++)
    {
        
        asciiValue = (int) p[i];

        if(asciiValue >= 97 && asciiValue <= 122)
        {
            asciiValue = asciiValue + 32;
            p[i] = asciiValue;
        }
    }

}

Answer 1

你的问题归结为错误的幻数，这使得即使近距离观察也几乎无法分辨，因为它们是幻数！

相反，我会使用字符文字来使事情变得显而易见：

if(asciiValue >= 'a' && asciiValue <= 'z')
{
    asciiValue = asciiValue + ('a' - 'A');
    p[i] = asciiValue;
}

现在应该很明显您添加了错误的值！它应该是：

asciiValue = asciiValue + ('A' - 'a');

Answer 2

您的代码不可移植

这是不可移植的代码。它肯定只适用于 ASCII 编码。尽管如此，这里对应的non-portable解决方案：

       asciiValue = asciiValue - 32;   // - just move in the other direction

如何做得更好？

这里是您当前代码的一些问题：

大写字母和小写字母的区别并不总是32。例如在EBCDIC中，大小写普通字母的区别是+64而不是-32。
小写字母的边界也可能不同。
对于使用非 ASCII 区域设置的外语，您的特殊字符可能与普通字母的范围不同（参见示例 ISO-8859-1），其中您也有 224 和 25 范围内的小写字母但有一个例外。
在某些编码中，对于不同的小写字母组甚至有不同的规则。 以 ISO 8859-3 为例。 'Ŭ' 和 'ŭ' 之间的差异是 -32 但 'Ż' 和 'ż' 之间的差异是 -16 .
最后，不保证字符是无符号的。如果您将 ISO-8859-1 编码与将字符管理为带符号字符的编译器结合使用，您的整个比较逻辑可能会完全失败。

因此，一种更安全的方法是使用：isupper() and toupper()，它考虑了语言环境。

作为副作用，这甚至可以使用 templated version of these functions or the wide version.

促进迁移到完整的 unicode 兼容代码

为什么不使用真正的字符串？

如果有人键入 80 个或更多字符的名称，您的代码就有缓冲区溢出的风险。您需要确保 cin 占用的字符数不超过允许的字符数。但是我不告诉你如何做到这一点，我建议使用更安全的 std::string 代替：

void toUpper(string &s)
{
    for(auto &p:s)              // Loop through each char in the string
        if (islower(p))         
            p =toupper(p);
}

int main()
{
    string name;
    cout << "Enter a name :";
    cin >> name;
    toUpper(name);
    cout << "The string in Upper Case is: " << name << endl;
}

online demo

Answer 3

asciiValue = asciiValue - 32;

减号而不是加号 例子： “a”的 ASCII 值是 97

97 - 32 是 65 是大写 A 的 ASCII 值

Answer 4

好像在这个if语句中

    if(asciiValue >= 97 && asciiValue <= 122)
    {
        asciiValue = asciiValue + 32;
        p[i] = asciiValue;
    }

您正在检查当前符号是否为 lower-case ASCII 符号。

但是 lower-case ASCII 符号的代码比 upper-case ASCII 符号的代码高。

所以不要添加幻数 32

        asciiValue = asciiValue + 32

你必须减去它

        asciiValue = asciiValue - 32

例如，小写 ASCII 符号 'a' 的代码为 97，而大写的 case.symbol 'A' 的代码为 65.

但是在任何情况下，您使用幻数的方法都是不好的，因为例如它不适用于 EBCDIC 符号表示。

在这种情况下也调用函数 strlen 是低效的。

而且当函数returns指向转换后的字符串时会好很多。

函数可以通过以下方式声明和实现

#include <cctype>

//...

char * toUpper( char *s )
{
    for ( char *p = s; *p; ++p )
    {
        if ( std::islower( static_cast<unsigned char>( *p ) ) )
        {
            *p = std::toupper( static_cast<unsigned char>( *p ) );
        }
    }

    return s;
}

Answer 5

一个更便携的 C++ 解决方案是使用 std::transform 将字符串转换为小写：

std::string shouting = "AM I SHOUTING";
std::transform(shouting.begin(), shouting.end(), shouting.begin(), tolower);
std::cout << shouting << "\n";

此解决方案不依赖于 ASCII 编码，并且适用于 std::tolower 有效的代码集。

将指针传递给 C++ 中的方法导致奇怪的输出

Passing a pointer into methods in C++ resulting in weird output

c++

c-strings

toupper

function-definition

您的代码不可移植

如何做得更好？

为什么不使用真正的字符串？