Compare UTF-8 characters

Issue

Here is a parsing function:

double transform_units(double val, const char* s)
{
    cout << s[0];
    if (s[0] == 'm') return val * 1e-3;
    else if (s[0] == 'µ') return val * 1e-6;
    else if (s[0] == 'n') return val * 1e-9;
    else if (s[0] == 'p') return val * 1e-12;
    else return val;
}

In the line with ‘µ’ I’m getting the warning:

warning: multi-character character constant [-Wmultichar]

and the character ‘µ’ is not being catched.

How to compare multibyte characters?

Edit: a dirty workaround is to check if it is less than zero. As Giacomo mentioned, it’s 0xCE 0xBC, both these bytes are greater than 127, so less than zero. It works for me.

Solution

How to compare multibyte characters?

You can compare a unicode code point consisting of multiple bytes (more generally, multiple code units) by using multiple bytes. s[0] is only a single char which is the size of a byte and thus cannot by itself contain multiple bytes.

This may work: std::strncmp(s, "µ", std::strlen("µ")) == 0.

Answered By – eerorika

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published