detecting chinese characters in php string [duplicate] - php

This question already has answers here:
Php - regular expression to check if the string has chinese chars
(4 answers)
Closed 8 years ago.
I am trying to detect chinese characters in a string I have in PHP, currently I am trying to do this:
$bio = '全部都好有用架 無用的我都一早打入冷宮唔見哂(… 有意/想睇圖都歡迎留whatsapp or line: chibimasakishop 可綠線/將軍澳線交收';
if (preg_match('/[\x{4e00}-\x{9fa5}]+.*\-/u', $bio) === 1) {
var_dump('contains a chinese character');
}
why isn't this working?

Try this
if(preg_match("/\p{Han}+/u", $bio))
{
var_dump('contains a chinese character');
}
Reference : Php check if the string has Chinese chars

in your case i think you have "\-" extra, i think
/[\x{4e00}-\x{9fa5}]+.*/u
should work

Related

How can I prevent unicode characters like this Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝ on my site and why do they exist? [duplicate]

This question already has answers here:
How does Zalgo text work?
(2 answers)
What's up with these Unicode combining characters and how can we filter them?
(4 answers)
Closed 3 years ago.
How to prevent characters like this one on my website:
Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝
They are really annoying. Ḧ̶̡̡̢͙͚̝̖͙͓̝̘̯̜̗͙̩͎̻̥̩͈͈͈̘̰͇̞͇͇̦̼̺̙̲͔́̿͌̀̅͊̌́͂̋̃̽̔̀̇̎̈̆́̽̇͂͘͘͜͝͝A̸̡̧̲̦͕̦̦̘̫͍̺͙̫͉̠͆̈́̅̚ͅͅḦ̴̪̱̠̦̜̩͒̃͌̎̇͌̒̍̒̇̾̀͑̂̆̉̓͌͘̚̚̕͜͝ͅA̶̻͐̔̍̃͆̆̓̿͋͊̽͝
Replace all the unicode characters outside of your desired range(s).
$annoying_string = 'Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̤͇̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝Ả̴̢̦̙̬̲̯̖̲̟̟̬̲̻̣̩͕͍̦͍̮̠̿́̾͋́̾̎̔̐̓̾̐̉͒̅͛̈́̀̇͋͋̔̕͘͝͝͝foobar̤͇';
$cleaned_string = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $annoying_string);
echo $cleaned_string; // AAAfoobar

Decode hex string to latin characters [duplicate]

This question already has answers here:
How to convert HTML entities like – to their character equivalents?
(5 answers)
Closed 7 years ago.
I have string:
tomas
there is decoded string tomas
I found that there is hec decode type : http://www.codetable.net/unicodecharacters
Does in PHP exists fuction to encode hex to latin characters ?
Those are not hex characters, but HTML entities.
PHP can decode them with html_entity_decode().

PHP: To decode json to chinese and smily not works? [duplicate]

This question already has answers here:
Unicode character in PHP string
(8 answers)
Closed 8 years ago.
To decode json to chinese:
json_decode('"\ud83d\ude18\ud83d\ude18\ud83d\ude18\ud83d\ude18\u597d\u5bb6\u4f19\ud83d\ude0d\ud83d\ude0d\ud83d\ude0d"');
not works?
It works for chinese but not for smily
can you please give me any idea for it
that's not a valid JSON string -- JSON strings must be inside double quotes
Edit: took the failing example above, wrapped the utf-8 in doublequotes, and it decoded:
var_dump(json_decode('"\ud83c\udf83\ud83c\udf83\ud83c\udf83"'));
string(12) "🎃🎃🎃"
(I don't know what the glyphs should look like, I don't eve know if I have the right fonts installed, but the string decoded)

Convert Unicode escape sequence to UTF-8 [duplicate]

This question already has answers here:
Unicode character in PHP string
(8 answers)
Closed 2 years ago.
I'm trying to convert some characters with PHP before inserting to MySQL DB a JSON object with this kind of data:
\u00c9
that means: É
I tried this but it didn't work:
echo utf8_encode(print_r('\u00c9'));
I've read that it's in Unicode but i can't find the way to print it before inserting it. Any ideas?
Take a look at this answer. TL;DR:
echo json_decode('"\u00c9"');

PHP remove characters with hyphens above them like (é) [duplicate]

This question already has answers here:
Change foreign characters to their roman equivalent
(7 answers)
Closed 8 years ago.
Is there any possible way of doing this in PHP? Eg convert the word Québec
$str = 'Québec';
echo convert($str);
result:
Quebec
Those are called diacritics (or accents). The process of converting some text from one script to another (e.g. one with diacritics to one without) is called transliteration and PHP has a module for performing it.
You probably want something like:
transliterator_transliterate('Any-Latin; Latin-ASCII', $your_input);

Categories