Writing source code in PHP without special characters [duplicate] - php

This question already has answers here:
Unicode character in PHP string
(8 answers)
Closed 4 years ago.
Is There a way to print special characters in PHP using only source code with ascii characters?
For example, in javascript, we can use \u00e1 in the middle of text.
In Java we can use \u2202 for example.
And in PHP? How can I use it?
I don't want to include special chars in my source code.

I found 3 ways for this.
Php Documentation: http://php.net/manual/en/language.types.string.php#language.types.string.syntax.double
A good explanation in portuguese: https://pt.stackoverflow.com/questions/293500/escrevendo-c%C3%B3digo-em-php-sem-caracteres-especiais
Sintax added only in PHP7:
\u{[0-9A-Fa-f]+}
the sequence of characters matching the regular expression is a Unicode codepoint.
which will be output to the string as that codepoint's UTF-8 representation
examples:
<?php
echo "\u{00e1}\n";
echo "\u{2202}\n";
echo "\u{aa}\n";
echo "\u{0000aa}\n";
echo "\u{9999}\n";
Sintax for PHP7 and old PHP versions:
\x[0-9A-Fa-f]{1,2}
the sequence of characters matching the regular expression,
is a character in hexadecimal notation
examples:
<?php
echo "\xc3\xa1\n";
echo "\u{00e1}\n";
Using int to binary convertion functions:
<?php
printf('%c%c', 0xC3, 0xA1);
echo chr(0xC3) . chr(0xA1);
printf() Extended Unicode Characters?
http://phptester.net/

Related

preg_replace UTF-8 doesn't work [duplicate]

This question already has answers here:
Matching Unicode letter characters in PCRE/PHP
(5 answers)
Closed 4 years ago.
I've got the following code which works fine on my offline test version but it fails on the online server.
$names = "dimitris giannIs micHalis";
echo preg_replace("/s\b/", "w", mb_convert_case($names, MB_CASE_TITLE, "UTF-8"));
The result I get is Dimitriw Gianniw Michaliw.
But instead of English chars/words I've got UTF-8 ones. If I place the above example as it is (in English) it works fine so I'm guessing I'm doing something wrong here with UTF-8
Typically (but see the note below the Edit), you need to use the u modifier on your regex to make it work with UTF-8 characters. e.g.
$words = "qθαεqθε γραεcισ cονσεcτε";
echo preg_replace("/ε\b/u", "α", mb_convert_case($words, MB_CASE_TITLE, "UTF-8"));
Output:
Qθαεqθα Γραεcισ Cονσεcτα
This example on rextester demonstrates the use of the u modifier (note that rextester doesn't support mb_convert_case but that doesn't really affect the result).
Edit
As was pointed out by #CasimiretHippolyte, it is possible to compile the PCRE extension (used by PHP for regex) to handle unicode characters by default with the --enable-unicode-properties option. This may explain the difference between the results on the offline test version and online server.

htmlspecialchars() x htmlentities() [duplicate]

This question already has answers here:
htmlentities() vs. htmlspecialchars()
(12 answers)
Closed 8 years ago.
I have read their documentation, but I still don't get when to use each of them and their difference.
Let's consider the situation of having a general string in a variable and needing to echo it inside HTML code. If it has any HTML markup in it, I want it converted to HTML code (< replaced by <, & replaced by &. If it has UTF special chars that aren't available in HTML code, it's replaced by HTML number (• replaced by •).
What's the best function for that?
A harder need: unprintable chars, like \n, char(10), char(13), etc, be replaced by their number code, in the case the string is printed inside <pre> or any special textarea so that the string be dumped.
htmlentities is a workaround for not having set the character type of the document properly. htmlspecialchars is the correct function to use for merely writing text into an HTML document.
As to your second question, I think you're looking for addcslashes.

preg_match with cyrillic text [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to match Cyrillic characters with a regular expression
I have a simple php script which uses preg_match to compare a string against some cyrillic text inside a variable (e.g. $var = 'страница').
However when I input the cyrilic text into the variable it comes up as ???????? in my code.
$var1 = '/?????????????/';
I get the folowing warning when I run the script:
preg_match(): Compilation failed: nothing to repeat at offset 0
Can anyone suggest a solution?
thanks very much.
Change encoding of your scripts or all project source files on UTF for example in your IDE.
Use modifier for unicode
preg_match('/abcdef/u',$some_string)
Maybe it’s because of invalid codepage, which codepage has your interpreter and which codepage uses connection to a database (if any?)

PHP - convert unicode to character [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How to get the character from unicode value in PHP?
PHP: Convert unicode codepoint to UTF-8
How can I convert a unicode character such as %u05E1 to a normal character via PHP?
The chr function not covering it and I am looking for something similar.
"%uXXXX" is a non-standard scheme for URL-encoding Unicode characters. Apparently it was proposed but never really used. As such, there's hardly any standard function that can decode it into an actual UTF-8 sequence.
It's not too difficult to do it yourself though:
$string = '%u05E1%u05E2';
$string = preg_replace('/%u([0-9A-F]+)/', '&#x$1;', $string);
echo html_entity_decode($string, ENT_COMPAT, 'UTF-8');
This converts the %uXXXX notation to HTML entity notation &#xXXXX;, which can be decoded to actual UTF-8 by html_entity_decode. The above outputs the characters "סע" in UTF-8 encoding.
Use hexdec to convert it to it's decimal representation first.
echo chr(hexdec("05E1"));
var_dump(hexdec("%u05E1") == hexdec("05E1")); //true

Hex to Unicode in PHP ( \u014D to ō) [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How to decode Unicode escape sequences like “\u00ed” to proper UTF-8 encoded characters?
How can I convert \u014D to ō in PHP?
Thank You
It's not immediate clear what you mean when you say "to ō". If you're asking how to convert it into a different encoding then a general approach is to use the iconv function. 014D is the UCS-2 (unicode) for your desired function so, if you have a string containing the bytes 014D you could use
iconv('UCS-2', 'UTF-8', $s)
to convert from UCS-2 to UTF-8. Similarly if you want to convert to a different encoding - although you need to be aware that not all encodings will include the character you are using. You'll see from the iconv documentation that the //TRANSLIT option may help in that case.
Note that iconv is taking a byte sequence so, if you actually have a string containing a slash, then a u, then a 0 etc... you'll need to convert that into the byte sequence first.
If you have the escape characters in the string you could use a messy exec statement.
$string = '\\u014D';
exec("\$string = '$string'");
This way, the Unicode escape sequence should be recognized and interpreted as a unicode character When the string is parsed.
Of course, you should never use exec unless absolutely necessary.

Categories