Displaying japanese characters with PHP - php

I have plain Japanese hieroglyphs texts with utf8mb_general_ci in MySQL table, I can fetch row and display as a single string.
But what I need is to get a single character from string and use it for a query to match other results(find other hieroglyphs words that consist of that specific single hieroglyph).
Problem is that when I loop that string, all I get is ? marks.
I read that I have to use UTF8 everywhere but I believe I do.
So, what are the steps from zero to make sure so I can fetch Japanese hieroglyph string, split into separate chars and queries would understand what kind of input is that(not just a ? mark).
Here's some basic code below as an example with the same data that I fetch from my DB and which results in the same problem.
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body>
<?php
$word = "東京のビルの中";
echo $word;
echo strlen($word);
echo "<br>";
$chars = str_split($word);
foreach($chars as $single) {
echo $single . "<br>";
}
?>
</body>
</html>

This answer works fine in your case as well. Just add the function to your code and then just call
$chars = mb_str_split($word);

Related

array_map produces weird unicode character

I'm newbie in PHP so sorry for any funny mistakes :(
I have a problem when try to get some unicode characters (Korean, actually) from database to an javascript array. I think after call array_map("utf8_encode", $row); the field I needed has weird character. This is the file to do that business:
<?php
include 'ChromePhp.php';
ChromePhp::log('Hello console!');
$mysql = new mysqli('localhost','root','vertrigo','demo', 3306);
$mysql->set_charset("utf8");
$result = $mysql->query("select * from countries");
$rows = array();
while($row = $result->fetch_array(MYSQL_ASSOC)) {
$rows[] = array_map("utf8_encode", $row);
ChromePhp::log($row); // fine, readable characters
}
ChromePhp::log($rows); // weird characters
echo json_encode($rows);
$result->close();
$mysql->close();
?>
I also set the main page and the script charset=utf8 like this:
<meta http-equiv="Content-Type" content="text/html; charset=utf8" />
and
<script src="lib/jquery/jquery-1.11.1.min.js" charset="utf8"></script>
<!-- Include all compiled plugins (below), or include individual files as needed -->
<script src="lib/bootstrap/js/bootstrap.min.js" charset="utf8"></script>
<script src="lib/magicsuggest/magicsuggest.js" charset="utf8"></script>
<script src="js/script.js" charset="utf8"></script>
The original code's here. I just add a test record with this sql command:
INSERT INTO `demo`.`countries` (`idCountry`, `countryCode`, `countryName`, `population`, `capital`, `continentName`) VALUES (NULL, 'KO', 'KOR', '134', '서울', 'Asia');
According to utf8_encode
utf8_encode — Encodes an ISO-8859-1 string to UTF-8
Since you're dealing with Korean characters, I suspect the strings are not ISO-8859-1 encoded.
Depending on the database settings, there is no need to convert the strings at all.

Is it possible to change original html text in php?

I am trying to make "manner friendly" website. We use different declination dependent on gender and other factors. For example:
You did = robili
It did = robilo
She did = robila
Linguisticaly this is very simplified (and unlucky) example! I would like to change html text in php file where appropriate. For example
<? php
something
?>
html text of the page and somewhere is the word "robil"
<div>we tried to robil^i|o|a^</div>
<? php something ?>
Now I would like to replace all occurences of different tokens ^characters|characters|characters^ and replace them by one of their internal values according to "gender".
It is easy in javascript on the client side, but you will see all this weird "tokenizing" before javascript replace it.
Here I do not know the elegant solution.
Or do you have better idea?
Thanks for advice.
You can add these scripts before and after the HTML:
<?php
// start output buffering
ob_start();
?>
<html>
<body>
html text of the page and somewhere is the word "robil"
<div>we tried to robil^i|o|a^, but also vital^si|sa|ste^, borko^mal|mala|malo^ </div>
</body>
</html>
<?php
$use = 1; // indicate which declination to use (0,1 or 2)
// get buffered html
$html = ob_get_contents();
ob_end_clean();
// match anything between '^' than's not a control chr or '^', min 5 and max 20 chrs.
if (preg_match_all('/\^[^[:cntrl:]\^]{3,20}\^/',$html,$matches))
{
// replace all
foreach (array_unique($matches[0]) as $match)
{
$choices = explode('|',trim($match,'^'));
$html = str_replace($match,$choices[$use],$html);
}
}
echo $html;
This returns:
html text of the page and somewhere is the word "robil" we tried to
robilo, but also vitalsa, borkomala

PHP: Get encoded html entities

I'm trying to get the html entities of a UTF-8 string,
Example: example.com/search?q=مرحبا
<?php
echo htmlentities($_GET['q']);
?>
I got:
مرحبا0مرحبا
It's UTF-8 text not html entities,
what I need is:
مرحبا
I have tried urldecode and htmlentities functions!
Add this code to the start of your file:
header('Content-Type: text/html; charset=utf-8');
The browser needs to know it is UTF-8. This tag also can go in the head section for formality.
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
I think you can solve it by getting the each char in the string and get its value.
From Mark Baker's answer and vartec's answer you can get:
<?php
$chrArray = preg_split('//u',$_GET['q'], -1, PREG_SPLIT_NO_EMPTY);
$htmlEntities = "";
foreach ($chrArray as $chr) {
$htmlEntities .= '&#'._uniord($chr).';';
}
echo $htmlEntities;
?>
I have not test it.

mPDF & char is not encoded, how i can do?

mPDF not convert the character ' & ' and makes all that follows is not translatable . the pdf is generated but all the code following the character ' & ' is not printed . this is my code:
<table>
<tr>
<p>test example & test example</p>
</tr>
</table>
i use this php code to create the pdf output:
<?php
$divcontent = $_POST['divcontent'];
$html='<html><head></head><body style="background-color:#FFFFFF;height:100%; width:100%;">';
$html.= $divcontent;
$html.='</body></html>';
//==============================================================
include(dirname(__FILE__)."/../../libs/MPDF57/mpdf.php");
#$mpdf=new mPDF('c');
#$mpdf->SetDisplayMode('fullpage');
#$stylesheet = file_get_contents(realpath(dirname(__FILE__)."/../..")."/css/style.css");
$mpdf->setFooter('{PAGENO}');
#$mpdf->WriteHTML($stylesheet,1);
#$mpdf->WriteHTML($html);
$rand = rand();
#$mpdf->Output(realpath(dirname(__FILE__)."/../..")."/file.pdf",'F');
?>
Try this :
$mpdf->charset_in='utf-8';
You should probably add this to the head section of the HTML :
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Make sure that your server Apache or Nginx, use also utf-8 as default charset.
After days I found the problem. The ajax call not coded character and now adding "encodeURIComponent (divcontent)" and passing the result to the php function, should not use "html_entity_decode", mPDF print special characters such as "&".
use htmlspecialchars
<table>
<tr>
<p>test example <?php echo htmlspecialchars('&') ?> test example</p>
</tr>
</table>
mPDF 8.0.4 was embedding specific characters in my application's PDFs as blank rectangles, while all other characters rendered properly.
My issue was with the parenthesis and euro symbols in the Segoe UI TTF font.
Adding $mpdf->useSubstitutions = true; fixed the problem.

encoding a PHP variable with quotes and line breaks to be passed to a Javascript function (and then reverse the encoding)

Say you have a PHP variable called $description with the following value (that contains quotes and line breaks):
Tromp L'oeil Sheath Dress
You will certainly "trick the eye" of many in this gorgeous illusion. Add it to your fall wardrobe before it disappears.
You want to pass the contents of this variable into a Javascript function that writes that value into an INPUT of type text.
How would you do this? I tried this:
$description = htmlspecialchars ( $product->description, ENT_QUOTES );
However, I get a JS error. I also tried this:
$description = rawurlencode ( $product->description );
This encodes the value like so:
Michael%20Kors%0A%0ATromp%20L%27oeil%20Sheath%20Dress%0A%0AYou%20will%20certainly%20%22trick%20the%20ey%22%20of%20many%20in%20this%20gorgeous%20illusion.%20Add%20it%20to%20your%20fall%20wardrobe%20before%20it%20disappears.%0A%0AAvailable%20in%20Black%2FNude
This value can be passed as a JS variable, but I don't know of a JS function that will cleanly reverse a PHP rawurlencode.
Is there a matching pair of functions that I could use to encode a variable in PHP to allow it to be passed into a JS function -- and then reverse the encoding in JS so that I get the original value of the PHP variable?
EDIT: To clarify the question and reply to comments, here is some test code:
<?php
$str =<<<EOT
Tromp L'oeil Sheath Dress
You will certainly "trick the eye" of many in this gorgeous illusion. Add it to your fall wardrobe before it disappears.
EOT;
echo 'here is the string: <pre>' . $str . '</pre>';
?>
<script type="text/javascript">
<?php
// this does not work with JS as i get an unterminated string literal if i just use addslashes in the following commented-out line
// echo 'alert(\'' . addslashes($str) . '\');';
// this works with JS (the alert activates) but how do i un-rawurlencode in JS?
// echo 'alert(\'' . rawurlencode($str) . '\');';
// this does not work with JS, because of the line breaks
echo 'alert(\'' . htmlspecialchars ($str, ENT_QUOTES) . '\');';
?>
</script>
simplest would be to use json_encode()
I ran into problems using some of the answers proposed here, including issues with line breaks and decoding certain html entitites like /. I ended up using rawurlencode (in PHP) and decodeURIComponent (in Javascript) as matching functions to encode/decode the string so it could be passed as a JS variable. Here is working code for anybody else running into this problem.
<?php
$str =<<<EOT
Tromp L'oeil Sheath Dress
You will certainly "trick the eye" of many in this gorgeous illusion. Add it to your fall wardrobe before it disappears.
Available in Black/Nude
EOT;
echo 'here is the string: <pre>' . $str . '</pre>';
?>
<p>below is the variable doc.write'd after being rawurlencod'ed in PHP then decodeURIComponent'ed in JS:</p>
<script type="text/javascript">
<?php
echo 'document.write(decodeURIComponent("'. rawurlencode($str).'"));';
?>
You can use json_encode if available. It encodes the string according to the JSON data format that is a subset of JavaScript; so any JSON is also valid JavaScript.
<script type="text/javascript">
<?php
echo 'alert('. json_encode($str).');';
?>
</script>
Otherwise try PHP’s rawurlencode and decode it with JavaScript’s decodeURI:
<script type="text/javascript">
<?php
echo 'alert(decodeURI("'. rawurlencode($str).'"));';
?>
</script>
Json is the solution.
See sample code
Two pages to demonstrate
First Page json.php
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Untitled Document</title>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
</head>
<body>
<script type="text/javascript">
$(document).ready(function() {
// This is more like it!
$('#submit').live('click', function() {
var id=$("#id").attr("value");
$.getJSON("json-call.php", {id:id}, function callback(data) {
$("#list").html("var1:"+data['var1']+"<br/>"+"var2:"+data['var2']+"<br />id:"+data['id']);
});
});
});
</script>
<input id="id" type="text" value="test value" />
<input type="button" id="submit" value="submit" />
<div id="list"></div>
</body>
</html>
Second Page json-call.php
$var1 = 'your name';
$var2 = 'your address';
$id = $_REQUEST['id'];
print(json_encode(array ('var1' => $var1, 'var2' => $var2, 'id'=>$id)));
and Results
var1:your name
var2:your address
id:test value
Not sure whether json_decode does everything you need. htmlspecialchars() and htmlspecialchars_decode() should do the trick for everything but the line breaks. The line breaks are kind of a pain, since the combination of linebreaks and carriage returns will depend on the browser, but I think something like this should work:
$value = "your string with quotes and newlines in it.";
//take cares of quotes
$js_value = htmlspecialchars($value);
//first line replaces an ASCII newline with a JavaScript newline
$js_value = str_replace("\n",'\n',$js_value);
//second line replaces an ASCII carriage return with nothing, so you don't get duplicates
$js_value = str_replace("\r",'',$js_value);
//reverse to convert it back to PHP
$php_value = str_replace('\n',"\r\n",$js_value);
$php_value = htmlspecialchars_decode($php_value);
Maybe not the most elegant solution, but that's not really my specialty. ;) Also, keep in mind that newline characters will just end up like spaces in an <input type="text"> field.
Here is a litle something I have made:
function safefor_js($str) {
return str_replace(array("'",'"',"\n"), array('\x22','\x27','\\n'), $str);
}

Categories