PHP encoding utf-8 - php

I have a html file that want pars tags(Title,A,Name,...) and save them to a excel file .
html tags is persian and when save them into excel file not save corectly.
html file:
<html dir="rtl"><head><meta http-equiv="Content-Language" content="fa">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1256">
<title>ÈÇäß ÕæÊ.ÂãæÒÔ.ÇÏÈíÇÊ.ÚæÇãá Ýí ÇáäÍæ.ÇÓÊÇÏ ãÏÑÓ ÇÝÛÇäí.ÌáÓå 1</title>
<meta name="generator" M.H.SAFARZADE TEHRANI E_BANK_JAME_EMAMALI>
for example i want save title but not save corectly . title tag trust value is :
تدريس استاد مدرس افغاني
but save : ÈÇäß ÕæÊ.ÂãæÒÔ.ÇÏÈíÇÊ.ÚæÇãá Ýí ÇáäÍæ.ÇÓÊÇÏ ãÏÑÓ ÇÝÛÇäí.ÌáÓå 1
in php file i do this :
mb_internal_encoding('UTF-8');
mb_http_output('UTF-8');
mb_http_input('UTF-8');
mb_language('uni');
mb_regex_encoding('UTF-8');
ob_start('mb_output_handler');
header('Content-Type: text/html; charset=utf-8');
and :
$str = mb_convert_encoding($title, "UTF-8");
i can't find answer for my question in web.
please help me ...
thank you

Related

Why does PHP header(charset) work while HTML <meta charset> doesn't?

this one may be easy, but seems a problem for my server (or me myself).
I have this piece of code in index.php:
<?php
header('Content-Type: text/html; charset="UTF-8"');
// Some code for generating data to be displayed
foreach ($ObjectArray as $SingleObject) {
print_r($SingleObject->getAllProperties());
}
And it does this:
But I don't want to use header('Content-Type: text/plain; charset="UTF-8"'); - I'd rather include HTML code from my header.htm:
<html>
<head>
<meta charset="UTF-8">
<title>Test Cards</title>
<link rel="stylesheet" type="text/css" href="style.css">
<script src="jquery-3.1.0.min.js"></script>
<link rel="stylesheet" href="jquery-ui.css">
<script src="jquery-ui.js"></script>
</head>
with my index.php like that:
<?php
include 'view/header.htm';
echo '<body>';
// Some code for generating data to be displayed
foreach ($ObjectArray as $SingleObject) {
print_r($SingleObject->getAllProperties());
echo '</body>';
echo '</html>';
}
Unfortunately, this ain't too good. Charset still is recognized as UTF-8, but the result is far from my expectations:
Please tell me, what is happening and how to handle this kind of problem. Is it a case of combining HTML and PHP (clean PHP does use some fancy styling when HTML ain't present?) or maybe some mistake in my code?
Thanks in advance :)
The formatted look is preserved, because in the first case you have the content-type text/plain, while in the second case it is HTML (text/html).
You can wrap it in <pre></pre> tags to preserve formatting when returning HTML.
<?php
include 'view/header.php';
echo '<body>';
echo '<pre>';
// ...
// your foreach here
// ...
echo '</pre>';
echo '</body>';

How to output Chinese in HTML file

I have a form and insert some chinese words in database and it's ok. Table charset is UTF8. Problem appears when I select this data and send it via mail as HTML attachment.
Then, Chinese doesn't display properly. How to fix charset before send data via mail? Should I use some headers and will it work?
My code looks like that:
//$attachedBodyContent is data from database that contains some chinese words
Mail::send(
"emails.applicationTemplate",
$data,
function($message) use ($data, $template, $subject, $attachedBodyContent) {
$message->to($data['email'], $data['name'])
->from($template['from'],$template['from_name'])
->subject($subject)
->attachData($attachedBodyContent,'YourApplicationData.html');
}
);
When you generate .html attach file you should include in your <head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
In this case you can use this code for merge your content with <head>
<?php
$header = '<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>';
$footer = '</body>
</html>';
$allContent = $header.$attachedBodyContent.$footer;
?>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
This should do it, for further information check the link.
http://www.inventpartners.com/chinese-chars

PHP: Get encoded html entities

I'm trying to get the html entities of a UTF-8 string,
Example: example.com/search?q=مرحبا
<?php
echo htmlentities($_GET['q']);
?>
I got:
مرحبا0مرحبا
It's UTF-8 text not html entities,
what I need is:
مرحبا
I have tried urldecode and htmlentities functions!
Add this code to the start of your file:
header('Content-Type: text/html; charset=utf-8');
The browser needs to know it is UTF-8. This tag also can go in the head section for formality.
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
I think you can solve it by getting the each char in the string and get its value.
From Mark Baker's answer and vartec's answer you can get:
<?php
$chrArray = preg_split('//u',$_GET['q'], -1, PREG_SPLIT_NO_EMPTY);
$htmlEntities = "";
foreach ($chrArray as $chr) {
$htmlEntities .= '&#'._uniord($chr).';';
}
echo $htmlEntities;
?>
I have not test it.

Simple RSS encoding issue

Consider the following PHP code for getting RSS news on a site I'm developing:
<?php
$url = "http://dariknews.bg/rss.php";
$xml = simplexml_load_file($url);
$feed_title = $xml->channel->title;
$feed_description = $xml->channel->description;
$feed_link = $xml->channel->link;
$item = $xml->channel->item;
function getTheData($item){
for ($i = 0; $i < 4; $i++) {
$article_title = $item[$i]->title;
$article_description = $item[$i]->description;
$article_link = $item[$i]->link;
echo "<p><h3>". $article_title. "</h3></p><small>".$article_description."</small><p>";
}
}
?>
The data accumulated by this function should be presented in the following HTML format:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251"/>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Новини от Дарик</title>
</head>
<body>
<?php getTheData($item);?>
</body>
</html>
As you see I added windows-1251(cyrillic) and utf-8 encoding but the RSS feed is unreadable if I don't change the browser encoding to utf-8. The default encoding in my case is cyrilic but I get unreadable feed. Any help making this RSS readable in cyrilic(it's from Bulgaria) will be greatly appreciated.
I've just tested your code and the Bulgarian characters displayed fine when I removed the charset=windows-1251 meta tag and just left the UTF-8 one. Want to try that and see if it works?
Also, you might want to change your <html> tag to reflect the fact that your page is in Bulgarian like this: <html xmlns="http://www.w3.org/1999/xhtml" lang="bg" xml:lang="bg">
Or maybe you need to force the web server to send the content as UTF-8 by sending a Content-Type header:
<?php
header("Content-Type: text/html; charset=UTF-8");
?>
Just be sure to include this before ANY other content (even whitespace) is sent to the browser. If you don't you'll get the PHP "headers already sent" error.
Maybe you should take a look at htmlentities.
This can convert to html some characters.
$titleEncoded = htmlentities($article_title,ENT_XHTML,cp1251);

Converting russian characters from upper case to lower case in php

I'm trying to change the case of russian characters from upper to lower.
function toLower($string) {
echo strtr($string,'ЁЙЦУКЕНГШЩЗХЪФЫВАПРОЛДЖЭЯЧСМИТЬБЮ','ёйцукенгшщзхъфывапролджэячсмитьбю');
};
This is the function I used and the output looks something like this
ЁЙ## ёѹ##`
Can anybody help me with this ?
Thanks in advance
$result = mb_strtolower($orig, 'UTF-8');
(assuming the data is in utf-8)
Specify the charset within the HTML and use mb_strtolower() to convert case:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 TRANSITIONAL//EN">
<html>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<head>
<title></title>
</head>
<body>
<?
$string = 'ЦУКЕНГШЩЗХЪФЫВАПРОЛДЖЭЯЧСМИТЬБЮ' ;
echo mb_strtolower($string, 'UTF-8');
?>
</body>
</html>
With the meta-tag it looks like this:
цукенгшщзхъфывапролджэячсмитьбю
Without the meta-tag it looks like this
цукенгшщзхъфывапролджÑÑчÑмитьбю

Categories