php query not using UTF-8 charset - php

I am getting my urls and titles from a post's content, but the titles no longer seem to be UTF-8 and include some funky characters such as "Â" when I echo the result. Any idea why the correct charset isn't being used? My headers do use the right metadata.
I tried some of the solutions on here, but none seems to work so I thought I'd add my code below - just in case I'm missing something.
$servername = "localhost";
$database = "xxxx";
$username = "xxxxx";
$password = "xxxx";
$conn = mysqli_connect($servername, $username, $password, $database);
$post_id = 228;
$content_post = get_post($post_id);
$content = $content_post->post_content;
$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . $content);
$links = $doc->getElementsByTagName('a');
$counter = 0;
foreach ($links as $link){
$href = $link->getAttribute('href');
$avoid = array('.jpg', '.png', '.gif', '.jpeg');
if ($href == str_replace($avoid, '', $href)) {
$title = $link->nodeValue;
$title = html_entity_decode($title, ENT_NOQUOTES, 'UTF-8');
$sql = "INSERT INTO wp_urls_download (title, url) VALUES ('$title', '$href')";
if (mysqli_query($conn, $sql)) {
$counter++;
echo "Entry" . $counter . ": $title" . "<br>";
} else {
echo "Error: " . $sql . "<br>" . mysqli_error($conn);
}
}
}
Updated Echo string - changed this after I initially uploaded the code. I have already tried the solutions in the other posts and was not successful.

Did you try to set the utf8 charset on the connection?
$conn->set_charset('utf8');
For more information: http://php.net/manual/en/mysqli.set-charset.php

It seems that you have "double-encoding". What you expected was
Transverse Abdominis (TVA)
But what you have for the space before the parenthesis is a special space that probably came from Microsoft Word, then got converted to utf8 twice. In hex: A0 -> c2a0 -> c382c2a0.
Yes, the link to "utf8 all the way through" would ultimately provide the fix, but I think you need more help.
The A0 was converted from latin1 to utf8, then treating those bytes as if they were latin1 and repeating the conversion.
The connection provide the client's encoding via mysqli_obj->set_charset('utf8') (or similar).
Then the column in the table should be CHARACTER SET utf8mb4 (or utf8). Verify with SHOW CREATE TABLE. (It is probably latin1 currently.)
HTML should start with <meta charset=UTF-8>.
Trouble with UTF-8 characters; what I see is not what I stored

Related

Issues with Cyrillic character set in PHP (Black Diamonds & question marks)

I am having issues with retrieving and processing data that is in Russian using the cyrillic character set.
I get the data in a text file from an FTP server with the code below and it displays every character with the black diamonds with question marks inside.
If I view it directly by accessing the FTP address with the browser, it displays correctly.
I have tried changing this line:
to
and
and while I get different results, none show the same as when accessing the file directly by the browser.
I'm not sure how to get the code to display the same as the browser when I view it directly
This would be an example of how I view the text file directly which displays correctly : ftp://username:password#ftp.mysite.com/test.txt
This is the code I am using which displays the black diamonds with question marks (other other incorrect characters, depending on the charset mentioned above).
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<?php
$username = "username";
$password = "password";
$server = "ftp://ftp.mysite.com"
$remoteFile = "test.txt";
$conn = ftp_connect($server);
if (#ftp_login($conn, $username, $password)) {
echo "";
}
else {
echo "";
}
ob_start();
ftp_get($conn, 'php://output', $remoteFile, FTP_ASCII);
$data = ob_get_contents();
ob_end_clean();
ftp_close($conn);
echo $data;
?>
</html>
I managed to resolve this by using mb_convert_encoding by adding the following line :
$new_data = mb_convert_encoding($data, "utf-8", "Windows-1251");
with the resulting code as :
<html>
<?php
$username = "username";
$password = "password";
$server = "ftp://ftp.mysite.com"
$remoteFile = "test.txt";
$conn = ftp_connect($server);
if (#ftp_login($conn, $username, $password)) {
echo "";
}
else {
echo "";
}
ob_start();
ftp_get($conn, 'php://output', $remoteFile, FTP_ASCII);
$data = ob_get_contents();
ob_end_clean();
ftp_close($conn);
$new_data = mb_convert_encoding($data, "utf-8", "Windows-1251");
echo $data;
?>
</html>
Hope this helps someone...

PHP doesn't display the right result [duplicate]

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 6 years ago.
My problem is that how my query result unreadable for humans.
I get this result in browser:
ID: 5 Reg.date: 2016-02-29 18:57:52 C�si
And the name should be 'Cósi'.
The php is in UTF-8, the database is in utf8-hungarian-ci.
So I do the query and after it I put the results into a $user array, and I echo the first 3 item like: echo "ID: " . $user["userID"] . "Reg.date: " . $user["regdate"] . $user["name"];
I tried header('Content-type: text/html; charset=UTF-8'); but it neither works.
I have a innoDB phpmyadmin database, but the server is my father's computer. A xampp, should I search the problem there?
Here is the all php:
$con = mysqli_connect("127.0.0.1", "kxxx", "csxxx", "bxxx");
$password = "asd";
$username = "carrie#gmail.com";
$statement = mysqli_prepare($con, "SELECT * FROM user WHERE email = ? AND password = ?");
mysqli_stmt_bind_param($statement, "ss", $username, $password);
mysqli_stmt_execute($statement);
mysqli_stmt_store_result($statement);
mysqli_stmt_bind_result($statement, $userID, $reg_date, $name, $email, $password, $phonenumber);
$user = array();
while(mysqli_stmt_fetch($statement)){
$user["userID"] = $userID;
$user["regdate"] = $reg_date;
$user["name"] = $name; / which should be "Cósi"
$user["email"] = $email;
$user["password"] = $password;
$user["phonenumber"] = $phonenumber;
}
echo "ID: " . $user["userID"] . "Reg. dátum: " . $user["regdate"] . $user["name"];
echo json_encode($user, JSON_UNESCAPED_UNICODE);
The json_encode is disappears, if empty [] should displayed, but that's not, only the first 3 item of the array.
So I want the C�si to be Cósi. What should I do? I tried to changed meta tag to charset='utf-8', mysql_query("SET NAMES 'utf8'") and the header thing what I mentioned above, all the columns are utf8-hungarian-ci, all the tables and the database is too.... So maybe the server configurations are bed?
When I insert into the database over a php, in the database the 'Cósi' is displays well, everything saved in the database right.
Thank you guys, I resolved it wtih:
mysqli_set_charset($con, 'utf8mb4');
You need the proper html tags:
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>Your Page Title</title>
</head>
<body>
YOUR CONTENT HERE
</body>
</html>
Check mb_convert_encoding in php.
$str = mb_convert_encoding($str, "UTF-8", "utf8-hungarian-ci");

HTML tags saved to database as HTML entities

I am using this code to update values in a database:
function text_save( $page_name, $page_title, $page_text ) {
$servername = "127.0.0.1";
$username = "abcd";
$password = "password";
$dbname = "thedatabase";
// Create connection
$conn = new mysqli( $servername, $username, $password, $dbname );
// Check connection
if ( $conn->connect_error ) {
die( "Connection failed: " . $conn->connect_error );
}
$page_content = $page_text;
$sql = "UPDATE text SET page_title=?, page_text=? WHERE text_name=?";
// prepare and bind
$stmt = $conn->prepare( $sql );
// check whether the prepare() succeeded
if ( $stmt === false ) {
trigger_error( $this->mysqli->error, E_USER_ERROR );
}
$stmt->bind_param( 'sss', $page_title, $page_content, $page_name );
$stmt->execute();
print '<p>Text saved</p>';
$stmt->close();
$conn->close();
}
The variable $page_text is set from a tinymce text area on a form submission and is HTML content.
If I var_dump the value of $page_text, it comes out as:
string(23) "<p>test</p>"
When the $page_content data is saved to the database, it is saved as:
<p>test</p>
However, if I manually set the value of $page_content to
<p>test</p>,
e.g. $page_content = "<p>test</p>"; instead of
$page_content = $page_text;
it is saved to the database as:
<p>test</p>
I need the HTML to be saved to the database without being converted to HTML entities i.e. as <p>test</p> not <p>test</p>?
This is what I have tried so far:
Setting the page to utf8 - <meta charset="utf-8" />,
setting the form to utf8 - accept-charset="UTF-8",
setting the connection to utf8 - mysqli_set_charset($conn,"utf8");
setting the tinymce init with - entity_encoding : "raw"
What am I doing wrong here, and why does the HTML string save correctly when I manually set the variable rather than using the form variable value (which seems to be identical)?
Thanks very much!
You might be looking for the html_entity_decode function,
http://php.net/manual/en/function.html-entity-decode.php
$page_content = html_entity_decode($page_text);
Be careful about injections and such.
I think the problem is in the editor.
can you try to print the value of $page_content before insert into database and show us?
too, see this post.
HTML Tags stripped using tinyMCE

Special characters with Ajax and MySQL

I have a MySQL database encoded with the default characterset UTF8. I have also a PHP code encoded with the same charset meta charset="UTF-8".
My connection to the database is configured to use UTF8 too
new PDO("mysql:host=" .$host. ";dbname=".$database,$username,$password,
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
But I have a problem when I use Ajax to get the content of a textbox and insert it into the database.
If I do not use special characters it works fine but when I use a quote or something everything stops working.
I tried to use UTF8_encode and UTF8_decode but nothing changed
EDIT
PHP
...
<meta charset="UTF-8">
...
<textarea class="commentBox" id="<?php echo $id_case;?>"></textarea>
<button class="saveComment" id="<?php echo $id_case;?>"> Save comment </button>
//id_case is different for each textarea
Javascript
$('.saveComment').click(function()
{
var idComment = this.id;
var content = $('#'+idComment+'.commentBox').val();
add_comment(idComment, content);
});
function add_comment(case_id, content)
{
$.post("../functions/ajax/add_comment.php",
{
id_case: case_id,
content: content
},
function(data,status)
{
alert("It worked !");
console.log("Function add_comment : "+status);
});
}
add_comment.php
<?php
if(isset($_POST['id_case'], $_POST['content']))
{
$case = $_POST['id_case'];
$content = $_POST['content'];
}
else
{
echo "Error during sending data to [add_comment.php]";
}
if($db != null)
{
try
{
$sql = ("UPDATE cases SET progress_remarks = '$content' WHERE id_cases = $case");
$result = $db->exec($sql);
echo $content;
}
catch(PDOException $e)
{
echo $sql . "<br>" . $e->getMessage();
}
}
else echo "Erreur interne (fill_progress.php)";
?>
My database connection is done somewhere else but looks like this
$this->con = new PDO("mysql:host=" .$host. ";dbname=".$database,$username,$password,
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
when I use a quote or something everything stops working.
It is unclear whether the problem is (1) just an escaping issue, or (2) also a utf8 issue.
Since PDO has a builtin way to take care of escaping, use it:
$sql = ("UPDATE cases SET progress_remarks = ? WHERE id_cases = $case");
$result = $db->exec($sql, $content);
This is probably the preferred way to set the charset with PDO: $db = new PDO('dblib:host=host;dbname=db;charset=UTF-8', $user, $pwd);
meta charset="UTF-8" refers to the tag in HTML, not PHP or MySQL.
SHOW CREATE TABLE -- Is the column in question declared CHARACTER SET utf8?

Indicate encoding of XML file using objDOM ->load()

I am trying to read an XML file and then input the obtained value into a database. Then entire process works great , as long as there are no special characters in the XML. the XML is formatted as :
<link>
<name>Cech</name>
<club>Chelsea</club>
</link>
In case the name tag encloses a name like Suárez, i get the error: Input is not proper UTF-8, indicate encoding ! Bytes: 0xE1 0x72 0x65 0x7A in file:///C:/wamp/www/ADB/links.xml, line: 1857 in C:\wamp\www\ADB\phptry.php on line 14 , where line 1857 has the name Suárez . i tried including the <?xml version="1.0" encoding="UTF-8"?>
at the beginning of the file and using the utf8_encode(file_get_contents('links.xml')) but it doesnt work. Any suggestions? this is my working php code:
<?php
$dbhost = 'localhost';
$dbuser = 'root';
$dbpass = '';
$conn = mysql_connect($dbhost, $dbuser, $dbpass);
if(! $conn )
{
die('Could not connect: ' . mysql_error());
}
$objDOM = new DOMDocument();
//$content = utf8_encode(file_get_contents('links.xml'));
$objDOM->load('links.xml'); //make sure path is correct
$note = $objDOM->getElementsByTagName("link");
// for each note tag, parse the document and get values for
// tasks and details tag.
foreach( $note as $value )
{
$player = $value->getElementsByTagName("name");
$player_name = $player->item(0)->nodeValue;
$playername = addslashes($player_name);
$club = $value->getElementsByTagName("club");
$club_name = $club->item(0)->nodeValue;
// $points = $value->getElementsByTagName("points");
// $point_value = $points->item(0)->nodeValue;
$sql = "INSERT INTO pilayers (name,club) VALUES('$playername','$club_name')";
mysql_select_db('players');
$retval = mysql_query( $sql, $conn );
if(! $retval )
{
die('Could not enter data: ' . mysql_error());
}
echo "Entered data successfully\n";
}
mysql_close($conn);
?>
The error says that the xml file is not encoded in utf-8. You declared the encoding in the PI instruction, but that does not mean that you editor really saved utf-8.
How to change the encoding depends on your editor/ide.
Eclipse: Edit -> Set Encoding
PHPStorm: File -> File Encoding

Categories