MySQL or PHP is appending a  whenever the £ is used - php

Answers provided have all been great, I mentioned in the comments of Alnitak's answer that I would need to go take a look at my CSV Generation script because for whatever reason it wasn't outputting UTF-8.
As was correctly pointed out, it WAS outputting UTF-8 - the problem existed with Ye Olde Microsoft Excel which wasn't picking up the encoding the way I would have liked.
My existing CSV generation looked something like:
// Create file and exit;
$filename = $file."_".date("Y-m-d_H-i",time());
header("Content-type: application/vnd.ms-excel");
header("Content-disposition: csv" . date("Y-m-d") . ".csv");
header( "Content-disposition: filename=".$filename.".csv");
echo $csv_output;
It now looks like:
// Create file and exit;
$filename = $file."_".date("Y-m-d_H-i",time());
header("Content-type: text/csv; charset=ISO-8859-1");
header("Content-disposition: csv" . date("Y-m-d") . ".csv");
header("Content-disposition: filename=".$filename.".csv");
echo iconv('UTF-8', 'ISO-8859-1', $csv_output);
-------------------------------------------------------
ORIGINAL QUESTION
Hi,
I've got a form which collects data, form works ok but I've just noticed that if someone types or uses a '£' symbol, the MySQL DB ends up with '£'.
Not really sure where or how to stop this from happening, code and DB information to follow:
MySQL details
mysql> SHOW COLUMNS FROM fraud_report;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | mediumint(9) | | PRI | NULL | auto_increment |
| crm_number | varchar(32) | YES | | NULL | |
| datacash_ref | varchar(32) | YES | | NULL | |
| amount | varchar(32) | YES | | NULL | |
| sales_date | varchar(32) | YES | | NULL | |
| domain | varchar(32) | YES | | NULL | |
| date_added | datetime | YES | | NULL | |
| agent_added | varchar(32) | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
8 rows in set (0.03 sec)
PHP Function
function processFraudForm($crm_number, $datacash_ref, $amount, $sales_date, $domain, $agent_added) {
// Insert Data to DB
$sql = "INSERT INTO fraud_report (id, crm_number, datacash_ref, amount, sales_date, domain, date_added, agent_added) VALUES (NULL, '$crm_number', '$datacash_ref', '$amount', '$sales_date', '$domain', NOW(), '$agent_added')";
$result = mysql_query($sql) or die (mysql_error());
if ($result) {
$outcome = "<div id=\"success\">Emails sent and database updated.</div>";
} else {
$outcome = "<div id=\"error\">Something went wrong!</div>";
}
return $outcome;
}
Example DB Entry
+----+------------+--------------+---------+------------+--------------------+---------------------+------------------+
| id | crm_number | datacash_ref | amount | sales_date | domain | date_added | agent_added |
+----+------------+--------------+---------+------------+--------------------+---------------------+------------------+
| 13 | 100xxxxxxx | 10000000 | £10.93 | 18/12/08 | blargh.com | 2008-12-22 10:53:53 | agent.name |

What you're seeing is UTF-8 encoding - it's a way of storing Unicode characters in a relatively compact format.
The pound symbol has value 0x00a3 in Unicode, but when it's written in UTF-8 that becomes 0xc2 0xa3 and that's what's stored in the database. It seems that your database table is already set to use UTF-8 encoding. This is a good thing!
If you pull the value back out from the database and display it on a UTF-8 compatible terminal (or on a web page that's declared as being UTF-8 encoded) it will look like a normal pound sign again.

£ is 0xC2 0xA3 which is the UTF-8 encoding for £ symbol - so you're storing it as UTF-8, but presumably viewing it as Latin-1 or something other than UTF-8
It's useful to know how to spot and decode UTF-8 by hand - check the wikipedia page for info on how the encoding works:
0xC2A3 = 110 00010 10 100011
The bold parts are the actual
"payload", which gives 10100011,
which is 0xA3, the pound symbol.

In PHP, another small scale solution is to do a string conversion on the returned utf8 string:
print iconv('UTF-8', 'ASCII//TRANSLIT', "Mystring â"); //"Mystring "
Or in other platforms fire a system call to the inconv command (linux / osx)
http://php.net/manual/en/function.iconv.php#83238

You need to serve your HTML in utf-8 encoding (actually everyone needs to do this I think!)
Header like:
Content-Type: text/html; charset=UTF-8
Or the equivalent. Double check the details though. Should always be declaring the charset as a browser can default to anything it likes.

To remove a  use:
$column = str_replace("\xc2\xa0", '', $column);
Credits among others: How to remove all occurrences of c2a0 in a string with PHP?

Thanks a lot. I had been suspecting mysql for being currupting the pound symbol. Now all i need to do is wherever the csv record is generated, just use wrap them incov funciton. Though this is a good job, I am happy, at least someone showed exactly what to do. I sincerly appreciate dislaying the previous and the new 'header' values. It was a great help to me.
-mark

If you save line "The £50,000 Development Challenge" in two different data type column i.e. "varchar" & "text" field.
Before i save i have replaced the symbol with html equi value using following function.
str_replace("£", "£", $title);
You will find that value stored in text fields is &pound where as in varchar its "£".

Related

Replacing URL pattern for MySQL

I'm trying to figure out how to use a regex search on a MySQL column to update some data.
The problem is I'm trying to rename part of a URL (i.e a directory).
The table looks something like this (although it's just an example, the actual data is arbitrary):
myTable:
| user_name | URL |
| ------------- |:---------------------------------------------------------:|
| John | http://example.com/path/to/Directory/something/something |
| Jane | http://example.com/path/to/Directory/something/else |
| Jeff | http://example.com/path/to/Directory/something/imgae.jpg |
I need to replace all the URLs that have "path/to/Directory/" to "path/to/anotherDirectory/" while keeping the rest of URL intact.
So the result after the update should look like this:
| user_name | URL |
| ------------- |:----------------------------------------------------------------:|
| John | http://example.com/path/to/anotherDirectory/something/something |
| Jane | http://example.com/path/to/anotherDirectory/something/else |
| Jeff | http://example.com/path/to/anotherDirectory/something/imgae.jpg |
At the moment, the only way I could figure out how to do it is using a combination of regex quires to check for the directory, and then loop over it and change the URL, like this:
$changeArr = $db->query("SELECT URL FROM myTable WHERE URL REGEXP 'path/to/Directory/.+'");
$URLtoChange = "path/to/Directory/";
$replace = "path/to/anotherDirectory/"
foreach ($changeArr as $URL) {
$replace = str_replace($URLtoChange, $replace, $URL);
$db->query("UPDATE myTable SET URL = :newURL WHERE URL = :URL", array("URL"=>$URL,"newURL"=>$replace));
}
This seems to work pretty well, however with such with a big table it can be pretty heavy on performance.
I was wondering if there's a more efficient way to do this? Perhaps with some sort of regex replace in a mySQL query.
Just use the REPLACE() function:
UPDATE myTable
SET URL = REPLACE(url, 'path/to/Directory/', 'path/to/anotherDirectory/')
WHERE URL LIKE '%path/to/Directory/%'

Error on accentuated characters with PHP and MySQL

My problem is that what is written directly via PHP is correctly accentuated, but when the accentuated word comes from the MySQL, the letters come like this �.
I tried using the html charset as ISO-8859-1 and it fixed the MySQL letters, but broke the others. One way to fix it all is to set my .php files to ISO-8859-1, but I can't do it, I need to use it in utf-8 encode.
What can I do?
At the moment solution: Include mysqli_set_charset($link, "utf8"); before the queries (only need to do once for each connection made). I'm still looking for a conclusive solution on the server, not on the client.
EDIT:
mysql> SHOW VARIABLES LIKE 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
mysql> SHOW VARIABLES LIKE 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
mysql> show variables like "character_set_database";
+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| character_set_database | utf8 |
+------------------------+-------+
1 row in set (0.00 sec)
mysql> show variables like "collation_database";
+--------------------+-----------------+
| Variable_name | Value |
+--------------------+-----------------+
| collation_database | utf8_general_ci |
+--------------------+-----------------+
1 row in set (0.00 sec)
These are the values of my database, but I still cannot make it right.
EDIT2:
<meta charset="utf-8">
...
$con = mysqli_connect('localhost', 'root', 'root00--', 'eicomnor_db');
$query = "SELECT * FROM table";
$result = mysqli_query($con, $query);
while ($row = mysqli_fetch_assoc($result)) {
echo "<tr>";
echo "<td>" . $row['id'] . "</td>";
echo "<td>" . $row['nome'] . "</td>";
echo "</tr>";
}
mysqli_close($con);
Here's the PHP code.
First off, don't try to modify your php files in the direction of ISO-8859-1, that's going backwards, and may lead to compatibility issues with browsers on down the line. Instead, you want to be following the path to utf-8 from the bottom up.
The
easiest thing to check is to make sure that you're serving your html as utf-8:
AddDefaultCharset utf-8 in your apache config may help with that,
and <meta charset="utf-8"> in your html header will as well.
The second thing to check is to make sure that the mysql connection & collation
uses utf-8:
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html or http://docs.moodle.org/23/en/Converting_your_MySQL_database_to_UTF8
The
final and most annoying step is to convert any data actually in the
database to utf-8. Back up your data with a standard mysql dump first! There are a few tricks to simplify this process by creating a dump of the database as utf-8 and then putting it back into the system with the right collation, but be aware that this is a delicate process and be sure you have a solid backup to work with first! http://docs.moodle.org/23/en/Converting_your_MySQL_database_to_UTF8 is a good guide to that process.
Good luck! charset issues with old databases are often more work than they initially appear.
Have you tried iconv? As you know that the charset used on the DB is ISO-8859-1, you can convert to your charset (I'm assuming UTF-8):
// Assuming that $text is the text coming from the DB
$text = iconv("ISO-8859-1", "UTF-8", $text)
Assuming you send the output to the browser, you need to ensure that the proper charset <meta charset="utf-8" /> is set and that you don't override it in your browser settings (check that it's either "auto" or "uft-8").
Include mysqli_set_charset($link, "utf8"); before the queries (only need to do once for each connection made) resolves the problem.

Unable to update/insert on mySQL DB using PHP/PDO

I'm having trouble doing updates and inserts on my database using PDO in PHP. The error code notes a success, but the expected changes aren't reflected in the database.
Here's where I set up my connection:
$dsn = "mysql:host=".DB_HOST.";dbname=".DB_NAME;
$db = new PDO($dsn, DB_USER, DB_PASS);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$db->setAttribute(PDO::ATTR_AUTOCOMMIT, FALSE);
I can connect and SELECT on things just fine, but it won't update or insert at all. I've tried just using a hard-coded string, but not even that will work. Here's the code for that:
$ins = "insert into choice_history (id_choice_history,choice_num,choice_taken) values ( 0, 10, 28);";
if($stmt = $this->_db->prepare($ins))
{
$status = $stmt->execute();
return $errorcode = $stmt->errorCode();
}
I have a similar update string but they both have the same results.
Table definition (it has no constraints):
mysql> describe choice_history;
+-------------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| id_choice_history | int(11) | NO | | NULL | |
| choice_taken | int(11) | YES | | NULL | |
| choice_num | int(11) | YES | | NULL | |
+-------------------+---------+------+-----+---------+----------------+
The error code for this is always 00000 (success). I've also tried calling exec() with the same string, but no dice. I can paste that query into mysql on my server and it executes fine, but from PDO nothing happens.
At this point I'm at a loss. PDO reports success but nothing happens in the database. How can I figure out where the problem is?
You have turned auto-commit off, so you need to explicitly commit() all transactions.
Your queries are successful (thus the results you are seeing), they are just never committed.
If you don't want to have to explicitly commit every request, you need to remove this line of code:
$db->setAttribute(PDO::ATTR_AUTOCOMMIT, FALSE);

Mysql string check on equals is false for the same values

I have a problem with MySql
I have a table with parsed informations from websites. A strange string interpretation appear:
the query
select id, address from pagesjaunes_test where address = substr(address,1,length(address)-1)
return a set of values instead of none
at beginning I executed functions as:
address = replace(address, '\n', '')
address = replace(address, '\t', '')
address = replace(address, '\r', '')
address = replace(address, '\r\n', '')
address = trim(address)
but the problem still persist.
Values of field 'address' have some french chars , but the query returned also values that contains only alfanumeric english chars.
Another test: I tried to check the length of strings and ... the strlen() from PHP and LENGTH() from MYSQL display different results! Somewhere difference is by 2 chars, somewhere by 1 character without a specific "rule".
Visual I can't see any space or tabs or something else.
After I modified an address manualy(I deleted all string and I wrote it again), the problem is solved, but I have ~ 6000 values, so this is not a solution :)
What can be the problem?
I suppose that strings can have something as an "empty char", but how to detect and remove it?
Thanks
P.S.
the problem is not just length. I need to join this table with other one and using a condition that check if values from fields 'address' are equals. Even if the fields have the same collation and tables have the same collation, query returns that no addresses match
E.g.
For query:
SELECT p.address,char_length(p.address) , r.address, char_length(r.address)
FROM `pagesjaunes_test` p
LEFT JOIN restaurants r on p.name=r.name
WHERE
p.postal_code=r.postal_code
and p.address!=r.address
and p.phone=''
and p.cuisines=''
LIMIT 10
So: p.address!=r.address
The result is:
+-------------------------------------+------------------------+--------------------------+------------------------+
| address | char_length(p.address) | address | char_length(r.address) |
+-------------------------------------+------------------------+--------------------------+------------------------+
| Dupin Marc13 quai Grands Augustins | 34 | 13 quai Grands Augustins | 24 |
| 39 r Montpensier | 16 | 39 r Montpensier | 16 |
| 8 r Lord Byron | 14 | 3 r Balzac | 10 |
| 162 r Vaugirard | 15 | 162 r Vaugirard | 15 |
| 32 r Goutte d'Or | 16 | 32 r Goutte d'Or | 16 |
| 2 r Casimir Périer | 18 | 2 r Casimir Périer | 18 |
| 20 r Saussier Leroy | 19 | 20 r Saussier Leroy | 19 |
| Senes Douglas22 r Greneta | 25 | 22 r Greneta | 12 |
| Ngov Ly Mey44 r Tolbiac | 23 | 44 r Tolbiac | 12 |
| 33 r N-D de Nazareth | 20 | 33 r N-D de Nazareth | 20 |
+-------------------------------------+------------------------+--------------------------+------------------------+
As you see, "162 r Vaugirard", "20 r Saussier Leroy" contains only ASCII chars, have the same length but aren't equals!
Maybe have a look at the encoding of the mysql text fields - UTF8 encodes most of its characters with 2 bytes - only a small subset of UTF8 (ASCII characters for example) get encoded with one byte.
MySQL knows UTF8 and counts right.
PHP text functions aren't UTF8 aware and count the bytes itself.
So if PHP counts more than MYSQL, this is probably the cause and you could have a look at utf8decode.
br from Salzburg!
The official documentation says:
Returns the length of the string str, measured in bytes. A multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTH() returns 10, whereas CHAR_LENGTH() returns 5.
So, use CHAR_LENGTH instead :)
select id, address from pagesjaunes_test
where address = substr(address, 1, char_length(address) - 1)
Finally, I found the problem. After changed collation to ascii_general_ci all non-ascii chars was transformed to "?". Some spaces also was replaced with "?". After check initial values, function ORD() from MySQL returned 160 (instead of 32) for these spaces. So,
UPDATE pagesjaunes_test SET address = TRIM(REPLACE(REPLACE(address, CHAR(160), ' '), ' ',' ')
resolved my question.

binary file(pdf) from database on web page using PHP

I have uploaded a pdf file (varbinary(MAX)) in MSSQL running on my desktop via VS2010 (Ref) with just the UPDATE statement in the Ref.
The table has additional data associated with the pdf, like name, description, date.
Table in database:
Name | Description | Date | File |
DataSheet | Milling M/C | 2004-01-01 | <Binary data> |
Is it possible to display the table content in a web page as
Name | Description | Date | File |`
DataSheet | Milling M/C | 2004-01-01 | (link to the pdf) |
or
Name | Description | Date | File |
DataSheet | Milling M/C | 2004-01-01 | (an icon of pdf) |
using PHP so that user can click the link/file to view it. There will be lot of rows, I just gave 1 row for example.
Please let me know if I am not clear, thanks in advance.
PS: I did create some images but was not able to post them, sorry about the format
What I have done so far
I am using 'Connect without a DSN (using a connection string)' from here
$conn = new COM ("ADODB.Connection") or die("Cannot start ADO");
$connStr = "PROVIDER=SQLOLEDB;SERVER=".$myServer.";UID=".$myUser.";PWD=".$myPass.";DATABASE=".$myDB;
$conn->open($connStr);
$query = "select * from Table";
$rs = $conn->execute($query);
$num_columns = $rs->Fields->Count();
for ($i=0; $i < $num_columns; $i++) {
$fld[$i] = $rs->Fields($i);
}
while (!$rs->EOF)
{
for ($i=0; $i < $num_columns; $i++) {
echo $fld[$i]->value;
}
$rs->MoveNext();
}
The result I am getting is garbage for binary data.
DataSheet | Milling M/C | 2004-01-01 | some_garbage |
Yes it is possible. Create a link to a separate page that accepts a record is in the query string. For instance, you can link to ViewFile.php?recordid=123.
Then, in ViewFile.php, you can get the binary data, output some headers, and output the binary (I think even echo would suffice for that).
The headers should contain at least a Content-Type header, telling the browser that the binary data is to be interpreted as application/pdf data. If it is not a pdf, you should specify appropriate headers. Lists of valid Content-Types can be found all over the internet.
You can specify a filename too. But the important thing is that you can determine what kind of data it is. If you don't know whether it is an icon or a pdf, you cannot tell the browser either. The browser will need to know (by reading the content-type header) how the data should be interpreted. It cannot guess it, even when the url would have a .pdf extension.

Categories