Why do strings behave like an array in PHP 5.3? - php

I have this code:
$tierHosts['host'] = isset($host['name']) ? $host['name'] : $host;
It's working fine in PHP 5.5, but in PHP 5.3 the condition returns true while $host contains a string like pjba01. It returns the first letter of $tierHosts['host'], that is, p.
What's so wrong with my code?

You can access strings like an array and prior PHP 5.4 offsets like your name were silently casted to 0, means you accessed the first character of that string:
character | p | j | b | a | 0 | 1 |
-----------------------------------
index | 0 | 1 | 2 | 3 | 4 | 5 |
After 5.3 such offsets will throw a notice, as you can also read in the manual:
As of PHP 5.4 string offsets have to either be integers or integer-like strings, otherwise a warning will be thrown. Previously an offset like "foo" was silently cast to 0.

Related

Sphinx: PDO exception with certain characters

I'm trying to get the Sphinx search server working with PDO, but it triggers a syntax error when using the MATCH() function in specific scenarios.
Ex.:
In my code I'm splitting the search query by space and then concatenate it using the | (OR) operator. If someone types test > 3, in the match function it would become (test | > | 3). This combination triggers a: Syntax error or access violation: 1064 main_idx: syntax error, unexpected '|' near ' > | 3'. I don't think it's an escape problem because the > character is not on the escape list and even if you try to escape it, it doesn't work. Is this a bug in the version of Sphinx i'm using? Or am I doing something wrong?
I'm using Sphinx version 2.2.11. It's actually a docker instance provided by this image: jamesrwhite/sphinx-alpine:2.2.11
The PHP version is 7.2.
This is my non-working code:
$searchQuery = "SELECT * FROM main_idx WHERE MATCH(:search)";
$dbh = new PDO('mysql:host=127.0.0.1;port=9306', 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$stmt = $dbh->prepare($searchQuery);
$stmt->bindValue('search', 'test | > | 3');
$stmt->execute();
Same code works perfectly fine if I'm using the MySQLi extension. It also works fine with PDO and Sphinx version 2.2.6. Something must've changed between 2.2.6 and 2.2.11. Anyone encountered this issue?
This behaviour is caused by this bug http://sphinxsearch.com/bugs/view.php?id=2305 and this fix https://github.com/sphinxsearch/sphinx/commit/d9923f76c7724fa8d05a3d328e26a664799841b7. In the previous revision ' > | ' was supported.
We at Manticore Search (fork of Sphinx) will check if the fix was correct and will make a better fix if that's not. Thanks for pointing this out.
Meanwhile you can use 2.2.8 from http://sphinxsearch.com/downloads/archive/ or build manually from the latest revision which supports the syntax (https://github.com/sphinxsearch/sphinx/commit/f33fa667fbfd2031ff072354ade4b050649fbd4e)
[UPDATE]
The fix is proper. It was wrong to not show the error about that in the previous versions as long as you DON'T have the spec. character (>) in your charset_table. To workaround this you can add > to your charset_table and then escape it in the search query, e.g.:
mysql> select * from idx_min where match('test | \\> | a');
+------+---------+----------+-------+------+
| id | doc | group_id | color | size |
+------+---------+----------+-------+------+
| 7 | dog > < | 5 | red | 3 |
+------+---------+----------+-------+------+
1 row in set (0.00 sec)
mysql> select * from idx_min where match('test | \\< | a');
+------+---------+----------+-------+------+
| id | doc | group_id | color | size |
+------+---------+----------+-------+------+
| 7 | dog > < | 5 | red | 3 |
+------+---------+----------+-------+------+
1 row in set (0.00 sec)
or
$stmt->bindValue('search', 'test | \\< | a');
in PDO.
There's still a little bug found though which is that if non-spec character is not in charset_table it doesn't generate an error. E.g.
mysql> select * from idx_min where match('test | j | a');
Empty set (0.00 sec)
works fine even though j is not in charset_table. I've filed a bug in our bug tracker https://github.com/manticoresoftware/manticoresearch/issues/156
Thanks again for helping to point this out.
say for exmple you want to do an exact match I like doing my exact matching like this...
...WHERE MATCH(column) AGAINST('happy I am') AND column LIKE '%happy I am%';
that will guarantee I match exactly what I want to match where as if I didn't include the AND LIKE... it would match happy OR I OR am

Splitting value in MySQL

I want to update a field on a really huge (1m rows) table. I want to update it from:
+-----------------------------------------------------------+
| ref |
+-----------------------------------------------------------+
| 0001___000000000003616655___IVANTI UK___TEMPLATE MATERIAL |
+-----------------------------------------------------------+
to:
+-------------------------------+
| ref |
+-------------------------------+
| IVANTI UK___TEMPLATE MATERIAL |
+-------------------------------+
So basically its just changing the ref (which is not fixed length) from sid___sku___mfr___pnum to mfr___pnum format.
In PHP I'd do it like so (pseduo code):
list($p['sid'], $p['sku'], $p['mfr'], $p['pnum']) = explode('___', $row['ref']);
$row['ref'] = $p['mfr'] . '___' . $p['pnum'];
Wondering if its possible to do it directly with MySQL with a performant query?
select SUBSTRING_INDEX(ref,'___',-2) from test
0001___000000000003616655___IVANTI UK___TEMPLATE MATERIAL
=>
IVANTI UK___TEMPLATE MATERIAL
https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_substring-index
SUBSTRING_INDEX(str,delim,count)
Returns the substring from string str before count occurrences of the
delimiter delim. If count is positive, everything to the left of the
final delimiter (counting from the left) is returned. If count is
negative, everything to the right of the final delimiter (counting
from the right) is returned. SUBSTRING_INDEX() performs a
case-sensitive match when searching for delim.

How to find the length of a chinese phrase in a MySQL database with SQL?

For example, this is my table, which is called example:
--------------------------
| id | en_word | zh_word |
--------------------------
| 1 | Internet| 互联网 |
--------------------------
| 2 | Hello | 你好 |
--------------------------
and so on...
And I tried using this SQL Query:
SELECT * FROM `example` WHERE LENGTH(`zh_word`) = 3
For some reason, it wouldn't give me three, but would give me a lot of single letter characters.
Why is this? Can this be fixed? I tried this out in PhpMyAdmin.
But when I did it with JavaScript:
"互联网".length == 3; // true
And it seems to work fine. So how come it doesn't work?
you should use CHAR_LENGTH instead of LENGTH
LENGTH() returns the length of the string measured in bytes.
CHAR_LENGTH() returns the length of the string measured in characters.
LENGTH returns length in bytes (and chinese is multibyte)
Use CHAR_LENGTH to get length in characters
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_char-length
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_length

Mysql string check on equals is false for the same values

I have a problem with MySql
I have a table with parsed informations from websites. A strange string interpretation appear:
the query
select id, address from pagesjaunes_test where address = substr(address,1,length(address)-1)
return a set of values instead of none
at beginning I executed functions as:
address = replace(address, '\n', '')
address = replace(address, '\t', '')
address = replace(address, '\r', '')
address = replace(address, '\r\n', '')
address = trim(address)
but the problem still persist.
Values of field 'address' have some french chars , but the query returned also values that contains only alfanumeric english chars.
Another test: I tried to check the length of strings and ... the strlen() from PHP and LENGTH() from MYSQL display different results! Somewhere difference is by 2 chars, somewhere by 1 character without a specific "rule".
Visual I can't see any space or tabs or something else.
After I modified an address manualy(I deleted all string and I wrote it again), the problem is solved, but I have ~ 6000 values, so this is not a solution :)
What can be the problem?
I suppose that strings can have something as an "empty char", but how to detect and remove it?
Thanks
P.S.
the problem is not just length. I need to join this table with other one and using a condition that check if values from fields 'address' are equals. Even if the fields have the same collation and tables have the same collation, query returns that no addresses match
E.g.
For query:
SELECT p.address,char_length(p.address) , r.address, char_length(r.address)
FROM `pagesjaunes_test` p
LEFT JOIN restaurants r on p.name=r.name
WHERE
p.postal_code=r.postal_code
and p.address!=r.address
and p.phone=''
and p.cuisines=''
LIMIT 10
So: p.address!=r.address
The result is:
+-------------------------------------+------------------------+--------------------------+------------------------+
| address | char_length(p.address) | address | char_length(r.address) |
+-------------------------------------+------------------------+--------------------------+------------------------+
| Dupin Marc13 quai Grands Augustins | 34 | 13 quai Grands Augustins | 24 |
| 39 r Montpensier | 16 | 39 r Montpensier | 16 |
| 8 r Lord Byron | 14 | 3 r Balzac | 10 |
| 162 r Vaugirard | 15 | 162 r Vaugirard | 15 |
| 32 r Goutte d'Or | 16 | 32 r Goutte d'Or | 16 |
| 2 r Casimir Périer | 18 | 2 r Casimir Périer | 18 |
| 20 r Saussier Leroy | 19 | 20 r Saussier Leroy | 19 |
| Senes Douglas22 r Greneta | 25 | 22 r Greneta | 12 |
| Ngov Ly Mey44 r Tolbiac | 23 | 44 r Tolbiac | 12 |
| 33 r N-D de Nazareth | 20 | 33 r N-D de Nazareth | 20 |
+-------------------------------------+------------------------+--------------------------+------------------------+
As you see, "162 r Vaugirard", "20 r Saussier Leroy" contains only ASCII chars, have the same length but aren't equals!
Maybe have a look at the encoding of the mysql text fields - UTF8 encodes most of its characters with 2 bytes - only a small subset of UTF8 (ASCII characters for example) get encoded with one byte.
MySQL knows UTF8 and counts right.
PHP text functions aren't UTF8 aware and count the bytes itself.
So if PHP counts more than MYSQL, this is probably the cause and you could have a look at utf8decode.
br from Salzburg!
The official documentation says:
Returns the length of the string str, measured in bytes. A multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTH() returns 10, whereas CHAR_LENGTH() returns 5.
So, use CHAR_LENGTH instead :)
select id, address from pagesjaunes_test
where address = substr(address, 1, char_length(address) - 1)
Finally, I found the problem. After changed collation to ascii_general_ci all non-ascii chars was transformed to "?". Some spaces also was replaced with "?". After check initial values, function ORD() from MySQL returned 160 (instead of 32) for these spaces. So,
UPDATE pagesjaunes_test SET address = TRIM(REPLACE(REPLACE(address, CHAR(160), ' '), ' ',' ')
resolved my question.

Is php deg2rad() equal to mysql radians()

Are these functions the same? If not, what is an appropriate php equivalent to mysql's radians()
Judging from their documentations (deg2rad, radians), they seem to do the same.
And a quick verification on a simple test-case :
mysql> select radians(0), radians(45), radians(90);
+------------+-------------------+-----------------+
| radians(0) | radians(45) | radians(90) |
+------------+-------------------+-----------------+
| 0 | 0.785398163397448 | 1.5707963267949 |
+------------+-------------------+-----------------+
1 row in set (0,00 sec)
And, in PHP :
var_dump(deg2rad(0), deg2rad(45), deg2rad(90));
also gives :
float 0
float 0.785398163397
float 1.57079632679
So, it seems they do quite the same...
Consulting the documentation:
MySQL's RADIANS(x): returns the argument x, converted from degrees to radians.
PHP's DEG2RAD(): converts the number in degrees to the radian equivalent
...so yes, they are equivalent.
Was there something more specific you were looking for?

Categories