first time posting here.
I am facing a problem with unpredicted behavior on my PROD server and my local environment.
Here is some background on the situation:
In my application (backend Laravel 7, frontend regular html/javascript) I need to search for entries in a particular table based on JSON data stored in one of the columns:
Table: flights
columns: id, date, passengers, ... pilot_id, second_pilot_id, flight_data, updated_at, created_at
There are flights, that are directly linked to either a pilot or a second pilot via pilot_id or second_pilot_id. That is fine so far, because I can easily query them. However there are also flight entries, where no registered user is doing the entry and they are only represented by a name that is entered. This works only if the name doesn't contain special characters, in particular the german Umlaute (ö, ä, ü), also doesn't work for other specials like â or ß or é, è etc. But ONLY ON PROD, on Local everything works even with special characters.
flight_data has the data type "JSON" in my migration files.
$table->json('flight_data') ...
Now the problem:
On my local environment I can run the following and will get results returned:
... ->where(function($q) use ($r) {
$q->whereRaw("IF(payee = 2, JSON_CONTAINS(flight_data, '{\"second_pilotname\":\"$r\"}'), JSON_CONTAINS(flight_data, '{\"pilotname\":\"$r\"}'))");
})->...
This will get me my example results without issues, as expected
($r is filled a particular name of a pilot, in my example he is called "Jöhn Düe")
If I run this on my PROD system I will get no retuns. I tracked it down to the JSON_CONTAINS() function, that prevents the results. I also tried playing around with "Joehn Duee", which would be found correctly, so it basically comes down to the german Umlaute (ö, ä, ü) not being handled correctly somehow.
I also tried some SQL statements in phpmyadmin and these are the results:
LOCAL
select id, flight_data, comments, updated_at from logbook where JSON_CONTAINS(flight_data, '{"pilotname": "Juehn Duee"}')
1 result found
select id, flight_data, comments, updated_at from logbook where JSON_CONTAINS(flight_data, '{"pilotname": "Jühn Düe"}')
1 result found
PROD
select id, flight_data, comments, updated_at from logbook where JSON_CONTAINS(flight_data, '{"pilotname": "Juehn Duee"}')
1 result found
select id, flight_data, comments, updated_at from logbook where JSON_CONTAINS(flight_data, '{"pilotname": "Jühn Düe"}')
0 result found
I also checked the raw data that is stored:
PROD:
column
data
flight_data
{"pilotname":"J\u00fchn D\u00fce"}
LOCAL:
column
data
flight_data
{"pilotname":"J\u00fchn D\u00fce"}
So logically the data is transformed. Which is ok, because the data is then shown according to UTF-8 and then correctly displayed ("Jühn Düe")
The problem is, that in the backend I need to compare this data.
The differences are that on my local environment I am using MYSQL 8.0 (it's a homestead server, so select ##version; => 8.0.23-0ubuntu0.20.04.1) and on PROD (the hosted server) I am seeing "10.3.28-MariaDB-log-cll-lve"
Therefore the difference is clear, MariaDB vs. MYSQL and the handling of german Umlaute.
I tried various things around changing the conversion / charset of the entries, of the database, that all didn't solve the problem. I searched for quite a while for various similar problems, but most of them resulted in having the data stored not in UTF-8 - which I checked and is the case for me here.
Even querying for the raw data doesn't work somehow:
The following doesn't work neither on PROD nor on LOCAL:
select id, flight_data, comments, updated_at from logbook where JSON_CONTAINS(flight_data, '{"pilotname": "J\u00fchn D\u00fce"}')
0 results found
Can you help me figuring out what I am missing here?
Obviously it has to do something with the database, what else can I check or do I need to change?
Thanks a lot everybody for your help!
You should use the same software in development that you use in production. The same brand and the same version. Otherwise you risk encountering these incompatible features.
MariaDB started as a fork of the MySQL project in 2010, and both have been diverging gradually since then. MySQL implements new features, and MariaDB may or may not implement similar features, either by cherry-picking code from the MySQL project or by implementing their own original code. So over time these two projects grow more and more incompatible. At this point, over 10 years after the initial fork, you should consider MariaDB to be a different software product. Don't count on any part of it remaining compatible with MySQL.
In particular, the implementation of JSON in MariaDB versus MySQL is not entirely compatible. MariaDB creating their own original code for the JSON data type as an alias for LONGTEXT. So the internal implementation is quite different.
You asked if there's something you need to change.
Since you use MariaDB in production, not MySQL, you should use MariaDB 10.3.28 in your development environment, to ensure compatibility with the database brand and version you use in production.
I think the problem is a collation issue. Some unicode collations implement character expansions, so ue = ü would be true in the German collation.
Here's a test using MySQL 5.7 which is what I have handy (I don't use MariaDB):
mysql> select 'Juehn Duee' collate utf8mb4_unicode_520_ci = 'Jühn Düe' as same;
+------+
| same |
+------+
| 0 |
+------+
mysql> select 'Juehn Duee' collate utf8mb4_german2_ci = 'Jühn Düe' as same;
+------+
| same |
+------+
| 1 |
+------+
As you can see, this has nothing to do with JSON, but it's just related to string comparisons and which collation is used.
See the explanation in https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html in the section "_general_ci Versus _unicode_ci Collations"
Thank you all for your inputs and response!
I figured out a different solution for the problem. Maybe it helps someone..
I went a step back and checked how I am storing the data. I was using json_encode() for that, which created the table contents as shown above. By just using a raw array to save it, it was working then
$insert->pilotname = ['pilotname' => $request->pilotname];
Somehow the storing of data before was already the issue.
Related
I'm running into a complicated situation here, and I'm hoping for a push in the right direction.
I need to allow Basic Latin searches to bring back results with diacritics. This is further complicated by the fact that the data is stored with HTML instead of pure ASCII. I have been making some progress, but have come across two problems.
First: I'm able to do a partial conversion of the data into something marginally useful, using something like this:
$string = 'Véra';
$converted = html_entity_decode($string, ENT_COMPAT, 'UTF-8');
setlocale(LC_ALL, 'en_US.UTF8');
$translit = iconv('UTF-8', 'ASCII//TRANSLIT', $converted);
echo $translit;
This brings back this result: V'era This is a start but what I really need is Vera. I can do a preg_replace on resulting string, but is there a way of just bringing it back without the hyphen? This is only one example; there are a lot more diacritics in the database (e.g. ñ and more). I feel like this has been addressed before (e.g. iconv returns strange results), but there don't appear to be any solutions listed.
Bigger Problem: I need to convert a string such as Vera and be able to bring back results with Véra. as well as results of Vera. However I believe I need to get problem 1 solved first before I can get to this point.
I'm thinking something like if ($translit) { return $string} but I'm a bit unsure of how to handle this.
All help appreciated.
Edit: I'm thinking this might be done easier directly in the database, however I'm running into issues with DQL. I know that there are ways with doing it in SQL with a stored procedure, but with limited access to the database, I'm open any suggestions for dealing with this in Doctrine
Okay, so maybe I'm making this too difficult
All I need is a way of finding entries that have been HTML encoded in the database without having to search with either the specific encoding but also without the diacritic itself. If I search for Jose, it should bring up anything in the database labeled as José
Preface: It's not quite clear whether the data to search is already in the database or whether you're just taking advantage of the fact that the database has logic for character comparisons. I'm going to assume that the data source is the DB.
The fact that you're trying to search html raises the question of whether you really want to search HTML or in fact want to search the user-visible text in HTML and strip html tags (What if there is a diacritic in a tag attribute? What if a word is broken with an empty <span>? Should it match? What if it was broken with a <br>?)
MySQL has the notion of both character sets (how characters are encoded) and collations (how characters are compared)
Relevant Documentation:
https://dev.mysql.com/doc/refman/5.7/en/charset-mysql.html
https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-sets.html
Assuming your mysql client/terminal is correctly set for UTF8 encoding, then the following demonstrates the effect of overriding the collation (using ß as particularly interesting example)
> SET NAMES 'utf8';
> SELECT
'ß',
'ss',
'ß' = 'ss' COLLATE utf8_unicode_ci AS ss_unicode,
'ß' = 'ss' COLLATE utf8_general_ci AS ss_general,
'ß' = 's' COLLATE utf8_general_ci AS s_general;
+----+----+------------+------------+-----------+
| ß | ss | ss_unicode | ss_general | s_general |
+----+----+------------+------------+-----------+
| ß | ss | 1 | 0 | 1 |
+----+----+------------+------------+-----------+
1 row in set (0.00 sec)
Note: general is the faster but not-strictly-correct version of the unicode collation -- but even that is wrong if you speak turkish (see: dotted uppercase i)
I would save decoded html in the database and search on this making sure that the collation is set correctly.
Confirm that the table/column collation is correct using SHOW CREATE TABLE xxx. Change it manually (ALTER TABLE ...), or use doctrine annotations as per this answer & use doctrine migrations to update (and confirm afterwards with SHOW CREATE TABLE that your version of doctrine respects collation)
Confirm that doctrine is configured to use utf8 encoding.
If you just need to override the collation for one particular query (eg you don't have permission to change the DB structure or it will break other code):
If you need to map to a doctrine ORM object, use NativeQuery and add COLLATE overrides as per the example above.
If you just want the record ID & field then you can use a direct query bypassing the ORM with a COLLATE override
You can use REGEX_REPLACE function to strip diactrics in Database, while requesting. Mysql database has no built-in regex_replace function, but you can use User Defined Library, or change library to MariaDB. MariaDB based on Mysql (Migrating data to MariaDB will be easy).
Then in MariaDB you can use queries like:
SELECT * FROM `test` WHERE 'jose' = REGEXP_REPLACE(name, '(&[A-Za-z]*;)', '')
// another variant with PHP variable
SELECT `table`.name FROM `table` WHERE $search = REGEXP_REPLACE(name, '(&[A-Za-z]*;)', '')
Even phpMyAdmin supports MariaDB. I tested my query on Demo page. It worked pretty well:
Or if you want to stay on MySql, add this UDFs:
https://github.com/mysqludf/lib_mysqludf_preg
I am working on a Mysql trigger to do something on my database. Before any thing, I want to check if a value exists in a json_encode string in my config table and after that proceed to run the trigger.
my config table is handled by php scripts and looks like this:
-------------------------------------------
| config_name | config_value |
-------------------------------------------
| target_id | ["1","16","18","22","37"] |
-------------------------------------------
in php we can use:
if in_array(json_decode(config_value)) and ...
but my problem is in Mysql syntax which doesn't support in_array and json_decode
How can I check if my value exists in 'config value' in Mysql trigger?
How to solve this problem
If you are storing JSON in mysql, make sure that you upgrade to mysql 5.7, then you can make use of the range of JSON functions available. In your particular case, you can do
SELECT * FROM my_table WHERE JSON_SEARCH(config_value,"one", "17") IS NOT NULL;
What you Definitely ought to be doing
You have a problem in your data. If you find that you are always searching a JSON field, what that really means is that your table should be normalized.
update: section 2, title changed as suggested by #Sammitch
using this:
#Query(value = "select * from configs where json_contains(config_value, json_array(?1))))", nativeQuery = true)
List<Config> list(String config);
Okay so what you have there is a JSON_ARRAY() to check if a particular id exists in the JSON_ARRAY() simply put a condition i.e. JSON_CONTAINS(config_value,"1")=1, so if "1" is present in the your JSON_ARRAY() then it will return 1 else 0.
Also, rather than doing json_encode in PHP code while doing the insert/update query you could use "..config_value=JSON_ARRAY(" . implode(',', $arr) . ") ...
Thanks for your reply. But unfortunately mysql version is 5.1 and it's impossible to upgrade it. Actually I am adding some features to a voipswitch system. So my flexibility is limited to the present features.
I am selecting multiple tariffs and I want to record inserted calls (after insert calls in the 'calls' table via my trigger) which belong to the selected tariffs in a list for some analyses.
Also I tried php serialize function instead of json_encode to store my tariff id's in config table. But mysql doesn't support unserialize as well.
I am using CodeIgniter 2.1.3 and PHP 5.4.8, and two PostgreSQL servers (P1 and P2).
I am having a problem with list_fields() function in CodeIgniter. When I retrieve fields from P1 server, fields are in the order that I originally created the table with. However, if I use the exact same codes to retrieve fields from P2 server, fields are in reverse order.
If fields from P1 is array('id', 'name', 'age'),
fields from P2 becomes array('age', 'name', 'id')
I don't think this is CodeIgniter specific problem, but rather general database configuration or PHP problem, because codes are identical.
This is the code that I get fields with.
$fields = $this->db->list_fields("clients");
I have to clarify something. #muistooshort claims in a comment above:
Technically there is no defined order to the columns in a table.
#mu may be thinking of the order or rows, which is arbitrary without ORDER BY.
It is completely incorrect for the order of columns, which is well defined and stored in the column pg_attribute.attnum. It's used in many places, like INSERT without column definition list or SELECT *. It is preserved through a dump / restore cycle and has significant bearing on storage size and performance.
You cannot simply change the order of columns in PostgreSQL, because it has not been implemented, yet. It's deeply wired into the system and hard to change. There is a Postgres Wiki page and it's on the TODO list of the project:
Allow column display reordering by recording a display, storage, and
permanent id for every column?
Find out for your table:
SELECT attname, attnum
FROM pg_attribute
WHERE attrelid = 'myschema.mytable'::regclass
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
ORDER BY attnum;
It is unwise to use SELECT * in some contexts, where the columns of the underlying table may change and break your code. It is explicitly wise to use SELECT * in other contexts, where you need all columns (in default order).
As to the primary question
This should not occur. SELECT * returns columns in a well defined order in PostgreSQL. Some middleware must be messing with you.
I suspect you are used to MySQL which allows you to reorder columns post columns post-table creation. PostgreSQL does not let you do this, so when you:
ALTER TABLE foo ADD bar int;
It puts this on the end of the table always and there is no way to change the order.
On PostgreSQL you should not assume that the order of the columns is meaningful because these can differ from server to server based on the order in which the columns were defined.
However the fact that these are reversed is odd to me.
If you want to see the expected order on the db, use:
\d foo
from psql
If these are reversed then the issue is in the db creation (this is my first impression). That's the first thing to look at. If that doesn't show the problem then there is something really odd going on with CodeIgniter.
I'm using a legacy PHP framework which automatically assembles queries for me. By some reason, it is assembling a query like this:
SELECT s.status,c.client FROM client C LEFT JOIN status S ON C.id_status=S.id_status'
This isn't a problem on my MacOS X workstation. But when I test it on my production server mysql raises the following error:
#1054 - Unknown column 's.status' in 'field list'
It is definitively a case issue on s.status. If I manually runs the query changing s,c for S,C , it works perfectly.
A quick look on google didn't solved the issue. Any ideas?
Well, it's said in the documentation:
By default, table aliases are case sensitive on Unix, but not so on
Windows or Mac OS X. The following statement would not work on Unix,
because it refers to the alias both as a and as A:
mysql> SELECT col_name FROM tbl_name AS a
-> WHERE a.col_name = 1 OR A.col_name = 2;
There are also some solutions given in this section of the documentation as well. For example, you can set lower_case_table_names to 1 on all platforms to force names to be converted to lowercase - but you have to rename all your tables to lowercase as well in that case.
Tables in MySQL are stored as files. File names are case-insensitive in MacOS X and Windows, whereas they are case-sensitive in Linux. You can use table names without regarding case in MacOS X and Windows, but not in Linux. So you should choose a consistent casing for all your table names and use it throughout your code.
I'd suggest using all lowercase or uppercase names separated with underscore like tbl_etc, MY_TBL etc so that there'd be no confusion regarding case.
Not sure how to word this but my problem is that my fields will not update properly. I have a page set up where the user can update things like job listings, events, etc. The problem is that some of the descriptions are 3 or more paragraphs long and when the form is processed, it doesn't update the database correctly. This also happens when the user makes a new item.
Is there a limit on how much text can be loaded at once?
Here is the code for updating that I am using:
mysql_query("UPDATE tbl_workers_club SET eventdate='".$newDate."', theme='".$theme."', text='".$text."', contactperson='".$contactperson."', contactphone='".$phone."', dateentered='".$dateentered."' WHERE specialID='".$id."' ");
Here is the code for new items:
mysql_query("INSERT INTO tbl_workers_club (eventdate, theme, text, contactperson, contactphone, dateentered)
VALUES ('$newDate', '$theme', '$text', '$contactperson', '$contactphone', '$dateentered')");
I did not see another question like this so if you know of one let me know.
Saying what errors you're getting (or no errors) would greatly help diagnosing this problem. Is the content merely "cut off" in the database? Or is MySQL throwing errors back to PHP? However, here are some ideas...
mysql_real_escape_string() is a good place to start. Any input with a single apostrophe will break your code. Read up on cleansing user input for SQL, there are tons of resources on the web. It's absolutely not optional. Also look into parameterized queries / prepared statements, or one of the many ORM libraries/frameworks out there.
It's also possible that your "text" column is not large enough to hold those inputs, if you're only having problems for very long entries. VARCHAR(X) can only hold X characters. Use something like MEDIUMTEXT instead.
Use POST rather than GET; GET has limits on the size of values sent via the URL. POST is the conventional method when sending requests that create/alter data.
It's hard to say what the problem is without more info on what errors you're getting.
take a look to the setup of MySQL max_allowed_packet A communication packet is a single SQL statement sent to the MySQL server, a single row that is sent to the client, or a binary log event sent from a master replication server to a slave.
mysql> select * from information_schema.global_variables where variable_name like 'max_allow%';
+--------------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+--------------------+----------------+
| MAX_ALLOWED_PACKET | 1048576 |
+--------------------+----------------+
1 row in set (0.00 sec)