I am trying to UPDATE a mysql TEXT data-type column with a string contains near 6,000 characters inside (6 kb data-size). Then the column is not updated with that data.
Then when i test this column by updating with test data around 100 characters, it works.
So, is there any limit on mysql query processing?
If so, how can i adjust it?
I'm using XAMPP on MacOSX.
Try escaping your data properly. Your data would be truncated rather than denied (by default) by MySQL if it wouldn't fit. The max size for TEXT field is 64kb.
If your data contains characters like ' and " your query could fail. Add proper error handling to determine where and why things go wrong and escape your data accordingly.
Related
I have a PHP web application developed with CakePHP 4 an MySQL.
One of the databases the application works with is a table named trg_registros_vip that contains 77 columns. One of the columns is named trg_tipo_proyecto and is defined as a non-nullable varchar with a maximum length of 100 characters.
The table is populated by reading an excel file with SimpleXLSX (a library obtainable as shuchkin/simplexlsx by using composer). For each row in the file, starting from the second row (as the first one has the headers for the sheet), the data on each cell from left to right, starting from the first cell, is stored in an entity class named TRegistrosVip in the same order as defined for each column in the table starting from the second (since the first column is an autoincremental primary key). Then the entity is inserted in the table. The column trg_tipo_proyecto is the fifth one in the table and the data for this column is obtained from fourth column in the excel file.
However, when I try to read the excel from the application, I get the following error message: Error: [PDOException] SQLSTATE[22001]: String data, right truncated: 1406 Data too long for column 'trg_tipo_proyecto' at row 1 in C:\xampp\htdocs\CPSC2\vendor\cakephp\cakephp\src\Database\Statement\MysqlStatement.php
The stacktrace for this error indicates that the error is generated in the very moment I try to insert one row in the table.
I searched in the excel file if there is a cell in the fourth column whose length is greater than 100 and got none.
I searched the table class associated to the entity class, named TRegistrosVipTable in case of a bad defined column and I got this:
$validator->scalar('trg_tipo_proyecto')
->maxLength('trg_tipo_proyecto', 100)
->requirePresence('trg_tipo_proyecto', 'create')
->notEmptyString('trg_tipo_proyecto');
That means the column is well defined.
I check the config/app.php file in case I would be working with a wrong database but I'm using the correct one.
Also, due to possible presence of foreign accents, I called the utf8_decode function on each value of that column in the excel file before inserting it on the table.
Considering the described above, what would be the cause of this error?
Thanks in advance.
SOLVED
I implemented logging in the application and printed in the debug log file the value in the excel file's fourth column for each row after the call to the utf8_decode function. The error is triggered when trying to insert the word "Regularización" (without the double quotes, without leading or trending spaces and with an accent).
I removed the accent from that word in the Excel and the error dissappeared.
However, I cannot remove the accents from the Excel programmatically because client's restrinctions forbid it. So, I tested removing the call to utf8_decode and inserting the value directly without any codification or decodification. The error is not triggered anymore.
Thanks to all who have read and answered this question.
i have a table field type varchar(36) and i want to generate it dynamically by mysql so i used this code:
$sql_code = 'insert into table1 (id, text) values (uuid(),'some text');';
mysql_query($sql_code);
how can i retrieve the generated uuid immediately after inserting the record ?
char(36) is better
you cannot. The only solution is to perform 2 separated queries:
SELECT UUID()
INSERT INTO table1 (id, text) VALUES ($uuid, 'text')
where $uuid is the value retrieved on the 1st step.
You can do everything you need to with SQL triggers. The following SQL adds a trigger on tablename.table_id to automatically create the primary key UUID when inserting, then stores the newly created ID into an SQL variable for retrieval later:
CREATE TRIGGER `tablename_newid`
AFTER INSERT ON `tablename`
FOR EACH ROW
BEGIN
IF ASCII(NEW.table_id) = 0 THEN
SET NEW.table_id = UNHEX(REPLACE(UUID(),'-',''));
END IF;
SET #last_uuid = NEW.table_id;
END
As a bonus, it inserts the UUID in binary form to a binary(16) field to save storage space and greatly increase query speed.
edit: the trigger should check for an existing column value before inserting its own UUID in order to mimic the ability to provide values for table primary keys in MySQL - without this, any values passed in will always be overridden by the trigger. The example has been updated to use ASCII() = 0 to check for the existence of the primary key value in the INSERT, which will detect empty string values for a binary field.
edit 2: after a comment here it has since been pointed out to me that using BEFORE INSERT has the effect of setting the #last_uuid variable even if the row insert fails. I have updated my answer to use AFTER INSERT - whilst I feel this is a totally fine approach under general circumstances it may have issues with row replication under clustered or replicated databases. If anyone knows, I would love to as well!
To read the new row's insert ID back out, just run SELECT #last_uuid.
When querying and reading such binary values, the MySQL functions HEX() and UNHEX() will be very helpful, as will writing your query values in hex notation (preceded by 0x). The php-side code for your original answer, given this type of trigger applied to table1, would be:
// insert row
$sql = "INSERT INTO table1(text) VALUES ('some text')";
mysql_query($sql);
// get last inserted UUID
$sql = "SELECT HEX(#last_uuid)";
$result = mysql_query($sql);
$row = mysql_fetch_row($result);
$id = $row[0];
// perform a query using said ID
mysql_query("SELECT FROM table1 WHERE id = 0x" . $id);
Following up in response to #ina's comment:
A UUID is not a string, even if MySQL chooses to represent it as such. It's binary data in its raw form, and those dashes are just MySQL's friendly way of representing it to you.
The most efficient storage for a UUID is to create it as UNHEX(REPLACE(UUID(),'-','')) - this will remove that formatting and convert it back to binary data. Those functions will make the original insertion slower, but all following comparisons you do on that key or column will be much faster on a 16-byte binary field than a 36-character string.
For one, character data requires parsing and localisation. Any strings coming in to the query engine are generally being collated automatically against the character set of the database, and some APIs (wordpress comes to mind) even run CONVERT() on all string data before querying. Binary data doesn't have this overhead. For the other, your char(36) is actually allocating 36 characters, which means (if your database is UTF-8) each character could be as long as 3 or 4 bytes depending on the version of MySQL you are using. So a char(36) can range anywhere from 36 bytes (if it consists entirely of low-ASCII characters) to 144 if consisting entirely of high-order UTF8 characters. This is much larger than the 16 bytes we have allocated for our binary field.
Any logic performed on this data can be done with UNHEX(), but is better accomplished by simply escaping data in queries as hex, prefixed with 0x. This is just as fast as reading a string, gets converted to binary on the fly and directly assigned to the query or cell in question. Very fast.
Reading data out is slightly slower - you have to call HEX() on all binary data read out of a query to get it in a useful format if your client API doesn't deal well with binary data (PHP in paricular will usually determine that binary strings === null and will break them if manipulated without first calling bin2hex(), base64_encode() or similar) - but this overhead is about as minimal as character collation and more importantly is only being called on the actual cells SELECTed, not all cells involved in the internal computations of a query result.
So of course, all these small speed increases are very minimal and other areas result in small decreases - but when you add them all up binary still comes out on top, and when you consider use cases and the general 'reads > writes' principle it really shines.
... and that's why binary(16) is better than char(36).
Its pretty easy actually
you can pass this to mysql and it will return the inserted id.
set #id=UUID();
insert into <table>(<col1>,<col2>) values (#id,'another value');
select #id;
Depending on how the uuid() function is implemented, this is very bad programming practice - if you try to do this with binary logging enabled (i.e. in a cluster) then the insert will most likely fail. Ivan's suggestion looks it might solve the immediate problem - however I thought this only returned the value generated for an auto-increment field - indeed that's what the manual says.
Also what's the benefit of using a uuid()? Its computationally expensive to generate, requires a lot of storage, increases the cost of querying the data and is not cryptographically secure. Use a sequence generator or autoincrement instead.
Regardless if you use a sequence generator or uuid, if you must use this as the only unique key on the database, then you'll need to assign the value first, read it back into phpland and embed/bind the value as a literal to the subsequent insert query.
Context, using doctrine to store an array as longtext in mysql column. Received some Notice: unserialize(): Error at offset 250 of 255 bytes.I therefore did some backtracking to realize the serialized string was truncated because its too big for a long text column. I really doubt that is the case. This string is near and far away from being 4GB.
Someone from this question suggested to take a look at SET max_allowed_packet but mine is 32M.
a:15:{i:0;s:7:"4144965";i:1;s:7:"4144968";i:2;s:7:"4673331";i:3;s:7:"4673539";i:4;s:7:"4673540";i:5;s:7:"4673541";i:6;s:7:"5138026";i:7;s:7:"5140255";i:8;s:7:"5140256";i:9;s:7:"5140257";i:10;s:7:"5140258";i:11;s:7:"5152925";i:12;s:7:"5152926";i:13;s:7:"51
Mysql table collation: utf8_unicode_ci
Any help would be greatly appreciated !!
Full Error
Operation failed: There was an error while applying the SQL script to the database.
ERROR 1406: 1406: Data too long for column 'numLotCommuns' at row 1
SQL Statement:
UPDATE `db`.`table` SET `numLotCommuns`='a:15:{i:0;s:7:\"4144965\";i:1;s:7:\"4144968\";i:2;s:7:\"4673331\";i:3;s:7:\"4673539\";i:4;s:7:\"4673540\";i:5;s:7:\"4673541\";i:6;s:7:\"5138026\";i:7;s:7:\"5140255\";i:8;s:7:\"5140256\";i:9;s:7:\"5140257\";i:10;s:7:\"5140258\";i:11;s:7:\"5152925\";i:12;s:7:\"5152926\";i:13;s:7:\"51}' WHERE `id`='14574'
The column was a tinytext...
Only logical explanation I can understand from this is that whether when I created my table in earlier version of doctrine, the default was tiny text
OR
I remember changing the type of the column within doctrine annotations and maybe the update didn't fully convert the type correctly.
Bottom line, check your types even though you use an orm.
Your column must have been defined as varchar(250).
You need to first convert it to the longtext.
I have some php scripts that read data from csv files and insert them to a mysql database. When I look at the table rows in phpmyadmin their data seems fine but when i open one of the rows for editing, the varchar fields become full of question marks. Also when I search the tables for any string the result is empty. the collation of the fields is utf8_general_ci....
Even a word like KAMZA displays like ��K�A�M�Z�A��
Whats the problem?
I'm scraping data from multiple pages and inserting to my MySQL database. There could be duplicates; I only want to store unique entries. Just in case my primary key isn't sufficient, I put in a test which is checked when I get a MySQL 1062 error* (duplicate entry on primary key**). The test checks that all of the pieces of the tuple to be inserted are identical to the stored tuple. What I found is that the when I get the 1062 error that the stored tuple and the scraped tuple are only different by one element/field, a TEXT field.
First, I retrieved the already stored entry and passed them both into htmlspecialchars() to compare the output visually; they looked identical.
According to strlen(), the string retrieved from the DB was 304 characters in length but the newly scraped string was 305. similar_text() backed that up by returning 304***.
So then I looped through one string comparing character for character with the other string, stopping when there was a mismatch. The problem was the first character. In the string coming from the DB it was N yet both strings appear to start with I (even in their output from htmlspecialchars()). Plus the DB string was supposedly one character shorter, not longer.
I then checked the output (printing htmlspecialchars()) and the strlen() again, but this time before the original string (the one that ends up in the DB) is inserted, and before the duplicated is inserted. They looked the same as before and strlen() returned 305 for both.
So this made me think their must be something happening between my PHP and my MySQL. So instead of comparing the newly scraped string to the string in the database with the same primary key (the ID), I try to retrieve a tuple where every single field is equal to their respective parts in newly scraped section like SELECT * FROM table WHERE value1='{$MYSQL_ESCAPED['value1']}' .... AND valueN='{$MYSQL_ESCAPED['valueN']}'; and the tuple is returned. Therefore they are identical in every way including that problematic TEXT field.
What's going on here?
Straight away when I see N in front of string I think of NVARCHAR, etc. from MSSQL but as I know that's not a part of MySQL, but...
Could it have anything to do with the fact that "Each TEXT value is stored using a two-byte length prefix that indicates the number of bytes in the value."?
Or does this just point to a character encoding problem?
Edit:
There are no multi-byte characters stored in the database.
mb_strlen() returns the same results as strlen() where mentioned above.
Using utf8_encode() or mb_convert_encoding() before inserting to the DB makes no difference; an invisible N is still prefixing the string retrieved from the DB.
Notes:
Before inserting any string into my database I pass it through mysql_real_escape_string(trim(preg_replace('/\s\s+/', ' ', $str))) which replaces double spaces with single spaces, removes leading & tailing spaces and escapes it for MySQL insertion.
The page I print the output & testing to is UTF-8.
Upon creation, my DB has its character set set to utf8, its collation to utf8_general_ci and I use the SET NAMES 'utf8' COLLATE 'utf8_general_ci'; command too, as a precaution.
Foot notes:
* I force an exit from the scraping then also.
** The primary key is just a ID (VARCHAR(10)) which I scrape from the pages.
*** Number of common characters
TEXT fields are subject to character set conversion as/when MySQL sees fit. However, MySQL will not randomly add/remove data without a reason. While text fields DO store the length of the data as 2 extra bytes at the head of the on-disk data blob containing the text field data, those 2 bytes are NEVER exposed to the end user. Assuming character set settings are the same throughout the client->database->on-disk->database->client pipeline, there should never be a change in string length anywhere.