I want to update a field on a really huge (1m rows) table. I want to update it from:
+-----------------------------------------------------------+
| ref |
+-----------------------------------------------------------+
| 0001___000000000003616655___IVANTI UK___TEMPLATE MATERIAL |
+-----------------------------------------------------------+
to:
+-------------------------------+
| ref |
+-------------------------------+
| IVANTI UK___TEMPLATE MATERIAL |
+-------------------------------+
So basically its just changing the ref (which is not fixed length) from sid___sku___mfr___pnum to mfr___pnum format.
In PHP I'd do it like so (pseduo code):
list($p['sid'], $p['sku'], $p['mfr'], $p['pnum']) = explode('___', $row['ref']);
$row['ref'] = $p['mfr'] . '___' . $p['pnum'];
Wondering if its possible to do it directly with MySQL with a performant query?
select SUBSTRING_INDEX(ref,'___',-2) from test
0001___000000000003616655___IVANTI UK___TEMPLATE MATERIAL
=>
IVANTI UK___TEMPLATE MATERIAL
https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_substring-index
SUBSTRING_INDEX(str,delim,count)
Returns the substring from string str before count occurrences of the
delimiter delim. If count is positive, everything to the left of the
final delimiter (counting from the left) is returned. If count is
negative, everything to the right of the final delimiter (counting
from the right) is returned. SUBSTRING_INDEX() performs a
case-sensitive match when searching for delim.
I have the above table: tblCompInfo, the product_id value is not 100% accurate and I need to fix it. I have total of 543847 total row with 25 different company and 12 different products.
now, The URL is 100% accurate and as you can see from the image I have highlighted with RED which means they are wrong and GREEN which is what it should be updated to.
TASK:
I need to update Product_id by parsing through URL and getting the INTEGER and checking it with product table, if its a product, assign the value else assign 0.
SOLUTION:
I got two solution in my head:
1. EXPORT the entire DATA to EXCEL CVS, change it and UPLOAD it to DATABASE. which means my entire week will be working with EXCEL only.
2. Since I have laravel framework: I can make a function in PHP and get the DATA company wise and UPDATE the table in a foreach loop with condition.
PROBLEM:
So, to make my life easy, I made the PHP function with a simple solution and it works BUT I get MEMORY ALLOCATION PROBLEM.
$companyID = ??;
$tblCompInfos = tblCompInfo::where('company_id', '=', $companyID)->get();
foreach($tblCompInfos as $tblCompInfo)
{
$actual_link = $tblCompInfo->url;
$pathlink = parse_url($actual_link, PHP_URL_PATH);
$product_id_from_url = preg_replace("/[^0-9]/", "" , $pathlink);
$FindIfItsInProductTable = Product::find($product_id_from_url);
$real_product_id = $FindIfItsInProductTable == null ? 0 : $product_id_from_url;
DB::table('tblCompInfo')->where('company_id', '=', $companyID)->where('url', '=', $tblCompInfo->url)->update(array(
'product_id' => $real_product_id,
));
echo $actual_link."-".$real_product_id."=".$tblCompInfo->product_id."<br>";
}
if it was a local server, I would have update my PHP.ini with more memory and do the job.
However, I have a LIVE server and it has to be done in the live server and I have no control or power over PHP.ini.
What to do? How can I do it easily that I will not get a memory issue?
Please help if anyone?
Try this :
UPDATE [table_name] SET product_id = CONVERT(SUBSTR(url, LOCATE('products/', url)+9, LOCATE('/compare',url)-LOCATE('products/', url)+9),UNSIGNED INTEGER)
But this will only works if every url field has suffix as /compare
if you use MariaDB you can use REGEXP_REPLACE to do the changes like
UPDATE your_table
SET url = REGEXP_REPLACE(url,'[0-9]+',Product_id)
WHERE Product_id > 0;
sample
MariaDB [your_schema]> SELECT REGEXP_REPLACE('http://example.com/products/12/compare','[0-9]+','99');
+--------------------------------------------------------------------+
| REGEXP_REPLACE('http://example.com/products/12/compare','[0-9]+','99') |
+--------------------------------------------------------------------+
| http://example.com/products/99/compare |
+--------------------------------------------------------------------+
1 row in set (0.00 sec)
MariaDB [your_schema]>
I have a pretty odd idea but it can work.
Look at that query :
SELECT
'http://example.com/products/12/compare' as url,
'http://example.com/products/' as check1,
'http://example.com/termsets/' as check2,
'http://example.com/products/12/compare' REGEXP 'http://example.com/products/' as regexp_check1, -- check 1
SUBSTRING('http://example.com/products/12/compare', LOCATE('http://example.com/products/','http://example.com/products/12/compare')+LENGTH('http://example.com/products/'),1 ) as test1,
SUBSTRING('http://example.com/products/12/compare', LOCATE('http://example.com/products/','http://example.com/products/12/compare')+LENGTH('http://example.com/products/'),1 ) REGEXP "^[0-9]+$" as test1_only_num,
SUBSTRING('http://example.com/products/12/compare', LOCATE('http://example.com/products/','http://example.com/products/12/compare')+LENGTH('http://example.com/products/'),2 ) as test11,
SUBSTRING('http://example.com/products/12/compare', LOCATE('http://example.com/products/','http://example.com/products/12/compare')+LENGTH('http://example.com/products/'),1 ) REGEXP "^[0-9]+$" as test11_only_num,
SUBSTRING('http://example.com/products/12/compare', LOCATE('http://example.com/products/','http://example.com/products/12/compare')+LENGTH('http://example.com/products/'),3 ) as test111,
SUBSTRING('http://example.com/products/12/compare', LOCATE('http://example.com/products/','http://example.com/products/12/compare')+LENGTH('http://example.com/products/'),1 ) REGEXP "^[0-9]+$" as test111_only_num;
Result :
+----------------------------------------+------------------------------+------------------------------+---------------+-------+----------------+--------+-----------------+---------+------------------+
| url | check1 | check2 | regexp_check1 | test1 | test1_only_num | test11 | test11_only_num | test111 | test111_only_num |
+----------------------------------------+------------------------------+------------------------------+---------------+-------+----------------+--------+-----------------+---------+------------------+
| http://example.com/products/12/compare | http://example.com/products/ | http://example.com/termsets/ | 1 | 1 | 1 | 12 | 1 | 12/ | 0 |
+----------------------------------------+------------------------------+------------------------------+---------------+-------+----------------+--------+-----------------+---------+------------------+
Url, check1 and check2 are just to display the variables I'm using. It's a main ID, the query is not usable that way of course.
Logic with check1
You check with a REGEX if check1 is present in your URL. If yes, regexp_check1 is 1, else it's 0.
ONLY if regexp_check1 is 1, then you SUBSTRING your URL to take the part that is located AFTER the check1 sentence. You take the first character AFTER (test1), then the two characters AFTER (test11), the three characters AFTER (test111) etc.. until the max length your ID_PRODUCT can be (6 or 7 for example).
You REGEX the SUBSTR you isolated to check if they are numeric only (test1 is numeric, test11 is numeric only, test111 is not numeric only.
Then you know that the content of test11 is your ID
Then you do the same thing with check2 if regexp_check1 was 0, and with an eventual check3 (which would contain http://www.comadso.dk/products/ for example), and for every beginning you can have.
Maybe my idea is a shitty one, but hey if it's seem dumb but works, it's not dumb !
I am running these two printf() functions and little bit confused with the output generated by them.
printf("%0.4f",3467);
It outputs - 3467.0000. In first parameter that is 0.4f 4 is understood as it represents 4 zeros after decimal but I am not sure about the 0 as I changed it to 1,2,3 it remains same. So what does it do ?
printf("%1.6u\n", -32);
While running this, I am getting 4294967264 what exactly does this number refers too ?
Before asking I checked printf() link whick is referring to sprintf() for parameter look up, but unable to find something on this.
The 0 in %0.4f is the minimum length the output will have when the value is formatted. In your case you will not see any difference in output unless you change it to 10 or above as the output like that will always be 3467.0000. If you change the 0 to 15 you will get one blank in front of the formatted output:
printf("%15.4f", 3467);
3467.0000
| | |
1 10 5
In your browser you will not see the extra blanks, but if you additionally tell it to use a dot as the fill-character you will see it:
printf("%'.15.4f", 3467);
......3467.0000
| | |
1 10 5
As for your second question. You are formatting a signed value as unsigned output. -32 as unsigned 32 bit integer is FFFFFFE0. If you tell printf to output that as unsigned you will get the unsigned value of FFFFFFE0 which is 4294967264.
For example, this is my table, which is called example:
--------------------------
| id | en_word | zh_word |
--------------------------
| 1 | Internet| 互联网 |
--------------------------
| 2 | Hello | 你好 |
--------------------------
and so on...
And I tried using this SQL Query:
SELECT * FROM `example` WHERE LENGTH(`zh_word`) = 3
For some reason, it wouldn't give me three, but would give me a lot of single letter characters.
Why is this? Can this be fixed? I tried this out in PhpMyAdmin.
But when I did it with JavaScript:
"互联网".length == 3; // true
And it seems to work fine. So how come it doesn't work?
you should use CHAR_LENGTH instead of LENGTH
LENGTH() returns the length of the string measured in bytes.
CHAR_LENGTH() returns the length of the string measured in characters.
LENGTH returns length in bytes (and chinese is multibyte)
Use CHAR_LENGTH to get length in characters
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_char-length
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_length
I am looking up which exchange services which telephone numbers, from a table of fragmentary numbers that show which exchange services them.
So my table contains, for example:
id |exchcode |exchname |easting|northin|leadin |
-----------------------------------------------------------------
12122 |SNL/UC |SANDAL |43430 |41306 |1924240 |
12123 |SNL/UC |SANDAL |43430 |41306 |1924241 |
881 |SNL/UD |SANDAL |43430 |41306 |1924249 |
2456 |BD/BCC/1 |BRADFORD CABLE |41627 |43262 |192421 |
4313 |NEY/UB |NORMANTON |43847 |42289 |192422 |
12124 |SNL/UC |SANDAL |43430 |41306 |192425 |
9949 |OBE/UB |HORBURY OSSETT |42857 |41971 |192428 |
9987 |OBE/UB |WAKEFIELD |42857 |41971 |1924 |
(sorry, formatting a bit rubbish)
leadin is the leading part of the phone number I have to match (stored as a VARCHAR, not a number)
And I am supplied with a phone number 1924283777 (not real)
how do I query to get the best match from the above table (It should pick exchange id 9949), or do I deal with it in code after I've done the query (php)
tl;dr: variable length for values of leadin column, want best match with a number longer than leadin.
I would think something like
WHERE ? LIKE concat(leadin, '%') order by length(leadin) desc limit 1
(I haven't checked the function names, and I'm not certain that this will work in MYSQL - I'm pretty sure it will work in one of the SQL's I've used).