MySQL Selecting million records to generate urls - php

I currently getting a 2 million records from different tables to generate a url to create a sitemap. The script eat too much resources and use 100% of the servers performance
query
SELECT CONCAT("/url/profile/id/",u.id,"/",nickname) as url FROM users AS u
UNION ALL
Select CONCAT("url/city/", c.id, "/paramId/",p.id,"/",Replace(p.title, " ", "+"),"/",r.region_Name,"/",c.city_Name) AS url
From city c
Join region r On r.id = c.id_region
Join country country On country.id = c.id_country
cross join param p
Where country.used = 1
And p.active = 1
//i store it on an array $url_list then process for creating a sitemap..but it takes time and to much resources
//i tried to get the data by batch using LIMIT 0,50000
but getting the maxrow for paging takes time. also the code doesn't look good for i have to run a two query that has a large data
$url_list = array();
$maxrow = SELECT COUNT(*) AS max from (
SELECT CONCAT("/url/profile/id/",u.id,"/",nickname) as url FROM users AS u
UNION ALL
Select CONCAT("url/city/", c.id, "/paramId/",p.id,"/",Replace(p.title, " ", "+"),"/",r.region_Name,"/",c.city_Name) AS url
From city c
Join region r On r.id = c.id_region
Join country country On country.id = c.id_country
cross join param p
Where country.used = 1
And p.active = 1) as tmp
$limit = 50,000;
$bybatch = ceil($maxrow/$limit);
$start = 0;
for($i = 0;$i < $bybatch; $i++){
// run query and store to $result
(SELECT CONCAT("/url/profile/id/",u.id,"/",nickname) as url FROM users AS u
UNION ALL
Select CONCAT("url/city/", c.id, "/paramId/",p.id,"/",Replace(p.title, " ", "+"),"/",r.region_Name,"/",c.city_Name) AS url
From city c
Join region r On r.id = c.id_region
Join country country On country.id = c.id_country
cross join param p
Where country.used = 1
And p.active = 1 LIMIT $start,$limit);
$start += $limit;
//push to $url_list
$url_list = array_push($result);
}
//when finish i use this to create a site map
$linkCount = 1;
$fileNomb = 1;
$i = 0;
foreach ($url_list as $ul) {
$i += 1;
if ($linkCount == 1) {
$doc = new DOMDocument('1.0', 'utf-8');
$doc->formatOutput = true;
$root = $doc->createElementNS('http://www.sitemaps.org/schemas/sitemap/0.9', 'urlset');
$doc->appendChild($root);
}
$url= $doc->createElement("url");
$loc= $doc->createElement("loc", $ul['url']);
$url->appendChild($loc);
$priority= $doc->createElement("priority",1);
$url->appendChild($priority);
$root->appendChild($url);
$linkCount += 1;
if ($linkCount == 49999) {
$f = fopen($this->siteMapMulti . $fileNomb .'.xml', "w");
fwrite($f,$doc->saveXML());
fclose($f);
$linkCount = 1;
$fileNomb += 1;
}
}
Any better way to do this? or to speed up the performance?
Added
Why is this faster than sql query but consumes 1 hundred percent of the servers resources and performance
$this->db->query('SELECT c.id, c.city_name, r.region_name, cr.country_name FROM city AS c, region AS r, country AS cr WHERE r.id = c.id_region AND cr.id = c.id_country AND cr.id IN (SELECT id FROM country WHERE use = 1)');
$arrayCity = $this->db->recordsArray(MYSQL_ASSOC);
$this->db->query('SELECT id, title FROM param WHERE active = 1');
$arrayParam = $this->db->recordsArray(MYSQL_ASSOC);
foreach ($arrayCity as $city) {
foreach ($arrayParam as $param) {
$paramTitle = str_replace(' ', '+', $param['title']);
$url = 'url/city/'. $city['id'] .'/paramId/'. $param['id'] .'/'. $paramTitle .'/'. $city['region_name'] .'/'. $city['city_name'];
$this->addChild($url);
}
}

I suggest you not to use UNION and just issue two separated queries. It will speed up a query itself.
Also as you mentioned above it's good idea to get data by batches.
And finally, don't collect all data in memory. Immediately write it to file in your loop.
Just open file in beginning, write each URL entry in loop and close file in end.
— open file for writing
— count query users table
— do several selects with LIMIT in loop (as you already done)
— right here in loop while ($row = mysql_fetch_array()) write each row to file
and than repeat such algorithm for another table.
It would be useful to implement a function for writing data to file, so you can call that function and adhere to the DRY principle.

Related

Getting columns from more than 2 tables and then inserting them into an array

I wrote some code to try and fetch multiple columns from different tables and join them together before inserting them into an array. It works when I was doing it on single tables, and two different tables, but when I tried to do it for three tables, suddenly I'm getting more results than I'm suppose to. A lot more. Please take a look:
include('connect.php');
$arrayX = "pieces.pieceID,playerDeck.amount,playerPieces.amount/pieces,
playerDeck,playerPieces/where playerDeck.playerName = 'playerName' and
playerPieces.playerName = 'playerName' and pieces.name =
playerDeck.name = playerPieces.name";
$arrayX = explode('/', $arrayX);
$column = $arrayX[0];
$table = $arrayX[1];
$where = $arrayX[2];
$myArray = explode(',', $column);
global $connect;
$fetch = mysqli_query($connect,"SELECT $column FROM $table $where");
$count = mysqli_num_rows($fetch);
while($row=mysqli_fetch_array($fetch,MYSQLI_NUM)){
$count --;
$arrayCount = count($myArray);
while($arrayCount > 0){
$arrayCount--;
$array[$count][$arrayCount]= $row[$arrayCount];
}
}
$count = count($array);
echo $count." rows";
Expected output:
32 rows
Actual output:
31744 rows
31,744/32 = 922. Which means I got 922 of the same copies of the rows I needed. I have no idea just what I did wrong to get 922 copies, nor how is that actually even possible. If anyone can figure out just what I did wrong, please point it out. Thank you very much.
Please run the following query:
SELECT COUNT(*) FROM
pieces p INNER JOIN playerDeck pd
ON p.name = pd.name
INNER JOIN playerPieces pp
ON pd.name = pp.name
WHERE pd.playerName = 'playerName' AND
pp.playerName = 'playerName'
If you get 31744 rows, the you will know that the logic in your query is not quite what you expected.
After seeing c4pone's answer, before it was deleted, I tried to edit that part where it said it was breaking it? And then it worked.
$arrayX = "pieces.pieceID,playerDeck.amount,playerPieces.amount/pieces,playerDeck,playerPieces/where playerDeck.playerName = 'playerName' and playerPieces.playerName = 'playerName' and pieces.name = playerDeck.name and pieces.name = playerPieces.name";
Cartesian product?
You try writing the query with "Inner Join".
Just like that:
pieces p INNER JOIN playerDeck pd
ON p.name = pd.name
INNER JOIN playerPieces pp
ON pd.name = pp.name
WHERE pd.playerName = 'playerName' AND
pp.playerName = 'playerName'

Performance, sql heavy join vs multiple small request

I have the following Mysql database structure
[Table - Category1]
[Table Category1 -> Category2 ] (One to N relation)
[Table - Category2]
[Table Category2 -> Item ] (One to N relation)
[Table - Item]
and I want to get everything into an array in PHP with the following structure
$arr[$i]['name'] = 'name of something in category1';
$arr[$i]['data'][$j]['name'] = 'name of something in category2';
$arr[$i]['data'][$j]['data'][$k]['name'] = 'name of something in item';
So basically I don't know if I should use one "heavy" sql request with JOIN like the following one or use an iterative method
The join request
SELECT c1.name as c1name, c2.name as c2name, i.name
FROM category1 c1
LEFT JOIN category1_to_category2 c1tc2 ON c1.id = c1tc2.id_category1
LEFT JOIN category2 c2 ON c1tc2.id_category2 = c2.id
LEFT JOIN category2_to_item c2ti ON c2.id = c2ti.id_category2
LEFT JOIN item i ON c2ti.id_item = i.id
The iterative method
$sql = 'SELECT id, name FROM category1';
$result = $mysqli->query($sql);
$arr = array();
$i = 0;
while ($arr[$i] = $result->fetch_assoc()) {
$join = $mysqli->query('SELECT c2.id, c2.name FROM category2 c2 LEFT JOIN category1_to_category2 c1tc2 ON c2.id = c1tc2.id_category 2 WHERE c1tc2.id_category1 = '.$arr[$i]['id']);
$j = 0;
while ($arr[$i]['data'][$j] = $join->fetch_assoc())
/* same request as above but with items */
$i++;
}
The iterative solution will make around 10 * 20 request which seems a lot to me that's why I would choose the first solution (4 JOIN single request).
However, with the single request solution, my array will look like that
$arr[0]['c1name'];
$arr[0]['c2name'];
$arr[0]['iname'];
And it will require some PHP traitement to obtain the desired array which I require to display in tabs in an HTML page. So my question is, is it better to have one big SQL request with some PHP array manipulation or to have multiple small request without the PHP array manipulation ? I know that in most case, getting all the data from SQL is a better solution but in this case I'm not sure. By the way, my only consideration is the loading time of my web page.
Thanks in advance for your help =).
It is typically better, and your example is no exception, to have the SQL server do as much of the data formatting and iteration as possible as SQL servers are typically more efficient at the task than common programming languages.
Add to this that you are cutting down on query load of the server and you have a very good reason for using complex joins.
The only downside is complex SQL queries can be hard to format and debug, if not already using a 3rd party SQL tool I would recommend getting one.
To go with the answer by Wobbles (that I agree with), I would suggest that you do a single query but you store the last key for each of c1name, c2name and iname. When these change you increment the relevant array subscript and initialise the lower level ones again to build up your array.
Something like this:-
<?php
$sql = "SELECT c1.name AS c1name, c2.name AS c2name, i.name AS iname
FROM category1 c1
LEFT JOIN category1_to_category2 c1tc2 ON c1.id = c1tc2.id_category1
LEFT JOIN category2 c2 ON c1tc2.id_category2 = c2.id
LEFT JOIN category2_to_item c2ti ON c2.id = c2ti.id_category2
LEFT JOIN item i ON c2ti.id_item = i.id"
$result = $mysqli->query($sql);
$arr = array();
$i = 0;
$j = 0;
$k = 0;
$c1name = '';
$c2name = '';
$iname = '';
while ($row = $result->fetch_assoc())
{
switch(true)
{
case $row['c1name'] != $c1name :
$i++;
$j = 0;
$k = 0;
$arr[$i]['name'] = $row['c1name'];
$arr[$i]['data'][$j]['name'] = $row['c2name'];
$arr[$i]['data'][$j]['data'][$k]['name'] = $row['iname'];
break;
case $row['c2name'] != $c2name :
$j++;
$k = 0;
$arr[$i]['data'][$j]['name'] = $row['c2name'];
$arr[$i]['data'][$j]['data'][$k]['name'] = $row['iname'];
break;
default :
$k++;
$arr[$i]['data'][$j]['data'][$k]['name'] = $row['iname'];
break;
}
$c1name = $row['c1name'];
$c2name = $row['c2name'];
$iname = $row['iname'];
}
As an aside there is some code at work that is used to generate a menu. Just 2 levels, and it was originally coded as one query for the first level and then one query for each of the records in the first level to get all the items below it. Not complex (there are only ~16 items in the first level, and on average under 10 items below each of those). I rewrote that to a single joined query. Typical time to generate that menu dropped from 0.25 seconds down to 0.004 seconds. It is easy for the time taken sending queries to the database to rapidly become excessive.

PHP MYSQL LOOP (insert TABLE_C) FROM QUERY RESULT TABLE_A and TABLE_B

i have 3 tables table_a(4000 rows) and table_b(35000 rows) and table_c to store the result,
it takes 670 sec to complete...., is there another way to do this..?,( i also try left join , but the right table give result more than one, and the left result become more than one, and it takes about 300 sec to complete.....
autocommit = 0
$c_mgp = "select * from table_a where .......";
$c_mgp_r = mysqli_query($con_a,$c_mgp) or die (mysqli_error($con_a));
$multi_sq = '';
$r = 0;
while($c_mgp_f = mysqli_fetch_array($c_mgp_r)) {
$r++;
$mgpstat = trim($c_mgp_f['STATUS']);
$mgpval= trim($c_mgp_f['VAL']);
$sand = trim(($c_mgp_f['SAND']);
$multi_sq .= "insert into table_c (NAME,VAL,VAL_RES) values('$mgpstat','$mgpval',
(select SUM(VAL_RES) from table_b where DATE = '$date_a' and GRUP = '$grup' and ACNO= '$sand'));" //this part is the most important thing, $sand always different (and always more than one row in result) each loop
if($r == 500){
mysqli_multi_query...........;
$r=0;
$multi_sq='';
}
}
commit
many thanks for the help...

Convert sql php script to coldfusion

we have a script that has been provided from our developers for php to generate a best sellers list from our database, however we need it to be in coldfusion!
Is there a simple way to convert or will this need rewriting completely?
Thanks in advance for any advice :-)
// // ---------- // Get Top Selling Products (by sku) // ---------- //
function CWgetBestSelling($max_products=5, $sub_ids=0)
{
$productQuery = '';
$returnQuery = '';
$idList = '0';
$itemsToAdd = '';
if (!is_numeric($idList[0])) {
$idList = '0';
}
$q_productQuery = mysql_query( "
SELECT count(*) as prod_counter,
p.product_id,
p.product_name,
p.product_preview_description,
p.product_date_modified
FROM cw_products p
INNER JOIN cw_order_skus o
INNER JOIN cw_skus s
WHERE o.ordersku_sku = s.sku_id
AND s.sku_product_id = p.product_id
AND NOT p.product_on_web = 0
AND NOT p.product_archive = 1
AND NOT s.sku_on_web = 0
GROUP BY product_id
ORDER BY prod_counter DESC
LIMIT ".$max_products
,$_ENV["request.cwapp"]["db_link"]);
$productQuery = array();
while ($qd = mysql_fetch_assoc($q_productQuery)) {
$productQuery[] = $qd;
}
// add values to list
foreach ($productQuery as $values) {
$idList = $values['product_id'] . "," . $idList;
}
// if not enough results, fill in from sub_ids
if (count($productQuery) < $max_products) {
// number needed
$itemsToAdd = $max_products - count($productQuery);
for ($i = 1; $i <= $itemsToAdd; $i++) {
if (substr_count($sub_ids, ',') >= $i) {
$idListArray = explode(',', $sub_ids);
$idList .= "," . $idListArray[$i];
}
}
$q_resultsQuery = mysql_query("
SELECT 0 as prod_counter,
p.product_id,
p.product_name,
p.product_preview_description,
p.product_date_modified
FROM cw_products p
WHERE p.product_id in(".CWqueryParam($idList).")
AND NOT p.product_on_web = 0
AND NOT p.product_archive = 1
ORDER BY product_date_modified DESC
",$_ENV["request.cwapp"]["db_link"]);
} else {
$q_resultsQuery = mysql_query("
SELECT count(*) as prod_counter,
p.product_id,
p.product_name,
p.product_preview_description,
p.product_date_modified
FROM cw_products p
INNER JOIN cw_order_skus o
INNER JOIN cw_skus s
WHERE o.ordersku_sku = s.sku_id
AND s.sku_product_id = p.product_id
AND NOT p.product_on_web = 0
AND NOT p.product_archive = 1
AND NOT s.sku_on_web = 0
GROUP BY product_id
ORDER BY prod_counter DESC, product_date_modified
",$_ENV["request.cwapp"]["db_link"]);
}
while ($qd = mysql_fetch_assoc($q_resultsQuery)) {
$returnQuery[] = $qd;
}
return $returnQuery;
}
Code conversion questions don't tend to stay open long because they're viewed as lazy. So, Here are some references to get you started. I'm no PHP pro but after a quick glance at your code I think this list of links will give you a good head start.
CFFunction
CFQuery
CFArgument
valueList()
CFLoop
CFif
In the interest of not doing your work for you and give you the opportunity to learn, I'll provide some samples but not the entire code so you get an idea where your PHP fits into CF. This is also the tag version, not the script version.
<cffunction name = "CWgetBestSelling" ...>
<cfargument name = "max_products" default = "5" ...>
<cfargument ...>
<cfset var local.productQuery = "">
<cfset var local.returnQuery = "">
<cfset ...>
<cfset ...>
<cfquery name = "q_productQuery " datasource = "yourDatasource">
SELECT
count(*) as prod_counter,
p.product_id,
p.product_name,
p.product_preview_description,
p.product_date_modified
FROM
cw_products p
INNER JOIN cw_order_skus o
INNER JOIN cw_skus s
WHERE
o.ordersku_sku = s.sku_id
AND s.sku_product_id = p.product_id
AND NOT p.product_on_web = 0
AND NOT p.product_archive = 1
AND NOT s.sku_on_web = 0
GROUP BY
product_id
ORDER BY
prod_counter DESC
LIMIT #arguments.max_products#
</cfquery>
...
...
...
<cfreturn yourReturnVariable>
</cffunction>

Multiple Queries: Multi Variables or One Query

I had a lot of help in writing all of this in php (mainly the first long query).
// Query the database for data
$query = "SELECT cards.card_id, concat(title, \" By Amy\") AS TitleConcat,
description, meta_description,
seo_keywords,concat(\"http://www.amyadele.com/attachments//cards/\",cards.card_id,\"/\",card_image) AS ImageConcat,price
FROM cards, card_cheapest
WHERE cards.card_id = card_cheapest.card_id
ORDER BY card_id";
$result = mysql_query($query);
// Open file for writing
$myFile = "googleproducts.txt";
$fh = fopen($myFile, 'w') or die("can't open file");
// Loop through returned data and write (append) directly to file
fprintf($fh, "%-25s %-200s %-800s %-200s %-800s %-800s\n", "id", "label","description","price","image","seo keywords");
fprintf($fh, "\n");
while ($row = mysql_fetch_assoc($result)) {
fprintf($fh, "%-25s %-200s %-800s %-200s %-800s %-800s\n", $row['card_id'], $row['TitleConcat'], $row['description'],$row['price'],$row['ImageConcat'], $row['seo_keywords']);
}
// Close out the file
fclose($fh);
echo "The file has been written sucessfully to googleproducts.txt. It will run again tomorrow at 12:00pm."
?>
However, over the last couple of days I have written a couple other queries which leads me to my question. Would it be easier to somehow stick this other queries into the first "select", or would it be easier to set up multiple queries and then just insert that into the rows as well (I dont even know if that is even possible).
Query: Selecting the Min Price
select card_id, min(card_price) from card_lookup_values where card_price > 0 group by card_id;
Query: Creating URL Structure
SELECT CONCAT('http://amyadele.com/', cards.title, '/', categories.seoname, '/', cards.seoname),cards.card_id
FROM cards
LEFT JOIN card_categories
ON card_categories.card_id = cards.card_id
LEFT JOIN categories
ON card_categories.category_id = categories.category_id ORDER by card_id;
I guess my question is is it best to set up my queries (which I already know work) into multiple variables and then somehow push that into the table format I have set up, or to somehow format all of these queries into one long query?
I recently just wrote
SELECT
replace(lower(concat( 'http://www.amyadele.com/', pcat.seoname,'/',cat.seoname, '/', cards.seoname, '.htm' )),' ','+') AS link,
concat(pcat.name,'>',cat.name) as category,
replace(lower(concat( 'http://www.amyadele.com/', cat.seoname, '/', cards.seoname, '.htm' )),' ','+') AS add_to_cart_link
FROM cards
INNER JOIN card_categories cc ON cards.card_id = cc.card_id AND cards.card_live = 'y' AND cards.active = 'y' AND cc.active = 'Y'
INNER JOIN categories cat ON cat.category_id = cc.category_id AND cat.active = 'Y'
INNER JOIN categories pcat ON cat.parent_category_id = pcat.category_id
INNER JOIN card_lookup_values clv on clv.card_id=cards.card_id and clv.lookup_detail_id
WHERE cat.parent_category_id <>0
ORDER BY cc.card_id
However, I am really confused on how to even add this now
You can use 'joins' to aggregate all of this information into a single response from the DB. I don't know your exact schema, so some of this is just a guess, but here is how I would start.
SELECT
cards.card_id,
concat(cards.title, ' By Amy') AS TitleConcat,
cards.description,
cards.meta_description,
cards.seo_keywords,
concat('http://www.amyadele.com/attachments//cards/',cards.card_id,'/',cards.card_image) AS ImageConcat,
card_cheapest.price,
card_import.author,
min(card_lookup_values.card_price)
FROM
cards
join card_cheapest on cards.card_id = card_cheapest.card_id
left join card_import on card_import.card_id = cards.card_id
join card_lookup_values on card_lookup_values.card_id = cards.card_id
WHERE card_lookup_values.card_price > 0
GROUP BY
cards.card_id
ORDER BY
cards.card_id

Categories