I am trying to make a PHP loop work for me in MySQL. Currently all visits to a website via a specific URL parameterare logged into a table along with the date and time of the visit. I am rebuilding the logging procedure to only count the visits via one specific parameter on one day, but I'll have to convert the old data first.
So here's what I'm trying to do: The MySQL table (let's call it my_visits) has 3 columns: parameter, visit_id and time.
In my PHP code, I've created the following loop to gather the data I need (all visits made via one paramter on one day, for all parameters):
foreach (range(2008, 2014) as $year) {
$visit_data = array();
$date_ts = strtotime($year . '-01-01');
while ($date_ts <= strtotime($year . '-12-31')) {
$date = date('Y-m-d', $date_ts);
$date_ts += 86400;
// count visit data
$sql = 'SELECT parameter, COUNT(parameter) AS total ' .
'FROM my_visits ' .
'WHERE time BETWEEN \''.$date.' 00:00\' AND \''.$date.' 23:59\' '.
'GROUP BY parameter ORDER BY total DESC';
$stmt = $db->prepare($sql);
$stmt->execute(array($date));
while ($row = $stmt->fetch()) {
$visit_data[] = array(
'param' => $row['parameter'],
'visit_count' => $row['total'],
'date' => $date);
}
$stmt->closeCursor();
}
}
Later on, the gathered data is inserted into a new table (basically eliminating visit_id) using a multiple INSERT (thanks to SO! :)).
The above code works, but due to the size of the table (roughly 3.4 million rows) it is very slow. Using 7 * 365 SQL queries just to gather the data seems just wrong to me and I fear the impact of just running the script will slow everything down substantially.
Is there a way to make this loop work in MySQL, like an equivalent query or something (on a yearly basis perhaps)? I've already tried a solution using GROUP BY, but since this eliminates either the specific dates or the parameters, I can't get it to work.
You can GROUP further.
SELECT `parameter`, COUNT(`parameter`) AS `total`, DATE(`time`) AS `date`
FROM `my_visits`
GROUP BY `parameter`, DATE(`time`)
You can then execute it once (instead of in a loop) and use $row['date'] instead of $date.
This also means you don't have to update your code when we reach 2015 ;)
Related
I have an online calendar listing upcoming live music. I use a prepared statement to fetch the listings for the next eight days and display them in a table. What I need to do, before displaying the results, is count the number of unique dates ('Date') within the listings. For example, if only five days out of the next eight have events happening, I need to know that number is 5.
Using COUNT(DISTINCT) works for giving me that number, but then it only displays one row of results, so I need another solution
My code is this:
$mysqli = new mysqli("Login Stuff Here");
if ($mysqli->connect_errno){
echo "Failed to connect to MySQL: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;
}
$start = strtotime('today midnight');
$stop = strtotime('+1 week');
$start = date('Y-m-d', $start);
$stop = date('Y-m-d', $stop);
$allDates = $mysqli->prepare("SELECT
ID,
Host,
Type,
Bands,
Date,
Time,
Price,
Note,
Zip,
URL
FROM things WHERE (Date >= ? AND Date <= ?) ORDER BY Date, Time, Host");
$allDates->bind_param("ss", $start, $stop);
$allDates->execute();
$allDates->bind_result($ID, $Host, $Type, $Bands, $Date, $Time, $Price, $Note, $Zip, $URL);
while($allDates->fetch()):
// ECHO ALL THE INFO IN A NICE TABLE
endwhile;
$allDates->close();
I need to count the unique values (and maybe even retrieve them) from the 'Date' column. Right now I have it working by doing a separate query, but I'm sure there's a better way.
EDIT: Ultimately, I wound up doing a separate query, which worked out well as I was able to use it for other things as well. I found that using GROUP BY always only returned just one result per date, so it didn't work for displaying the full listings. Maybe I was going at it wrong, but I wound up being good in another way. Thanks!
You are missing group by:
GROUP BY Date
To use any aggregate method like count, sum for specific group. You need to apply GROUP BY on specific column or list of columns.
Example :
GROUP BY Date ORDER BY Date, Time, Host
Please note that, list of columns in SELECT MUST match the list of columns mentioned along with GROUP BY.
Also,
Date >= ? AND Date <= ?
can be replaced by
Date BETWEEN ? AND ?
I'd recommend grouping by date as you fetch the results from the query.
while($row = $allDates->fetch()) {
$dates[$row['date'][] = $row;
}
That makes it easy to count the distinct dates
$count = count($dates);
And potentially simpler to format your output (headers/sections for each date, etc.)
foreach($dates as $date) {
foreach($date as $event) {
// ECHO ALL THE INFO IN A NICE TABLE
}
}
It does require iterating the same data twice, but for a reasonable amount of data to display on a page, that shouldn't make much difference.
I'm writing a time logging programme for a client who is a piano tuner, and I've written the following PHP code to give a record a status of 'to do':
$last_tuned = '2017-01-05';
$tuning_period = 3;
$month_last_tuned = date('Y-m', strtotime(date('Y-m-d', strtotime($last_tuned))));
$next_tuning = date('Y-m', strtotime($month_last_tuned.(' +'.$tuning_period.' months')));
if (time() > strtotime($next_tuning.' -1 months')) {
if (time() > strtotime($next_tuning)) {
return 'late';
} else {
return 'upcoming';
}
}
As you can see, the $last_tuned variable is of the date(YYYY-MM-DD) format. This is then converted to a (YYYY-MM) format.
Once convered, an additional number of months, identical to $tuning_period is then added to the $month_last_tuned variable giving us a month and year value for when we need to add a new record.
If the current time (found with time()) is greater than the $next_tuning variable - 1 month, it returns that the task is upcoming. If it's after the $next_tuning variable, it returns that the task is late.
I now have to write a MySQL query to list the items that would return as upcoming or late.
How would I write this in MySQL? I'm not very good with MySQL functions, and some help would be much appreciated.
My attempt at the logic is:
SELECT * FROM records
// The next lines are to get the most recent month_last_tuned value and add the tuning_period variable
WHERE
NOW() > (SELECT tuning_date FROM tunings ORDER BY tuning_date ASC LIMIT 1)
+
(SELECT tuning_period FROM records WHERE records.id = INITIAL CUSTOMER ID)
I know that that is completely wrong. The logic is pretty much there though.
My database schema is as follows:
I expect the rows returned from the query to be on-par with the 'late' or 'upcoming' values in the PHP Code above. This means that the rows returned will be within 1 months of their next tuning date (calculated from last tuning plus tuning period).
Thanks!
You'd probably be better off with using the DateTime object instead of manipulating date strings.
$last_tuned = '2017-01-05';
$tuning_period = 3; // months
$dt_last_tuned = DateTimeImmutable::createFromFormat('Y-m-d',$last_tuned);
$dt_next_tuning = $dt_last_tuned->add(new DateInterval('P3M'));
$dt_now = new DateTimeImmutable();
$dt_tuning_upcoming = $dt_next_tuning->sub(new DateInterval('P1M'));
if( $dt_now > $dt_next_tuning) {
return 'late';
}
if( $dt_now > $dt_tuning_upcoming) {
return 'upcoming';
}
You can also use these DateTime objects in your MySQL queries, by building the query and passing through something like $dt_next_tuning->format('Y-m-d H:i:s'); as needed.
Given your table structure, however, it may be easier to just get all the relevant records and process them. It's a little difficult to tell exactly how the pieces fit together, but generally speaking MySQL shouldn't be used for "processing" stuff.
I'm currently struggling with an issue that is overloading my database which makes all page requests being delayed significantly.
Current scenario
- A certain Artisan Command is scheduled to be ran every 8 minutes
- This command has to update a whole table with more than 30000 rows
- Every row will have a new value, which means 30000 queries will have to be executed
- For about 14 seconds the server doesn't answer due to database overload (I guess)
Here's the handle method of the command handle()
public function handle()
{
$thingies = /* Insert big query here */
foreach ($thingies as $thing)
{
$resource = Resource::find($thing->id);
if(!$resource)
{
continue;
}
$resource->update(['column' => $thing->value]);
}
}
Is there any other approach to do this without making my page requests being delayed?
Your process is really inefficient and I'm not surprised it takes a long time to complete. To process 30,000 rows, you're making 60,000 queries (half to find out if the id exists, and the other half to update the row). You could be making just 1.
I have no experience with Laravel, so I'll leave it up to you to find out what functions in Laravel can be used to apply my recommendation. I just want to get you to understand the concepts.
MySQL allows you to submit a multi query; One command that executes many queries. It is drastically faster than executing individual queries in a loop. Here is an example that uses MySQLi directly (no 3rd party framework such as Laravel)
//the 30,000 new values and the record IDs they belong to. These values
// MUST be escaped or known to be safe
$values = [
['id'=>145, 'fieldName'=>'a'], ['id'=>2, 'fieldName'=>'b']...
];
// %s and %d will be replaced with column value and id to look for
$qry_template = "UPDATE myTable SET fieldName = '%s' WHERE id = %d";
$queries = [];//array of all queries to be run
foreach ($values as $row){ //build and add queries
$q = sprintf($qry_template,$row['fieldName'],$row['id']);
array_push($queries,$q);
}
//combine all into one query
$combined = implode("; ",$queries);
//execute all queries at once
$mysqli->multi_query($combined);
I would look into how Laravel does multi queries and start there. The last time I implemented something like this, it took about 7 milliseconds to insert 3,000 rows. So updating 30,000 will definitely not take 14 seconds.
As an added bonus, there is no need to first run a query to figure out whether the ID exists. If it doesn't, nothing will be updated.
Thanks to #cyclone comment I was able to update all the values in one single query.
It's not a perfect solution, but the query execution time now takes roughly 8 seconds and only 1 connection is required, which means the page requests are still being handled when the query is being executed.
I'm not marking this question as definitive since there might be improvements to make.
$ids = [];
$caseQuery = '';
foreach ($thingies as $thing)
{
if(strlen($caseQuery) == 0)
{
$caseQuery = '(CASE WHEN id = '. $thing->id . ' THEN \''. $thing->rank .'\' ';
}
else
{
$caseQuery .= ' WHEN id = '. $thing->id . ' THEN \''. $thing->rank .'\' ';
}
array_push($ids, $thing->id);
}
$caseQuery .= ' END)';
// Execute query
DB::update('UPDATE <table> SET <value> = '. $caseQuery . ' WHERE id IN ('. implode( ',' , $ids) .')');
I've worked with Postgresql some, but I'm still a novice. I usually default to creating way too many queries and hacking my way through to get the result I need from a query. This time I'd like to write some more streamlined code since I'll be dealing with a large database, and the code needs to be as concise as possible.
So I have a lot of point data, and then I have many counties. I have two tables, "counties" and "ltg_data" (the many points). My goal is to read through a specified number of counties (as given in an array) and determine how many points fall in each county. My novice, repetitive and inefficient way of doing this is by writing queries like this:
$klamath_40_days = pg_query($conn, "SELECT countyname, time from counties, ltg_data where st_contains(counties.the_geom, ltg_data.ltg_geom) and countyname");
$klamath_rows = pg_num_rows($klamath_40_days);
If I run a separate query like the above for each county, it gives me nice output, but it's repetitive and inefficient. I'd much rather use a loop. And eventually I'll need to pass params into the query via the URL. When I try to run a for loop in PHP, I get errors saying "query failed: ERROR: column "jackson" does not exist", etc. Here's the loop:
$counties = array ('Jackson', 'Klamath');
foreach ($counties as $i) {
echo "$i<br>";
$jackson_24 = pg_query($conn, "SELECT countyname, time from counties, ltg_data where st_contains(counties.the_geom, ltg_data.ltg_geom) and countyname = ".$i." and time >= (NOW() - '40 DAY'::INTERVAL)");
$jackson_rows = pg_num_rows($result);
}
echo "$jackson_rows";
So then I researched the pg_query_params feature in PHP, and I thought this would help. But I run this script:
$counties = array('Jackson', 'Josephine', 'Curry', 'Siskiyou', 'Modoc', 'Coos', 'Douglas', 'Klamath', 'Lake');
$query = "SELECT countyname, time from counties, ltg_data where st_contains(counties.the_geom, ltg_data.ltg_geom) and countyname = $1 and time >= (NOW() - '40 DAY'::INTERVAL)";
$result = pg_query_params($conn, $query, $counties);
And I get this error: Query failed: ERROR: bind message supplies 9 parameters, but prepared statement "" requires 1 in
So I'm basically wondering what the best way to pass parameters (either individual from perhaps a URL passed param or multiple elements in an array) to a postgresql query is? And then I'd like to echo out the summary results in an organized manner.
Thanks for any help with this.
If you just need to know how many points fall into each county specified in an array, then you can do the following in a single call to the database:
SELECT countyname, count(*)
FROM counties
JOIN ltg_data ON ST_contains(counties.the_geom, ltg_data.ltg_geom)
WHERE countyname = ANY ($counties)
AND time >= now() - interval '40 days'
GROUP BY countyname;
This is much more efficient than making individual calls and you return only a single instance of the county name, rather than one for every record that is retrieved. If you have, say 1,000 points in the country Klamath, you return the string "Klamath" just once, instead of 1,000 times. Also, php doesn't have to count the length of the query result. All in all much cleaner and faster.
Note also the JOIN syntax in combination with the PostGIS function call.
To execute a query with a parameter in a loop for several values you can use the following pattern:
$counties = array('Jackson', 'Josephine', 'Curry');
$query = "SELECT countyname, time from counties where countyname = $1";
foreach ($counties as $county) {
$result = pg_query_params($conn, $query, array($county));
$row = pg_fetch_row($result);
echo "$row[0] $row[1] \n";
}
Note that the third parameter of pg_query_params() is an array, hence you must put array($county) even though there is only one parameter.
You can also execute one query with an array as parameter.
In this case you should use postgres syntax for an array and pass it to the query as a text variable.
$counties = "array['Jackson', 'Josephine', 'Curry']";
$query = "SELECT countyname, time from counties where countyname = any ($counties)";
echo "$query\n\n";
$result = pg_query($conn, $query);
while ($row = pg_fetch_row($result)) {
echo "$row[0] $row[1] \n";
}
I am creating an online calendar for a client using PHP/MySQL.
I initiated a <table> and <tr>, and after that have a while loop that creates a new <td> for each day, up to the max number of days in the month.
The line after the <td>, PHP searches a MySQL database for any events that occur on that day by comparing the value of $i (the counter) to the value of the formatted Unix timestamp within that row of the database. In order to increment the internal row counter ONLY when a match is made, I have made another while loop that fetches a new array for the result. It is significantly slowing down loading time.
Here's the code, shortened so you don't have to read the unnecessary stuff:
$qry = "SELECT * FROM events WHERE author=\"$author\"";
$result = mysql_query($qry) or die(mysql_error());
$row = mysql_fetch_array($result);
for ($i = 1; $i <= $max_days; $i++) {
echo "<td class=\"day\">";
$rowunixdate_number = date("j", $row['unixdate']);
if ($rowunixdate_number == $i) {
while ($rowunixdate_number == $i) {
$rowtitle = $row['title'];
echo $rowtitle;
$row = mysql_fetch_array($result);
$rowunixdate_number = date("j", $row['unixdate']);
}
}
echo "</td>";
if (newWeek($day_count)) {
echo "</tr><tr>";
}
$day_count++;
}
The slowness is most likely because you're doing 31 queries, instead of 1 query before you build the HTML table, as Nael El Shawwa pointed out -- if you're trying to get all the upcoming events for a given author for the month, you should select that in a single SQL query, and then iterate over the result set to actually generate the table. E.g.
$sql = "SELECT * FROM events WHERE author = '$author' ORDER BY xdate ASC";
$rsEvents = mysql_query($sql);
echo("<table><tr>");
while ($Event = mysql_fetch_array($rsEvents)) {
echo("<td>[event info in $Event goes here]</td>");
}
echo("</tr></table>");
Furthermore, it's usually a bad idea to intermix SQL queries and HTML generation. Your external data should be gathered in one place, the output data generated in another. My example cuts it close, by having the SQL immediately before the HTML generation, but that's still better than having an HTML block contain SQL queries right in the middle of it.
Have you run that query in a MySQL tool to see how long it takes?
Do you have an index on the author column?
There's nothing wrong with your PHP. I suspect the query is the problem and no index is the cause.
aside from their comments above, also try to optimize your sql query since this is one of the most common source of performance issues.
let say you have a news article table with Title, Date, Blurb, Content fields and you only need to fetch the title and display them as a list on the html page,
to do a "SELECT * FROM TABLE"
means that you are requiring the db server to fetch all the field data when doing the loop (including the Blurb and Content which you are not going to use).
if you optimize that to something like:
"SELECT Title, Date FROM TABLE" would fetch only the necessary data and would be more efficient in terms of server utilization.
i hope this helps you.
Is 'author' an id? or a string? Either way an index would help you.
The query is not slow, its the for loop thats causing the problem. Its not complete; missing the $i loop condition and increment. Or is this a typo?
Why don't you just order the query by the date?
SELECT * FROM events WHERE author=? ORDER BY unixdate ASC
and have a variable to store the current date you are on to have any logic required to group events by date in your table ex. giving all event rows with the same date the same color.
Assuming the date is a unix timestamp that does not account for the event's time then you can do this:
$currentDate = 0;
while(mysql_fetch_array($result)){
if($currentDate == $row['unixdate']){
//code to present an event that is on the same day as the previous event
}else{
//code to present an even on a date that is past the previous event
//you are sorting events by date in the query
}
//update currentDate for next iteration
$currentDate = $row['unixdate'];
}
if unixdate includes the event time, then you need to add some logic to just extract the unix date timestmap excluding the hours and minutes.
Hope that helps