Processing 10 million datasets - php and sql [closed] - php

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
We're using PHP 7 and have a MySQL DB running on a Webserver with only 128 MB RAM.
We have a problem with processing tons of datasets.
Simple description: We have 40.000 products and we want to collect data to these products to find out, if they need to be updated or not. The query which is collecting the specific data from another table with 10 Million datasets takes 1.2 seconds, because we have some SUM functions in it. We need to do the query for every product individually, because the time range which is relevant for the SUM, differs. Because of the mass of queries the function which should iterate over all the products returns a time out (after 5 min) - that's why we decided to implement a cronjob, which calls the function and the function continues with the product it ended the last time. We call the cronjob every 5 min.
But still, with our 40.000 products, it takes us ~30 hours until all the products were processed. Per cronjob, our function processes about 100 products...
How is it possible to deal with such a mass of data - is there a way to parallelize it with e.g. pthreads or does somebody have another idea? Could a server update be a solution?
Thanks a lot!
Nadine

Parallel processing will require resources as well, so on 128 MB it will not help.
Monitor your system to see where the bottleneck is. Most probably the memory since it is so low. Once you find the bottleneck resource, you will have to increase it. No amount of tuning and tinkering will solve an overloaded server issue.
If you can see that it is not a server resources issue (!), it could be at the query level (to many joints, need some indexes, ...). And your 5 min. timeout could be increased.
But start with the server.

Related

Frequent database calculations [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am building a browser php game.
There will be resources like metal, wood, food etc. Players will be getting the resources all the time (gaining resources speed depending on buildings/mines/farms levels).
The number of resources is saved in database resources table.
Lets say now that someone will be getting 50 000 of metal hourly.
What is the best way to save these values to database or recalculate them?
It would be crazy to add these values to the resources table every second to keep it updated. How to do it best?
If you can afford a stateful design, I have found that it is usually best to keep and maintain them in session, and aggregate the changes and write them out to the database at set intervals (of say 10 minutes), or when the session ends.
High rates of update can kill database performance: this impact is multiplied when the table you're writing to has any significant indexing. Different databases can support different transaction rates, and if you have more than a couple users, once-per-second updates will just kill performance.
An alternative is to write out these updates to a local or temporary queue table, containing only an index on the autoincrement field, and to have a sweeper process blow through it periodically to add those updates to the eventual target table at low priority. This keeps the update overhead lower, and reduces contention to the critical table, but it also means that your application logic will have to read the database value, and add the "pending" changes, before it receives a usable value.
A last alternative that is kind of the midpoint of the two above ones is to use a queue for storing pending database changes, but it would make it more difficult to calculate point-in-time values when there are unwritten changes still in the queue.

PHP & MySQL: Good/efficient of showing statistics from thousands of rows on each page load [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I know this is the wrong way to go about it, but I have a database filled with stats about vehicles that are imported from excel files.
For each vehicle(about 100 currently, updated each three days) I have from 500 to 2000 rows of data which is used to build graphs regarding fuel consumption, distance driven etc..
This is fairly simple and it takes from 1 to 3 seconds to load, but I also need the total stats and compare it against each car.
So if I build the graph for car id 1, I want to see the difference between its fuel consumption and the global fuel consumption (of all existing cars).
Is there a way of doing this without querying not only the single car but also all cars on each page load ?
Thank you
It sounds like you need to pre-compile your stats into a summary table. Write a function that takes in 1 vehicle as a parameter, compiles all your stats, then saves them to a dedicated summary table. Then write a background script that calls that function for all vehicles one by one. You can call the background script as often as you feel the stats need to be updated, leaving the web interface free to do very little computing/io.
This type of thing has saved me quite a big of headache over the years.

PHP based games [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am wondering, when you want to make a php based games, that requires the player to wait for something, for example: I paid 100 gold to explore, and every 5 minutes I will receive loot. The exploration will ends in 30 minutes for example. I want to know, which is the best and why. Here are the options:
Keep record of starting time of the exploration command issued, then every time the one specific exploring player open the page, calculate everything and show the result then keep it in the database.
Make a cron job to calculate exploration of EVERY player currently exploring every 5 minutes and update it to database.
Make a cron job every 30 minutes to calculate and update everything for EVERY PLAYER, but also allow SPECIFIC PLAYER to update just like option 1.
option 3 is basically combination of option 1 and 2. Thanks for the help. I am not sure about the performance issue so I need to know from people who already had experience in this.
These are just some personal opinion, might not be the best choice.
2) is more of a general approach for multiplayer game that has player interaction, but it puts constant strain on the server, which seems to be over kill as I seriously doubt your game would have complex interaction between players.
1) is probably the way to go unless your calculation is very complex and take a long time. The possible drawback is that you'll probably have trouble handling lots of simultaneous request to update. But from what you describe I don't think that'll happen.
3)This is hard to comment on because I have no idea if your calculation-time would depends on how much time it has pass since last update. If you calculation is time-indepentdent, then it's a horrible method as you spend time to update data that no one might need AND you are open to traffic spike as well.

PHP Server balance / load (CPU and memory) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
A server has a page that calls 10 different PHP files which in total up take 10 ms (1% of CPU) to execute and 1MB of memory.
If the website begins to get lots of traffic and this individual page request that calls these 10 PHP files takes 10 ms (1% of CPU) happens to gain 90 hits per second does the CPU percent increase? Or balances at 1%? Also does the memory increase?
What would the load (CPU and memory) look like at 100 hits? 1,000 hits? 10,000 hits? and 100,000 hits?
Keeping with the above specifications.
Also, if there were another 10 different pages, calling 5 unique PHP files and 5 of the same PHP files from the above call? What happens to load at 100 hits, 1,000 hits, 10,000 hits and 100,000 hits per second? Does it partially increase? Balance?
There isn't much information on heavy loading behavior for PHP online, so I'm asking to get a better understanding, of course. Thanks! :o)
Your question has a difficult answer and I cannot tell you the accurate ratio of the increase of server's resources. But, keep these two things in mind:
More the number of users, more the use of resources. So, it doesn't matter that you are calling the same files, but the thing which matter is that you are calling it 90 times.
Your system's usage would increase definitely, but one thing would make it a little less. And, that is caching. Your CPU would load these files into its cache (when they would be accessed very much) and hence, it would make the process a bit faster.

Is setting set_time_limit(0) a bad idea? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am making changes to a script a freelancer made and noticed at the top of his function he has set the timeout limit to 0 so it won't timeout. Is there any implications to doing this? Or should i properly test his script and work out how long it is taking and if it will timeout on the server?
Edit - Adding more detail.
So its a script that creates a temp DB table and populates it by pulling data from a number of different tables, it then outputs it to a csv using mysql outfile.
From the PHP docs:
set_time_limit — Limits the maximum execution time
seconds: The maximum execution time, in seconds. If set to zero, no time limit is
imposed.
http://nl3.php.net/manual/en/function.set-time-limit.php
If you set the time limit to zero, it means the script could run forever without being stopped by the PHP interpreter. Most times this is a bad idea, because if you would have a bug in your code which results in an unending loop, the script will run forever and eat your CPU.
A lot of times I've seen this in code without a good reason, so I assume the developer couldn't think of an appropriate timeout and just set 0.
There are a few valid reasons you would want to set a timeout of 0, for example, a deamon script.

Categories