I have a website that let's each user create a webpage (to advertise his product). Once the page is created it will never be modified again.
Now, my question: Is it better to keep the page content (only a few parts are editable) into a MySql database and generate it using queries everytime the page is accesed or to create a static webpage containing all the info and store it onto the server?
If I store every page on the disk, I may reach like 200.000 files.
If I store each page in MySQL database I would have to make a query each time the page is requested, and for like 200.000 entries and 5-6 queries/second I think the website will be slow...
So what's better?
MySQL will be able to handle the load if you create the tables properly (normalized and indexed). But if the content of the page doesn't change after creation, it's better if you cache the page statically. You can organize the files into buckets (folders) so that one folder doesn't have too many files in it.
Remember to cache only the content areas and not the templates. Unless each user has complete control over how his/her page shows up.
200.000 files writable by the Apache process is not a good idea.
I recommend using a database.
Database imports/exports are easier, not telling about the difference between the maintenance costs.
Databases are using caching, and if nothing is changed, they will pull up the last result, without running the query again. This doesn't stand, thanks JohnP.
If you want to redesign your webpage sometimes later you must be using MySQL to store the pages as you can't really change them (unless you dig into regexp) after making them static.
About the time issue - its not an issue if you set indexes right.
if the data is small to moderate then prefer static hardcoding ie. putting the data in the HTML, but if it is huge, computational or dynamic and changing you have no option but to use a connectivity to the Database
I believe that proper caching technique with certain attributes (long exp. time) would be better than static pages or retrieving everything from mysql everytime.
Static content is usually a good thing if you have a lot of traffic, but 5-6 queries a second is not hard for the database at all, so with your current load it doesn't matter.
You can spread the static files to different directories by file name and set up rewrite rules in your web server (mod_rewrite on Apache, basic location matching with regexp on Nginx and similar on other web servers). That way you won't even have to invoke the PHP interpreter.
A database and proper caching. 200.000 pages times, what? 5KB? That's 1 GB. Easy to keep in RAM. Besides 5/6 queries per second is easy on a database. Program first, then benchmark.
// insert quip about premature optimisation
Related
I was hoping to get your input on a CMS that I am creating. How it is currently setup, on a visitors first page load the system queries the "site" table and pulls down any site wide data (ie. Site ID/Site name/Site wide hooks etc.). This information is stored in PHP session and that table does not get queried again for the remainder of the users visit.
Does this sound acceptable? I like the idea of saving an unnecessary db query on every page load however, if the site has a large amount of hooks, this session var could get large (unlikely but possible).
For extra information, the system currently runs a config class that could store some site data (thus preventing even the first db query) however I want the plugin system to easily be able integrate hooks into this CMS so I decided a DB route was the way to go.
I would appreciate your input. Thanks
There's no need in overcomplicating things, K.I.S.S will serve you good here. Start optimizing when you actually need it. You should also remember that the database will most likely cache the query and the result if it's done multiple times so there's no guarantee that you will save any time at all.
Well this is kind of a question of how to design a website which uses less resources than normal websites. Mobile optimized as well.
Here it goes: I was about to display a specific overview of e.g. 5 posts (from e.g. a blog). Then if I'd click for example on the first post, I'd load this post in a new window. But instead of connecting to the Database again and getting this specific post with the specific id, I'd just look up that post (in PHP) in my array of 5 posts, that I've created earlier, when I fetched the website for the first time.
Would it save data to download? Because PHP works server-side as well, so that's why I'm not sure.
Ok, I'll explain again:
Method 1:
User connects to my website
5 Posts become displayed & saved to an array (with all its data)
User clicks on the first Post and expects more Information about this post.
My program looks up the post in my array and displays it.
Method 2:
User connects to my website
5 Posts become displayed
User clicks on the first Post and expects more Information about this post.
My program connects to MySQL again and fetches the post from the server.
First off, this sounds like a case of premature optimization. I would not start caching anything outside of the database until measurements prove that it's a wise thing to do. Caching takes your focus away from the core task at hand, and introduces complexity.
If you do want to keep DB results in memory, just using an array allocated in a PHP-processed HTTP request will not be sufficient. Once the page is processed, memory allocated at that scope is no longer available.
You could certainly put the results in SESSION scope. The advantage of saving some DB results in the SESSION is that you avoid DB round trips. Disadvantages include the increased complexity to program the solution, use of memory in the web server for data that may never be accessed, and increased initial load in the DB to retrieve the extra pages that may or may not every be requested by the user.
If DB performance, after measurement, really is causing you to miss your performance objectives you can use a well-proven caching system such as memcached to keep frequently accessed data in the web server's (or dedicated cache server's) memory.
Final note: You say
PHP works server-side as well
That's not accurate. PHP works server-side only.
Have you think in saving the posts in divs, and only make it visible when the user click somewhere? Here how to do that.
Put some sort of cache between your code and the database.
So your code will look like
if(isPostInCache()) {
loadPostFromCache();
} else {
loadPostFromDatabase();
}
Go for some caching system, the web is full of them. You can use memcached or a static caching you can made by yourself (i.e. save post in txt files on the server)
To me, this is a little more inefficient than making a 2nd call to the database and here is why.
The first query should only be pulling the fields you want like: title, author, date. The content of the post maybe a heavy query, so I'd exclude that (you can pull a teaser if you'd like).
Then if the user wants the details of the post, i would then query for the content with an indexed key column.
That way you're not pulling content for 5 posts that may never been seen.
If your PHP code is constantly re-connecting to the database you've configured it wrong and aren't using connection pooling properly. The execution time of a query should be a few milliseconds at most if you've got your stack properly tuned. Do not cache unless you absolutely have to.
What you're advocating here is side-stepping a serious problem. Database queries should be effortless provided your database is properly configured. Fix that issue and you won't need to go down the caching road.
Saving data from one request to the other is a broken design and if not done perfectly could lead to embarrassing data bleed situations where one user is seeing content intended for another. This is why caching is an option usually pursued after all other avenues have been exhausted.
Im working on Blog based Website having more than 50k posts. I need suggestions to increase the website speed.
I have two options
1: I can pick up the post data from the mysql database and display it using php
2: Static Webpage for each post (Using DOM parser i can Update the Post Contents)
which one is fast database or File System ? or any other suggestions to speedup my website.Im using go daddy shared hosting.
I would suggest:
a pagination for the site.
implement coding style: fetch-what-you-only-need from the database
run some load tests on where on your site needs improving.
Sorry, looked up godaddy and they do not allow memcached :(
Use database and implement memcached to cache recently shown pages.
Even with 50 K posts I imagine that most fetches are for a small subset of posts for a specific time period, usually recent posts.
If this is the case a memcache solution would beat any disk based storage.
Automatically generating static pages for posts for often retrieved posts is another way.
But base storage in a database is the easiest.
You can't get reliable performance on shared hosting, so just go with what's easiest to work with. Today you may get fast access to the file system, but tomorrow they relocate your app to another silo and the database becomes faster. It's a lot easier to extend a database to add new features, so I'd go with that.
But if you really care about performance you have to make tests to measure it.
you can use page caching for whole page
query caching from caching results of database queries
using file system will give only give trouble in update ,delete ,insert etc..
I am designing a web application which will be doing three things:
1) store some data
2) make user available to view these data
3) from time to time add/remove/change some data
Looks pretty simple, but I would like to minimase usage of server resources by avoiding MySQL and PHP. My main goal is to deliver HTML file for user - posts1.html (posts2.html, posts3.html... (where 1,2,3 are numbers of pages of data)).
Normally, I would create posts.php file, which would send query to database, but my data are changing only three-five times a day, so it would be a huge waste.
Instead, I thought about caching these data, what would spare a lot of server resources, but in this situation there would be some of PHP code involved.
My another idea is to create script that would be creating all HTML files after every change in database and then replace the old ones with them. But what if someone requests page that is replacing right now? It may cause errors, user can get the uncompleted file etc.
However, there is one solution - I could store created HTML files in two directories (A and B) and using .htaccess do something like this (pseudocode):
if ( (HOURS)%2 == 0 )
/postsX.html -> /A/postsX.html
else
/postsX.html -> /B/postsX.html
It would give me enough time to upgrade all files.
I would love to hear what do you think about it and what would you do?
If you dont want to use a full blown MySQL server, use SQLite. It's part of PHP and very lightweight. Then add caching where appropriate. Your other approaches sound like a waste of time to me. Too much effort for too little gain. SQLite and caching is tried and tested.
Besides, you should not worry about waste of resources unless you are running short on them. Your application doesnt sound like it needs scaling at this point. So build the simplest thing that will work.
If you have to have that static pages approach, then put all those files into a symlinked folder. Create a script that generates the static pages into a new folder (either via cron or manual trigger) and then changes the symlink from the old folder to the new folder. This way you don't have to worry about people hitting your site while its generating content.
you should use SQLite with ADODB or any other supported database and implement caching. See the ADODB compatibility list http://phplens.com/lens/adodb/docs-adodb.htm#drivers. The caching feature is really powerful, ADODB is very famous and well documented.
First of all, the website I run is hosted and I don't have access to be able to install anything interesting like memcached.
I have several web pages displaying HTML tables. The data for these HTML tables are generated using expensive and complex MySQL queries. I've optimized the queries as far as I can, and put indexes in place to improve performance. The problem is if I have high traffic to my site the MySQL server gets hammered, and struggles.
Interestingly - the data within the MySQL tables doesn't change very often. In fact it changes only after a certain 'event' that takes place every few weeks.
So what I have done now is this:
Save the HTML table once generated to a file
When the URL is accessed check the saved file if it exists
If the file is older than 1hr, run the query and save a new file, if not output the file
This ensures that for the vast majority of requests the page loads very fast, and the data can at most be 1hr old. For my purpose this isn't too bad.
What I would really like is to guarantee that if any data changes in the database, the cache file is deleted. This could be done by finding all scripts that do any change queries on the table and adding code to remove the cache file, but it's flimsy as all future changes need to also take care of this mechanism.
Is there an elegant way to do this?
I don't have anything but vanilla PHP and MySQL (recent versions) - I'd like to play with memcached, but I can't.
Ok - serious answer.
If you have any sort of database abstraction layer (hopefully you will), you could maintain a field in the database for the last time anything was updated, and manage that from a single point in your abstraction layer.
e.g. (pseudocode): On any update set last_updated.value = Time.now()
Then compare this to the time of the cached file at runtime to see if you need to re-query.
If you don't have an abstraction layer, create a wrapper function to any SQL update call that does this, and always use the wrapper function for any future functionality.
There are only two hard things in
Computer Science: cache invalidation
and naming things.
—Phil Karlton
Sorry, doesn't help much, but it is sooooo true.
You have most of the ends covered, but a last_modified field and cron job might help.
There's no way of deleting files from MySQL, Postgres would give you that facility, but MySQL can't.
You can cache your output to a string using PHP's output buffering functions. Google it and you'll find a nice collection of websites explaining how this is done.
I'm wondering however, how do you know that the data expires after an hour? Or are you assuming the data wont change that dramatically in 60 minutes to warrant constant page generation?