Ok, this may be a dumb question but here goes. I noticed something the other day when I was playing around with different HTML to PDF converters in PHP. One I tried (dompdf) took forever to run on my HTML. Eventually it ran out of memory and ended but while it was still running, none of my other PHP scripts were responsive at all. It was almost as if that one request was blocking the entire Webserver.
Now I'm assuming either that can't be right or I should be setting something somewhere to control that behaviour. Can someone please clue me in?
did you had open sessions for each of the scripts?:) they might reuse the same sesion and that blocks until the session is freed by the last request...so they basically wait for each other to complete(in your case the long-running pdf generator). This only applies if you use the same browser.
Tip, not sure why you want html to pdf, but you may take a look at FOP http://xmlgraphics.apache.org/fop/ to generate PDF's. I'm using it and works great..and fast:) It does have its quirks though.
It could be that all the scripts you tried are running in the same application pool. (At least, that's what it's called in IIS.)
However, another explanation is that some browsers will queue requests over a single connection. This has caused me some confusion in the past. If your web browser is waiting for a response from yourdomain.com/script1.php and you open another window or tab to yourdomain.com/script2.php that request won't be sent until the first request receives a reply making it seem like your entire web server is hanging. An easy way to test if this is what's going on try two requests on two separate browsers.
It sounds like the server is simply being overwhelmed and under too much load to complete the requests. Converting an HTML file to a PDF is a pretty complex process, as the PHP script has to effectively provide the same functionality as a web browser and render the HTML using PDF drawing functions.
I would suggest you either split the HTML into separate, smaller files or run the script as a scheduled task directly through PHP independent of the server.
Related
I am writing a JavaScript for an in-browser IM client for the sake of practicing and learning JavaScript and AJAX.
I need to be able to check for a change in the file size of a text file that is being used as a temporary storage for 40-80 SQL entries that contain messages so that it can update the display.
At the moment I am using a setInterval function to periodically check for a change in file size using short PHP script, but this can cause issues, if the interval is to long, messages are delayed, if it is shorter, it means a lot of php scripts running very quickly, which takes up server resources.
What is the best way to do this if the main concern is to reduce server resource usage?
(I am running my server off of a rather low tech PC I've scraped together(2gb ram, 2.8ghz AMD seperon processor))
Preferably, I would want to do this using an AJAX event triggered by someone sending a message, I.E. When user B triggers the event that edits the file by pressing enter, that triggers a function on user A's side that updates the HTML file
Any ideas? I am open to any solution to this particular problem. I gave specific examples of what I want to happen in the specific languages in order to give a better idea of what it is I am attempting to do.
If there is a way to do this that isn't JavaScript/PHP, I'd also be open to exploring that as an option.
Doing this with PHP can be a bit cumbersome. You could try doing something like long polling where you keep the HTTP request open until the server has new data to send to the user. If messages are sent frequently, this might not be ideal. You might want to consider using event-driven web technologies like node.js with something like Socket.IO.
In any case, you'll likely want to maintain a connection with the server if you want to get the message in near real-time. There are ways to use WebSockets with PHP as well, but PHP isn't really the best for this because it's not designed to keep scripts running for long periods (also see What exactly entails setting up a PHP Websocket Server?).
Browsers & HTTP/ AJAX generally work by a "pull" model. The browser/ or AJAX sends the server a request, then the server answers a response.
There isn't generally much provision for the server to contact the browser, to "push" an event. This can however be simulated by a long-running request, to which the server writes data when the event/ or events occur.
For example, this could be a request that answers "empty" after a timeout of 10-30 seconds.. or the server returns & answers immediately, if there are event(s) in its queue.
With a Java server this is easy to do, and I've used this successfully for event notification in a major integration project a few years back.
However I'm not sure in PHP how much ability there is (probably very near zero) to maintain an overall server state, coordinate or communicate between threads/requests, or maintain event queues.
You could look into something like a Java webapp running on Tomcat. All you need is a basic web.xml and one Servlet class, and you can build just about anything from there.
I've been working on sockets, generally in PHP for a while. Currently I have a PHP client for connecting to a chat server, and output every each data sent from server it's connected to.
To explain that in a wider matter, I accomplished this using flush() function in PHP to write out every each buffer waiting in the loop. Buffer reader is withing a while where the condition is the status of the connection socket. But this matters less.
Now to what I want to accomplish. I want to keep socket handling to server side and data from server outputted to client, via AJAX/jQuery. So far, my researches always returned me HTML5 WebSocket and node.js, however, I "have to" be real picky about this, as for users of this, my minimal dependency might be:
WinXP IE6 users(Already disables jQuery, even)
Users without JAVA/Flash installed
So I have to think of possibilities in this, which is why I can't use a Flash/Java backend or a new technology like WebSockets, and neither I want to handle server stuff in the client. I really hate to be stuck in old technology but for this it's a must.
As I was searching around, I found this one being as similiar to my needs.
Is PHP socket a viable option for making PHP jQuery based chat?
And to quick review the answers, they all point to one direction, PHP multi-process and memory eating. I know this is a minus, but it's the best I can take for now. But yet still, there'll be timeout disconnects for inactive connections within a certain delay, and extension of the delay if wanted. So I'm not much onto this one.
Secondly, the last answer pointing to "Ajax Chat Application Tutorial", I made an overall review but whoa, writing each line into an html file and re-including it each time, that is which I could do without using an extra file but, is it really necessary? Plus re-reading the file from server side, and re-importing the whole read file into document every each time, isn't that just worse for "both sides"?
Either ways that's about it, I wasn't able to come to a conclusion for a while, and it happened, here I am again. (:P) Waiting for your answers/suggestions/ideas, thanks by now.
Regards.
There is server software available that specializes in such matters. Is called a push server/service. There's for example APE (http://www.ape-project.org/); according to their website, it's compatible with all web browsers and they even got a demo chat there. I'd suggest you to go for that solution.
I've got a registration list, which I need to send out a PDF to each person on the list. Each email needs to contain a PDF, which has a base version on the server, but each person's needs to be personalized via name/company etc over the top. This needs to be emailed to each person, which at the moment adds up to be 2,500, but can easily be much higher in the future.
I've only just started working on this project, but the problem I've encountered continuously since last week are that the server doesn't seem to be able to handle doing this. Currently the script is using Zend, which then allows it to use Zend_Pdf and Zend_Mail to create and email the PDFs. Zend_mail connects to an smtp server from smtp.com to do the actual emailing.
Since we have quite a few sites running on the server, we can't afford it to be going down, and when I run it in batches it can start to go down. The best solution I have thus far is running curl from my local machine to the script, which then does one person. The curl script then calls it again, over and over in batches. Even this runs into problems at times, and seems to some how hog memory even after it should be complete (I'm really not sure how).
So what I'm looking for is information on doing this, from libraries, code, information on server setups, anything that can make this much less painful, and much quicker for us to run. I've run out of ideas, and this is something I've not really had to do before (especially at a bulk level).
Thank you.
Edit:
I also forgot to mention that it's using zend_barcode::factory for creating a barcode on the PDF.
First step I suggest is to work out where the problem lies if you can. Is it the PDF generation? Is it the emailing? "Server doesn't seem to be able to handle this" doesn't say what is actually failing as with the "server goes down" - you need to determine if you are running out of memory/disk-space/time or something else. That will help you determine if you need a tweak or a new approach to your generation. Because you said that even single manual invocations can fail you should be able to narrow the problem down to exactly what is the cause of the failure.
If you are running near some resource limit (which might be the case with several sites running), you probably need to offload this capability onto another machine. Your options include:
run the same setup on a new host and adjust your applications to use the new system
run a new setup on a new host
use an external system (such as the mentioned PDFCrowd or Docmosis)
Start with the specifics of the problem. I hope that helps. Please note I work for the company that created Docmosis.
Here's some ideas:
Is there a particular reason this has to run on a web server? Why not run the framework
from a different machine, but with the same settings? You might have to create a different
controller to handle the command-line version of the request, but there's no fundamental
reason it can't work.
If creating PDFs programatically is giving you a headache, you can instead use a service.
In the past, I've used PDFCrowd with good results, and they provided
a useful PHP library. You can give them a blob of HTML, using full URLs for any stylesheets
and images, and they'll create a PDF for you.
The cost per document varies from 0.5-4.5 cents per document depending on your rate plan.
There are other services which do the same thing.
If this kind of batch job is a big deal for your company, you might consider an
asynchronous job queue like beanstalk. You could queue
up thousands of these, and a worker script could handle the requests at whatever pace you
deem reasonable.
From my experience - two options:
Dynamically generate PDFs using one or more PDF libraries (which can be awfully slow).
OR
Use something like wkhtmltopdf which is a simple shell utility to convert html to pdf using the webkit rendering engine, and qt.
Basically, you can loop over n HTML pages and generate PDF's without the overhead of purely dynamic PDF generation!
We've used this to distribute thousands of personalised PDF's on a daily basis as it quickly converts HTML pages to PDF. There are dependencies, but it works and is less intensive (computationally) than 'creating' PDFs individually.
Hope this helps.
If you are trying to call the script over HTTP, the script will timeout based on the max_execution_time specified in the php.ini.
You need to write a php script which can be run from command line and then schedule it via a cron job. The script at a time, can read one user, put together his pdf file, and email him. After that, you might have to run some performance checks to see if the server can handle the process.
I have a series of XML files which can be retrieved, edited and saved by a User. My intention is to allow multiple Users to edit these files at the same time. Many parts of these XML files relate to content displayed in the browser UI for example a <name>My title</name> node is displayed and can be edited.
The technologies I'm using are Javascript, PHP, and a master XML file containing references to other XML files (both master and referenced files can be edited in the UI). The server is WebDAV enabled, and WebDAV methods are used via YUI3's io module to handle retrieval, saving, collection moving etc.
How do I go about updating UIs where these resources are being used, based on the contents of the edited and saved XML file(s)?
I know I could probably run setTimeouts and whatnot to check for updates, but it seems more intuitive to make the UI respond only when data is changed.
cheers!
The feature you're describing is similar to a technique known as server-push. What you're asking to do is a very tricky thing for a web app (especially for PHP, which is built around the idea of a request that gets served and the script terminating).
HTML5 is introducing technologies such as websockets for maintaining a persistent connection to a server, you could look into websockets as a solution, but it's a brand-new technology and I don't think the spec is even finalized yet, so it will only be implemented in the very latest versions of browsers, if at all.
You've already mentioned AJAX polling (driven by setInterval), but you've also noticed that it's problematic. You're right of course, it is, local data can become stale in the interval between polls, and you'll generate a lot of traffic between the server and any open clients.
An alternative is so called "long-polling". The idea is the client starts an AJAX session with the server. On the server the script invoked by the client basically just sits there and waits for something to change. When it does, the server notifies the client by sending a JSON/XML/whatever response and closing the AJAX session. When the client receives the response, it processes it and initiates a new AJAX connection to wait for another server response.
This approach is almost instantaneous, because data gets pushed to the client as soon as it's available. However, it also means lots of open connections to the server and this can put the server under a lot of load. Also, PHP scripts aren't really meant to run or sleep for a long time due to the request-response model the language is built around. It is possible, but probably not advisable to follow this approach.
How do I implement basic "Long Polling"? has some examples of the long-polling technique.
Good luck!
My question is whether or not using multiple PHP includes() is a bad idea. The only reason I'm asking is because I always hear having too many stylesheets or scripts on a site creates more HTTP requests and slows page loading. I was wondering the same about PHP.
The detailed answer:
Every CSS or JS file referenced in a web page is actually fetched over the network by the browser, which involves often 100s of milliseconds or more of network latency. Requests to the same server are (by convention, though not mandated) serialized one or two at a time, so these delays stack up.
PHP include files, on the other hand, are all processed on the server itself. Instead of 100s of milliseconds, the local disk access will be 10s of milliseconds or less, and if cached, will be direct memory accesses which is even faster.
If you use something like http://eaccelerator.net/ or http://php.net/manual/en/book.apc.php then your PHP code will all be precompiled on the server once, and it doesn't even matter if you're including files or dumping them all in one place.
The short answer:
Don't worry about it. 99 times out of 100 with this issue, the benefits of better code organization outweigh the performance increases.
The use of includes helps with code organization, and is no hindrance in itself. If you're loading up a bunch of things you don't need, that will slow things down -- but that's another problem. Clarification: As you include pages, be aware what you're adding to the load; don't carelessly include unneeded resources.
As already said, the use of multiple PHP includes helps to keep your code organized, so it is not a bad idea. It will become a problem when too many includes are used, because the web server will have to perform an I/O operetation for each include you have.
If you have a large web application, you can boost it using a PHP accelerator, which caches data and compiled code from the PHP bytecode compiler in shared memory. If you have lots of PHP includes in a specific file they will be performed just once. Further calls to that file will hit the cache, so no PHP require will be performed.
It really depends on what you want to do, I mean if you have a piece of code that is used all the time it is really convenient to include it instead of copying and pasting all the time and that will make your code more clear and not slower, but if you include all the functions or classes you have written in your files without using them of course thats not a good practice...I would suggest using a framework (like codeigniter or something else you find convinient) because it really helps clearing this things out... good luck!
The only reason I'm asking is because
I always hear having too many
stylesheets or scripts on a site
creates more HTTP requests and slows
page loading. I was wondering the same
about PHP.
Do notice that an HTTP request is several orders of magnitude slower that a PHP include.
Several HTTP requests -> Client has to request and accept over the wire several files
Several PHP includes -> Client has to request and accept only one file
The includes, obviously, will have a server penalty. But by your question... don't sweat it. Only on really large-scale PHP projects will you face such problems.