Is it possible to keep a SQL connection/session "open" between PHP program iterations, so the program doesn't have to keep re-logging in?
I've written a PHP program that continually (and legally/respectfully) polls the web for statistical weather data, and then dumps it into a local MYSQL database for analysis. Rather than having to view the data through the local database browser, I've wanted to have it available as an online webpage hosted by an external web host.
Not sure of the best way to approach this, I exported the local MYSQL database up onto my web host's server, figuring that because the PHP program needs to be continually looping (and longer than the default runtime, with HTML also continually refreshing its page), it would be best if I kept the "engine" on my local computer where I can have the page continually looping in a browser, and then have it connect to the database up on my web server and dump the data there.
It worked for a few hours. But then, as I feared might happen, I lost access to my cPanel login/host. I've since confirmed through my own testing that my IP has been blocked (the hosting company is currently closed), no doubt due to the PHP program reconnecting to the online SQL database once every 10 minutes. I didn't think this behavior and amount of time between connections would be enough to warrant an IP blacklisting, but alas, it was.
Now, aside from the possibility of getting my IP whitelisted with the hosting company, is there a way to keep a MYSQL session/connection alive so that a program doesn't have to keep re-logging in between iterations?
I suppose this might only be possible if I could keep the PHP program running indefinitely, perhaps after manually adjusting the max run-time limits (I don't know if there would be other external limitations, too, perhaps browser limits). I'm not sure if this is feasible, or would work.
Is there some type of low-level system-wide "cookie" for a MYSQL connection? With the PHP program finishing and closing (and then waiting for the HTML to refresh the page), I suppose the only way to not have to re-log in again would be with some type of cookie, or IP address access (which would need server-side functionality/implementation).
I'll admit that my approach here probably isn't the most efficient/effective way to accomplish this. Thus, I'm also open to alternative approaches and suggestions that would accomplish the same end result -- a continual web-scrape loop that dumps into a database, and then have the database continually dumped to a webpage.
(I'm seeking a way to accomplish this other than asking my webhost for an IP whitelist, or merely determining their firewall's access ban rate. I'll do either of these if there's truly no feasible or better way.)
Perhaps you can try Persistent Database Connection.
This link explains about persistent connectivity: http://in2.php.net/manual/en/function.mysql-pconnect.php
Related
I'm working on a system that needs to get a country code based on IP address, and needs to be accessible to multiple applications of all shapes and sizes on multiple servers.
At the moment, this is obtained via a cURL request to a preexisting geo.php library, which I think resolves the country code from a .dat file downloaded from MaxMind. Apparently though this method has been running into problems under heavy loads, perhaps due to a memory leak? No one's really sure.
The powers that be have suggested to me that we should dispense with the cURLing and derive the countrycode from a locally based geocoding library with the data also stored in a local file. Or else possibly a master file hosted on, e.g., Amazon S3. I'm feeling a bit wary of having a massive file of IP-to-country lookups stored unnecessarily in a hundred different places, of course.
One thing I've done is put the data in a mysql database and obtained the required results by connecting to that; I don't know for sure, but it seems to me that our sites generally run swiftly and efficiently while connecting to centralised mysql data, so wouldn't this be a good way of solving this particular problem?
My question then: what are the relative overheads of obtaining data in different ways? cURLing it in, making a request to a remote database, getting it from a local file, getting it from a file hosted somewhere else? It's difficult to work out which of these are more efficient or inefficient, and whether the relative gains in efficiency are likely to be big enough to matter...
I had a website using cURL to get the country code text from maxmind as well for about 1.5 years with no problems as far as I could tell. One thing that I did do though was set a timeout of ~1-2 seconds for the cURL request and default back to a set country code if it didn't hit it. We went through about 1 million queries to maxmind I believe, so it must have been used.... If it didn't reach it in that time, I didn't want to slow the page anymore. That's the main disadvantage of using an external library - relying on their response time.
As for having it locally, the main thing to be concerned about is: will it be up to date a year from now? Obviously you can't get any more different IP addresses out of the current IPv4 pool, but potentially ISPs could buy/sell/trade IPs with different countries (I don't know how it works, but I've seen plenty of IPs from different countries and they never seem to have any pattern to them lol). If that doesn't happen, disregard that part :p. The other thing about having it locally is you could use mysql query cache to store the result so you don't have to worry about resources on subsequent page loads, or alternatively just do what I did and store it in a Cookie and check that first before cURLing (or doing a lookup).
You state this question wrong way.
There are only two different methods:
a network lookup
a local resource request
And only one answer:
NEVER do any network lookups while serving client request.
So, as long as you're accessing local resource (okay - in the limits of the same datacenter) - you're all right.
If you're requesting some distant resource - no matter it's curl or database or whatever - you're in trouble.
That rule seems obvious for me.
We have 2 applications: web-based (PHP) and Desktop (VB) sharing the same database (Hostgator). Our web app has a fast access to the database (it's localhost). Our desktop application suffers with slow access and frequent timeouts.
What's the best approach to this kind of issue? Should we share a database? Are there any other solution.
Thanks
Some possible solutions:
Get a faster DB server
Move your database to a server that is closer to the desktop(s)
Host your webserver/DB at the location of the desktop(s)
Have two DBs, the current one that is local to the webserver and a second one that is local to the desktop(s) and set the second up as a slave to the first. You would have to consider if the desktop(s) write to the DB in this scenario. This option is probably not a good one unless the desktop(s) are read-only and aren't worried about possibly out-of-date data. This could potentially work if the desktop(s) read a lot but write less frequently.
There is no problem to "share" a DB. Have you checked the server load and the connection stability?
AFAIK, I dont suppose, this could be a problem. Because, web or desktop, both access the database with MySQL server, so it mustn't be giving mixed performance results.
The problem is probably not that it's shared; rather, it's probably the network that the data is going over. There are very few circumstances in which it's faster to use a network connection than localhost for accessing MySQL data, so you can't expect the same performance from both.
However, you should be able to get a fairly fast and reliable db connection over a good network. If you're moving huge amounts of data, you may have to employ some sort of caching. But if the issues are happening even on moderately-sized queries, you may have to bring that issue to your hosting company for troubleshooting. Many shared hosts are not optimized for remote DB hosting (most sites don't need/use/want it), so if they can't accomodate it, you may have to move to a host that will meet your needs.
I am designing a file download network.
The ultimate goal is to have an API that lets you directly upload a file to a storage server (no gateway or something). The file is then stored and referenced in a database.
When the file is requsted a server that currently holds the file is selected from the database and a http redirect is done (or an API gives the currently valid direct URL).
Background jobs take care of desired replication of the file for durability/scaling purposes.
Background jobs also move files around to ensure even workload on the servers regarding disk and bandwidth usage.
There is no Raid or something at any point. Every drive ist just hung into the server as JBOD. All the replication is at application level. If one server breaks down it is just marked as broken in the database and the background jobs take care of replication from healthy sources until the desired redundancy is reached again.
The system also needs accurate stats for monitoring / balancing and maby later billing.
So I thought about the following setup.
The environment is a classic Ubuntu, Apache2, PHP, MySql LAMP stack.
An url that hits the currently storage server is generated by the API (thats no problem far. Just a classic PHP website and MySQL Database)
Now it gets interesting...
The Storage server runs Apache2 and a PHP script catches the request. URL parameters (secure token hash) are validated. IP, Timestamp and filename are validated so the request is authorized. (No database connection required, just a PHP script that knows a secret token).
The PHP script sets the file hader to use apache2 mod_xsendfile
Apache delivers the file passed by mod_xsendfile and is configured to have the access log piped to another PHP script
Apache runs mod_logio and an access log is in Combined I/O log format but additionally estended with the %D variable (The time taken to serve the request, in microseconds.) to calculate the transfer speed spot bottlenecks int he network and stuff.
The piped access log then goes to a PHP script that parses the url (first folder is a "bucked" just as google storage or amazon s3 that is assigned one client. So the client is known) counts input/output traffic and increases database fields. For performance reasons i thought about having daily fields, and updating them like traffic = traffic+X and if no row has been updated create it.
I have to mention that the server will be low budget servers with massive strage.
The can have a close look at the intended setup in this thread on serverfault.
The key data is that the systems will have Gigabit throughput (maxed out 24/7) and the fiel requests will be rather large (so no images or loads of small files that produce high load by lots of log lines and requests). Maby on average 500MB or something!
The currently planned setup runs on a cheap consumer mainboard (asus), 2 GB DDR3 RAM and a AMD Athlon II X2 220, 2x 2.80GHz tray cpu.
Of course download managers and range requests will be an issue, but I think the average size of an access will be around at least 50 megs or so.
So my questions are:
Do I have any sever bottleneck in this flow? Can you spot any problems?
Am I right in assuming that mysql_affected_rows() can be directly read from the last request and does not do another request to the mysql server?
Do you think the system with the specs given above can handle this? If not, how could I improve? I think the first bottleneck would be the CPU wouldnt it?
What do you think about it? Do you have any suggestions for improvement? Maby something completely different? I thought about using Lighttpd and the mod_secdownload module. Unfortunately it cant check IP adress and I am not so flexible. It would have the advantage that the download validation would not need a php process to fire. But as it only runs short and doesnt read and output the data itself i think this is ok. Do you? I once did download using lighttpd on old throwaway pcs and the performance was awesome. I also thought about using nginx, but I have no experience with that. But
What do you think ab out the piped logging to a script that directly updates the database? Should I rather write requests to a job queue and update them in the database in a 2nd process that can handle delays? Or not do it at all but parse the log files at night? My thought that i would like to have it as real time as possible and dont have accumulated data somehwere else than in the central database. I also don't want to keep track on jobs running on all the servers. This could be a mess to maintain. There should be a simple unit test that generates a secured link, downlads it and checks whether everything worked and the logging has taken place.
Any further suggestions? I am happy for any input you may have!
I am also planning to open soure all of this. I just think there needs to be an open source alternative to the expensive storage services as amazon s3 that is oriented on file downloads.
I really searched a lot but didnt find anything like this out there that. Of course I would re use an existing solution. Preferrably open source. Do you know of anything like that?
MogileFS, http://code.google.com/p/mogilefs/ -- this is almost exactly thing, that you want.
I have 2 servers. On #1 remote DB access is disabled. Database is huge (~1GB) so there is no possible to dump it with phpMyAdmin as it crashes and hangs the connection. I have no SSH access. I need to copy entire DB to #2 (where I can set up virtually everything).
My idea is to use some kind of HTTP access layer over #1.
For example simple PHP script that accepts query as an _GET/_POST argument and returns result as HTTP body.
On #2 (or my desktop) I could set up some kind of server application that would ask sequentially for every row in every table, even one at the time.
And my question is: do you know some ready-to-use app with such a flow?
BTW: #1 is PHP only, #2 can be PHP, Python etc
I can't run anything on #1, all fopen, curl, sockets, system etc are disabled. I can only access DB from PHP, no remote connections allowed
Can you connect to a remote MySQL server from PHP on Server #1?
I know you said "no remote connections allowed", but you haven't specifically mentioned this scenario.
If this is possible, you could SELECT from your old database and directly INSERT to MySQL running on Server #2.
long time ago I used Sypex Dumper for this, just left open browser for the night and next morning whle db dump was available on ftp
It is not clear what is available on the server #1.
Assuming you can run only php you still might be able to
run scp from php and connect to server #2, send files like that
maybe you can use php to run local commands on server #1? in that case something like running rsync through php from server #1 can work
It sounds to me that you should contact the owner of the host and ask if you can get your data out somehow. It is a bit too stupid that you should stream your entire database and reinsert it on the new machine. It will eat a lot of resources on the php server you get the data from. (And if the hosting provider already is that restrictive, you might have a limit to how much sql operations you are allowed in a time span as well)
Though if you is forced to do it, you could do a select * from table and for each row convert it to a json object that you echo on the line. This you can store to disk on your end and use to insert it afterward.
I suspect you can access both servers using FTP.
How do you get your php files on it otherwise.
Perhaps you can just copy the mysql database if you can access it using FTP.
Works not in all cases check out: http://dev.mysql.com/doc/refman/5.0/en/copying-databases.html for more info.
Or you can try the options found on this link:
http://www.php-mysql-tutorial.com/wikis/mysql-tutorials/using-php-to-backup-mysql-databases.aspx
Don't now how your php is setup though as I can imagine it can take some time to process the entire database.
max execution time setting etc.
There is the replication option, if at least one of the hosts is reachable remotely via TCP. It would be slow to synch a virgin DB this way, but it would have the advantage of preserving all the table metadata that you would otherwise lose by doing select/insert sequences. More details here: http://dev.mysql.com/doc/refman/5.0/en/replication.html
I'm not sure how you managed to get into this situation. The hosting provider is clearly being unreasonable, and you should contact them. You should also stop using their services as soon as possible (although this sounds like what you're trying to do anyway).
Unfortunately PHPMyAdmin is not suitable for database operations which are critical for data as it's got too many bugs and limitations - you certainly shouldn't rely on it to dump or restore.
mysqldump is the only way to RELIABLY dump a database and bring it back and have a good chance for the data to be complete and correct. If you cannot run it (locally or remotely), I'm afraid all bets are off.
Are there any tricks to transferring and/or sharing the Resource Links across multiple PHP pages? I do not want the server to continue connecting to the database just to obtain another link to the same user in the same database.
Remember that the link returned from mysql_connect() is automatically closed after the script it originated on completes executing. Is there anyway to close it manually at the end of each session instead?
PHP allows persistent mysql connections, but there are drawbacks. Most importantly, idle apache children end up sitting around, holding idle database connections open. Database connections take up a decent amount of memory, so you really only want them open when they're actually being used.
If your user opens one page every minute, it's far better to have the database connection closed for the 59 seconds out of every minute you're not using it, and re-open it when needed, than to hold it open continually.
Instead, you should probably look into connection pooling.
One of the advantages of Mysql over other heavier-weight database servers is that connections are cheap and quick to setup.
If you are concerned about large numbers of connections to retrieve the save information, you may like to look at such things as caching of information instead, or as well as getting the information from disk. As usual, profiling of the number, and type of calls to SQL that are being made, will tell you are great deal more than anyone here guessing at what you should really be doing next.