I am having trouble uploading files to S3 from on one of our servers. We use S3 to store our backups and all of our servers are running Ubuntu 8.04 with PHP 5.2.4 and libcurl 7.18.0. Whenever I try to upload a file Amazon returns a RequestTimeout error. I know there is a bug in our current version of libcurl preventing uploads of over 200MB. For that reason we split our backups into smaller files.
We have servers hosted on Amazon's EC2 and servers hosted on customer's "private clouds" (a VMWare ESX box behind their company firewall). The specific server that I am having trouble with is hosted on a customer's private cloud.
We use the Amazon S3 PHP Class from http://undesigned.org.za/2007/10/22/amazon-s3-php-class. I have tried 200MB, 100MB and 50MB files, all with the same results. We use the following to upload the files:
$s3 = new S3($access_key, $secret_key, false);
$success = $s3->putObjectFile($local_path, $bucket_name,
$remote_name, S3::ACL_PRIVATE);
I have tried setting curl_setopt($curl, CURLOPT_NOPROGRESS, false); to view the progress bar while it uploads the file. The first time I ran it with this option set it worked. However, every subsequent time it has failed. It seems to upload the file at around 3Mb/s for 5-10 seconds then drops to 0. After 20 seconds sitting at 0, Amazon returns the "RequestTimeout - Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed." error.
I have tried updating the S3 class to the latest version from GitHub but it made no difference. I also found the Amazon S3 Stream Wrapper class and gave that a try using the following code:
include 'gs3.php';
define('S3_KEY', 'ACCESSKEYGOESHERE');
define('S3_PRIVATE','SECRETKEYGOESHERE');
$local = fopen('/path/to/backup_id.tar.gz.0000', 'r');
$remote = fopen('s3://bucket-name/customer/backup_id.tar.gz.0000', 'w+r');
$count = 0;
while (!feof($local))
{
$result = fwrite($remote, fread($local, (1024 * 1024)));
if ($result === false)
{
fwrite(STDOUT, $count++.': Unable to write!'."\n");
}
else
{
fwrite(STDOUT, $count++.': Wrote '.$result.' bytes'."\n");
}
}
fclose($local);
fclose($remote);
This code reads the file one MB at a time in order to stream it to S3. For a 50MB file, I get "1: Wrote 1048576 bytes" 49 times (the first number changes each time of course) but on the last iteration of the loop I get an error that says "Notice: fputs(): send of 8192 bytes failed with errno=11 Resource temporarily unavailable in /path/to/http.php on line 230".
My first thought was that this is a networking issue. We called up the customer and explained the issue and asked them to take a look at their firewall to see if they were dropping anything. According to their network administrator the traffic is flowing just fine.
I am at a loss as to what I can do next. I have been running the backups manually and using SCP to transfer them to another machine and upload them. This is obviously not ideal and any help would be greatly appreciated.
Update - 06/23/2011
I have tried many of the options below but they all provided the same result. I have found that even trying to scp a file from the server in question to another server stalls immediately and eventually times out. However, I can use scp to download that same file from another machine. This makes me even more convinced that this is a networking issue on the clients end, any further suggestions would be greatly appreciated.
This problem exists because you are trying to upload the same file again. Example:
$s3 = new S3('XXX','YYYY', false);
$s3->putObjectFile('file.jpg','bucket-name','file.jpg');
$s3->putObjectFile('file.jpg','bucket-name','newname-file.jpg');
To fix it, just copy the file and give it new name then upload it normally.
Example:
$s3 = new S3('XXX','YYYY', false);
$s3->putObjectFile('file.jpg','bucket-name','file.jpg');
now rename file.jpg to newname-file.jpg
$s3->putObjectFile('newname-file.jpg','bucket-name','newname-file.jpg');
I solved this problem in another way. My bug was, that filesize() function returns invalid cached size value. So just use clearstatcache()
I have experienced this exact same issue several times.
I have many scripts right now which are uploading files to S3 constantly.
The best solution that I can offer is to use the Zend libraries (either the stream wrapper or direct S3 API).
http://framework.zend.com/manual/en/zend.service.amazon.s3.html
Since the latest release of Zend framework, I haven't seen any issues with timeouts. But, if you find that you are still having problems, a simple tweak will do the trick.
Simply open the file Zend/Http/Client.php and modify the 'timeout' value in the $config array. At the time of writing this it existed on line 114. Before the latest release I was running at 120 seconds, but now things are running smooth with a 10 second timeout.
Hope this helps!
There are quite a bit of solutions available. I had this exact problem but I don't wanted to write a code and figure out the problem.
Initially I was searching for a possibility to mount S3 bucket in the Linux machine, found something interesting:
s3fs - http://code.google.com/p/s3fs/wiki/InstallationNotes
- this did work for me. It uses FUSE file-system + rsync to sync the files in S3. It kepes a copy of all filenames in the local system & make it look like a FILE/FOLDER.
This saves BUNCH of our time + no headache of writing a code for transferring the files.
Now, when I was trying to see if there is other options, I found a ruby script which works in CLI, can help you manage S3 account.
s3cmd - http://s3tools.org/s3cmd - this looks pretty clear.
[UPDATE]
Found one more CLI tool - s3sync
s3sync - https://forums.aws.amazon.com/thread.jspa?threadID=11975&start=0&tstart=0 - found in the Amazon AWS community.
I don't see both of them different, if you are not worried about the disk-space then I would choose a s3fs than a s3cmd. A disk makes you feel more comfortable + you can see the files in the disk.
Hope it helps.
You should take a look at the AWS PHP SDK. This is the AWS PHP library formerly known as tarzan and cloudfusion.
http://aws.amazon.com/sdkforphp/
The S3 class included with this is rock solid. We use it to upload multi GB files all of the time.
Related
I've been all over the internet looking for an answer to my problem. Here is the setup, I am running embedded Linux (created with Yocto) which is running the Lighttpd web server with PHP5. In my C++ code I have the following:
shared = shm_open(SHARED_FILE_NAME, O_RDWR | O_CREAT | O_TRUNC, 0666);
ftruncate(shared, FILE_SIZE);
map = mmap(...);
// shm_unlink() isn't called until my C++ thread ends.
Everything works well and I do not get any errors and other C++ processes and threads are also able to access the shared memory and map without any problems (I have one writer thread and all other threads and processes do a read only on the memory). The memory is used as a ring buffer where the writing thread is updating data very quickly. The problems start to occur when trying to access that same memory in PHP. In PHP I do (need read only):
<?php
$shm_key = ftok("/dev/shm/shared_file.shm", 'c');
$shm_id = shm_open($shm_key, "a", 0, 0);
...
?>
When looking at the value from ftok() it returns a non -1 number which means it did not fail. I do get a fail on the PHP's shm_open() call which reads:
Warning: shmop_open(): unable to attach or create shared memory segment in /www/pages/shared.php on line 9
I've changed the permission of the file with chmod 777 /dev/shm/shared.shm just to rule out any file permission issues. Also when I run ipcs -m I do not get any listings for shared memory segments, yet my C++ code is running just fine. I've also looked for SELinux and tried entering setenforce 0 but I get a response of -sh: setenforce: command not found so I figure this isn't an issue. I've also tried running wget <local ip address>/shared.php to see if running locally would return the correct data but when looking at the file which was returned it had the same error messages.
I am looking to be able to have a web page on my embedded system read this shared memory and stream back chunks of binary to feed a graph when a request comes in (not interested in web sockets at the time). I am able to get named pipes to work across PHP and C++ just fine but I need shared memory for this application and the shared memory access seems to be troublesome. Any help is appreciated.
I'm developing PHP functions that need to use C Shared Memory. As your code, my C functions use shm_open, mmap, etc.. and I guess to use PHP ftok(), shmop_open() to access the C's shared memory but this PHP functions don't work.
The two area are not compatible. I found different properties of the two areas in this documents http://menehune.opt.wfu.edu/Kokua/More_SGI/007-2478-008/sgi_html/ch03.html:
C (with shm_open, mmap, like the Straton source code) use “POSIX Shared Memory”
PHP (with shmop_* functions) use “System V Shared Memory”
I suggest you to try with Sync http://php.net/manual/en/book.sync.php: you need the PECL sync extension.
I am using the SabreDAV PHP library to connect to a WebDAV server and download some files but it is taking forever to download a 1MB file and I have to download up to 1GB files from that server. I looked at this link http://code.google.com/p/sabredav/wiki/WorkingWithLargeFiles but it is not helpful because it's telling me that I will get a stream when I do a GET but it is not the case.
Here is my code:
$settings = array(
'baseUri' => 'file url',
'userName' => 'user',
'password' => 'pwd'
);
$client = new \Sabre\DAV\Client($settings);
$response = $client->request('GET');
response is an array with a 'body' key that contains the content of the file. What am I doing wrong? I only need the file for read only. How can I can read through the file line by line as quick as possible?
Thanks in advance.
If its taking too long just to download a 1MB file, then I think its not SabreDAV problem but a problem with your server or network, or perhaps the remote server.
The google code link you mentioned just lists a way if you want to transfer very large files, for that you will have to use the stream and fopen way they mentioned, but I think I was able to transfer 1GB files without using that way and just normally when I last used it with OwnCloud.
If you have a VPS/Dedi server, open ssh and use wget command to test the speed and time it takes to download that remote file from WebDAV, if its same as what its taking with SabreDAV, then its a server/network problem and not SabreDAV, else, its a problem with Sabre or your code.
Sorry but I donot have any code to post to help you since the problem itself is not clear and there can be more than 10 things causing it.
PS: You need to increase php limits for execution time, max file upload and max post size too relatively
Iam trying to upload a file using php. I can upload.zip files up to 3 mb . But can't upload files >3mb. It take a lot of time to submit the html form. I have checked the upload and memory details using the following code.
$max_upload = (int)(ini_get('upload_max_filesize'));
$max_post = (int)(ini_get('post_max_size'));
$memory_limit = (int)(ini_get('memory_limit'));
$upload_mb = min($max_upload, $max_post, $memory_limit);
And it gives the out put as
max_upload=10
memory_limit=64
upload_mb=10
Please help me to find out the solution.
It could also be the webserver, see LimitRequestBody for apache or client_max_body_size for nginx
Another reason would be proxy (transparent proxy?). You can test that by asking someone else to try uploading the file
Have you checked the timeout for your scripts? By default is 30 sec... Maybe that's the limit...
since it takes a lot of time may be you are exceeding the 30 secon timeout
you can alter it by adding
like
set_time_limit(60);
from http://php.net/manual/en/function.set-time-limit.php
and
run this code
<?php
phpinfo(); ?>
Run that file to get your system settings (search for upload_max_filesize, etc);
I gave the same answer to a previous PHP large file upload question, but the answer still applies:
For large files, if you don't want to have to deal with configuring server settings (particularly if you are on shared hosting or some other hosting that doesn't give you full control over the server), one potential solution is to hand the upload off to a third party service.
For example, you could have the form do a direct post to Amazon S3 (http://s3.amazonaws.com/doc/s3-example-code/post/post_sample.html) or use a service like Filepicker.io
Full disclosure: I work at Filepicker.io, but want to help out folks who are dealing with issues doing large file uploads
What I would like to script: a PHP script to find a certain string in loads of files
Is it possible to read contents of thousands of text files from another ftp server without actually downloading those files (ftp_get) ?
If not, would downloading them ONCE -> if already exists = skip / filesize differs = redownload -> search certain string -> ...
be the easiest option?
If URL fopen wrappers are enabled, then file_get_contents can do the trick and you do not need to save the file on your server.
<?php
$find = 'mytext'; //text to find
$files = array('http://example.com/file1.txt', 'http://example.com/file2.txt'); //source files
foreach($files as $file)
{
$data = file_get_contents($file);
if(strpos($data, $find) !== FALSE)
echo "found in $file".PHP_EOL;
}
?>
[EDIT]: If Files are accessible only by FTP:
In that case, you have to use like this:
$files = array('ftp://user:pass#domain.com/path/to/file', 'ftp://user:pass#domain.com/path/to/file2');
If you are going to store the files after you download them, then you may be better served to just download or update all of the files, then search through them for the string.
The best approach depends on how you will use it.
If you are going to be deleting the files after you have searched them, then you may want to also keep track of which ones you searched, and their file date information, so that later, when you go to search again, you won't waste time searching files that haven't changed since the last time you checked them.
When you are dealing with so many files, try to cache any information that will help your program to be more efficient next time it runs.
PHP's built-in file reading functions, such as fopen()/fread()/fclose() and file_get_contents() do support FTP URLs, like this:
<?php
$data = file_get_contents('ftp://user:password#ftp.example.com/dir/file');
// The file's contents are stored in the $data variable
If you would need to get a list of the files in the directory, you might want to check out opendir(), readdir() and closedir(), which I'm pretty sure supports FTP URLs.
An example:
<?php
$dir = opendir('ftp://user:password#ftp.example.com/dir/');
if(!$dir)
die;
while(($file = readdir($dir)) !== false)
echo htmlspecialchars($file).'<br />';
closedir($dir);
If you can connect via SSH to that server, and if you can install new PECL (and PEAR) modules, then you might consider using PHP SSH2. Here's a good tutorial on how to install and use it. This is a better alternative to FTP. But if it is not possible, your only solution is file_get_content('ftp://domain/path/to/remote/file');.
** UPDATE **
Here is a PHP-only implementation of an SSH client : SSH in PHP.
With FTP you'll always have to download to check.
I do not know what kind of bandwidth you're having and how big the files are, but this might be an interesting use-case to run this from the cloud like Amazon EC2, or google-apps (if you can download the files in the timelimit).
In the EC2 case you then spin up the server for an hour to check for updates in the files and shut it down again afterwards. This will cost a couple of bucks per month and avoid you from potentially upgrading your line or hosting contract.
If this is a regular task then it might be worth using a simple queue system so you can run multiple processes at once (will hugely increase speed) This would involve two steps:
Get a list of all files on the remote server
Put the list into a queue (you can use memcached for a basic message queuing system)
Use a seperate script to get the next item from the queue.
The procesing script would contain simple functionality (in do while loop)
ftp_connect
do
item = next item from queue
$contents = file_get_contents;
preg_match(.., $contents);
while (true);
ftp close
You could then in theory fork off multiple processes through the command line without needing to worry about race conditions.
This method is probabaly best suited to crons/batch processing, however it might work in this situation too.
A very strange thing is happening. I am running a script on a new server (it works on my current server and laptop).
The strange thing is that I only get it to (sort of) work when I increase memory limit to 1024M (!). It is extracting a large zip file and going through the files, so I thought it was normal. Instead of this script terminating or ending with errors. I get an error from my browser:
The server at www.localhost.com is
taking too long to respond.
Localhost.com? The web server is just localhost:9090 and I can see Apache is still running. Maybe Apache crashes momentarily and it can't find the server? But nothing about apache crashing in the log files.
This isn't a server issue, its more to do with my PHP script and memory usage I think, so no need to move to server fault.
What could be the problem? How can I narrow do the cause, I am at loss here!
The server is a windows server running Apache 2.2 with PHP version 5.3.2. May laptop and the other working server are running version 5.3.0 and 5.3.1 for PHP.
Thanks all for any help
Ensure that,
ini_set('display_errors','On');
ini_set('error_reporting',E_ALL);
ini_set('max_execution_time', 180);
ini_set('memory_limit','1024MB' );
I'd pop this in the top of the script and see what comes out. It should show you errors and the like.
The other thing, have you checked fopen and the path of the file which it's loading?
Abs said,
check files being zipped up can be zipped by PHP (permissions
especially on a Windows OS with multi
users)
I kept getting this problem too, and none of these sites really helped until I started looking at the same thing for people using Internet Explorer. The way I fixed it is to open up the system hosts file, located at C:\Windows\System32\drivers\etc\hosts, and then uncomment out the line that mentions ::1, which is needed for IPv6. After that it worked fine.
Somehow your system's munged up and isn't treating localhost as the local 127.0.0.1 address. Is your hosts file properly configured? This is most likely why you're getting the "too long to respond" error:
marc#panic:~$ host www.localhost.com
www.localhost.com has address 64.99.64.32
marc#panic:~$ wget www.localhost.com
--2010-08-03 22:41:05-- http://www.localhost.com/
Resolving www.localhost.com... 64.99.64.32
Connecting to www.localhost.com|64.99.64.32|:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.
www.localhost.com is full valid hostname as far as the DNS system is concerned.
I am not a php guru by any means but are you writing the extracted files to a temporary local storage location that is within the scope of the application? Because if you are not then I think what is happening is that the application is storing the zip file and extracted files in memory and then is attempting to read them. So if it is a large zip and/or the extracted files are large that would introduce a huge amount of overhead on top of the overhead introduced by your read and processing actions.
So if you are not already I would extracted the files and write them to disk in their own folder, dispose of the zip file at this point, and then iterate over the files in your newly created directory and perform whatever actions you need to on them.