I am crawling some websites for images. However, some of these sites have the .ashx extension, which makes me unable to determine the sizes of the images.
I am using getimagesize():
$imgsize = getimagesize($url);
This results in the following error:
getimagesize(url): failed to open stream: No such file or directory
How can I get around this, and check the size of the image?
I think you can attempt to use the code from this answer, although I think it should work the way you have it already:
// From #dynamic: https://stackoverflow.com/a/10971333/697370
$cont = file_get_contents($url);
$r = imagecreatefromstring($cont);
imagesx($r);
imagerx($r);
Your original code may not be working if:
You don't have allow_url_fopen enabled (unlikely as I think that would also cause you to not be able to fetch regular images with getimagesize()), or
The .ashx script is correctly generating an image, but is not sending the correct headers which is causing getimagesize to fail.
There may be other causes, but those are the only two I can think of.
Related
I am trying to get a file size of an image from a remote url, I am trying to this like so:
$remoteUrl = $file->guid;
//remote url example: http://myApp/wp-content/uploads/2017/05/Screen-Shot-2017-05-08-at-10.35.54.png
$fileSize = filesize($remoteUrl);
But, I get:
filesize(): stat failed for
http://myApp/wp-content/uploads/2017/05/Screen-Shot-2017-05-08-at-10.35.54.png
You can use HTTP headers to find the size of the object. The PHP function get_headers() can get them:
$headers = get_headers('http://myApp/wp-content/uploads/2017/05/Screen-Shot-2017-05-08-at-10.35.54.png', true);
echo $headers['Content-Length'];
This way you can avoid downloading the entire file. You also have access to all other headers, such as $headers['Content-Type'], which can come in handy if you are dealing with images (documentation).
That error usually means the supplied URL does not return a valid image. When I try to visit http://myapp/wp-content/uploads/2017/05/Screen-Shot-2017-05-08-at-10.35.54.png it does not show an image in my browser. Double check that the returned URL from $file->guid; is correct.
You will need to try a different method, the http:// stream wrapper does not support stat() which is needed for the filesize() function.
This question has some options you might use, using curl to make a HEAD request and inspect the Content-Length header.
I have a DB system built in PHP/MySql. I'm fairly new at this. The system allows the user to upload an invoice. Others give permission to pay the invoice. The accounting person uploads the check. After check is uploaded, it generates a PDF as a cover, then uses PDFTK (using Ben Squire's PDFTK-PHP-Library) to combine all of the files together and present the user with a single PDF to download.
Some users upload PDF files which cause PDFTK to hang indefinitely when it tries to combine the PDF with others (but most of the time it works fine). No returned error, just hangs. In order to get back onto the sytem, user must clear cache and re-log in. There are no error messages logged by the server, it just freezes. The only difference I can find in the files that do or do not work in looking at them with Acrobat is that the bad files are legal sized (8.5 x 14) ... but if I create my own legal sized file and try that, it works fine.
Using Putty I've gone to command line and replicated the same problem, PDFTK can't read the file, it hangs on the command line as well. I tried using PDFMerge which uses FPDF to combine the files and get an error with the file as well (The error I get back from this is: FPDF error: Unable to find object (4, 0) at expected location). On the command line I was able to use ImageMagick to convert PDF to JPG, but it gives me an error: "Warning: File has an invalid xref entry: 2. Rebuilding xref table." and then it converts it to a jpg but gives a few other less helpful warnings.
If I could get PHP to check the PDF file to determine if is valid without hanging the system, I could use ImageMagick to convert the file and then convert it back to a PDF, but I don't want to do this to all files. How can I get it to check the validity of the file when uploaded to see if it needs to be converted without causing the system to hang?
Here is a link to a file that is causing problems: http://www.cssc-testing.org/accounting/school_9/20130604-a1atransportation-1.pdf
Thanks in advance for any guidance you can offer!
My Code (which I'm guessing is not very clean, as I'm new):
$pdftk = new pdftk();
if($create_cover) { $pdftk->setInputFile(array("filename" => $cover_page['server'])); }
// Load a list of attachments
$sql = "SELECT * FROM actg_attachments WHERE trans_id = {$trans_id}";
$attachments = Attachment::find_by_sql($sql);
foreach($attachments as $attachment) {
// Check if the file exists from the attachments
$attachment->set_variables();
$file = $attachment->abs_path . DS . $attachment->filename;
if(file_exists($file)){
// Use the pdftk tool to attach the documents to this PDF
$pdftk->setInputFile(array("filename" => $file));
}
}
$pdftk->setOutputFile($save_file);
$pdftk->_renderPdf();
the $pdftk class it is calling is from: https://github.com/bensquire/php-pdtfk-toolkit
You could possibly use Ghostscript using exec() to check the file.
The non-accepted answer here may help:
How can you find a problem with a programmatically generated PDF?
I wont say this is an appropriate/best fix, but it may resolve your problem,
In: pdf_parser.php, comment out the line:
$this->error("Unable to find object ({$obj_spec[1]}, {$obj_spec[2]}) at expected location");
It should be near line 544.
You'll also likely need to replace:
if (!is_array($kids))
$this->error('Cannot find /Kids in current /Page-Dictionary');
with:
if (!is_array($kids)){
// $this->error('Cannot find /Kids in current /Page-Dictionary');
return;
}
in the fpdi_pdf_parser.php file
Hope that helps. It worked for me.
I am trying to fetch the meta information from URL results passed after a search. I have been using the OpenGraph library and also PHP's native get_meta_tags function to retrieve the meta tags.
My problem is when I am reading through the contents of a URL that happens to be a .m4v extension. The program tries to read the contents of that file but it is way too large (and not mention, completely useless as it is all junk) and my program refuses to let it go. Therefore, I am stuck till the program throws a timeout error and moves on.
Is there any way to stop reading the contents of the file if it is way too large? I tried file_get_contents() with the maxlen parameter, but it still seems to read through the entire page. How can I quickly determine if a file is structured with tags before I dive in to farm it for meta?
get_headers() is what you need, there's a Content-Type and Content-Length in the response that you might be interested in.
You might want to:
$headers=get_headers($url,1);
Use php's filesize($yourFile); to find the file size in bytes:
$size = filesize($yourFile);
if ($size < 1000) {
$string = file_get_contents($yourFile);
}
I have a php script that needs to determine the size of a file on the file system after being manipulated by a separate php script.
For example, there exists a zip file that has a fixed size but gets an additional file of unknown size inserted into it based on the user that tries to access it. So the page that's serving the file is something like getfile.php?userid=1234.
So far, I know this:
filesize('getfile.php'); //returns the actual file size of the php file, not the result of script execution
readfile('getfile.php'); //same as filesize()
filesize('getfile.php?userid=1234'); //returns false, as it can't find the file matching the name with GET vars attached
readfile('getfile.php?userid=1234'); //same as filesize()
Is there a way to read the result size of the php script instead of just the php file itself?
filesize
As of PHP 5.0.0, this function can also be used with some URL
wrappers.
something like
filesize('http://localhost/getfile.php?userid=1234');
should be enough
Someone had posted an option for using curl to do this but removed their answer after a downvote. Too bad, because it's the one way I've gotten this to work. So here's their answer that worked for me:
$ch = curl_init('http://localhost/getfile.php?userid=1234');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //This was not part of the poster's answer, but I needed to add it to prevent the file being read from outputting with the requesting script
curl_exec($ch);
$size = 0;
if(!curl_errno($ch))
{
$info = curl_getinfo($ch);
$size = $info['size_download'];
}
curl_close($ch);
echo $size;
The only way to get the size of the output is to run it and then look. Depending on the script the result might differ though for practical use the best thing to do is to estimate basd on your knowledge. i.e. if you have a 5MB file and add another 5k user specific content it's still about 5MB in the end etc.
To expand on Ivan's answer:
Your string is 'getfile.php' with or without GET parameters, this is being treated as a local file, and therefore retrieving the filesize of the php file itself.
It is being treated as a local file because it isn't starting with the http protocol. See http://us1.php.net/manual/en/wrappers.php for supported protocols.
When using filesize() I got a warning:
Warning: filesize() [function.filesize]: stat failed for ...link... in ..file... on line 233
Instead of filesize() I found two working options to replace it:
1)
$headers = get_headers($pdfULR, 1);
$fileSize = $headers['Content-Length'];
echo $fileSize;
2)
echo strlen(file_get_contents($pdfULR));
Now it's working fine.
Currently if a user POST/uploads a photo to my PHP script I start out with some code like this
getimagesize($_FILES['picture1']['tmp_name']);
I then do a LOT more stuff to it but I am trying to also be able to get a photo from a URL and process it with my other existing code if I can. SO I am wanting to know, I f I use something like this
$image = ImageCreateFromString(file_get_contents($url));
Would I be able to then run getimagesize() on my $image variable?
UPDATE
I just tried this...
$url = 'http://a0.twimg.com/a/1262802780/images/twitter_logo_header.png';
$image = imagecreatefromstring(file_get_contents($url));
$imageinfo = getimagesize($image);
print_r($imageinfo);
But it didnt work, gave this.
Warning: getimagesize(Resource id #4) [function.getimagesize]: failed to open stream: No such file or directory in
Any idea how I can do this or something similar to get the result I am after?
I suggest you follow this approach:
// if you need the image type
$type = exif_imagetype($url);
// if you need the image mime type
$type = image_type_to_mime_type(exif_imagetype($url));
// if you need the image extension associated with the mime type
$type = image_type_to_extension(exif_imagetype($url));
// if you don't care about the image type ignore all the above code
$image = ImageCreateFromString(file_get_contents($url));
echo ImageSX($image); // width
echo ImageSY($image); // height
Using exif_imagetype() is a lot faster than getimagesize(), the same goes for ImageSX() / ImageSY(), plus they don't return arrays and can also return the correct image dimension after the image has been resized or cropped for instance.
Also, using getimagesize() on URLs isn't good because it'll consume much more bandwidth than the alternative exif_imagetype(), from the PHP Manual:
When a correct signature is found, the
appropriate constant value will be
returned otherwise the return value is
FALSE. The return value is the same
value that getimagesize() returns in
index 2 but exif_imagetype() is much
faster.
That's because exif_imagetype() will only read the first few bytes of data.
If you've already got an image resource, you'd get the size using the imagesx and imagesy functions.
getimagesize can be used with HTTP.
Filename - It can reference a local file or (configuration permitting) a remote file using one of the supported streams.
Thus
$info = getimagesize($url);
$image = ImageCreateFromString(file_get_contents($url));
should be fine.
Not sure if this will help, but I ran into a similar issue and it turned out the firewall controlled by my host was blocking outgoing http connection from my server.
They changed the firewall settings. My code then worked.
BTW: I thought this might have been an issue when I tried file_get_contents() on a number of urls, none of which worked!