Web page garbled (encoding?) when downloading from within PHP - php

I am trying to download this page (http://www.360.ru/) from within PHP. However, when I write the file out and view it, the content is garbled/corrupt. However, a different page from the same site downloads with out problems (http://www.360.ru/goods/category/3/466/). And both work perfect well within Chrome & Firefox (which both report the encoding is UTF-8). I can not think what the problem can be. Here is my PHP code:
<?php
file_put_contents('/temp/out.html', fopen("http://www.360.ru/", 'r'));
file_put_contents('/temp/out2.html', fopen("http://www.360.ru/goods/category/3/466/", 'r'));
exit;
?>
When I open the two files, "out.html" is garbled, corrupt and "out2.html" is perfectly okay. Any help would be really appreciated. Thanks!

Ah, figured it out - the first page was gzipped. Using gzopen instead of fopen fixed the problem. Hope this helps others...

Related

PDF Parser PHP Library Not Working

I'm using the PDF Parser PHP library to parse the text from several PDFs. It works perfectly for a majority of these, but seems to just timeout and stop working for certain PDFs.
This is the code I'm using (straight from their demo page):
<?php
include 'vendor/autoload.php';
$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile('document.php');
$text = $pdf->getText();
echo $text;
?>
When I replace 'document.pdf' with the URL to this file, it works perfectly as expected.
However, when I replace 'document.pdf' with the URL to this file, it just times out with a blank page.
Any ideas why it would work for one file and not the other?
Thanks in advance for any advice!
yes this "ghost" error I saw it too, nothing even in error_log, nor tripped in try catch very hard to diagnose if you increase the memory_limit in php.ini it goes away, it's either something to do with the bad garbage collection on the developers part or ballooning - i think the latter because my loop failed after 4 pdf's but when I quadrupled available ram it didn't fail after 60

PHP Media Link shows gibberish

Never done a lot of work with media files but I have an odd problem. I have a media link
http://.../wb_media/3343/64999/0aa2233675f94a4fc8a3915175e218f3/1/4e5b9927-3a46-4c69-9929-cc7e2a52f616.png
Which is suppose to show an image in the browser yet it shows gibberish:
Not sure where I should start looking to solve this? I have verified this is indeed the correct link. I would even appreciate knowing what that gibberish is called so I can research the problem more.
You must set header for the file type.
<?php
header("Content-type: image/png");
print (file_get_contents("location/to/image.png");
?>
Or if you are not printing it through php script, then you must look into server configuration. How server handles mime-types.

External Image resizing

So I have a PHP script file that resizes images on the fly. While this has worked for many sites and servers I have one server where it just won't work.
The script works like this:
<img src="resize/thumb2.php?src=http://a8.sphotos.ak.fbcdn.net/hphotos-ak-snc6/284989_230936523610152_118543444849461_606799_3897837_n.jpg&w=150&h=100&type=crop.">
The result is the following error (it various from browser to browser but the gist is that it can't find the file):
Firefox can't find the file at http://xx.xx.xx.xx/~test/tools/resize/thumb2.php?src=http://a8.sphotos.ak.fbcdn.net/hphotos-ak-snc6/284989_230936523610152_118543444849461_606799_3897837_n.jpg&w=150&h=100&type=crop.
So from the above output you can see it's actually trying to open the whole link as the file.
As this is the only server that isn't working, I'm strongly guessing this a server setting issue?
I've tried setting:
ini_set('allow_url_fopen', 1);
ini_set('allow_url_include', 1);
Any help would be appreciated.
Thanks
Not to steal #Pekka's thunder, but his comment is the correct answer. (If he posts it as an answer I'll happily delete)
You need to urlencode the src and then decode it in thumb2.php
<img src="resize/thumb2.php?src=<?php echo urlencode('http://example.com/logo.gif'); ?>">
All text after "resize/thumb2.php?src=" is badly-coded. Try using urlencode() or something similar, that encodes the "query part" of this URL.

TCPDF outputs weird characters in IE8

Today I started experimenting with PHP-based PDF generators. I tried TCPDF and it works fine for the most part, although it seems to be a little slow. But when I load the PHP file that generates my PDF in Internet Explorer 8, I see lines and lines of weird characters. Chrome however recognizes it as a PDF.
I'm assuming that I have to set a special MIME type to tell IE that it should interpret the page output as a PDF file. If yes, how can I do this?
putting "application/pdf" or "application/octet-stream" mime types might help. keep in mind that "application/octet-stream" will force download of the file and might prevent it from opening in the browser..
in case you wonder, you can do it like that:
header('Content-type: application/octet-stream');
I had this problem also but what I did to get it work is I added
exit();
at the end of pdf output.
You need to handle IE differently for dynamic-generated content. See this article,
http://support.microsoft.com/default.aspx?scid=kb;en-us;293792
In my code, I do this,
if(isset($_SERVER['HTTP_USER_AGENT']) AND ($_SERVER['HTTP_USER_AGENT']=='contype')) {
header('Content-Type: application/pdf');
exit;
}
This problem may also explain slowness you mentioned because your page actually sends the whole PDF multiple times without this logic.
#Pieter: I was experiencing the same issue using tcpdf (with fpdi), and loading the page that was generating the pdf using an ajax call. I changed the javascript to load the page using window.location instead, and the issue went away and the performance was much better. I believe that the other two posters are correct in the idea that the document header is causing the issue. In my case, because of the ajax call, the header was not being applied to the whole document, and causing the issue. Hope this helps.
I found this to be a problem too, and for me this all hinged on the code:
if (php_sapi_name( != 'cli') {
on line 7249 of the tcpdf.php file.
I commented this 'if' statement (and related '}')and all works fine for my other browser and ie8
Hope this helps

Displaying image with php

I have a script which displays images like this:
header("Content-Type: image/{$ext}");
readfile($image->path);
This has worked fine for weeks and now suddenly it has stopped working. The response header looks fine (Content-Type: image/jpg), I have no ending php-tag and I have made no changes to my code, server- or php-setup which could have caused this to malfunction. Does anyone have a clue as to what may be going wrong?
Thanks!
======================
UPDATE
The image doesn't display although you can download it (file->save as) and save it to computer. Openeing it locally though won't work either which leads me to think that the image has been corrupted somehow. Anyone experienced something similar? I'm thinking maybe som php errors/warnings get injected into the stream and corrupts the image.
One source of possible issues is that the MIME type for JPEG images is image/jpeg, not image/jpg. This is a case where the type doesn't agree with the fairly-common, 3-character version of the file extension.
Some thoughts:
File is to big
File path causing problems
The right content-type for JPG images is "Content-type: image/jpeg".
Note that the T of type is lower case.
UPDATE
I don't know if it will be useful but try something like this:
$info=pathinfo($image->path);
$ext=strtolower($info["extension"]);
if($ext=="jpg") $ext="jpeg";
header("Content-type: image/$ext");
imagejpeg(imagecreatefromjpeg($image->path));

Categories