PHP code below fails to retrieve correct characters when used :
echo $html = file_get_contents("http://www.tsetmc.com/tsev2/data/instinfofast.aspx?i=65883838195688438&c=34+");
the result is :
���\�%PKJDA��ۈ�0�o'�z��W�"�7o�E��J:�%�+�=o�h#Ĥ�T�Jv�L�$��IT��1҈IY �B L�g�Mt����� �S]>>�����������j#�Tu97������#"jD��C�3x0�����I"("D�W��Bd��9������J�^ȑ���T��[e��K����r�ZB����r�Z#�w��4G� � �C�b�%8��PR�/���ع���a=�o��s���H�G�
This is because the output is 'gzip'ed, you need to 'unzip' it (see 'Content-Encoding'):
D:\Temp>curl -v "http://www.tsetmc.com/tsev2/data/instinfofast.aspx?i=65883838195688438&c=34+" -o output.data
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 79.175.151.173...
* TCP_NODELAY set
* Connected to www.tsetmc.com (79.175.151.173) port 80 (#0)
> GET /tsev2/data/instinfofast.aspx?i=65883838195688438&c=34+ HTTP/1.1
> Host: www.tsetmc.com
> User-Agent: curl/7.55.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: public, max-age=1
< Content-Type: text/html; charset=utf-8
< Content-Encoding: gzip
< Expires: Sat, 21 Dec 2019 09:43:48 GMT
< Last-Modified: Sat, 21 Dec 2019 09:43:47 GMT
< Vary: *
< Server: Microsoft-IIS/10.0
< X-Powered-By: ASP.NET
< X-Powered-By: ARR/3.0
< X-Powered-By: ASP.NET
< Date: Sat, 21 Dec 2019 09:42:59 GMT
< Content-Length: 155
<
{ [155 bytes data]
100 155 100 155 0 0 155 0 0:00:01 --:--:-- 0:00:01 662
* Connection #0 to host www.tsetmc.com left intact
D:\Temp>
unzipping (on Windows):
D:\Temp>"c:\Program Files\7-Zip\7z.exe" x output.data output
7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30
Scanning the drive for archives:
1 file, 155 bytes (1 KiB)
Extracting archive: output.data
--
Path = output.data
Type = gzip
Headers Size = 10
Everything is Ok
Size: 239
Compressed: 155
D:\Temp>type output
12:29:59,A ,9055,9098,9131,9072,9217,9000,3582,17432646,158598409673,0,20191221,122959;;2#100400#9055#9055#20091#1,2#60000#9050#9058#554#1,1#1000#9040#9059#993#2,;66660,417193,674167;13450748,3981898,0,13913408,3519238,1255,9,0,899,11;;;1;
D:\Temp>
Related
I started a project with laravel 8, I was adjusting the seeders to generate fake data with faker in the database, I have a table with images which I generate random images with faker, my ImageFactory.php file is like this:
namespace Database\Factories;
use App\Models\Image;
use Illuminate\Database\Eloquent\Factories\Factory;
class ImageFactory extends Factory
{
/**
* The name of the factory's corresponding model.
*
* #var string
*/
protected $model = Image::class;
/**
* Define the model's default state.
*
* #return array
*/
public function definition()
{
return [
'url' => 'posts/' . $this->faker->image('./public/storage/posts', 640, 480, null, false)
];
}
}
when executing the command:
php artisan migrate:fresh --seed
generates this error in the console:
copy(https://via.placeholder.com/640x480.png/001144?text=corrupti): Failed to open stream: Connection timed out at vendor/fakerphp/faker/src/Faker/Provider/Image.php:121
Looking for a solution to this I found that in the Image.php file that generates error, after the line where CURLOPT_FILE is placed these two lines and the problem would be solved, so that it would look like this:
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); //aggregated line
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); //aggregated line
$success = curl_exec($ch) && curl_getinfo($ch, CURLINFO_HTTP_CODE) === 200; existente
However this did not solve my problem and the error still persists, I am using laravel 8 and php 8.
Update
I'm using Ubuntu 18.04, running the command curl -vo /dev/null "via.placeholder.com/640x480.png/001144?text=corrupti" I got this:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 2600:3c00::f03c:91ff:fe60:d792...
* TCP_NODELAY set
* Trying 45.33.24.119...
* TCP_NODELAY set
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Connected to via.placeholder.com (45.33.24.119) port 80 (#0)
> GET /640x480.png/001144?text=corrupti HTTP/1.1
> Host: via.placeholder.com
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.6.2
< Date: Sat, 13 Feb 2021 14:36:37 GMT
< Content-Type: image/png
< Content-Length: 1558
< Last-Modified: Sat, 09 Jan 2021 14:00:02 GMT
< Connection: keep-alive
< ETag: "5ff9b6e2-616"
< Expires: Sat, 20 Feb 2021 14:36:37 GMT
< Cache-Control: max-age=604800
< X-Cache: L1
< Accept-Ranges: bytes
<
{ [1126 bytes data]
100 1558 100 1558 0 0 3386 0 --:--:-- --:--:-- --:--:-- 3386
* Connection #0 to host via.placeholder.com left intact
I have a PHP Symfony application which is served by nginx.
* << BeReq >> 492062
- Begin bereq 492061 fetch
- Timestamp Start: 1572337898.474535 0.000000 0.000000
- BereqMethod GET
- BereqURL /
- BereqProtocol HTTP/1.0
- BereqHeader Host: xxx
- BereqHeader X-Forwarded-Host: xxx
- BereqHeader X-Real-IP: xxx
- BereqHeader X-Forwarded-Proto: https
- BereqHeader HTTPS: on
- BereqHeader User-Agent: Wget/1.19.4 (linux-gnu)
- BereqHeader Accept: */*
- BereqHeader X-Forwarded-For: 127.0.0.1
- BereqProtocol HTTP/1.1
- BereqHeader Accept-Encoding: gzip
- BereqHeader X-Varnish: 492062
- VCL_call BACKEND_FETCH
- VCL_return fetch
- BackendOpen 26 boot.default 127.0.0.1 8080 127.0.0.1 43676
- BackendStart 127.0.0.1 8080
- Timestamp Bereq: 1572337898.474685 0.000150 0.000150
- Timestamp Beresp: 1572337903.642006 5.167471 5.167321
- BerespProtocol HTTP/1.1
- BerespStatus 200
- BerespReason OK
- BerespHeader Server: nginx/1.14.0 (Ubuntu)
- BerespHeader Content-Type: text/html; charset=UTF-8
- BerespHeader Transfer-Encoding: chunked
- BerespHeader Connection: keep-alive
- BerespHeader Set-Cookie: PHPSESSID=slaurqvo3msh9uklerbht0nd2h; path=/; domain=.xxx; HttpOnly
- BerespHeader Cache-Control: max-age=3600, public
- BerespHeader Date: Tue, 29 Oct 2019 08:31:39 GMT
- BerespHeader Age: 20
- BerespHeader Content-Encoding: gzip
- TTL RFC 3600 10 0 1572337904 1572337884 1572337899 0 3600
- VCL_call BACKEND_RESPONSE
- TTL VCL 86420 10 0 1572337884
- TTL VCL 86420 3600 0 1572337884
- TTL VCL 140 3600 0 1572337884
- VCL_return deliver
- BerespHeader Vary: Accept-Encoding
- Storage malloc Transient
- ObjProtocol HTTP/1.1
- ObjStatus 200
- ObjReason OK
- ObjHeader Server: nginx/1.14.0 (Ubuntu)
- ObjHeader Content-Type: text/html; charset=UTF-8
- ObjHeader Set-Cookie: PHPSESSID=slaurqvo3msh9uklerbht0nd2h; path=/; domain=.xxx; HttpOnly
- ObjHeader Cache-Control: max-age=3600, public
- ObjHeader Date: Tue, 29 Oct 2019 08:31:39 GMT
- ObjHeader Content-Encoding: gzip
- ObjHeader Vary: Accept-Encoding
- Fetch_Body 2 chunked stream
- Gzip u F - 24261 118266 80 80 194017
- BackendReuse 26 boot.default
- Timestamp BerespBody: 1572337903.644744 5.170209 0.002738
- Length 24261
- BereqAcct 275 0 275 342 24261 24603
- End
* << Request >> 492061
- Begin req 492060 rxreq
- Timestamp Start: 1572337898.474380 0.000000 0.000000
- Timestamp Req: 1572337898.474380 0.000000 0.000000
- ReqStart 127.0.0.1 57354
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.0
- ReqHeader Host: xxx
- ReqHeader X-Forwarded-Host: xxx
- ReqHeader X-Real-IP: xxx
- ReqHeader X-Forwarded-For: xxx
- ReqHeader X-Forwarded-Proto: https
- ReqHeader HTTPS: on
- ReqHeader Cache-Control: max-age=15000
- ReqHeader Connection: close
- ReqHeader User-Agent: Wget/1.19.4 (linux-gnu)
- ReqHeader Accept: */*
- ReqHeader Accept-Encoding: identity
- ReqUnset X-Forwarded-For: xxx
- ReqHeader X-Forwarded-For: xxx, 127.0.0.1
- VCL_call RECV
- ReqUnset X-Forwarded-For: xxx, 127.0.0.1
- ReqHeader X-Forwarded-For: 127.0.0.1
- VCL_return hash
- ReqUnset Accept-Encoding: identity
- VCL_call HASH
- VCL_return lookup
- HitMiss 492059 104.991348
- VCL_call MISS
- VCL_return fetch
- Link bereq 492062 fetch
- Timestamp Fetch: 1572337903.643494 5.169113 5.169113
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Server: nginx/1.14.0 (Ubuntu)
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader Set-Cookie: PHPSESSID=slaurqvo3msh9uklerbht0nd2h; path=/; domain=.xxx; HttpOnly
- RespHeader Cache-Control: max-age=3600, public
- RespHeader Date: Tue, 29 Oct 2019 08:31:39 GMT
- RespHeader Content-Encoding: gzip
- RespHeader Vary: Accept-Encoding
- RespHeader X-Varnish: 492061
- RespHeader Age: 20
- RespHeader Via: 1.1 varnish (Varnish/5.2)
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1572337903.643520 5.169140 0.000026
- RespUnset Content-Encoding: gzip
- RespHeader Accept-Ranges: bytes
- RespHeader Connection: close
- Gzip U D - 24261 118266 80 80 194017
- Timestamp Resp: 1572337903.645192 5.170812 0.001672
- ReqAcct 313 0 313 381 118266 118647
- End
* << Session >> 492060
- Begin sess 0 HTTP/1
- SessOpen 127.0.0.1 57354 a0 127.0.0.1 80 1572337898.474305 24
- Link req 492061 rxreq
- SessClose TX_EOF 5.171
- End
Somehow, Varnish seems to save the website in cache:
- VCL_call DELIVER
- VCL_return deliver
VCL Configuration:
vcl 4.0;
# Default backend definition. Set this to point to your content server.
backend default {
.host = "127.0.0.1";
.port = "8080";
}
sub vcl_recv {
// Remove all cookies except the session ID.
if (req.http.Cookie) {
set req.http.Cookie = ";" + req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(PHPSESSID)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
if (req.http.Cookie == "") {
// If there are no more cookies, remove the header to get page cached.
unset req.http.Cookie;
}
}
}
sub vcl_backend_response {
# Happens after we have read the response headers from the backend.
#
# Here you clean the response headers, removing silly Set-Cookie headers
# and other mistakes your backend does.
if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
unset beresp.http.Surrogate-Control;
set beresp.do_esi = true;
}
}
sub vcl_deliver
{
# Insert Diagnostic header to show Hit or Miss
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
set resp.http.X-Cache-Hits = obj.hits;
}
else {
set resp.http.X-Cache = "MISS";
}
}
What is wrong there?
You shall not cache pages with a Set-Cookie Header!
also you are quoting the wrong lines to determine whether it was cached or not:
VCL_call MISS
I'm trying to download a file from an external server, on which I submit a form and returns a download file (PDF).
Copying the request as cURL from the Network tab in Chrome works fine in the Terminal (it downloads the PDF) but not in shell_exec() (I get the submit form page as output).
Here are the verbose output from both curls.
This one belows works fine:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 107.180.12.118...
* TCP_NODELAY set
* Connected to operaciones.ahmex.com.mx (107.180.12.118) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
} [5 bytes data]
* (304) (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* (304) (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [2881 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
* subject: OU=Domain Control Validated; CN=operaciones.ahmex.com.mx
* start date: May 25 17:30:17 2019 GMT
* expire date: Jul 24 17:04:13 2020 GMT
* subjectAltName: host "operaciones.ahmex.com.mx" matched cert's "operaciones.ahmex.com.mx"
* issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; CN=Go Daddy Secure Certificate Authority - G2
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* Using Stream ID: 1 (easy handle 0x55f8419d44c0)
} [5 bytes data]
> POST /potentials/generar_certificado HTTP/2
> Host: operaciones.ahmex.com.mx
> authority: operaciones.ahmex.com.mx
> cache-control: max-age=0
> origin: https://operaciones.ahmex.com.mx
> upgrade-insecure-requests: 1
> content-type: multipart/form-data; boundary=----WebKitFormBoundary1IwLan7m4erUfeoh
> user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36
> sec-fetch-mode: navigate
> sec-fetch-user: ?1
> accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
> sec-fetch-site: same-origin
> referer: https://operaciones.ahmex.com.mx/potentials/
> accept-encoding: gzip, deflate, br
> accept-language: en-US,en;q=0.9
> cookie: ci_session=(I removed this)
> Content-Length: 971
>
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
} [5 bytes data]
* We are completely uploaded and fine
{ [5 bytes data]
100 971 0 0 100 971 0 701 0:00:01 0:00:01 --:--:-- 700
100 971 0 0 100 971 0 407 0:00:02 0:00:02 --:--:-- 406< HTTP/2 200
< date: Tue, 13 Aug 2019 14:57:41 GMT
< server: Apache
< x-powered-by: PHP/5.6.40
< content-disposition: attachment; filename="CER-0196084.pdf"
< cache-control: private, max-age=0, must-revalidate
< pragma: public
< vary: Accept-Encoding,User-Agent
< content-encoding: gzip
< content-type: application/x-download
<
{ [5 bytes data]
100 12590 0 11619 100 971 3823 319 0:00:03 0:00:03 --:--:-- 4141
100 65749 0 64778 100 971 15423 231 0:00:04 0:00:04 --:--:-- 15650
100 65749 0 64778 100 971 12452 186 0:00:05 0:00:05 --:--:-- 12710
100 65749 0 64778 100 971 10443 156 0:00:06 0:00:06 --:--:-- 13442
100 65749 0 64778 100 971 8990 134 0:00:07 0:00:07 --:--:-- 13439
100 65749 0 64778 100 971 7893 118 0:00:08 0:00:08 --:--:-- 10288
100 78659 0 77688 100 971 8609 107 0:00:09 0:00:09 --:--:-- 2675
100 104k 0 103k 100 971 11625 106 0:00:09 0:00:09 --:--:-- 10514
* Connection #0 to host operaciones.ahmex.com.mx left intact
This one not:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 107.180.12.118...
* TCP_NODELAY set
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Connected to operaciones.ahmex.com.mx (107.180.12.118) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
} [5 bytes data]
* (304) (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* (304) (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [2881 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
* subject: OU=Domain Control Validated; CN=operaciones.ahmex.com.mx
* start date: May 25 17:30:17 2019 GMT
* expire date: Jul 24 17:04:13 2020 GMT
* subjectAltName: host "operaciones.ahmex.com.mx" matched cert's "operaciones.ahmex.com.mx"
* issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; CN=Go Daddy Secure Certificate Authority - G2
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* Using Stream ID: 1 (easy handle 0x55a28da8e4c0)
} [5 bytes data]
> POST /potentials/generar_certificado HTTP/2
> Host: operaciones.ahmex.com.mx
> authority: operaciones.ahmex.com.mx
> cache-control: max-age=0
> origin: https://operaciones.ahmex.com.mx
> upgrade-insecure-requests: 1
> content-type: multipart/form-data; boundary=----WebKitFormBoundaryWbbB3yby9oTZCuNV
> user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36
> sec-fetch-mode: navigate
> sec-fetch-user: ?1
> accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
> sec-fetch-site: same-origin
> referer: https://operaciones.ahmex.com.mx/potentials/
> accept-encoding: gzip, deflate, br
> accept-language: en-US,en;q=0.9
> cookie: ci_session=(I removed this)
> Content-Length: 972
>
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
} [5 bytes data]
* We are completely uploaded and fine
{ [5 bytes data]
< HTTP/2 200
< date: Tue, 13 Aug 2019 15:01:02 GMT
< server: Apache
< x-powered-by: PHP/5.6.40
< vary: Accept-Encoding,User-Agent
< content-encoding: gzip
< content-length: 3120
< content-type: text/html; charset=UTF-8
<
{ [5 bytes data]
100 4092 100 3120 100 972 3312 1031 --:--:-- --:--:-- --:--:-- 4343
* Connection #0 to host operaciones.ahmex.com.mx left intact
Here's the cURL, just removed the cookie session
curl 'https://operaciones.ahmex.com.mx/potentials/generar_certificado' -H 'authority: operaciones.ahmex.com.mx' -H 'cache-control: max-age=0' -H 'origin: https://operaciones.ahmex.com.mx' -H 'upgrade-insecure-requests: 1' -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundaryWbbB3yby9oTZCuNV' -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36' -H 'sec-fetch-mode: navigate' -H 'sec-fetch-user: ?1' -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3' -H 'sec-fetch-site: same-origin' -H 'referer: https://operaciones.ahmex.com.mx/potentials/' -H 'accept-encoding: gzip, deflate, br' -H 'accept-language: en-US,en;q=0.9' --data-binary $'------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="nombre2"\r\n\r\nSergio\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="apellido_paterno"\r\n\r\nMendoza\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="apellido_materno"\r\n\r\nNegrete\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="rfc"\r\n\r\nMENS8804144J4\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="importe"\r\n\r\n1500000\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="banco"\r\n\r\nSantander\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="estado"\r\n\r\nJalisco\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="municipio"\r\n\r\nZapopan\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV\r\nContent-Disposition: form-data; name="action"\r\n\r\nEnviar\r\n------WebKitFormBoundaryWbbB3yby9oTZCuNV--\r\n' --compressed -o cert.pdf
if you are using php why not try readfile method?
If you want to use console wget is better for download instead cURL
I've read tons of cURL tutorials (I'm using PHP) and there's always the same basic code, which doesn't work for me! No specific errors, just no result.
I want to make a HTTP request from Wikipedia and get the result in JSON format.
Here's the code :
$handle = curl_init();
$url = "http://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json";
curl_setopt_array($handle,
array(
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true
)
);
$output = curl_exec($handle);
if (!$output) {
exit('cURL Error: '.curl_error($handle));
}
$result= json_decode($output,true);
print_r($result);
curl_close($handle);
Would like to know what I'm doing wrong.
Your code is correct but it seems Wikipedia doesn't send back the data when using PHP curl (maybe some headers or other parameters must be set for it to work).
If all you need is to retrieve some data though, you can simply use file_get_contents which works fine:
$output = file_get_contents("http://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json");
echo $output;
Edit:
Just for information, I found what the issue is. When running curl -v on that URL, the following comes up:
* Trying 91.198.174.192...
* Connected to fr.wikipedia.org (91.198.174.192) port 80 (#0)
> GET /w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json HTTP/1.1
> Host: fr.wikipedia.org
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Date: Wed, 17 May 2017 13:54:31 GMT
< Server: Varnish
< X-Varnish: 852298595
< X-Cache: cp3031 int
< X-Cache-Status: int
< Set-Cookie: WMF-Last-Access=17-May-2017;Path=/;HttpOnly;secure;Expires=Sun, 18 Jun 2017 12:00:00 GMT
< Set-Cookie: WMF-Last-Access-Global=17-May-2017;Path=/;Domain=.wikipedia.org;HttpOnly;secure;Expires=Sun, 18 Jun 2017 12:00:00 GMT
< X-Client-IP: 86.214.172.57
< Location: https://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json
< Content-Length: 0
< Connection: keep-alive
<
* Connection #0 to host fr.wikipedia.org left intact
So what's happening is that the actual content is on the https url, not http, so by requesting https://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json it should work directly.
The reason it works with file_get_contents is because in this case the redirection is done automatically.
I have the following function that gets a blob from the database and should return the file to the browser, however it is returning a corrupt file:
$file_data is an array with the returned row from the files table with the blob, the content type, last modified and other such things.
$data is the blob component of the $file_data array.
function header_file($data, $file_data)
{
$last_modified = gmdate('D, d M Y H:i:s', $file_data['unix_last_modified_time'])." GMT";
// if browser question if it's up to date
if (isset($_SERVER['HTTP_IF_MODIFIED_SINCE']))
{
// parse header
$if_modified_since = preg_replace('/;.*$/', '', $_SERVER['HTTP_IF_MODIFIED_SINCE']);
if ($if_modified_since == $last_modified)
{
// the browser's cache is still up to date
header("HTTP/1.0 304 Not Modified");
header("Cache-Control: max-age=86400, must-revalidate");
exit;
}
}
header("Cache-Control: max-age=86400, must-revalidate");
header("Last-Modified: ".$last_modified);
header("Content-Type: ".$file_data['file_upload_type']);
// this prevents caching...
// yea, lots of hair lost to this one...
//header("Content-Length: " . strlen($data));
header("Content-Transfer-Encoding: binary");
if($file_data['file_upload_type'] == 'application/x-shockwave-flash')
header("Content-Disposition: inline; filename=\"".str_replace(' ','_',$file_data['file_upload_name'])."\"");
else
header("Content-Disposition: attachment; filename=\"".str_replace(' ','_',$file_data['file_upload_name'])."\"");
// send data to output
echo $data;
exit;
}
Before the function is run, the output buffer is cleared with:
if(ob_get_length() > 0)
{
ob_clean();
}
Results:
File downloads, with the correct filesize however it is corrupted.Related question: https://stackoverflow.com/questions/19768650/zend-caching-of-images-gives-problems-once-the-site-goes-down-for-a-while
Response:
Request URL:http://www.example.com/index.php?module=uploads&sub_module=getfile&id=4982
Request Method:GET
Status Code:200 OK
Request Headersview source
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:no-cache
Connection:keep-alive
Cookie:banner89=yes; banner90=yes; PHPSESSID=vhlk92ihtcdmtv2q4vhjbmsv54; __utmz=45276912.1383308583.8.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); __utma=45276912.1890999283.1366697926.1383574554.1383631349.15; __utmc=45276912
DNT:1
Host:www.example.com
Pragma:no-cache
User-Agent:Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
Query String Parametersview sourceview URL encoded
module:uploads
sub_module:getfile
id:4982
Response Headersview source
Cache-Control:max-age=86400, must-revalidate
Connection:keep-alive
Content-Disposition:attachment; filename="Masthead-Banner.gif"
Content-Encoding:gzip
Content-Length:5920
Content-Type:image/gif
Date:Tue, 05 Nov 2013 09:14:03 GMT
Expires:Thu, 19 Nov 1981 08:52:00 GMT
Last-Modified:Tue, 05 Nov 2013 08:27:55 GMT
Pragma:no-cache
Server:Apache
Vary:Accept-Encoding
X-Cache:MISS from firewall.
More investigation:
Opening the downloaded file and comparing it with the one on the server reveals that the header and footer of the text is the same but there are a lots of characters that are different:
Real Thing:
%PDF-1.5
%âãÏÓ
165 0 obj
<</Linearized 1/L 100758/O 167/E 86272/N 4/T 100409/H [ 498 245]>>
endobj
181 0 obj
<</DecodeParms<</Columns 5/Predictor 12>>/Filter/FlateDecode/ID[<47941C1B25C34A4EA92EE88606328B32><09EC517E475E964EB1CBEF770BC3C54D>]/Index[165 33]/Info 164 0 R/Length 91/Prev 100410/Root 166 0 R/Size 198/Type/XRef/W[1 3 1]>>stream
hÞbbd```b``º"§I~É"Á²`Ĺ,«æ*H®(;D&L#ÿí¿Hþßø h XÊäF¯O ~O
7
endstream
endobj
startxref
0
%%EOF
197 0 obj
<</C 163/Filter/FlateDecode/I 185/Length 151/O 147/S 94>>stream
hÞb```¢vV3A Ç%êzů K ULT«Ú1Q5}ukGGGGFGG#Ã#>(f`dàg¬áú¨}À!óÆÄF ?O1x7°30ínÒ#ôHs00íÍa ;Sn$Óu¨*E »!S
endstream
endobj
166 0 obj
<</MarkInfo<</Marked true>>/Metadata 10 0 R/Outlines 14 0 R/PageLayout/OneColumn/Pages 163 0 R/StructTreeRoot 25 0 R/Type/Catalog>>
endobj
167 0 obj
<</Contents 171 0 R/CropBox[0.0 0.0 612.0 792.0]/MediaBox[0.0 0.0 612.0 792.0]/Parent 163 0 R/Resources<</ColorSpace<</CS0 182 0 R/CS1 183 0 R>>/Font<</C2_0 188 0 R/TT0 190 0 R/TT1 192 0 R/TT2 194 0 R/TT3 196 0 R>>>>/Rotate 0/StructParents 0/Type/Page>>
endobj
168 0 obj
<</Filter/FlateDecode/First 121/Length 1269/N 15/Type/ObjStm>>stream
Corrupted:
%PDF-1.5
%âãÏÓ
165 0 obj
<</Linearized 1/L 100758/O 167/E 86272/N 4/T 100409/H [ 498 245]>>
endobj
181 0 obj
<</DecodeParms<</Columns 5/Predictor 12>>/Filter/FlateDecode/ID[<47941C1B25C34A4EA92EE88606328B32><09EC517E475E964EB1CBEF770BC3C54D>]/Index[165 33]/Info 164 0 R/Length 91/Prev 100410/Root 166 0 R/Size 198/Type/XRef/W[1 3 1]>>stream
hÞbbd```b``º"§‚I~É"™Á²`Ĺ,«æ*ƒH®(™;D&L‘#’‘ÿˆí¿Hþßø™ h‹ X–‘ÊäF¯O ~O
7
endstream
endobj
startxref
0
%%EOF
197 0 obj
<</C 163/Filter/FlateDecode/I 185/Length 151/O 147/S 94>>stream
hÞb```¢vV3AŠ dž%êŒzŽÅ–¯ K ™ULT«Ú1Q5}ukGGGGƒFGG#Õ#>(f`dàg¬‘áú¨}Àœ!“‹óÆÄF †—?O1x7€•°30ínÒŒ#ôHs00íÍa ;•Sn$Óu¨*E€ »!S
endstream
endobj
166 0 obj
<</MarkInfo<</Marked true>>/Metadata 10 0 R/Outlines 14 0 R/PageLayout/OneColumn/Pages 163 0 R/StructTreeRoot 25 0 R/Type/Catalog>>
endobj
167 0 obj
<</Contents 171 0 R/CropBox[0.0 0.0 612.0 792.0]/MediaBox[0.0 0.0 612.0 792.0]/Parent 163 0 R/Resources<</ColorSpace<</CS0 182 0 R/CS1 183 0 R>>/Font<</C2_0 188 0 R/TT0 190 0 R/TT1 192 0 R/TT2 194 0 R/TT3 196 0 R>>>>/Rotate 0/StructParents 0/Type/Page>>
endobj
168 0 obj
<</Filter/FlateDecode/First 121/Length 1269/N 15/Type/ObjStm>>stream
More info:
PDF's are now downloading correctly...fiddled around but never changed anything, always ctrl +z
latest Revelations:
A space is added to the top of files which corrupts the image...However i don't know how to get rid of it programmatically.
It has to do with Windows and Unix Line Endings because it is a unix based web server the \cr\lf applies where just the \lf should. Hence the images are being rendered with a space at the beginning which corrupts the entire file.