This is an example script from a larger application, but shows the general process of what I'm trying to do. If I have the following script:
<?php
ob_start();
setcookie('test1', 'first');
setcookie('test1', 'second');
setcookie('test1', 'third');
setcookie('test2', 'keep');
//TODO remove duplicate test1 from headers
ob_end_clean();
die('end test');
I get the following response (as viewed via Fiddler):
HTTP/1.1 200 OK
Date: Tue, 25 Apr 2017 21:54:45 GMT
Server: Apache/2.4.17 (Win32) OpenSSL/1.0.2d PHP/5.5.30
X-Powered-By: PHP/5.5.30
Set-Cookie: test1=first
Set-Cookie: test1=second
Set-Cookie: test1=third
Set-Cookie: test2=keep
Content-Length: 8
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html
end test
The problem is that Set-Cookie: test1... exists 3 different times, therefore increasing the header size unnecessarily. (Again, this is a simplified example -
in reality, I'm dealing with ~10 duplicate cookies in the ~800-byte range.)
Is there anything I can write in place of the TODO that would get rid of the header either completely or so it only shows once? i.e. the following is my end goal:
HTTP/1.1 200 OK
Date: Tue, 25 Apr 2017 21:54:45 GMT
Server: Apache/2.4.17 (Win32) OpenSSL/1.0.2d PHP/5.5.30
X-Powered-By: PHP/5.5.30
Set-Cookie: test1=third
Set-Cookie: test2=keep
Content-Length: 8
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html
end test
though the Set-Cookie: test1=third could not exist too and that's fine, but Set-Cookie: test2=keep needs to remain. When I try setcookie('test1', '', 1); to delete the cookie, it adds an additional header to mark it as expired:
Set-Cookie: test1=first
Set-Cookie: test1=second
Set-Cookie: test1=third
Set-Cookie: test2=keep
Set-Cookie: test1=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0
And if I try removing the header like:
if (!headers_sent()) {
foreach (headers_list() as $header) {
if (stripos($header, 'Set-Cookie: test1') !== false) {
header_remove('Set-Cookie');
}
}
}
it removes all Set-Cookie headers when I only want test1 removed.
As you suggested in that last block of code, the headers_list() function could be used to check what headers have been sent. Using that, the last values for each cookie could be stored in an associative array. The names and values can be extracted using explode() (along with trim()).
When multiple cookies with the same name have been detected, we can use the header_remove() call like you had, but then set the cookies to the final values. See the example below, as well as this example phpfiddle.
if (!headers_sent()) {
$cookiesSet = []; //associative array to store the last value for each cookie
$rectifyCookies = false; //multiple values detected for same cookie name
foreach (headers_list() as $header) {
if (stripos($header, 'Set-Cookie:') !== false) {
list($setCookie, $cookieValue) = explode(':', $header);
list($cookieName, $cookieValue) = explode('=', trim($cookieValue));
if (array_key_exists($cookieName, $cookiesSet)) {
$rectifyCookies = true;
}
$cookiesSet[$cookieName] = $cookieValue;
}
}
if ($rectifyCookies) {
header_remove('Set-Cookie');
foreach($cookiesSet as $cookieName => $cookieValue) {
//might need to consider optional 3rd - 8th parameters
setcookie($cookieName, $cookieValue);
}
}
}
Output:
Cache-Control max-age=0, no-cache, no-store, must-revalidate
Connection keep-alive
Content-Encoding gzip
Content-Type text/html; charset=utf-8
Date Wed, 26 Apr 2017 15:31:33 GMT
Expires Wed, 11 Jan 1984 05:00:00 GMT
Pragma no-cache
Server nginx
Set-Cookie test1=third
test2=keep
Transfer-Encoding chunked
Vary Accept-Encoding
I don't understand why you think that the cookie removing code you showed us would remove the setcookie for test2.
If your code is setting the same cookie multiple times then you need to change your code so it stops setting the cookie multiple times! Anything else is a sloppy workaround.
Related
I have to get the url and image name from returned facebook api response. I have the response results. I have tried to get the image url and image name from the following. Please help me to get the location url and image name
preg_match('/Location: (.*?)\n/', $header, $matches);
output:
HTTP/2 302
x-app-usage: {"call_count":16,"total_cputime":0,"total_time":4}
x-fb-rlafr: 0
location: https://xxxxx.net/v/cccc/cccc/130282202_3518020318246580_4104659942029629494_o.jpg?_nc_cat=104&ccb=2&_nc_sid=9e2e56&_nc_ohc=pErMyD3PYFkAX8b7JiO&_nc_ht=scontent-ort2-1.xx&tp=6&oh=db3843917c53f747c3c3f860ca9144d1&oe=6040C6ED
expires: Sat, 01 Jan 2000 00:00:00 GMT
x-fb-request-id: dddddd
strict-transport-security: max-age=15552000; preload
x-fb-trace-id: dddddd
facebook-api-version: v3.2
content-type: image/jpeg
x-fb-rev: 1003270116
cache-control: private, no-cache, no-store, must-revalidate
pragma: no-cache
access-control-allow-origin: *
x-fb-debug: cvvvvvvvvvvvvvvvvvvvvvvvvvvv
content-length: 0
date: Fri, 05 Feb 2021 06:41:05 GMT
alt-svc: h3-29=":443"; ma=3600,h3-27=":443"; ma=3600
$img_array[$key]['url'] = trim(substr($matches['0'],10)); // to get the location url
// print_r($img_array[$key]['url']);
$img_array[$key]['name'] = substr($b['name'],0,-16); // to get the image name
preg_match('/location: (.*?)\n/', $header, $matches);
I have a PHP Symfony application which is served by nginx.
* << BeReq >> 492062
- Begin bereq 492061 fetch
- Timestamp Start: 1572337898.474535 0.000000 0.000000
- BereqMethod GET
- BereqURL /
- BereqProtocol HTTP/1.0
- BereqHeader Host: xxx
- BereqHeader X-Forwarded-Host: xxx
- BereqHeader X-Real-IP: xxx
- BereqHeader X-Forwarded-Proto: https
- BereqHeader HTTPS: on
- BereqHeader User-Agent: Wget/1.19.4 (linux-gnu)
- BereqHeader Accept: */*
- BereqHeader X-Forwarded-For: 127.0.0.1
- BereqProtocol HTTP/1.1
- BereqHeader Accept-Encoding: gzip
- BereqHeader X-Varnish: 492062
- VCL_call BACKEND_FETCH
- VCL_return fetch
- BackendOpen 26 boot.default 127.0.0.1 8080 127.0.0.1 43676
- BackendStart 127.0.0.1 8080
- Timestamp Bereq: 1572337898.474685 0.000150 0.000150
- Timestamp Beresp: 1572337903.642006 5.167471 5.167321
- BerespProtocol HTTP/1.1
- BerespStatus 200
- BerespReason OK
- BerespHeader Server: nginx/1.14.0 (Ubuntu)
- BerespHeader Content-Type: text/html; charset=UTF-8
- BerespHeader Transfer-Encoding: chunked
- BerespHeader Connection: keep-alive
- BerespHeader Set-Cookie: PHPSESSID=slaurqvo3msh9uklerbht0nd2h; path=/; domain=.xxx; HttpOnly
- BerespHeader Cache-Control: max-age=3600, public
- BerespHeader Date: Tue, 29 Oct 2019 08:31:39 GMT
- BerespHeader Age: 20
- BerespHeader Content-Encoding: gzip
- TTL RFC 3600 10 0 1572337904 1572337884 1572337899 0 3600
- VCL_call BACKEND_RESPONSE
- TTL VCL 86420 10 0 1572337884
- TTL VCL 86420 3600 0 1572337884
- TTL VCL 140 3600 0 1572337884
- VCL_return deliver
- BerespHeader Vary: Accept-Encoding
- Storage malloc Transient
- ObjProtocol HTTP/1.1
- ObjStatus 200
- ObjReason OK
- ObjHeader Server: nginx/1.14.0 (Ubuntu)
- ObjHeader Content-Type: text/html; charset=UTF-8
- ObjHeader Set-Cookie: PHPSESSID=slaurqvo3msh9uklerbht0nd2h; path=/; domain=.xxx; HttpOnly
- ObjHeader Cache-Control: max-age=3600, public
- ObjHeader Date: Tue, 29 Oct 2019 08:31:39 GMT
- ObjHeader Content-Encoding: gzip
- ObjHeader Vary: Accept-Encoding
- Fetch_Body 2 chunked stream
- Gzip u F - 24261 118266 80 80 194017
- BackendReuse 26 boot.default
- Timestamp BerespBody: 1572337903.644744 5.170209 0.002738
- Length 24261
- BereqAcct 275 0 275 342 24261 24603
- End
* << Request >> 492061
- Begin req 492060 rxreq
- Timestamp Start: 1572337898.474380 0.000000 0.000000
- Timestamp Req: 1572337898.474380 0.000000 0.000000
- ReqStart 127.0.0.1 57354
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.0
- ReqHeader Host: xxx
- ReqHeader X-Forwarded-Host: xxx
- ReqHeader X-Real-IP: xxx
- ReqHeader X-Forwarded-For: xxx
- ReqHeader X-Forwarded-Proto: https
- ReqHeader HTTPS: on
- ReqHeader Cache-Control: max-age=15000
- ReqHeader Connection: close
- ReqHeader User-Agent: Wget/1.19.4 (linux-gnu)
- ReqHeader Accept: */*
- ReqHeader Accept-Encoding: identity
- ReqUnset X-Forwarded-For: xxx
- ReqHeader X-Forwarded-For: xxx, 127.0.0.1
- VCL_call RECV
- ReqUnset X-Forwarded-For: xxx, 127.0.0.1
- ReqHeader X-Forwarded-For: 127.0.0.1
- VCL_return hash
- ReqUnset Accept-Encoding: identity
- VCL_call HASH
- VCL_return lookup
- HitMiss 492059 104.991348
- VCL_call MISS
- VCL_return fetch
- Link bereq 492062 fetch
- Timestamp Fetch: 1572337903.643494 5.169113 5.169113
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Server: nginx/1.14.0 (Ubuntu)
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader Set-Cookie: PHPSESSID=slaurqvo3msh9uklerbht0nd2h; path=/; domain=.xxx; HttpOnly
- RespHeader Cache-Control: max-age=3600, public
- RespHeader Date: Tue, 29 Oct 2019 08:31:39 GMT
- RespHeader Content-Encoding: gzip
- RespHeader Vary: Accept-Encoding
- RespHeader X-Varnish: 492061
- RespHeader Age: 20
- RespHeader Via: 1.1 varnish (Varnish/5.2)
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1572337903.643520 5.169140 0.000026
- RespUnset Content-Encoding: gzip
- RespHeader Accept-Ranges: bytes
- RespHeader Connection: close
- Gzip U D - 24261 118266 80 80 194017
- Timestamp Resp: 1572337903.645192 5.170812 0.001672
- ReqAcct 313 0 313 381 118266 118647
- End
* << Session >> 492060
- Begin sess 0 HTTP/1
- SessOpen 127.0.0.1 57354 a0 127.0.0.1 80 1572337898.474305 24
- Link req 492061 rxreq
- SessClose TX_EOF 5.171
- End
Somehow, Varnish seems to save the website in cache:
- VCL_call DELIVER
- VCL_return deliver
VCL Configuration:
vcl 4.0;
# Default backend definition. Set this to point to your content server.
backend default {
.host = "127.0.0.1";
.port = "8080";
}
sub vcl_recv {
// Remove all cookies except the session ID.
if (req.http.Cookie) {
set req.http.Cookie = ";" + req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(PHPSESSID)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
if (req.http.Cookie == "") {
// If there are no more cookies, remove the header to get page cached.
unset req.http.Cookie;
}
}
}
sub vcl_backend_response {
# Happens after we have read the response headers from the backend.
#
# Here you clean the response headers, removing silly Set-Cookie headers
# and other mistakes your backend does.
if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
unset beresp.http.Surrogate-Control;
set beresp.do_esi = true;
}
}
sub vcl_deliver
{
# Insert Diagnostic header to show Hit or Miss
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
set resp.http.X-Cache-Hits = obj.hits;
}
else {
set resp.http.X-Cache = "MISS";
}
}
What is wrong there?
You shall not cache pages with a Set-Cookie Header!
also you are quoting the wrong lines to determine whether it was cached or not:
VCL_call MISS
I've read tons of cURL tutorials (I'm using PHP) and there's always the same basic code, which doesn't work for me! No specific errors, just no result.
I want to make a HTTP request from Wikipedia and get the result in JSON format.
Here's the code :
$handle = curl_init();
$url = "http://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json";
curl_setopt_array($handle,
array(
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true
)
);
$output = curl_exec($handle);
if (!$output) {
exit('cURL Error: '.curl_error($handle));
}
$result= json_decode($output,true);
print_r($result);
curl_close($handle);
Would like to know what I'm doing wrong.
Your code is correct but it seems Wikipedia doesn't send back the data when using PHP curl (maybe some headers or other parameters must be set for it to work).
If all you need is to retrieve some data though, you can simply use file_get_contents which works fine:
$output = file_get_contents("http://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json");
echo $output;
Edit:
Just for information, I found what the issue is. When running curl -v on that URL, the following comes up:
* Trying 91.198.174.192...
* Connected to fr.wikipedia.org (91.198.174.192) port 80 (#0)
> GET /w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json HTTP/1.1
> Host: fr.wikipedia.org
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Date: Wed, 17 May 2017 13:54:31 GMT
< Server: Varnish
< X-Varnish: 852298595
< X-Cache: cp3031 int
< X-Cache-Status: int
< Set-Cookie: WMF-Last-Access=17-May-2017;Path=/;HttpOnly;secure;Expires=Sun, 18 Jun 2017 12:00:00 GMT
< Set-Cookie: WMF-Last-Access-Global=17-May-2017;Path=/;Domain=.wikipedia.org;HttpOnly;secure;Expires=Sun, 18 Jun 2017 12:00:00 GMT
< X-Client-IP: 86.214.172.57
< Location: https://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json
< Content-Length: 0
< Connection: keep-alive
<
* Connection #0 to host fr.wikipedia.org left intact
So what's happening is that the actual content is on the https url, not http, so by requesting https://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json it should work directly.
The reason it works with file_get_contents is because in this case the redirection is done automatically.
I'm having issues parsing a csv file in php, using fopen() taking in API data.
My code works when I use a URL that displays the csv file in the browser as stated in 1) below. But I get random characters outputted from a URL that ends in format=csv as seen in 2) below.
1) Working URL: Returned expected values
https://www.kimonolabs.com/api/csv/duo2mkw2?apikey=yjEl780lSQ8IcVHkItiHzzUZxd1wqSJv
2) Not Working URL: Returns random characters
https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv
Here is my code: - using URL (2) above
<?php
$f_pointer=fopen("https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/ last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv","r");
while(! feof($f_pointer)){
$ar=fgetcsv($f_pointer);
echo $ar[1];
echo "<br>";
}
?>
Output: For URL mentioned in (2) above:
root#MorryServer:/# php testing.php
?IU?Q?JL?.?/Q?R??/)?J-.?))VH?/OM?K-NI?T0?P?*ͩT0204jzԴ?H???X???# D??K
Correct Output: If I use URL Type as stated in (1)
root#MorryServer:/# php testing.php
PHP Notice: Undefined offset: 1 in /testing.php on line 24
jackpot€2,893,210
This is an encoding problem.
The given file contains UTF-8 chars. These are read by the fgetcsv function, which is binary safe. Line Endings are Unix-Format ("\n").
The output on the terminal is scrumbled. Looking at the headers sent, we see:
GET https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv --> 200 OK
Connection: close
Date: Sat, 11 Jul 2015 13:15:24 GMT
Server: nginx/1.6.2
Content-Encoding: gzip
Content-Length: 123
Content-Type: text/csv; charset=UTF-8
Last-Modified: Fri, 10 Jul 2015 11:43:49 GMT
Client-Date: Sat, 11 Jul 2015 13:15:23 GMT
Client-Peer: 107.170.197.156:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO RSA Domain Validation Secure Server CA
Client-SSL-Cert-Subject: /OU=Domain Control Validated/OU=PositiveSSL/CN=www.parsehub.com
Mind the Content-Encoding: gzip: fgetcsv working on an URL doesn't obviously handle gzip encosing. The scrumbled String is just the gzipped content of the "file".
Look at the gzip lib of PHP to first deflate that before parsing it.
Proof:
srv:~ # lwp-download 'https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv' data
123 bytes received
srv:~ # file data
data: gzip compressed data, was "tcW80-EcI6Oj2TYPXI-47XwK.csv", from Unix, last modified: Fri Jul 10 11:43:48 2015, max compression
srv:~ # gzip -d < data
"title","jackpot"
"Lotto Results for Wednesday 08 July 2015","€2,893,210"
To get the proper output, minimal changes are need: Just add a stream wrapper:
<?php
$f_pointer=fopen("compress.zlib://https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv","r");
if ( $f_pointer === false )
die ("invalid URL");
$ar = array();
while(! feof($f_pointer)){
$ar[]=fgetcsv($f_pointer);
}
print_r($ar);
?>
Outputs:
Array
(
[0] => Array
(
[0] => title
[1] => jackpot
)
[1] => Array
(
[0] => Lotto Results for Wednesday 08 July 2015
[1] => €2,893,210
)
)
$string = "Response 22: 404 (8345ms), headers: Accept-Ranges=bytes,
Cache-Control=no-cache, no-store, private, Connection=close,
Content-Encoding=gzip, Content-Language=it-it, Content-Length=1674,
Content-Location=index.html.it-it, Content-Type=text/html;
charset=utf-8, Date=Wed, 24 Sep 2014 19:01:30 GMT,
ETag='eb1-50331586750c0;503ac178f62dd', Last-Modified=Tue, 16 Sep 2014
16:35:55 GMT, Server=Apache,
Strict-Transport-Security=max-age=31536000; includeSubDomains,
TCN=choice, Vary=negotiate,accept,accept-language,Accept-Encoding,
X-Frame-Options=SAMEORIGIN, X-UA-Compatible=IE=Edge";
Here I want to grab response number(=> 22), response code(=> 404) and its milli seconds(=> 8345ms).
I think I have to use regex, but I am new to that. Can you please give any suggestions?
Response\s*(\d+):\s*(\d+)\s*\((\S+)?\)
Try this.Get the three groups.See demo.
http://regex101.com/r/qC9cH4/3