goutte - guzzle: Not getting file download after form submission - php

I am using goutte to click a button in a form and get the download link for a file. I am correctly logged in the website.
The code I am using is what already worked in other parts of the page and gave me the right results until this point, but in this case I'm not sure it's enough:
$form = $crawler->selectButton('btn_name')->form();
$tableResults = $this->client->submit($form);
$result = $client->getResponse()->getContent(); // contains the same page of the form
The button sets some parameters and sends the request via ajax (since the page does not reload after posts):
<button id="btn_id_was_long" name="btn_name" onclick="PrimeFaces.ab({source:'btn_name'});return false;" type="submit"><span>DAE</span></button>
When I inspect the content of the response though I get the html page that contains the form but there is no trace of an attachment.
When the button is clicked with the above code I get the following headers in response:
Date: Tue, 08 Nov 2016 16:37:42 GMT
Server: Apache/2.2.29 (Unix) mod_ssl/2.2.29 OpenSSL/1.0.1e-fips DAV/2 mod_jk/1.2.31
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1
Liferay-Portal: Liferay Portal Enterprise Edition 6.2.10 EE GA1 (Newton / Build 6210 / November 1, 2013)
X-JAVAX-PORTLET-FACES-NAMESPACED-RESPONSE: true X-Powered-By: JSP/2.2
Transfer-Encoding: chunked Content-Type: text/html;charset=UTF-8
When I perform the same action in the browser I get this instead, with the file download:
HTTP/1.1 200 OK
Date: Tue, 08 Nov 2016 17:31:34 GMT
Server: Apache/2.2.29 (Unix) mod_ssl/2.2.29 OpenSSL/1.0.1e-fips DAV/2 mod_jk/1.2.31
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
X-XSS-Protection: 1
Content-Encoding: gzip
Liferay-Portal: Liferay Portal Enterprise Edition 6.2.10 EE GA1 (Newton / Build 6210 / November 1, 2013)
Content-Disposition: attachment;filename=DAE_00001_17/10/2016.pdf
Expires: 0
Pragma: public
Cache-Control: must-revalidate, post-check=0, pre-check=0 portlet.http-status-code: 200
Keep-Alive: timeout=5, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: application/pdf;charset=UTF-8
Any clue what I'm doing wrong?

Related

In PHP, While adding events in icalendar through CALDAV protocol, Received unknown HTTP status

I am working on project based on inserting events in icalendar(iPhone) through CALDAV protocol using PHP language
In localhost the code is working fine.. when adding the same code to server using some functions, received unknown HTTP status..
I tried using dataType:"text/plain" and also I tried contentType, still not fixed.. I removed ajax function.. directly link to file.. still its showing ame error.. sometimes its shows HTTP/1.1 500 Internal Server Error and http/1.1 415 unsupported media type
last request:
PUT /rpc/calendars/mediaj11/calendar~722ea7444446*******/.ics HTTP/1.1
Host: mail.mediajenie.com:2080
Authorization: Basic **********
User-Agent: cURL based CalDAV client
Accept: */*
Content-type: text/calendar; encoding="utf-8"
Depth: 1
Content-Length: 556
last response:
HTTP/1.1 500 Internal Server Error
Date: Fri, 28 Jun 2019 10:10:48 GMT
Server: cPanel
Persistent-Auth: false
Host: mail.mediajenie.com:2080
Cache-Control: no-cache, no-store, must-revalidate, private
Connection: Keep-Alive
Pragma: no-cache
Vary: Accept-Encoding
Content-Length: 3011
Content-Type: text/html; charset=UTF-8
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Set-Cookie: PHPSESSID=5e8045144d7823ac82049d0c7ad40247; path=/
Set-Cookie: horde_secret_key=5e8045144d7823ac82049d0c7ad40247; path=/; domain=mail.mediajenie.com; HttpOnly
Set-Cookie: default_horde_view=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=mail.mediajenie.com
X-Powered-By: PHP/7.2.7

NGINX page shows raw content instead of HTML

I am using NGINX 1.12 with PHP-FPM on AWS server.
I get a rare scenario bug that sometimes my page shows the html content along with headers with 0 prefixed and suffixed
0
HTTP/1.1 200 OK
Cache-Control: no-store, no-cache, must-revalidate
Content-Type: text/html; charset=utf-8
Date: Mon, 26 Nov 2018 05:29:29 GMT
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Pragma: no-cache
Server: nginx
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
transfer-encoding: chunked
Connection: keep-alive
html content
0
When i refresh immediately or later, this will show the proper page.
Can anyone suggest what may be the issue ?

How to fetch redirected URL using python? (CURLOPT_FOLLOWLOCATION not working)

I'm working on crawling information from a website: http://www.fatwallet.com
There are many redirected URLs. For instance: http://www.fatwallet.com/ticket/store/A4C?s=storepage
is redirected to http://www.a4c.com/?siteID=.7WaaTN6umc-s1Ih0x_Q67n6r7gInoh6Ug
I would like to use PHP to find out the redirected URL.
I have used "curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true)". I know it will automatically redirect 5 times.
However, the problem is, the page i get is not the final page, instead it's a page in between.
curl_exec returns:
HTTP/1.1 302 Moved Temporarily Server: Apache Location:
www。fatwallet。com/interstitial/signin Vary: Accept-Encoding
Content-Encoding: gzip Content-Length: 20 Content-Type: text/html
Date: Mon, 13 Apr 2015 12:03:19 GMT Connection: keep-alive
Set-Cookie: JSESSIONID=A9E28337052B56ADAC8451854A276210; Path=/;
HttpOnly
HTTP/1.1 302 Moved Temporarily Server: Apache Location:
www。fatwallet。com/interstitial/signin Vary: Accept-Encoding
Content-Encoding: gzip Content-Length: 20 Content-Type: text/html
Date: Mon, 13 Apr 2015 12:03:19 GMT Connection: keep-alive
HTTP/1.1 200 OK Server: Apache Cache-Control:
no-cache,no-store,max-age=0 Expires: Wed, 31 Dec 1969 23:59:59 GMT
X-UA-Compatible: IE=edge,chrome=1 Vary: User-Agent,Accept-Encoding
Content-Language: en Content-Encoding: gzip Content-Type:
text/html;charset=UTF-8 Content-Length: 16949 Date: Mon, 13 Apr
2015 12:03:20 GMT Connection: keep-alive Set-Cookie:
list_styles=grid; Expires=Sat, 01-May-2083 15:17:27 GMT; Path=/
Set-Cookie: non_mem=f86c0692-826f-40f2-9fa1-1e2f9a957af8; Expires=Sat,
01-May-2083 15:17:27 GMT; Path=/ ............
It seems that the third redirected code is "HTTP/1.1 200 OK", but it is not the final page. If you check http://www.fatwallet.com/ticket/store/A4C?s=storepage you will understand what I mean. Also, there is no way to find the final URL in the page returned.
So my question is, could it be able to make curl continue redirecting even if it receives HTTP/1.1 200 OK?
Is there another way to solve this(by using snoopy or python)?
Thanks for all!
Seems that last redirect is done via JS, not the native HTTP answer. You just need more advanced crawler with function to execute JS code.
Just see the source code of the first redirected page (view-source:https://www.fatwallet.com/interstitial/signin) and you will find the last one in some HTML elements, it seems that some JS code is reading those values and doing the last redirect

Chrome totally ignoring Access-Control-Allow-Origin: * header

I am setting this with htaccess. I know it's being set properly because if I set another header:
Header set Access-Control-Allow-Origin2: *
Then chrome does see this. As soon as I remove the 2 however, chrome just completely ignores it. If I make my file a PHP file and put this in it:
<?php header("Access-Control-Allow-Origin: *"); ?>
Then it works.
Here are the response headers as reported by Chrome of the .htaccess method which I need to work and which does not:
HTTP/1.1 304 Not Modified
Date: Sun, 30 Mar 2014 00:13:06 GMT
Server: Apache/2.2.22 (Ubuntu)
Connection: Keep-Alive
Keep-Alive: timeout=5, max=100
ETag: "208f3-178a2-4f5c4f119cd34"
Vary: Accept-Encoding
Here are the response headers as reported by Chrome from the PHP method which for some reason does work:
HTTP/1.1 200 OK
Date: Sun, 30 Mar 2014 00:13:09 GMT
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.10
Access-Control-Allow-Origin: *
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 23
Keep-Alive: timeout=5, max=99
Connection: Keep-Alive
Content-Type: text/html
Again, I know the htaccess is setting the header, even if I go to an online service that checks reponse headers, I see this back:
HTTP/1.1 200 OK
Date: Sun, 30 Mar 2014 00:18:14 GMT
Server: Apache/2.2.22 (Ubuntu)
Last-Modified: Sat, 29 Mar 2014 20:48:34 GMT
ETag: "208f3-178a2-4f5c4f119cd34"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Access-Control-Allow-Origin: *
Content-Length: 33393
Content-Type: application/javascript

HTTP Headers difference - load page incrementally

I have an HTML page that shows a progress bar as it steps through a process. It uses flush() to send the data to the browser. I'm trying to get this to work in a Zend process which I'm short circuiting by specifically sending a header, content, then ending the process with an exit command.
The HTML page displays correctly (progress bar steps through being done). The Zend/PHP page only shows the finished page (not the steps). I'm assuming this is a header problem since the method (flush()) is identical.
In Chrome, the header for the HTML page comes up as:
HTTP/1.1 200 OK
Date: Fri, 27 Jul 2012 14:38:07 GMT
Server: Apache/2.2.16 (Unix) mod_ssl/2.2.16 OpenSSL/0.9.8r DAV/2 PHP/5.3.2
X-Powered-By: PHP/5.3.2
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
And the header for the Zend/PHP page comes up as:
HTTP/1.1 200 OK
Date: Fri, 27 Jul 2012 14:44:13 GMT
Server: Apache/2.2.16 (Unix) mod_ssl/2.2.16 OpenSSL/0.9.8r DAV/2 PHP/5.3.2
X-Powered-By: PHP/5.3.2
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-cache
Pragma: no-cache
Keep-Alive: timeout=5, max=98
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
The only header information I'm specifying in the PHP is:
header('Content-Type: text/html; charset=utf-8');
I'm using this code from this page: http://w3shaman.com/article/php-progress-bar-script
Any help would be appreciated. Thanks.
Call ob_flush() before you call flush() as Zend could have output buffering activated.
Mathieu had the fix. Adding ob_flush() before flush() in the Zend/PHP page fixed the problem. I'm not sure if Zend is activating output buffering as suggested or not.

Categories