HTTP / Twilio requests via PHP in Google App Engine being repeated

HTTP / Twilio requests via PHP in Google App Engine being repeated - php

I am trying to migrate a PHP application that uses Twilio to Google Apps and have run into a bit of a snag. As a simple test, I sent a single text message to my cell phone from within the Google App that I created. It sends fine but I receive the message twice; to confirm it was actually executing twice I sent the epoch time - they're about 1 second apart.
I checked the logs and saw this - "This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application." I tried removing the Twilio usage entirely and replaced it with a simple "Hello World" echo, same message appeared in the log for that request.
How can I avoid this sort of behavior?
UPDATE
Here are the headers from my Requestb.in test using the following code. The bin was hit twice from the same IP address - I only went to the App's page one time.
<?php
$result = file_get_contents('http://requestb.in/BINID');
echo $result;
Headers -
First Request:
User-Agent: AppEngine-Google; (+http://code.google.com/appengine; appid: s~MYAPP)
Connection: close
Accept-Encoding: gzip
X-Request-Id: e7583bda-dfeb-4431-92a5-aa4af0bf06e8
Host: requestb.in
Second request:
User-Agent: AppEngine-Google; (+http://code.google.com/appengine; appid: s~MYAPP)
X-Request-Id: e766375b-bea8-4b79-a869-e2603309bec7
Accept-Encoding: gzip
Host: requestb.in
Connection: close
SECOND UPDATE
I added the epoch time as a GET variable to the requestb.in address, the bin was hit twice with the exact same epoch, two different IP addresses, one second apart. So this tells me that the code was executed one time but somehow accessed the bin twice from two IP addresses. Sometimes it seems to only be one IP address. Really puzzled here.. I even tried from scratch with a new app, same result.

I think you will find this message ""This request caused a new process to be started for your application, " is unrelated.
Unless you use warmup requests, you will always see this message if an instance is started to serve user facing request.
I would look at your code and see how the message sending code could be executed twice.
Try doing some logging around the sending code and see if you get that log message in the same request twice.

Related

406 when using Guzzle but not through browser, or command line cURL or wget

we have a php web app using Guzzle 5 to download Wordpress RSS feeds.
It's working fine except for this feed https://www.socialquant.net/blog/feed/
The owner of this site does want us to pull the feed, and is not knowingly attempting to block access.
I can successfully download the file from my local machine and from the production web server (where we initially noticed the problem) using wget or curl with no special options.
This happened once before and that time we believed the issue to be caused by mod_security on Apache and it was solved by adding an arbitrary User-Agent header. But that time I was able to reproduce the issue consistently on the command line, this time it's only failing through Guzzle/PHP
I've copied the response headers from a browser request to the problem feed, and another feed that is working. I crossed off those that were the same and was left with the below
Server:Apache/2.2.22
Vary:User-Agent
X-Powered-By:PHP/5.3.29
Content-Encoding:gzip
Server:Apache
Vary:Accept-Encoding
X-Powered-By:PHP/5.5.30
That's not offering much insight. The gzip content encoding jumps out, I'm trying to find another working feed using gzip to verify this but it shouldn't matter as Guzzle's default mode is to automatically handle encoding. And we're using the same settings to download images from CDNs which are using gzip.
Does anyone have any ideas please? Thanks :)
EDIT
Using Guzzle 5.3.0
Code:
$client = new \GuzzleHttp\Client();
try {
$res = $client->get( $feed, [
'headers' => ['User-Agent' => 'Mozilla/4.0']
] );
} catch (\Exception $e) {
}

I'm afraid I don't have a proper solution to your problem, but I have it working again.
tl;dr version
It's the User-Agent header, changing it to pretty much anything else works.
This wget call fails:
wget -d --header="User-Agent: Mozilla/4.0" https://www.socialquant.net/blog/feed/
but this works
wget -d --header="User-Agent: SomeRandomText" https://www.socialquant.net/blog/feed/
And with that, the PHP below now also works:
require 'vendor/autoload.php';
$client = new \GuzzleHttp\Client();
$feed = 'https://www.socialquant.net/blog/feed/';
try {
$res = $client->get(
$feed,
[
'headers' => [
'User-Agent' => 'SomeRandomText',
]
]
);
echo $res->getBody();
} catch (\Exception $e) {
echo 'Exception: ' . $e->getMessage();
}
My thoughts
I started with wget and curl as you pointed out, which works when no special headers or options are set. Opening it in my browser also worked. I also tried using Guzzle without the User-Agent set and that also works.
Once I set the User-Agent to Mozilla/4.0 or even Mozilla/5.0 it started failing with 406 Not Acceptable
According to the HTTP Status Code definitions, a 406 means
The resource identified by the request is only capable of generating response entities which have content characteristics not acceptable according to the accept headers sent in the request.
In theory, adding Accept and Accept-Encoding headers should resolve the issue, but it didn't. Not via Guzzle or wget.
I then found the Mozilla Developer Network definition which states:
This response is sent when the web server, after performing server-driven content negotiation, doesn't find any content following the criteria given by the user agent.
This kinda points at the User-Agent again. This led me to believe that you are indeed correct that mod_security is doing something odd. I am convinced that an update to mod_security or Apache on the client's servers added a rule to parse the Mozilla/* user agents in a specific way since sending the User-Agent: Mozilla/4.0 () also works.
That's why I'm saying I don't have a proper solution for you. Even though the client wants you to pull the feed, they (or their hosting) is still in control of the rules.
Note: I noticed my IP getting blacklisted after a number of failed 406 attempts, after which I had to wait an hour before I could access the site again. Most likely a mod_security rule. mod_security might even be picking up on the automated requests with your user agent and start blocking it or rejecting it with the 406.

I don't have a solution for you either, as I'm also experiencing this same issue (except I get error 503 and it fails 60% of the time). Let me know if you have found a solution.
However, I would like to share with you what I have found through my recent research. I found that certain User-Agents work better than others for me. This makes me believe that it's not what Donovan states to be the case (at least for me).
When I set User-Agent to null, it works 100% of the time. However, I haven't made any large requests yet, as I'm afraid of getting IP banned, as I know I would with a large request.
When I do a var_dump of the request itself, I see a lot of arrays which include Guzzle markers. I'm thinking, maybe Amazons detection services can tell that I'm spoofing the headers? I don't know.
Hope you figured it out.

Send continuous data from 4G module to server

I have the Telit LE910 4G LTE module connected to a Teensy board (Arduino will do). While I am able to send data to my PHP server using HTTP requests (POST and GET), I am not able to send continuous data due to necessary delays for the server to respond back:
[...]
// SOCKET DIAL
LTESerial.print("AT#SD=1,0,80,\"SERVER IP\"\r\n");
delay(5000);
// POST
LTESerial.print("POST /server/index.php?data=");
LTESerial.print(random(1000));
LTESerial.print(" HTTP/1.1\r\n");
LTESerial.print("Host: SERVER IP\r\n\r\n");
delay(5000);
while (getResponse() > 0);
This is simply an example (written here), but it somewhat illustrates what I am doing. The above code is supposed to be put inside a while loop, so that once the data is uploaded to a .txt file on the server, the module reconnects to the server and POST another data point.
Obviously, I want to avoid these delays and parse data to the server as fast as possible (as soon as the data is available). This is why I opted for the 4G LTE version.
Tweaking the delays might give me an extra second or so, but my project includes plotting a lot of data points in "real time", so it is very time sensitive.
Any idea on how to send a continuous data stream to the server on 4G? I am thinking about buffering some data points and use FTP to upload the data, but I assume uploading files to the server might even take more time than now.
Any help is much appreciated!

It sounds like your use case might be better suited to a special IoT (internet of things) protocol rather than a more client server connection orientated protocol, like HTTP.
There are several protocols in use in the IoT world but some of the most common are:
MQTT - http://mqtt.org
COAP - http://coap.technology
XMPP - https://xmpp.org
These should not only address your latency concerns but are generally also designed to minimise data overhead and processing/battery use also.
You should be able to find PHP examples for these also - for example one for MQTT:
https://www.cloudmqtt.com/docs-php.html

I somewhat got it to work using some of the existing code above, but it is still not optimal. This might be useful for others.
This is what I did:
1) I socket dial only once (during initialization)
2) The POST-section is running inside a loop infinitely. The 5 second delay is now reduced to 200 ms and I added some headers, like so:
//unsigned long data = random(1000000000000000, 9999999999999999);
LTESerial.print("POST /index.php?data=");
LTESerial.print(data);
LTESerial.print(" HTTP/1.1\r\n");
LTESerial.print("Host: ADDRESS\r\n");
LTESerial.print("Connection: keep-alive\r\n\r\n");
delay(200);
while (getResponse() > 0);
3) Turns out my WAMP server (PHP) had limitations as default in terms of maximum HTTP requests, timeouts and the like. I had to increase these numbers (I changed them to unlimited) inside php.ini.
However, while I am able to "continuously" send data to my server, a delay of 200 ms is still a lot. I would like to see something close to serial communication, if possible.
Also, when looking at the serial monitor, I get:
[...]
408295030
4238727231
3091191349
2815507344
----------->(THEN SUDDENLY)<------------
HTTP/1.1 200 OK
Date: Thu, 02 Jun 2
2900442411
016 19:29:41 GMT
Server: Apache/2.4.17 (Win32) PHP/5.6.15
X-P16
3817418772
Keep-Alive: timeout=5
Connection: Keep-Alive
Content-Type: te
86026031
HTTP/1.1 200 OK
Date: Thu, 02 Jun 2016 19:29:4
3139838298
75272508
[...]
----------->(After 330 iterations/POSTs, I get)<------------
NO CARRIER
NO CARRIER
NO CARRIER
NO CARRIER
So my question is:
1) How do I eliminate the 200 ms delay as well?
2) If my data-points have different sizes, the delay will have to change as well. How to do this dynamically?
3) Why does it stop at 330-ish iterations? This doesn't happen if data is only 4 digits.
4) Why do I suddenly get responses from the server?
I hope someone can use this for their own project, however this does not suffice for mine. Any ideas?

I get many 206 partial content

I have a client application that sends the data to a php file (hosted on Apache). Usually this works without any problem. On a client site I get 206 partial content every time the client app sends data.
The data size is 10 - 30 kB so it is not huge.
If you have any suggestion - like changing Apache settings .. or something similar I would appreciate it.
Thanks.

Its not an issue. Any 2xx code means "Success". You can view details # Why does Firebug show a "206 Partial Content" response on a video loading request?

Browser shows time out while Server process is still running

I am having following problem:
I am running BIG memory process but have divided memory load into smaller chunks so no CPU time out issue.
In the Server I am creating .xml files with around 100kb sizes and they will be created around 100+.
Now main problem is browser shows Response Time out and IE at the below (just upper status bar) shows .php file download message.
During this in the backend (Server side) process is still running and continuously creating .xml files in incremental order. So no issue with that.
I have following php.ini configuration.
max_execution_time = 10000 ; Maximum execution time of each script, in seconds
max_input_time = 10000 ; Maximum amount of time each script may spend parsing request data
memory_limit = 2000M ; Maximum amount of memory a script may consume (128MB)
; Maximum allowed size for uploaded files.
upload_max_filesize = 2000M
I am running my site on IE. And I am using ZSCE with PHP 5.3
Can anybody redirect me on proper way on this issue?
Edit:
Uploading image of Time out and that's why asking for .php file download.
Edit 2:
I briefly explain my execution flow:
I have one PHP file with objects of Class Hierarchies which will start to execute Function1() from each class Hierarchy.
I have class file.
First, let say, Function1() is executed which contains logic of creating XML files in chunks.
Second, let say, Function2() is executed which will display output generated by Function1().
All is done in Class Hierarchies manner. So I can't terminate, in between, execution of Function1() until it get executed. And after that Function2() will be called.
Edit 3:
This is specially for #hakre.
As you asked some cross questions and I agree with some points but let me describe more in detail about the issue.
First I was loading around 100+ MB size XML Files at a time and that's why my Memory in local setup was hanging and stops everything on Machine and CPU time was utilizing its most resources.
I, then, divided this big size XML files in to small size (means now I am loading single XML file at a time and then unloading it after its usage). This saved me from Memory overload and CPU issue on local setup.
Now my backend process is running no CPU or Memory issue but issue is with Browser Timeout. I even tried cURL but as per my current structure it does seems to fit because of my class hierarchy issue. I have a set of classes in hierarchy and they all execute first their Process functions and then they all execute their Output functions. So unless and until Process functions get executed the Output functions do not comes in picture and that's why Browser shows Timeout.
I even followed instructions suggested by #vortex and got little success but not what I am looking for. Why I could not implement cURl because My process function is Creating required XML files at one go so it's taking too much time to output to Browser. As Process function is taking that much time no output is possible to assign to client unless and until it get completed.
cURL Output:
URL....: myurl
Code...: 200 (0 redirect(s) in 0 secs)
Content: text/html Size: -1 (Own: 433) Filetime: -1
Time...: 60.437 Start # 60.437 (DNS: 0 Connect: 0.016 Request: 0.016)
Speed..: Down: 7 (avg.) Up: 0 (avg.)
Curl...: v7.20.0
Contents of test.txt file
* About to connect() to mylocalhost port 80 (#0)
* Trying 127.0.0.1... * connected
* Connected to mylocalhost (127.0.0.1) port 80 (#0)
\> GET myurl HTTP/1.1
Host: mylocalhost
Accept: */*
< HTTP/1.1 200 OK
< Date: Tue, 06 Aug 2013 10:01:36 GMT
< Server: Apache/2.2.21 (Win32) mod_ssl/2.2.21 OpenSSL/0.9.8o
< X-Powered-By: PHP/5.3.9-ZS5.6.0 ZendServer
< Set-Cookie: ZDEDebuggerPresent=php,phtml,php3; path=/
< Cache-Control: private
< Transfer-Encoding: chunked
< Content-Type: text/html
<
* Connection #0 to host mylocalhost left intact
* Closing connection #0
Disclaimer : An answer for this question is chosen based on the first little success based on answer selected. The solution from #Hakre is also feasible when this type of question is occurred. But right now no answer fixed my question but little bit. Hakre's answer is also more detail in case of person finding for more details about this type of issues.

assuming you made all the server side modifications so you dodge a server timeout [i saw pretty much everyting explained above], in order to dodge browser timeout it is crucial that you do something like this
<?php
set_time_limit(0);
error_reporting(E_ALL);
ob_implicit_flush(TRUE);
ob_end_flush();
I can tell you from experience that internet explorer doesn't have any issues as long as you output some content to it every now and then. I run a 30gb database update everyday [that takes around 2-4 hours] and opera seems to be the only browser that ignores the content output.
if you don't set "ob_implicit_flush" you need to do an "ob_flush()" after every piece of content.
References
ob_implicit_flush
ob_flush
if you don't use ob_implicit_flush at the top of your script as I wrote earlier, you need to do something like:
<?php
echo 'dummy text or execution stats';
ob_flush();
within your execution loop

1. I am running BIG memory process but have divided memory load into smaller chunks so no CPU time out issue.
Now that's a wild guess. How did you find out it was a CPU time out issue in the first place? Did you even? If yes, what does your test now gives? If not, how do you test now that this is not a time-out issue?
Despite you state there won't be a certain issue, you don't proof that and many questions are still open. That invites for guessing which is counter-productive for trouble-shooting (which you are doing here).
What you write here just means that you wrote code to chunk memory, however, this is not a test for CPU time out issues. The one is writing code the other part is test. Don't mix the two. And don't draw wild assumptions. Issues are for the test, otherwise it didn't happen.
So much for your first point already just to show you that when doing troubleshooting, look for facts (monitor, test, profile, step-debug) not run assumptions. This is curcial otherwise you look in the wrong places and ask the wrong questions.
From what you describe how the client (browser) behaves, this is not a time-out-issue per-se. The problem you've got is that the answer between the header response and the body response is taking to long for the taste of your browser. The one browser is assuming a time-out (as such a boundary value has been triggered and this looks more correct to me) and the other browser is assuming somthing is coming up, why not save it.
So you merely have a processing issue here. Please consult the menual of your internet browsers (HTTP clients) which configuration values you can change to change this behavior. E.g. monitor with a curl-request on the command-line how long the request actually take. Then configure your browser to not time-out when connecting to that server under such an amount of time you just measured. For example if you're using Internet Explorer: http://www.ehow.com/how_6186601_change-internet-timeout-options.html or if you're using Mozilla Firefox: http://forums.mozillazine.org/viewtopic.php?f=7&t=102322&start=0
As you didn't show any code on the server-side I assume you want to solve this problem with client settings. Curl will help you to measure the number of seconds such a request takes. Use the -v (Verbose) switch to obtain detailed information about the request.
In case you don't want to solve this on the client, curl will still help you to measure important data and easily reproduce any underlying server-related timing issue. So you should go for Curl on the command-line in any case, especially as looking into response-headers might reveal what triggers the (again) esoteric internet explorer behavior. Again the -v switch does reveal you request and response headers.
If you like to automate such tests with a PHP script, it's also possible with the PHP Curl Extension. This has been outlined in:
Php - Debugging Curl

The problem is with your web-server, not the browser.
If you're using Apache, you need to adjust your Timeout value at httpd.conf or virtual hosts config.

You have 3 pages
Process - Creates the XML files and then updates a database value saying that the process is done
A PHP page that returns {true} or {false} based on the status of the process completion database value
An ajax front end, polling page 2 every few seconds to check weather the process is done or not
Long Polling

I have had this issue several times, while reading large csv file and puting it in database. I solved it in way, that i divided the reading and putting in database process into smaller parts. Like i created a new table to make log of how much data is readed and inserted, and next time the page reloads itself and start from that position. So you can do it by creating one xml in one attempt,and reload page and start form next one. In this way the memory used by browser is refreshed.
Hope it will help.

Is it possible to send some output to browser from the script while it's still processing, even white space? If, then do it, it should reset the timeout counter.
If it's not possible, you have to increase the timeout of IE in the registry:
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings
You need ReceiveTimeout, if it's not there, create it as dword, and set the value in miliseconds.

What is a "CPU time out issue"?
The right way to solve the problem is to run the heavy stuff asynchronously, in a seperate session group (not the webserver process tree).

Try to include set_time_limit(0); in your PHP script page.
The following links might help you.
http://php.net/manual/en/function.set-time-limit.php
http://php.net/manual/en/function.ignore-user-abort.php

flash xml won't cache

I have a flash app that requests xml generated by a php script. The data doesn't change much, and I would like flash to cache the xml instead of loading it every time. I've been checking my access logs, and every single time i reload a page with the flash app on it, the php file is accessed and the xml downloaded.
I've read that flash doesn't control what is cached, as it just requests something from the browser, but nothing else that flash downloads (i.e. mp3 files that are supplied by the xml) doesn't get cached. So I'm not really sure what that means.
I've googled the heck out of this, but everything I find is telling me how to keep flash from caching stuff.
Here's the code I used (AS3):
xmlLoader.load(new URLRequest("info.php"));
It's not a huge deal but sometimes it takes 2-3 seconds to load if my host decides to respond slowly.
edit: I got the headers:
HEAD /beatinfo.php HTTP/1.1[CRLF]
Host: spoonhands.com[CRLF]
Connection: close[CRLF]
User-Agent: Web-sniffer/1.0.37 (+http://web-sniffer.net/)[CRLF]
Accept-Encoding: gzip[CRLF]
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7[CRLF]
Cache-Control: no-cache[CRLF]
Accept-Language: de,en;q=0.7,en-us;q=0.3[CRLF]
Referer: http://web-sniffer.net/[CRLF]

Try looking at the header function. (http://php.net/manual/en/function.header.php)
That is the one i always use to send html headers so that it will not be cached. I think you can send headers so that it will be cached instead.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.