PHP, cURL, Sessions and Cookies - Oh my - php

I wrote a PHP web application which uses authentication and sessions (no cookies though). All works fine for the users in their browsers. At this point though I need to add functionality which will perform a task automatically... users don't need to see anything and can't interact with this process. So I wrote my new PHP, import.php, which works in my browser. I set up a new cron job to call 'php import.php'. Doesn't work. Started Googling and it seems maybe I need to be using cURL and possibly cookies but I'm not certain. Basically import.php needs to authenticate and then access functions in a separate file, funcs.php, in the same directory on the local server. So I added cURL to import.php and reran from the command line; I see the following:
[me#myserver]/var/www/html/webapp% php ./import.php
* About to connect() to myserver.internal.corp port 443 (#0)
* Trying 192.168.111.114... * connected
* Connected to myserver.internal.corp (192.168.111.114) port 443 (#0)
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* Remote Certificate has expired.
* SSL certificate verify ok.
* SSL connection using SSL_RSA_WITH_3DES_EDE_CBC_SHA
* Server certificate:
* subject: CN=dept,O=Corp,L=Some City,ST=AK,C=US
* start date: Jan 11 16:48:38 2012 GMT
* expire date: Feb 10 16:48:38 2012 GMT
* common name: myserver
* issuer: CN=dept,O=Corp,L=Some City,ST=AK,C=US
> POST /webapp/import.php HTTP/1.1
Host: myserver.internal.corp
Accept: */*
Content-Length: 356
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------2c5ad35fd319
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
< Date: Thu, 27 Dec 2012 22:09:00 GMT
< Server: Apache/2.4.2 (Unix) OpenSSL/0.9.8g PHP/5.4.3
< X-Powered-By: PHP/5.4.3
* Added cookie webapp="tzht62223b95pww7bfyf2gl4h1" for domain myserver.internal.corp, path /, expire 0
< Set-Cookie: webapp=tzht62223b95pww7bfyf2gl4h1; path=/
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Content-Length: 344
< Content-Type: text/html
<
* Connection #0 to host myserver.internal.corp left intact
* Closing connection #0
I'm not sure what I'm supposed to do after I authenticate via cURL. Or is there an alternate way to authenticate with which I don't use cURL? Currently all pages in the web app take action (or not) based on $_SESSION and $_POST value checks. If cURL is the only way, do I need cookies? If I need cookies, once I send it back to the server why do I need to do to process it?
Basically import.php checks for and reads files from the same directory. Supposing there are files when the cron runs and parses them and inserts data into the DB. Again, everything works in the browser, just not the import from the command line.
Having never done this before (or much PHP for that matter), I'm completely stumped.
Thanks for your help.

I've solved my problems with this one.
shell_exec('nohup php '.realpath(dirname(__FILE__)).'/yourscript.php > /dev/null &');
You can set this to run every x minutes, and it will run in the background without user delay.
Can we start from here?

This is highly unlikely to help anybody but the requirements for this project changed so I ended up creating a PHP-based REST API and rewriting this import script in Python to integrate with some others tools being developed. All works as needed. In Python...
import cookielib
import getopt
import os
import sys
import urllib
import urllib2
import MultipartPostHandler
Shouldn't need to provide any more details - anybody versed enough in Python should get the drift. Script reads a file and submits it to my PHP API.

Related

Internet Explorer/Edge randomly displaying response headers with 200 OK and compressed (?) data instead of HTML

I'm the sole developer building a LAMP web application for a small infancy-stage startup and have been crying myself to sleep over a bug that only occurs when using the web app in Internet Explorer 10-11 and Edge (Chrome, FF, and Opera work like a charm). Worse yet, it happens randomly and about 50% of the time after a user has authenticated and logged into the web app. Here's a screenshot:
Here's what shows up in the DOM Explorer when inspecting:
15447HTTP/1.1 200 OK
Date: Wed, 17 Aug 2016 09:27:27 GMT
Server: Apache/2.4.12 (Ubuntu)
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 4972
Keep-Alive: timeout=15, max=97
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
‹í]ëvÛ¶²þÝ<ʶGöÙÖÅò-±-u9²8©/µ¤mVVDBlŠdJ²Òößy’ýo¿Æy”ó$gðŠÔ
l÷ê^‹j‹¸|33`$}zݹÿùæ
‚¡Ý~vý!Øj?Cð:’CnàUÉç·ÓuâÕ`ê…O-# OAW?BæûŒ­w÷çÕ熊壠—d4°IûÄSæú/©õÿóÏãºLTêz¾ë?˜¶
º.ö­9IËS2ñ\?PŠO¨ZS“TÅâ
(¶«ÌÄ6imóÚÿSõÝIµã=ÐnŠ‰‹³±ú$‡<
®¯Í)ÓwݾMªŒ¤:&>íQ(¸ŽRë`ûm÷Ç;ëf·7ö¼ó÷¿ü8=øùËã®Îß¿ï¼ß}8MN'ýæ¥ê!
Ë`ÁÔ&l#ˆ‚÷^Øi&cø¤}ËXÝqý!¶éRãš ÒíÙã¡K­Š(TáÂëõˆ‡Õ¤ø°GYÍt‡ìûR{Úºöˆó;ì
k·Ñ¨6*ÉF%b£2Ër…A÷æ(#:£2P§Ã~Ývûn
R+\à²Ú×Õ*úÁÅz%xB'¶§5ªVCdfúÔäB½‘cò¾Þ [lËÝêoù[xk¸ùýX‘1Äu÷˜AåSË?¢ýO-þÏï¿Çõ7!
~ÿýã§Íš7bƒ
<more garbled text>
As one can see from the response headers, the server returned a status of 200, and there are no errors or warnings in the console. Under the 'Network' tab, everything appears to have returned with either 200 or 302, with the exception of a couple 404s when retrieving profile pictures from the LinkedIn REST API (the pics still show up though in the other 50% of the time that the page actually displays properly...). On the server side, there is nothing in the Apache error log, and syslog is clean. The actual content appears to be compressed, which shouldn't be a problem given that the server is specifying the content encoding as gzip. Either that, or I'm looking at encrypted content.
I'm running Apache 2.4.12 on Ubuntu 15.10. Content is (of course) served over HTTPS, and the cert doesn't expire for another year. The application is written in PHP, and this happens on both the staging and production servers. I've scoured SO, Serverfault, and Google for a similar problem but haven't been successful. If anyone has encountered this error before or has any possible idea as to what's going on, any help would be greatly appreciated.

Link between PHP and HTTP Request and Response Messages

When I did a networks course I learned about HTTP Request and Response messages and I know how to code in php reasonably enough to get around. Now my question is, the PHP has to have some link to HTTP request and response message but how. I can't seem to see the link between the two. My reasoning for asking this is that I am using the Twitter API console tool to query their api. The tool sends the following HTTP request:
GET /1.1/search/tweets.json?q=%40twitterapi HTTP/1.1
Authorization:
OAuth oauth_consumer_key="DC0se*******YdC8r4Smg",oauth_signature_method="HMAC-SHA1",oauth_timestamp="1410970037",oauth_nonce="2453***055",oauth_version="1.0",oauth_token="796782156-ZhpFtSyPN5K3G**********088Z50Bo7aMWxkvgW",oauth_signature="Jes9MMAk**********CxsKm%2BCJs%3D"
Host:
api.twitter.com
X-Target-URI:
https://api.twitter.com
Connection:
Keep-Alive
and then I get a HTTP response:
HTTP/1.1 200 OK
x-frame-options:
SAMEORIGIN
content-type:
application/json;charset=utf-8
x-rate-limit-remaining:
177
last-modified:
Wed, 17 Sep 2014 16:07:17 GMT
status:
200 OK
date:
Wed, 17 Sep 2014 16:07:17 GMT
x-transaction:
491****a8cb3f7bd
pragma:
no-cache
cache-control:
no-cache, no-store, must-revalidate, pre-check=0, post-check=0
x-xss-protection:
1; mode=block
x-content-type-options:
nosniff
x-rate-limit-limit:
180
expires:
Tue, 31 Mar 1981 05:00:00 GMT
set-cookie:
lang=en
set-cookie:
guest_id=v1%3A14109******2451388; Domain=.twitter.com; Path=/; Expires=Fri, 16-Sep-2016 16:07:17 UTC
content-length:
59281
x-rate-limit-reset:
1410970526
server:
tfe_b
strict-transport-security:
max-age=631138519
x-access-level:
read-write-directmessages
So how do these HTTP request and response messages fit into PHP? Does PHP auto generate this? How do I add authorization to PHP requests etc? I'm confused about the deeper workings of PHP
When the client sends the HTTP request to the server, there has to be something to receive the HTTP request, which is called a web server. Examples of web servers are Apache, IIS, Nginx, etc. You can also write your own server, which can handle input however it wants. In this case, I'll assume that you are requesting a PHP file.
When the web server captures the HTTP request, it determines how it should be handled. If the file requested is tweets.json, it will go make sure that file exists, and then pass control over to PHP.
PHP then begins its execution, and performs any logic that the script needs to do, meaning it could go to the database, it reads, writes and makes decisions based cookies, it does math, etc.
When the PHP script is done, it will return a HTML page as well as a bunch of headers back to the web server that called it. From there, the web server turns the HTML page and headers back into a HTTP request to respond.
That is a pretty simple overview, and web servers can work in many different ways, but this is a simple example of how it could work in a introductory use-case. In more complex scenarios, people can write their own web servers, which perform more complex logic inside of the web server software, rather than passing it off to PHP.
When it comes down to it, PHP files are just scripts that the web server executes when they are called, they provide the HTTP request as input, and get a web page and headers as output.

How to politely ask remote webpage if it changed?

The remote webpage is updated - sometimes slower, once in ten minuter or so. Sometimes more often, like every minute or more frequently. There's a piece of data on that page I'd want to store, updating it whenever it changes (not necessarily grabbing every change but not falling too far behind the "current", and keeping the updates run 24/7).
Downloading the whole remote page every minute to check if it differs from previous version is definitely on the rude side.
Pinging the remote website for headers once a minute won't be too excessive.
If there's any hint when to recheck for updates, or have the server reply with the content only after the content changes, it would be ideal.
How should I go about minimizing unwanted traffic to the remote server while still staying up-to-date?
The "watcher/updater" is written in PHP, fetching the page using simplexml_load_file() to grab the remote URL every minute now, so something that plays nice with that (e.g. doesn't drop the connection upon determining the file differs only to reconnect for actual content half a second later, just proceeds with the content request) would be probably preferred.
edit: per request, sample headers.
> HEAD xxxxxxxxxxxxxxxxxxxxxxxxxxx HTTP/1.1
> User-Agent: curl/7.27.0
> Host: xxxxxxxxxxxxxx
> Accept: */*
>
* additional stuff not fine transfer.c:1037: 0 0
* HTTP 1/.1 or later with persistent connection, pipelining supported
< HTTP/1.1 200 OK
< Server: nginx
< Date: Tue, 18 Feb 2014 19:35:04 GMT
< Content-Type: application/rss+xml; charset=utf-8
< Content-Length: 9865
< Connection: keep-alive
< Status: 200 OK
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< X-UA-Compatible: chrome=1
< ETag: "66509a4967de2c5984aa3475188012df"
< Cache-Control: max-age=0, private, must-revalidate
< X-Request-Id: 351a829a-641b-4e9e-a7ed-80ea32dcb071
< X-Runtime: 0.068888
< X-Powered-By: Phusion Passenger
< X-Frame-Options: SAMEORIGIN
< Accept-Ranges: bytes
< X-Varnish: 688811779
< Age: 0
< Via: 1.1 varnish
< X-Cache: MISS
ETag: "66509a4967de2c5984aa3475188012df"
This is a very promising header. If it indeed corresponds to changes in the page itself, you can query the server setting this request header:
If-None-Match: "<the last received etag value>"
If the content was not modified, the server should respond with a 304 Not Modified status and no body. See http://en.wikipedia.org/wiki/HTTP_ETag. It also seems to be running a cache front end, so you're probably not hitting it too hard anyway.
Send an HTTP HEAD request using cURL and retrieve the Last-Modified value. This is similar to GET but HEAD only transfers the status line and header section, so you won't be "rude" to the other server if you're sending a HEAD request.
In command-line, we can achieve this using the following command:
curl -s -v -X HEAD http://example.com/file.html 2>&1 | grep '^< Last-Modified:'
It shouldn't be too hard to rewrite this using PHP's cURL library.

Trying to download a file using curl where file download is blocked by javascript?

I am trying to use curl to download a torrent file the url is
http://torcache.net/torrent/006DDC8C407ACCDAF810BCFF41E77299A373296A.torrent
You will notice that upon getting to the page the download of the file is blocked for a few seconds via javascript, I was wondering if there is anyway to bypass this while using curl and php?
Thanks
The file is not blocked via javascript, that's just an informal message if you request that file. The redirect then is done via javascript.
You can simulate the request your own, the important part here is that you add the HTTP Referrer request header. Example:
$ curl -I -H 'Referer: http://torcache.net/torrent/006DDC8C407ACCDAF810BCFF41E77299A373296A.torrent' http://torcache.net/torrent/006DDC8C407ACCDAF810BCFF41E77299A373296A.torrent
HTTP/1.1 200 OK
Server: nginx/1.3.0
Date: Sun, 10 Jun 2012 17:13:59 GMT
Content-Type: application/x-bittorrent
Content-Length: 10767
Last-Modified: Sat, 09 Jun 2012 22:17:03 GMT
Connection: keep-alive
Content-Encoding: gzip
Accept-Ranges: bytes
Referrer is one thing to check, mind the typo in the HTTP specs, see Wikipedia.

cURL gets Internal Server Error when posting to aspx page

I have a big problem.
I have some applications made on an unix based system, and I use PHP with cURL to post an XML question to an IIS server with asp.net.
Every time I ask the server something I get error:
HTTP/1.1 500 Internal Server Error
Date: Tue, 04 May 2010 07:36:08 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 3032
But if I ask same question on another server, almost identically to this one (BOTH configured by me) I get results like it should and the headers:
HTTP/1.1 200 OK
Date: Tue, 04 May 2010 07:39:37 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 9169
I tried everything, searched hundreds of forums, but i don't find anything.
In IIS logs I only get:
2010-05-04 07:36:08 W3SVC1657587027 80.xx.xx.xx POST /XML_SERV/XmlAPI.aspx - 80 - 80.xx.xx.xx Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1 500 0 0
any ideas where to look what is going on?
I forgot to mention! If I use an XML request software, and ask same question, it works.
Try reducing your asp page to the minimum, with the first try with an empty page. If this succeed, begin to add the real bits until it fails, so you can narrow the error.

Categories