how to obtain the download name from an url php coding - php

ive been trying to get the name of files from a url, i found it simple with base name
until i came urls that has no sign of the true name, intil downloaded
here is an example of the links i found
here the true name is youtubedownloadersetup272.exe
http://qdrive.net/index.php/page-file_share-choice-download_file-id_file-223658-ce-0
as you can see it shows no name until download.
ive been searching a lot, i got desperated of finding nothing, ill apreciate if someone can point me the way thanks.
i sorry to bother again but i foun this link from download.com and i dont se the filename using curl
<?php
function getFilename($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_NOBODY, 1);
$data = curl_exec($ch);
echo $data;
preg_match("#filename=([^\n]+)#is", $data, $matches);
return $matches[1];
}
echo getFilename("http://software-files-l.cnet.com/s/software/11/88/39/66/YouTubeDownloaderSetup272.exe?e=1302969716&h=89b64b6e8e7485eab1e560bbdf68281d&lop=link&ptype=1901&ontid=2071&siteId=4&edId=3&spi=fdc220b131cda22d9d3f715684d064ca&pid=11883966&psid=10647340&fileName=YouTubeDownloaderSetup272.exe");
?>
it returns this with echo $data
HTTP/1.1 200 OK
Server: Apache
Accept-Ranges: bytes
Content-Disposition: attachment
Content-Type: application/download
Age: 866
Date: Sat, 16 Apr 2011 10:16:54 GMT
Last-Modified: Fri, 08 Apr 2011 18:04:41 GMT
Content-Length: 4700823
Connection: keep-alive
if i understood the scrip you gave me it wont work because it has no filename,
is there a way to get the name with out having to do regex or parsing the url (YouTubeDownloaderSetup272.exe?e.........), like the scrip you gave me ?

You need to use curl, or some other library to request the file and look at the headers of the response.
You'll be looking for a header like:
Content-Disposition: attachment; filename=???
Where the question marks are the name of the file.
(You will still have to download the file, or at least look like you are downloading the file.)

function getFilename($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_NOBODY, 1);
$data = curl_exec($ch);
preg_match("#filename=([^\n]+)#is", $data, $matches);
return $matches[1];
}
echo getFilename("http://qdrive.net/index.php/page-file_share-choice-download_file-id_file-223658-ce-0"); //YouTubeDownloaderSetup272.exe
For download.com...
function download_com($url){
$filename = explode("?", $url);
$filename = explode("/", $filename[0]);
$filename = end($filename);
return $filename;
}
echo download_com("http://software-files-l.cnet.com/s/software/11/88/39/66/YouTubeDownloaderSetup272.exe?e=1302969716&h=89b64b6e8e7485eab1e560bbdf68281d&lop=link&ptype=1901&ontid=2071&siteId=4&edId=3&spi=fdc220b131cda22d9d3f715684d064ca&pid=11883966&psid=10647340&fileName=YouTubeDownloaderSetup272.exe");

Related

cURL returns null array

I have made a simple web Crawler with PHP cURL that should grab all the images of a particular page from Amazon where the keyword samsung has been searched.
Here is the code:
$curl = curl_init(); // $curl is going to be data type curl resource
$search_string = "samsung";
$url = "https://www.amazon.com/s?k$search_string";
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); // ssl
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true); // storing in variable
$result = curl_exec($curl);
preg_match_all("!https://m.media-amazon.com/images/I/[^\s]*?._AC_UL320_.jpg!", $result, $matches);
print_r($matches);
curl_close($curl);
But now I get Null array:
Array ( [0] => Array ( ) )
I don't why it is showing that, so if you know what is going wrong or how can I handle this, please let me know, I would really appreciate any idea from you guys...
Thanks in advance.
Note that I have specified [^\s]*? regular expression instead of image name to load all the available images on web page.
UPDATE #1:
Results of curl --head https://www.amazon.com/s?k=samsung
HTTP/1.1 503 Service Unavailable
Content-Type: text/html
Content-Length: 2671
Connection: keep-alive
Server: Server
Date: Tue, 15 Jun 2021 20:59:38 GMT
x-amz-rid: 9BVX8KQMWJ4QDJ75ETYV
Vary: Content-Type,Accept-Encoding,X-Amzn-CDN-Cache,X-Amzn-AX-Treatment,User-Agent
Last-Modified: Fri, 14 May 2021 19:08:48 GMT
ETag: "a6f-5c24ef9383000"
Accept-Ranges: bytes
Strict-Transport-Security: max-age=47474747; includeSubDomains; preload
Permissions-Policy: interest-cohort=()
X-Cache: Error from cloudfront
Via: 1.1 5345148f0ba8ae3c67b69d035acdbfc5.cloudfront.net (CloudFront)
X-Amz-Cf-Pop: AMS50-C1
X-Amz-Cf-Id: AHdq2-QLEtCE4WvXZIEh_P75D8hCrHP09EAkNqBer5VBS-pI-blj1w==
First issue: Your code:
$url = "https://www.amazon.com/s?k$search_string";
should be (note the "=")
$url = "https://www.amazon.com/s?k=$search_string";
Second issue: Amazon is smart, they're not going to let you scrape as you will. The result is the content for:
You can see this with:
$result = curl_exec($curl);
var_dump($result);
Third issue: Regex is not working. One should test Regex at https://www.phpliveregex.com/#tab-preg-match-all
(Using a right-click > view source, copy and paste of the page content.)
From what I got your regex did not return any results, but this did: https://m.media-amazon.com/images/I/[^\s]*?.jpg
May be that the string bit ._AC_UL320_ is also a Amazon anti-scraping thing... :(
it's not https://www.amazon.com/s?k$search_string, it's supposed to be 'https://www.amazon.com/s?k='.urlencode($search_string);, also Amazon.com requires you to send a Accept-Encoding header, otherwise you'll risk getting gzip-compressed responses with nothing to decompress it which means you need a CURLOPT_ENCODING, also amazon will block you if you don't supply a User-Agent header, so you must supply a CURLOPT_USERAGENT, also Amazon will block you without a browser-like Accept header, so you need CURLOPT_HTTPHEADER => array('accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng')
also Do not parse html with regex, Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts.
Instead use a HTML parser like DOMDocument
this code
<?php
$curl = curl_init(); // $curl is going to be data type curl resource
$search_string = "samsung";
$url = "https://www.amazon.com/s?k=".urlencode($search_string);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); // ssl
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true); // storing in variable
curl_setopt_array($curl,array(
CURLOPT_ENCODING =>'',
CURLOPT_USERAGENT=>'libcurl',
CURLOPT_HTTPHEADER=>array(
'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
)
));
$html=curl_exec($curl);
$domd = new DOMDocument();
#$domd->loadHTML($html);
foreach($domd->getElementsByTagName("img") as $img){
echo $img->getAttribute("src"),"\n";
}
outputs
//fls-na.amazon.com/1/batch/1/OP/ATVPDKIKX0DER:136-7756522-9160852:777GSTVR1XJ9MBF1N0KN$uedata=s:%2Frd%2Fuedata%3Fstaticb%26id%3D777GSTVR1XJ9MBF1N0KN:0
https://images-na.ssl-images-amazon.com/images/G/01/gno/sprites/nav-sprite-global-1x-hm-dsk-reorg._CB405937547_.png
https://m.media-amazon.com/images/I/81HdcaHSq4L._AC_UY218_.jpg
https://m.media-amazon.com/images/I/91eAcgt9fSL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/81afsli5ctL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/61m1Dot5KCL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/61HFJwSDQ4L._AC_UY218_.jpg
https://m.media-amazon.com/images/I/216-OX9rBaL._SS72_.png
https://m.media-amazon.com/images/I/21OXy0oJ8VL._SS160_.png
https://m.media-amazon.com/images/I/61jfI8GyQgL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/61LUNEgB6iL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/813dec-cszS._AC_UY218_.jpg
https://m.media-amazon.com/images/I/81AT+Flc+EL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/216-OX9rBaL._SS72_.png
https://m.media-amazon.com/images/I/21OXy0oJ8VL._SS160_.png
https://m.media-amazon.com/images/I/61a5ejk6K2L._AC_UY218_.jpg
https://m.media-amazon.com/images/I/81+3SWSAhDL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/61pwE8H34zL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/71ejkOW4y2L._AC_UY218_.jpg
https://m.media-amazon.com/images/I/71G6eW8H8hL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/91dFUw5MUTS._AC_UY218_.jpg
https://m.media-amazon.com/images/I/81P4RzFnw6L._AC_UY218_.jpg
https://m.media-amazon.com/images/I/712iry8nIYL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/61VgW9ZZXiL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/61ft-L7HnUL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/51icdppvRVL._AC_UY218_.jpg
https://m.media-amazon.com/images/I/6164p9jY2jS._AC_UY218_.jpg
https://m.media-amazon.com/images/I/51skvShlcsL._AC_UY218_.jpg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/93913ead-ae42-4933-8fc4-e9f88b0396c9/1635f47b-1fa9-40ca-8d85-47f529c1ba8b/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/6aa489c6-af9d-48d0-94c8-cce1a4f50fc7/ff2a7805-3166-41b9-9881-d00901ca9dfd/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/73b89b9f-ee28-446f-8535-beacd328c95a/8caa5478-3583-49f9-9dcb-6e5b0a254fa6/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/457fd8ad-f566-4682-bb66-fd865954aec0/fb2cdc76-7ed6-4b86-9196-d40c3ead2914/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/5c60fcd5-17c1-4389-8423-2252436f21c8/0125e72d-9178-4048-bea3-9d268a406a05/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/f852e5ab-0fa9-4f91-b195-b0facc4d0d70/30b0ec08-79b2-428d-98df-aadffd2c00eb/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/d173de56-5162-463f-be97-d256c1895024/7974c773-0c53-43a1-bfb4-91d7cc3ce801/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/68995c82-c645-4ec0-9168-20f77b8ae24d/625e2c3f-01d9-401e-b4a4-bb865ad9e525/media._SL60_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/grey-pixel.gif
https://m.media-amazon.com/images/S/mms-media-storage-prod/final/BrandPosts/brandPosts/2cfe5e10-6a7e-43f4-80c7-d87f212b8007/43e8a030-58c5-491a-9854-cd4d8824a873/media._SL480_.jpeg
https://images-na.ssl-images-amazon.com/images/G/01/personalization/ybh/loading-4x-gray._CB485916920_.gif
https://assoc-na.associates-amazon.com/abid/um?s=136-7756522-9160852&m=ATVPDKIKX0DER
//fls-na.amazon.com/1/batch/1/OP/ATVPDKIKX0DER:136-7756522-9160852:777GSTVR1XJ9MBF1N0KN$uedata=s:%2Frd%2Fuedata%3Fnoscript%26id%3D777GSTVR1XJ9MBF1N0KN:0
$url = "https://www.amazon.com/s?k$search_string";
yes your url is wrong
Actull url is.you can try
$url = "https://www.amazon.com/s?k=$search_string";
Firstly there is a typo
change
$url = "https://www.amazon.com/s?k".$search_string;
to
$url = "https://www.amazon.com/s?k=".$search_string;
Amazon expects some header values to be there when requesting content please refer to the following curl request
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.3>
curl_setopt($curl, CURLOPT_HTTPHEADER, array(
'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v>
));
curl_setopt($curl, CURLOPT_ENCODING, '');
$result=curl_exec($curl);
Lastly, Change your preg_match_all function from
preg_match_all("!https://m.media-amazon.com/images/I/[^\s]*?._AC_UL320_.jpg!", $result, $matches);
To
preg_match_all('/(https?:\/\/\S+\.(?:jpg|png|gif))\s+/', $result, $matches);
Complete Code :
<?php
$curl = curl_init();
$search_string = "samsung";
$url = "https://www.amazon.com/s?k=".$search_string;
//set headers to match with amazon header . you can check headers with any browsers developer tool.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36');
curl_setopt($curl, CURLOPT_HTTPHEADER, array(
'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
));
curl_setopt($curl, CURLOPT_ENCODING, '');
$result=curl_exec($curl);
preg_match_all('/(https?:\/\/\S+\.(?:jpg|png|gif))\s+/', $result, $matches);
print_r($matches);

Subscribe using Superfeedr PubSubHubbub generating error hub.topic not found

I want to integrate Superfeedr API using PubSubHubbub in PHP. I am following this and my code is:
<?php
require_once('Superfeedr.class.php')
$superfeedr = new Superfeedr('http://push-pub.appspot.com/feed',
'http://mycallback.tld/push?feed=http%3A%2F%2Fpush-pub.appspot.com%2Ffeed',
'http://wallabee.superfeedr.com');
$superfeedr->verbose = true;
$superfeedr->subscribe();
?>
And my subscribe() function is
public function subscribe()
{
$this->request('subscribe');
}
private function request($mode)
{
$data = array();
$data['topic'] = $this->topic;
$data['callback'] = $this->callback;
$post_data = array (
"hub.mode" => 'subscribe',
"hub.verify" => "sync",
"hub.callback" => urlencode($this->callback),
"hub.topic" => urlencode($this->topic),
"hub.verify_token" => "26550615cbbed86df28847cec06d3769",
);
//echo "<pre>"; print_r($post_data); exit;
// url-ify the data for the POST
foreach ($post_data as $key=>$value) {
$post_data_string .= $key.'='. $value.'&';
}
rtrim($fields_string,'&');
// curl request
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $this->hub);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data_string);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Accept: application/json'));
curl_setopt($ch, CURLOPT_USERPWD, 'USERNAME:PASSWORD');
$output = curl_exec($ch);
if ($this->verbose) {
print('<pre>');
print_r($output);
print('</pre>');
}
}
But after execution I am getting this error
HTTP/1.1 422 Unprocessable Entity
X-Powered-By: The force, Luke
Vary: X-HTTP-Method-Override, Accept-Encoding
Content-Type: text/plain; charset=utf-8
X-Superfeedr-Host: supernoder16.superfeedr.com
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
Access-Control-Allow-Headers: Authorization
Content-Length: 97
ETag: W/"61-db6269b5"
Date: Wed, 24 Aug 2016 14:01:47 GMT
Connection: close
Please provide a valid hub.topic (feed) URL that is accepted on this hub. The hub does not match.
Same data (topic and callback etc..) requesting from https://superfeedr.com/users/testdata/push_console
is working fine. But I don't know why I am getting this error on my local. If anyone has any experienced with same problom then please help me. Thanks.
You are using a strange hub URL. You should use HTTPS://push.superfeedr.com in the last param of your class constructor.

get curl work with Content-Disposition of the response header after sending $_POST request

Ok, to understand the problem, first please visit
http://unblockproxy.nu/
Try to surf any website, let's say (http://www.example.com/samplepage.html) put it in the field then click "unblock" button
After sending the $_POST request, the site should redirect you to something like:
http://unblockproxy.nu/index.php?x=Mfv0KjYRb3J3JO50MgBNbplFn2sTMoqPUIu1Unqn0bqdUoq5VbA9OnO8%3D
Response Headers of the browser is like:
HTTP/1.1 302 Found
Date: Fri, 06 Mar 2015 12:49:30 GMT
Server: Apache/2.2.15
x-powered-by: PHP/5.3.3
Location: http://unblockproxy.nu/index.php?x=Mfv0KjYRb3J3JO50MgBNbplFn2sTMoqPUIu1Unqn0bqdUoq5VbA9OnO8%3D
Cache-Control: max-age=600, private, must-revalidate
Expires: Fri, 06 Mar 2015 12:59:30 GMT
Vary: Accept-Encoding
Connection: close
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
HTTP/1.1 200 OK
Date: Fri, 06 Mar 2015 12:49:34 GMT
Server: Apache/2.2.15
X-Powered-By: PHP/5.3.3
Content-Disposition: inline; filename="samplepage.html"
Cache-Control: max-age=600, private, must-revalidate
Expires: Fri, 06 Mar 2015 12:59:34 GMT
Vary: Accept-Encoding
Connection: close
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
That's easy, now you got the contents of the surfed page by using this web proxy.
Now, i want to do the same job by using curl
My problem is, i don't know how to let curl deal with Content-Disposition of the response header
Here is some codes to simulate my problem::
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://unblockproxy.nu/index.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, array('x' => 'http://www.example.com/samplepage.html'));
curl_setopt($ch, CURLOPT_COOKIESESSION, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies.txt');
$string = curl_exec($ch);
curl_close($ch);
echo $string;
This will return the contents of http://unblockproxy.nu/ and that is not what i want (http://www.example.com/samplepage.html which surfed by http://unblockproxy.nu/)
If you want to take a look into the script of this site (2 PHP files only), you can go here
Thank you.
Try this. This works for me just fine if I'm understanding your question correctly. I removed a lot of code that did nothing. Turns out, the problems was that you weren't setting the referer in the request headers.
Let me start from the beginning. Upon submitting the form via POST to view a given website with a proxy, a request is sent to http://unblockproxy.nu/index.php. As you mentioned in your question, index.php handles the form submission and generates an HTTP status code of 302 which essentially just redirects you to another page. Assuming that you send a properly formatted request to index.php, you can parse the response headers and get the value of the redirect URL. Follow the code below to get the redirect URL.
/**
* Submit the form via POST
* #param [site_url] The link to the page that you want to view
* eg: http://sitetoget.com/page.html
* #return A string containing the response headers
*/
function GetRedirect($site_url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://unblockproxy.nu/index.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, array('x' => $site_url));
$response = curl_exec($ch);
curl_close($ch);
return $response;
}
/**
* Turn a header string into an associative array
* #param [response] The response headers from the form submission
* #return An array containing all of the headers
*/
function GetHeaders($response) {
$headers = [];
$text = substr($response, strpos($response, "\r\n\r\n"));
foreach(explode("\r\n", $text) as $i => $line) {
if($i === 0 || $i == 1) {
$headers['http_code'] = $line;
} else {
list($key, $value) = explode(': ', $line);
if($key != '' && $value != '') {
$headers[$key] = $value;
}
}
}
return $headers;
}
// Get the redirect URL
$redirect = GetRedirect('http://lancenewman.me/');
// Parse the response headers
$headers = GetHeaders($redirect);
// Save the redirect URL
$new_url = $headers['Location'];
Now that you have the URL that index.php redirects to, send a cURL request to it as follows. Strangely enough, almost all of the other request headers that I've tinkered with play no role in determining whether or not this solution works. The reason your code is getting the contents of http://unblockproxy.nu instead of the contents of the given site as viewed by http://unblockproxy.nu is because you're not following the redirections correctly and you're not setting the referer in request headers. The cookies, content-disposition and all of the other headers seem to play no role in solving this.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $new_url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, 'http://unblockproxy.nu');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$string = curl_exec($ch);
curl_close($ch);
echo $string;
It's important to note that some of the images, CSS and JS on some of the pages might not properly load because some use relative URLs instead of absolute ones. Just keep that in mind.
The problem is it requires two round-trips to the server to complete the request. Many sites use the method to reduce the number or requests by "bots". The first request creates a cookie (typically for a "session") which must be present in order for the form to be processed.
Perform the curl_exec() twice and see if you get the results you want. The first time the response will send a cookie which curl will save since you have enabled cookies. The second time you should get the results you want.

logging in to Joomla from external script

I'm trying to login a user who was authenticated elsewhere to a Joomla-site and was following Brent Friar's nice program, but had to apply two modifications:
added a field "return" which was contained in the form
refencing com_users, not com_user
I do not know if that site has specific customizations, uses a specific login-module or if is a different version - I do not have admin-access to the site, so I cannot check.
Now, my script is running, but it does not successfully login the user - it doesn't get a cookie in return which it is expecting.
Instead, the site returns
HTTP/1.1 100 Continue
HTTP/1.1 303 See other Date: Wed, 23 Jul 2014 18:18:25 GMT Server:
Apache/2.2.22 X-Powered-By: PHP/5.2.17 Location:
http://www.strassenbau.forum-kundenportal.de/login-erfolgreich
Content-Length: 0 Connection: close Content-Type: text/html;
charset=utf-8
I know a bit of Joomla, but know nothing about the depths of http-communication with it, so I have no idea what the problem is here.
Here's my code:
<?php
$uname = "*** secret";
$upswd = "*** credentials";
$url = "http://www.strassenbau.forum-kundenportal.de/login-anmeldung";
set_time_limit(0);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url );
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE );
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE );
curl_setopt($ch, CURLOPT_COOKIEJAR, realpath('./cookie.txt'));
curl_setopt($ch, CURLOPT_COOKIEFILE, realpath('./cookie.txt'));
curl_setopt($ch, CURLOPT_HEADER, TRUE );
$ret = curl_exec($ch);
if (!preg_match('/name="([a-zA-z0-9]{32})"/', $ret, $spoof)) {
preg_match("/name='([a-zA-z0-9]{32})'/", $ret, $spoof);
}
preg_match('/name="return" value="(.*)"/', $ret, $return); // search for hidden field "return" and get its value
// POST fields
$postfields = array();
$postfields['username'] = urlencode($uname);
$postfields['password'] = urlencode($upswd);
$postfields['option'] = 'com_users';
$postfields['task'] = 'user.login';
$postfields['return'] = $return[1];
$postfields[$spoof[1]] = '1';
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields);
$ret = curl_exec($ch);
echo "ret2: <pre>"; var_dump($ret); echo "</pre>"; // no cooking being set here!
// Get logged in cookie and pass it to the browser
preg_match('/^Set-Cookie: (.*?);/m', $ret, $m);
$cookie=explode('=',$m[1]);
setcookie($cookie[0], $cookie[1]);
?>
I ended up using the AutoLogin-extension which did the job. Not as "elegant" as I wanted it to be (because it requires installation of that plugin), but hey, it works! :-)

How to perform a PUT operation using CURL in PHP?

I would like to perform a PUT operation on a webservice using CURL. Let's assume that:
webservice url: http://stageapi.myprepaid.co.za/api/ConsumerRegisterRequest/cac52674-1711-e311-b4a8-00155d4905d3
municipality= NMBM
sgc= 12345
I've written the code below, but it outputs this error message: "ExceptionMessage":"Object reference not set to an instance of an object.". Any help would be so much appreciated. Thanks!
<?php
function sendJSONRequest($url, $data)
{
$data_string = json_encode($data);
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "PUT");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/json',
'Accept: application/json',
'X-MP-Version: 10072013')
);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
ob_start();
$result = curl_exec($ch);
$info = curl_getinfo($ch);
if ($result === false || $info['http_code'] == 400) {
return $result;
} else {
return $result;
}
ob_end_clean();
curl_close($ch);
}
$mun = $_GET['municipality'];
$sgc = $_GET['sgc'];
$req = $_GET['req']; //cac52674-1711-e311-b4a8-00155d4905d3
//myPrepaid PUT URL
echo $mpurl = "http://stageapi.myprepaid.co.za/api/ConsumerRegisterRequest/$req";
// Set Variables
$data = array("Municipality" => "$mun", "SGC" => "$sgc");
//Get Response
echo $response = sendJSONRequest($mpurl, $data);
?>
I copied your code, but changed it so it pointed at a very basic HTTP server on my localhost. Your code is working correctly, and making the following request:
PUT /api/ConsumerRegisterRequest/cac52674-1711-e311-b4a8-00155d4905d3 HTTP/1.1
Host: localhost:9420
Content-Type: application/json
Accept: application/json
X-MP-Version: 10072013
Content-Length: 37
{"Municipality":"NMBM","SGC":"12345"}
The error message you're receiving is coming from the stageapi.myprepaid.co.za server. This is the full response when I point it back to them:
HTTP/1.1 500 Internal Server Error
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Expires: -1
Server: Microsoft-IIS/8.0
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Fri, 30 Aug 2013 04:30:41 GMT
Connection: close
Content-Length: 867
{"Message":"An error has occurred.","ExceptionMessage":"Object reference not set to an instance of an object.","ExceptionType":"System.NullReferenceException","StackTrace":" at MyPrepaidApi.Controllers.ConsumerRegisterRequestController.Put(CrmRegisterRequest value) in c:\\Workspace\\MyPrepaid\\Prepaid Vending System\\PrepaidCloud\\WebApi\\Controllers\\ConsumerRegisterRequestController.cs:line 190\r\n at lambda_method(Closure , Object , Object[] )\r\n at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass13.<GetExecutor>b__c(Object instance, Object[] methodParameters)\r\n at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.Execute(Object instance, Object[] arguments)\r\n at System.Threading.Tasks.TaskHelpers.RunSynchronously[TResult](Func`1 func, CancellationToken cancellationToken)"}
You may want to check out the API to make sure you're passing them the correct information. If you are, the problem could be on their end.
And while I realize this isn't part of your question and this is in development, please remember to sanitize any data from $_GET. :)
Try with:
curl_setopt($ch, CURLOPT_PUT, true);

Categories