Trying to parse an rss feed, this code works if I use a feed that doesn't need auth. So I assume it must be a curl issue. Please help, thanks.
<?php
$curl = curl_init();
curl_setopt_array($curl, Array(
CURLOPT_URL => 'http://insite.unthsc.edu/dailynews/feed/',
CURLOPT_USERAGENT => 'spider',
CURLOPT_TIMEOUT => 120,
CURLOPT_CONNECTTIMEOUT => 30,
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_ENCODING => 'UTF-8'
));
$data = curl_exec($curl);
curl_close($curl);
$xml = simplexml_load_string($data, 'SimpleXMLElement', LIBXML_NOCDATA);
//die('<pre>' . print_r($xml], TRUE) . '</pre>');
?>
<!DOCTYPE html>
<html>
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
</head>
<body>
<?php foreach ($xml->channel->item as $item) {
$creator = $item->children('dc', TRUE);
echo '<h2>' . $item->title . '</h2>';
echo '<h2>' . $item->category . '</h2>';
}
?>
</body>
</html>
Related
I wrote a very simple script to just see what a server was sending like this:
<?php
$html1 = <<<EOT
<!doctype html>
<html lang=en>
<head>
<meta charset = 'utf-8'>
<title>TradingView Test</title>
</head>
<body>
<pre>
EOT;
$html2 = <<<EOT
</pre>
</body>
</html>
EOT;
date_default_timezone_set('America/Phoenix');
$file = date('Y/m/d h:i:sa') . "\n";
$url = (isset($_SERVER['HTTPS']) && $_SERVER['HTTPS'] === 'on' ? "https" : "http")
. "://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";
$file .= "\nURL: $url\n\n";
$html = $file;
if (!empty($_REQUEST)) {
$r = "\$_REQUEST:\n" . var_export($_REQUEST, true) . "\n\n";
$file .= $r;
$html .= htmlspecialchars($r);
}
$headers = apache_request_headers();
if (!empty($headers)) {
$h = "HEADERS:\n" . var_export($headers, true) . "\n\n";
$file .= $h;
$html .= htmlspecialchars($h);
}
// if (!empty($_SERVER)) {
// $s = "\$_SERVER:\n" . var_export($_SERVER, true) . "\n\n";
// $file .= $s;
// $html .= htmlspecialchars($s);
// }
$file .= str_repeat('-', 40) . "\n";
file_put_contents('./get-post.log', $file, FILE_APPEND);
# echo $html1 . $html . $html2;
?>
In the example for what the server sends they had this example:
#!/bin/sh
curl -H 'Content-Type: text/plain; charset=utf-8' -d '1111111111,data,more-data' -X POST https://www.testtest.com/
I thought with $_REQUEST I would see all that was sent but I don't see anything and what the script outputs is:
2022/03/30 06:15:22pm
URL: https://www.testtest.com/
HEADERS:
array (
'X-Https' => '1',
'Connection' => 'close',
'Accept-Encoding' => 'gzip',
'Content-Type' => 'text/plain; charset=utf-8',
'Content-Length' => '45',
'User-Agent' => 'Go-http-client/1.1',
'Host' => 'www.testtest.com',
)
I suspect the issue is that they are not sending the data as a proper POST by naming a field and I can't change how the data is being sent because it's not my server.
Does anyone know what I can look at in PHP to see the data being sent? Or perhaps this is an Apache problem as PHP gets its data through Apache and maybe in that format it's just not getting passed through?
You will note that I have displaying of $_SERVER commented out because even when it wasn't it didn't give me anything that helped.
Found it. PHP won't fill $_POST without named arguments so to get the string the way it's being sent this needs to be used:
<?php
$str = file_get_contents('php://input');
I have a curl script that fetches json data from a server. When I save data into a file, it works fine and retrieves data only when new values are available.
But if I insert an INSERT query to save values into database, it starts fetching redundant values in a loop. Like new data is available each 2 minutes and when I save data into file, it fetches each 2 minute new data only. But I replace query with the file data insert code, it starts looping and fetching redundant data something like each 5 seconds.
How can I correct the script so that it fetches data when newer is available. Like it's doing when I save data into file.
Here is my code:
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="refresh" content="300">
<title>Weather Data</title>
<meta charset="utf-8">
</head>
<body>
<h2>Connect.php</h2>
<div>
<?php
require("Connection.php");
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => "https://cors.io/?http://api.xxxxx.com/live/xxxxxxxxxx=m/s",
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => "",
CURLOPT_MAXREDIRS => 10,
//CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => "GET",
CURLOPT_HTTPHEADER => array(
"Cache-Control: no-cache"
),
));
$response = curl_exec($curl);
$Wdata = json_decode($response, true);
$tem = $Wdata["temperature"];
$tem2 = $Wdata["2nd_temp"];
$hum = $Wdata["humidity"];
$tim = $Wdata["dateTime"];
$err = curl_error($curl);
curl_close($curl);
if ($err) {
echo "cURL Error #:" . $err;
} else {
echo $response;
echo "<br>";
echo "temp1:".$tem;
echo "<br>";
echo "humidity:".$hum;
echo "<br>";
echo "time:".$tim;
echo "<br>";
if ( $fl = fopen('weatherData.json','a'))
{ fwrite($fl,"\"Weather Data\": { \"Time\" : \"". $tim . "\" ,"."\"Humidity\" : \"". $hum . "\"}\n" );
//echo $_id.';'.$_time;
fclose($fl); }
// Here code to save $response into database; When I un-comment it, the script starts fetching redundant data in a loop
/*
try{
$sql = "INSERT INTO weatherdata (datetime, temperature, humidity)
VALUES ('".$tim."','".$tem."','".$hum."')";
echo "<meta http-equiv='refresh' content='0'>";
($conn->query($sql));
//$conn = null;
}
catch(PDOException $e)
{
echo $e->getMessage();
}
*/
}
?>
</div>
</body>
</html>
I want to scrape an HTML page. I am using cURL in PHP for doing the same.
I can successfully scrape a specific <div> content. i.e.
<div class="someDiv">ABC</div>
With the following working code
<?php
$curl = curl_init('https://www.someUrl.com');
curl_setopt_array($curl, array( CURLOPT_ENCODING => '',
CURLOPT_FOLLOWLOCATION => FALSE,
CURLOPT_FRESH_CONNECT => TRUE,
CURLOPT_SSL_VERIFYPEER => FALSE,
CURLOPT_REFERER => 'http://www.google.com',
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_USERAGENT => 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
CURLOPT_VERBOSE => FALSE));
$page = curl_exec($curl);
if(curl_errno($curl))
{
echo 'Scraper error: ' . curl_error($curl);
exit;
}
curl_close($curl);
$regex = '/<div class="someDiv">(.*?)<\/div>/s';
if (preg_match_all($regex, $page, $result)){
echo $result[1][0];
}
else{
print "Not found";
}
?>
Now I want to scrape an <img> nested inside a <span>. The code I want to scrape is as follows:
<span class="thumbnail">
<img src="image.gif" width="20" data-thumb="blabla/photo.jpg" height="20" alt="abc" >
</span>
I want to get the data-thumb from the <img> tag nested inside a <span> having class="thumbnail".
Here we go again...don't use regex to parse html, use an html parser like DOMDocument along with DOMXpath, i.e.:
<?php
...
$page = curl_exec($curl);
$dom = new DOMDocument();
$dom->loadHTML($page);
$xpath = new DOMXpath($dom);
foreach ($xpath->query("//span[#class='thumbnail']/img") as $img){
echo $img->getAttribute('data-thumb');
}
I need to post to a URL and I am doing this with curl. But the problem is with the HTML content I am posting. I am using this page which I am requesting to send an html email. So it will have inline styles. When I urlencode() or rawurlenocde() these style attribute is stripped. So the mail will not look correct. How can I avoid this and post the HTML as it is ?
This is my code :
$mail_url = "to=".$email->uEmail;
$mail_url .= "&from=info#domain.com";
$mail_url .= "&subject=".$email_campaign[0]->email_subject;
$mail_url .= "&type=signleOffer";
$mail_url .= "&html=".rawurlencode($email_campaign[0]->email_content);
//open curl request to send the mail
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST,count(5));
curl_setopt($ch,CURLOPT_POSTFIELDS,$mail_url);
//execute post
$result = curl_exec($ch);
Here is an example, use http_build_query() to build your post data from an array of values:
<?php
//Receiver debug
if($_SERVER['REQUEST_METHOD']=='POST'){
file_put_contents('test.POST.values.txt',print_r($_POST,true));
/*
Array
(
[to] => example#example.com
[from] => info#domain.com
[subject] => subject
[type] => signleOffer
[html] =>
<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Mail Template</title>
<style>.yada{color:green;}</style>
</head>
<body>
<p style="color:red">Red</p>
<p class="yada">Green</p>
</body>
</html>
)
*/
die;
}
$curl_to_post_parameters = array(
'to'=>'example#example.com',
'from'=>'info#domain.com',
'subject'=>'subject',
'type'=>'signleOffer',
'html'=>'
<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Mail Template</title>
<style>.yada{color:green;}</style>
</head>
<body>
<p style="color:red">Red</p>
<p class="yada">Green</p>
</body>
</html>
'
);
$curl_options = array(
CURLOPT_URL => "http://localhost/test.php",
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => http_build_query( $curl_to_post_parameters ), //<<<
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => false
);
$curl = curl_init();
curl_setopt_array($curl, $curl_options);
$result = curl_exec($curl);
curl_close($curl);
?>
Do a POST as described in this post:
Passing $_POST values with cURL
It should solve your problem.
I have this php file that gets the points (id,name,geolocation) of london.The problem is that I get the correct results as a json format but when i decode it and trying to get to contains array of results I get an error.How i can get the data from '/location/location/contains' attribute?
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>search</title>
</head>
<body>
<?php
function freebasequery ($fid){
$query = array(array('id' => $fid, '/location/location/contains'=>array(array('id'=>NULL,'name' => NULL,'/location/location/geolocation' =>array(array('/location/geocode/longitude' =>NULL,'/location/geocode/latitude' => NULL))))));
$query_envelope = array('query' => $query);
$service_url = 'http://api.freebase.com/api/service/mqlread';
$url = $service_url . '?query=' . urlencode(json_encode($query_envelope));
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
curl_close($ch);
return $response;
}
$points=freebasequery('/en/london');
//echo $points;
$results=json_decode($points)->result;
foreach($results as $poi){
echo $poi->id;
$contains="/location/location/contains";
$poisarray=$poi->$contains;
foreach($poisarray as $point){
echo $point->id;
}
}
?>
</body>
</html>
The error it was on json_decode ( It requires to have a true) the solution is:
json_decode($points,true);
and then I can access the data array I want like this:
$results["result"][0]["/location/location/contains"];