I want to insert some value from first and second foreach into database, but I met some trouble. I write my problem in the code. I can not solve the two loop problem. I ask for a help.
<?php
header('Content-type:text/html; charset=utf-8');
set_time_limit(0);
require_once ('../conn.php');
require_once ('../simple_html_dom.php');
$url = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&rsz=large&q=obama&key={api-key}";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $url);
$body = curl_exec($ch);
curl_close($ch);
$data = json_decode($body);
foreach ($data->responseData->results as $result) {
$title = html_entity_decode($result->titleNoFormatting);
$link = html_entity_decode($result->unescapedUrl);
$html = #file_get_html($link );
foreach(#$html->find('h3') as $element) {
$table=$element;
echo $table;// here while the $table is empty, echo is null.
}
echo $table;// here while the $table is empty, echo will repeat the prev $table value.
mysql_query("SET NAMES utf8");
mysql_query("INSERT INTO ...");// I want insert all the $title and $table into database.
}
echo '<hr />';
}
?>
I print the result while the $table is empty, echo will repeat the prev $table value.
Organizing for America | BarackObama.com
Barack Obama - Wikipedia, the free encyclopedia
President Barack Obama | The White House
President Obama Nominates William Francis Kuntz, II to the United States District Court//the prev value
Change.gov - The Official Web Site of the
President Obama Nominates William Francis Kuntz, II to the United States District Court//here the $table is empty, it will repeat the prev $table value, and it should be empty.
Barack Obama on Myspace
Idle Friends▼
ob (obama) on Twitter
Piè di pagina
Barack Obama
Advertise with the NY Daily News!
Barack Obama on the Issues
Voting Record
PHP's variable initialization and scoping rules are kind of funny.
At no point are you initializing $table. It first gets referenced two foreaches deep. PHP allows this, and won't complain about it.
The problem is that you're constantly trying to set it to a value, but you're never actually resetting it.
Initialize it to null before the inner foreach:
$html = file_get_html($link );
$table = null; // <-- New!
foreach($html->find('h3') as $element) {
$table = $element;
echo $table;
}
This ensures that, when the foreach is completed, $table will either be null, or it will be the final H3 element in the HTML document you fetched. (Incidentally, if you really did want the final H3, you can probably just grab the array that find returns and look at the last element rather than looping through.)
Also, please get rid of the # error-silencing operators, turn error_reporting all the way up, and make sure you've turned on display_errors. You may have other errors lurking that you are intentionally ignoring, and that leads to horror stories.
Related
I am working on to retrieve a table content(everything under <tbody>) from an URL to my page.
It can be everything under <table> but remove <thread>...</thread>
I have search many references in this forum but not able to get the result I want.
The HTML structure as per the image(actual code too lengthy to paste here):
[1]: https://i.stack.imgur.com/SgwM1.png
Appreciate if you can show me the light
Orz
My sample code"
$url = 'https://xxxxxx.com/tracking/SUA000085003';
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$cl = curl_exec($ch);
$dom = new DOMDocument();
$dom->loadHTML($cl);
$dom->validate();
$rows = $dom->getElementsByTagName("tr");
foreach ($rows as $row) {
$cells = $row -> getElementsByTagName('td');
foreach ($cells as $cell) {
print $cell->nodeValue; // print cells' content as 124578
echo "<BR>";
}
}
The result I got is:
https://xxxxxx.com/tracking/SUA000085003
15 May 202101:35:33
the goods left the warehouse in guangzhou
15 May 202101:35:33
arrived at sorting facility
14 May 202123:35:33
express operation is complete
The URL from the result is under <Table><thread>...</thread>
I would like to remove this text entirely or only show the text after the last /, SUA000085003 is the example for this case.
I have created a script where every other word in a paragraph is green, which is correct. However there is a problem because the original paragraph which I used appears above the new paragraph, which I do not want.
This solution to this may be simple but I can't get my head around it.
Can anyone point me in the right direction?
Code:
<?php
$storyOfTheDay= "Once upon a time there was an old woman who loved baking gingerbread. She would bake gingerbread cookies, cakes, houses and gingerbread people, all decorated with chocolate and peppermint, caramel candies and colored frosting.
She lived with her husband on a farm at the edge of town. The sweet spicy smell of gingerbread brought children skipping and running to see what would be offered that day.
Unfortunately the children gobbled up the treats so fast that the old woman had a hard time keeping her supply of flour and spices to continue making the batches of gingerbread. Sometimes she suspected little hands of having reached through her kitchen window because gingerbread pieces and cookies would disappear.";
$storyOfTheDay = preg_split("/\s+/", $storyOfTheDay);
//Adding <span> to odd array index items
foreach (array_chunk($storyOfTheDay , 2) as $chunk) {
$storyOfTheDay[] = $chunk[0];
if(!empty( $chunk[1]))
{
$storyOfTheDay[] = $chunk[1]= "<span style='color:green'>". $chunk[1] ."</span>";
}
}
$storyOfTheDay = join(" ", $storyOfTheDay);
echo $storyOfTheDay;
Output:
Image of Output
You are continuously filling the same array ($storyOfTheDay). Make the new one:
$storyOfTheDay = preg_split("/\s+/", $storyOfTheDay);
$newStoryOfTheDay = [];
//Adding <span> to odd array index items
foreach (array_chunk($storyOfTheDay , 2) as $chunk) {
$newStoryOfTheDay[] = $chunk[0];
if( !empty($chunk[1]) ){
$newStoryOfTheDay[] = "<span style='color:green'>". $chunk[1] ."</span>";
}
}
$newStoryOfTheDay = join(" ", $newStoryOfTheDay);
echo $newStoryOfTheDay;
Kindly please help regarding Xpath...
Following scripts will scraping the main body of URL by using Xpath
<?php
//sentimen order
if (PHP_SAPI != 'cli') {
echo "<pre>";
}
require_once __DIR__ . '/../autoload.php';
$sentiment = new \PHPInsight\Sentiment();
require_once 'Xpath.php';
$startUrl = "http://news.sky.com/story/1445575/suspect-held-over-shooting-of-ferguson-police/";
$xpath = new XPATH($startUrl);
// We starts from the root element
$query = '/html/body/div[2]/div[3]/article/div/div[2]/div[2]/p[3]';
$strQuery = $xpath->query($query);
$strNode = $strQuery->item(0)->nodeValue;
$result = array($strNode);
foreach ($result as $string) {
// calculations:
$scores = $sentiment->score($string);
$class = $sentiment->categorise($string);
// output:
echo "Strings $string \n";
echo "Dominant: $class, scores: ";
print_r($scores);
echo "\n";
}
Above scripts run well except the array loop...Xpath does not scraping ALL content but ONLY the first line of main body..
I think the problem lies from array loop and foreach...
Anyone please help to fix this looping....
You only fetch one paragraph. Additionally you only put one string into the array.
You're perhaps looking for something more along this lines:
foreach ($xpath->query('
//header/h1
|//header/p
|//header//p[#class="last-updated__text"]
|//div[#class="story__content"]/p') as $p) {
echo string_normalize($p->textContent), "\n\n";
}
function string_normalize($string)
{
return preg_replace('~\s+~u', ' ', trim($string));
}
Output:
Shooting Of Ferguson Police: Suspect Charged
A prosecutor says the 20-year-old suspect claims he fired the shots in a dispute with other individuals and did not aim at police.
05:19, UK, Monday 16 March 2015
By Sky News US Team
A suspect has been charged in connection with the shooting and wounding last week of two police officers in Ferguson, Missouri.
St Louis County prosecutor Robert McCulloch told a news conference the accused was 20-year-old Jeffrey Williams.
He said the suspect, a local resident, was facing two counts of assault in the first degree.
Williams, who was arrested on Saturday night, is also charged with firing a handgun from a vehicle.
"He has acknowledged his participation in firing the shots," Mr McCulloch told reporters.
...
Well, what I'm trying to do is quite obvious. I am receiving tweets as shown below :
$options .= 'q='.urlencode($hash_tag);
$options .= '&page=15';
$options .= '&rpp=100';
$options .= '&result_type=recent';
$url = 'https://search.twitter.com/search.atom?'.$options ;
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, TRUE);
$xml = curl_exec ($ch);
curl_close ($ch);
$affected = 0;
$twelement = new SimpleXMLElement($xml);
foreach ($twelement->entry as $entry) {
$text = trim($entry->title);
$author = trim($entry->author->name);
$time = strtotime($entry->published);
$id = $entry->id;
echo '<hr>';
echo "Yazan : ".$author;
echo "</br>";
echo "Tarih : ".date('Ymd H:i:s',$time);
echo "</br>";
echo "Tweet : ".$text;
echo "</br>";
}
and as you can check on this link : linkToTrial I can receive tweets. But they are so old for me! I want to receive tweets in last moments, at least in last 5 mins. Here it says
This sounds like something you can do on your end, as created_at is one of the fields returned in the result set. Just do your query, and only use the ones that are within the last 5 seconds.
but when you check my example, you will see that I'm not even receiving the last tweets. Where am I doing wrong? Where?
Any answer will be appreciated. Thanks for your responds.
You're using a deprecated API (search.twitter.com) that will cease functioning on May 7, 2013 -- you'll want to move to the v1.1 Search API -- see https://dev.twitter.com/docs/api/1.1/get/search/tweet for docs.
It looks like the specific reason you're getting older results with this query is that you're starting on page 15 -- the end of the result set. The most recent tweets will be at the beginning of the result set -- page 1.
In API v1.1, the concept of paging no longer exists for the Search API. Instead, you navigate through the result set using since_id and max_id, details here: https://dev.twitter.com/docs/working-with-timelines
I found this site, that provides IMDB API:
http://www.omdbapi.com
and for getting for example the hobbit's it's easy enough as this:
http://www.omdbapi.com/?i=tt0903624
Then I get all this information:
{"Title":"The Hobbit: An Unexpected Journey","Year":"2012","Rated":"11","Released":"14 Dec 2012","Runtime":"2 h 46 min","Genre":"Adventure, Fantasy","Director":"Peter Jackson","Writer":"Fran Walsh, Philippa Boyens","Actors":"Martin Freeman, Ian McKellen, Richard Armitage, Andy Serkis","Plot":"A curious Hobbit, Bilbo Baggins, journeys to the Lonely Mountain with a vigorous group of Dwarves to reclaim a treasure stolen from them by the dragon Smaug.","Poster":"http://ia.media-imdb.com/images/M/MV5BMTkzMTUwMDAyMl5BMl5BanBnXkFtZTcwMDIwMTQ1OA##._V1_SX300.jpg","imdbRating":"9.2","imdbVotes":"5,666","imdbID":"tt0903624","Response":"True"}
The thing is that I only want for exmaple the title, the year and the plot information, and I wonder how I can only retrieve this.
I want to use PHP.
Here you go... simply decode the json, and pull out the data you need. If need be, you can re-encode it as json afterwards.
$data = file_get_contents('http://www.omdbapi.com/?i=tt0903624');
$data = json_decode($data, true);
$data = array('Title' => $data['Title'], 'Plot' => $data['Plot']);
$data = json_encode($data);
print($data);
Another way to do this (slightly more efficiently) is to unset unneeded keys, e.g.:
$data = file_get_contents('http://www.omdbapi.com/?i=tt0903624');
$data = json_decode($data, true);
$keys = array_keys($data);
foreach ($keys as $key) {
if ($key != 'Title' && $key != 'Plot) {
unset($data[$key]);
}
}
$data = json_encode($data);
print($data);
OMDBAPI.com is no longer free to use. As you can read on their website:
05/08/17 - Going Private! Please go read the post on the Patreon page about this major change.
This means you must become a donator to get access to the API. I used their API for over a year and now it stopped. If you need to perform lots of queries then I think becoming a sponsor to OMDBAPI is a good idea. However I used their API in my little private project. After googling a bit I found another API. Here's the code you can use:
<?php
$imdbID = 'tt2866360';
$data = json_decode(file_get_contents('http://api.rest7.com/v1/movie_info.php?imdb=' . $imdbID));
if (#$data->success !== 1)
{
die('Failed');
}
echo '<pre>';
print_r($data->movies[0]);
I am not affiliated with this website. But I use this API so can answer a question or two if anyone has.