Trying to Parse the price from a site. I can already retrieve the title from the source but i get a Notice when i attempt to scrape the price.
Notice: Undefined offset: 1
Here's the code:
<?php
$file_string = file_get_contents('http://finance.google.com');
preg_match('/<title>(.*)<\/title>/i', $file_string, $title);
$title_out = $title[1];
preg_match('~<span id="ref_658274_l">(.*)</span>~', $file_string, $price);
//error on the line below
$price_out = $price[1];
?>
<?php echo "$title_out"; ?>
<?php echo "$price_out"; ?>
Parsing HTML may be more successful with DOMDocument
$doc = new DOMDocument();
$doc->loadHTML(file_get_contents('http://finance.google.com'));
$titleElems = $doc->getElementsByTagName('title');
if ($titleElems->length) {
$title = $titleElems->item(0)->nodeValue;
}
$priceElem = $doc->getElementById('ref_658274_l');
if ($priceElem != null) {
$price = $priceElem->nodeValue;
}
Your regular expression doesn't match. When using the result, you should always validate that the index you are using, in this case 1, is within the bounds of your array.
Related
I can't figure out where is mistake in this code and how to solve this error.
Error reported is:
Notice: Undefined offset: 1 in /Applications/XAMPP/xamppfiles/htdocs/bat/rd-search.php on line 53 that is this string of code:
$final_result[$file_count]['page_title'][] = $page_title[1];
the code is:
$contents = file_get_contents($file);
preg_match("/\<title\>(.*)\<\/title\>/", $contents, $page_title); //getting page title
if (preg_match("#\<body.*\>(.*)\<\/body\>#si", $contents, $body_content)) { //getting content only between <body></body> tags
$clean_content = strip_tags($body_content[0]); //remove html tags
$clean_content = preg_replace('/\s+/', ' ', $clean_content); //remove duplicate whitespaces, carriage returns, tabs, etc
$found = strpos_recursive(mb_strtolower($clean_content, 'UTF-8'), $search_term);
$final_result[$file_count]['page_title'][] = $page_title[1];
$final_result[$file_count]['file_name'][] = preg_replace("/^.{3}/", "\\1", $file);
}
for ($j = 0; $j < count($template_tokens); $j++) {
if (preg_match("/\<meta\s+name=[\'|\"]" . $template_tokens[$j] . "[\'|\"]\s+content=[\'|\"](.*)[\'|\"]\>/", $contents, $res)) {
$final_result[$file_count][$template_tokens[$j]] = $res[1];
}
}
This is because the preg_match returns no matches.
You can resolve this by adding a line below such as the following:
preg_match("/\<title\>(.*)\<\/title\>/", $contents, $page_title); //getting page title
$page_title = empty($page_title) ? [0 => '', 1 => ''] : $page_title;
This will just create an array with 0 and 1 as indexes if preg_match failed to get any matches for a title.
The other alternative is to check the value of $page_title before trying to access it. This approach means that you will have to create a check each time you want to access this variable, whereas the first approach doesn't because the indexes have blank values.
if (!empty($page_title) {
$final_result[$file_count]['page_title'][] = $page_title[1];
}
I currently this error while parsing for Stock price from Yahoo.
I receive only for some stocks however.
I've looked through some other posts and haven't been able to resolve this.
Here's my scraper code:
<?php
//session_start();
#$stockname = "COALINDIA.NS";
$stockname = "AAPL";
#$stockname = "GOOG";
#$stockname = "TCS.NS";
$url = 'https://finance.yahoo.com/quote/' . $stockname;
$data = file_get_contents($url);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($data);
libxml_clear_errors();
$dom->saveHTML();
$xpath = new DOMXPath($dom);
$price = $xpath->query('//*[#id="quote-header-info"]/div[3]/div[1]/div/span[1]/text()');
$sign = $xpath->query('//*[#id="quote-header-info"]/div[3]/div[1]/div/span[2]/text()');
$change = $xpath->query('//*[#id="quote-header-info"]/div[3]/div[1]/div/span[2]/text()[2]');
$current_price = $price->item(0)->nodeValue;
$gain_and_percent = $sign->item(0)->nodeValue . $change->item(0)->nodeValue;
echo $current_price;
echo "<br>";
echo "Gain (Gain %): ". $gain_and_percent;
?>
The error I receive is:
Notice: Trying to get property of non-object in /Applications/XAMPP/xamppfiles/htdocs/Portfolio-Manager/scripts/scraper.php on line 27
Notice: Trying to get property of non-object in /Applications/XAMPP/xamppfiles/htdocs/Portfolio-Manager/scripts/scraper.php on line 28
This refers to the lines where I assign the variable $current_price and $gain_and_percent. I do not receive this notice when I echo values separately, however I do when I assign into a HTML Table as follows:
<?php
while ($record = mysqli_fetch_assoc($result)) {
?>
<td>
<?php
$stockname = $symbol;
require 'scripts/scraper.php';
echo $current_price;
?>
</td>
Thank you for looking over my code and your time.
There are many posts on this topic but I did not get the required answer. Hence, I am here.
I have been getting Notice: Trying to get property of non-object in /opt/lampp/htdocs/amit/crawlnepalstock.php on line 49 error in my php page.
Here is my code
<?php
include_once('simple_html_dom.php');
error_reporting(E_ALL);
$html = file_get_html('http://nepalstock.com/datanepse/index.php');
$indexarray = array('s_no','stocksymbol', 'LTP', 'LTV', 'point_change', 'per_change', 'open','high', 'low', 'volume','prev_close');
$stocks = array();
$maincount = 0;
$tables = $html->find('table[class=dataTable]');
$str = $html->plaintext;
$matches = array();
foreach ($tables[0]->find('tr') as $elementtr) {
$count = 0;
$temp = array();
$anchor = $elementtr->children(1)->find('a',0);
$splits = preg_split('/=/', $anchor->href); **//line 49**
$temp['stocksymbol'] = isset($splits[1]) ? $splits[1] : null;
$temp['fullname'] = $elementtr->children(1)->plaintext;
$temp['no_of_trans'] = $elementtr->children(2)->plaintext;
$temp['max_price'] = $elementtr->children(3)->plaintext;
$temp['min_price'] = $elementtr->children(4)->plaintext;
$temp['closing_price'] = $elementtr->children(5)->plaintext;
$temp['total_share'] = $elementtr->children(6)->plaintext;
$temp['amount'] = $elementtr->children(7)->plaintext;
$temp['previous_close'] = $elementtr->children(8)->plaintext;
$temp['difference'] = $elementtr->children(9)->plaintext;
$stocks[] = $temp;
}
$html->clear();
unset($html);
echo '<pre>';
print_r($stocks);
echo '</pre>';
?>
I have not included simple_html_dom.php class as it is quite long. Your opinions are very much appreciated. You can find simple_html_dom.php file online in case http://sourceforge.net/projects/simplehtmldom/files/
You are trying to access property of non-object or from null object. e.g.
$obj = null;
echo $boject->first_name // this will produce same error as you are getting.
// another example may be
$obj = array();
echo $obj->first_name; // this will also produce same error.
In code sample Line 49 is not clear so you should check yourself this type of error on line 49.
This is happening because there is no longer a td[align="center"] tag found in the google.com document. Perhaps it was there when the code was first written.
So, what the others are saying about a non-object is true, but because the HTML was not found, there is not an object to use the ->plaintext method on.
As of 12/11/2020, if you change the URL found in line 6 of example_basic_selector.php to this:
$html = file_get_html('https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_td_align_css');
And change this line
echo $html->find('td[align="center"]', 1)->plaintext . '';
to:
echo $html->find('td style="text-align:right"', 1)->plaintext. '';
the error will go away because the text it searches for is found, and thus the method works as intended.
I am having problem with the yahoo search API, sometimes it works and sometimes don't why I am getting problem with that
I am using this URL
http://api.search.yahoo.com/WebSearchService/rss/webSearch.xml?appid=yahoosearchwebrss&query=originurlextension%3Apdf+$search&adult_ok=1&start=$start
The code is given below:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<? $search = $_GET["search"];
$replace = " "; $with = "+";
$search = str_replace($replace, $with, $search);
if ($rs =
$rss->get("http://api.search.yahoo.com/WebSearchService/rss/webSearch.xml?appid=yahoosearchwebrss&query=originurlextension%3Apdf+$search&adult_ok=1&start=$start")
)
{ }
// Go through the list powered by the search engine listed and get
// the data from each <item>
$colorCount="0";
foreach($rs['items'] as $item) { // Get the title of result
$title = $item['title']; // Get the description of the result
$description = $item['description']; // Get the link eg amazon.com
$urllink = $item['guid'];
if($colorCount%2==0) {
$color = ROW1_COLOR;
} else {
$color = ROW2_COLOR;
}
include "resulttemplate.php"; $colorCount++;
echo "\n";
}
?>
Sometimes it gives results and sometimes don't. I get this error usually
Warning: Invalid argument supplied for foreach() in /home4/thesisth/public_html/pdfsearchmachine/classes/rss.php on line 14
Can anyone help..
The error Warning: Invalid argument supplied for foreach() in /home4/thesisth/public_html/pdfsearchmachine/classes/rss.php on line 14 means the foreach construct did not receive an iterable (usually an array). Which in your case would mean the $rs['items'] is empty... maybe the search returned no results?
I would recommended adding some checks to the results of $rss->get("...") first, and also having an action for when the request fails or returns no results:
<?php
$search = isset($_GET["search"]) ? $_GET["search"] : "default search term";
$start = "something here"; // This was left out of your original code
$colorCount = "0";
$replace = " ";
$with = "+";
$search = str_replace($replace, $with, $search);
$rs = $rss->get("http://api.search.yahoo.com/WebSearchService/rss/webSearch.xml?appid=yahoosearchwebrss&query=originurlextension%3Apdf+$search&adult_ok=1&start=$start");
if (isset($rs) && isset($rs['items'])) {
foreach ($rs['items'] as $item) {
$title = $item['title']; // Get the title of the result
$description = $item['description']; // Get the description of the result
$urllink = $item['guid']; // Get the link eg amazon.com
$color = ($colorCount % 2) ? ROW2_COLOR : ROW1_COLOR;
include "resulttemplate.php";
echo "\n";
$colorCount++;
}
}
else {
echo "Could not find any results for your search '$search'";
}
Other changes:
$start was not declared before your $rss->get("...") call
compounded the $color if/else clause into a ternary operation with fewer comparisons
I wasn't sure what the purpose of the if ($rs = $rss->get("...")) { } was, so I removed it.
I would also recommend using require instead of include as it will cause a fatal error if resulttemplate.php doesn't exist, which in my opinion is a better way to detect bugs than PHP Warnings which will continue execution. However I don't know you whole situation so it might not be of great use.
Hope that helps!
Cheers
When I remove the # sign from my $d, $x DOMdocument variables below, I'm getting the error...
Warning: DOMDocument::loadHTML()
[domdocument.loadhtml]: Empty string
supplied as input in
C:\xampplite\htdocs\mysite\wp-content\plugins\myplugin\index.php
on line 50
On the $content variable, when I run the function below. Even though I can echo $content and get a string. What am I missing?
add_filter('wp_insert_post_data', 'decorate_keyword');
function decorate_keyword($postarray) {
global $post;
$keyword = getKeyword($post);
/*
Even though I can echo $content, I'm getting the error referenced above.
I have to explicitly set it to a string to overcome the error.
*/
$content = $postarray['post_content'];
//$content = "this is a test phrase";
$d = new DOMDocument();
$d->loadHTML($content);
$x = new DOMXpath($d);
$nodes = $x->query("//text()[contains(.,'$keyword') and not(ancestor::h1) and not(ancestor::h2) and not(ancestor::h3) and not(ancestor::h4) and not(ancestor::h5) and not(ancestor::h6)]");
if ($nodes && $nodes->length) {
$node = $nodes->item(0);
// Split just before the keyword
$keynode = $node->splitText(strpos($node->textContent, $keyword));
// Split after the keyword
$node->nextSibling->splitText(strlen($keyword));
// Replace keyword with <b>keyword</b>
$replacement = $d->createElement('b', $keynode->textContent);
$keynode->parentNode->replaceChild($replacement, $keynode);
}
$postarray['post_content'] = $d;
return $postarray;
}
You should input a URL string like http://www.example.com to loadHTML() instead of an array.