I am building a web crawler which scans links, titles and meta descriptions from links that are found from one url submitted
This if statement i think is correct. $description is the variable which holds all the descriptions from the array $link. But i notice not all sites have a meta description (wikipedia for example) so i have decided that i would like the first twenty characters to act as the description if the description is empty. (By the way, the function and calling of everything works, i just wanted you to see it)
if ($description == '') {
$html = file_get_contents($link);
preg_match('%(<p[^>]*>.*?</p>)%i', $html, $re);
$res = get_custom_excerpt($re[1]);
echo "\n";
echo $res;
echo "\n";
}
However, in the array, the links are stored in [link], the title of the link in [title] and the description in [description]. But i don't know how i would cope with adding $res to my array and to only use if the if statement works.
$output = Array();
foreach ($links as $thisLink) {
$output[] = array("link" => $thisLink, "title" => Titles($thisLink), "description" => getMetas($thisLink), getMetas($res));
}
print_r($output);
You can use array_push() to add $res back to your array and then evaluate the array however you need to; not 100% sure what you're trying to do...
From your wording I think you want to do this:
$outputs = array();
foreach ($links as $thisLink) {
$output = array("link" => $thisLink, "title" => Titles($thisLink), "description" => getMetas($thisLink));
if ($output['description'] == null) {
$output['description'] = getMetas($res);
}
$outputs[] = $output;
}
You might want to adjust the if statement because I do not know what getMetas() returns when there is not description.
Related
I have created a xml from an array using php.The result is listed below.
<Mst><Mstrow><sCode>10</sCode>Test<sName></sName></Mstrow></Mst>
But I want to show this xml with white spaces between each element lik this
<Mst> <Mstrow> <sCode>10</sCode> <sName>Test</sName> </Mstrow> </Mst>
Below is my code ,
$results = Array ( [0] => Array ( [sCode] => 10 [sName] => Test) ) ;
$main = $dom->appendChild($dom->createElement('Mst'));
if($results != Array()){
foreach ($results as $datas) {
$row ->$main->appendChild($dom->createElement('Mstrow'));
foreach ($datas as $name => $value) {
$row
->appendChild($dom->createElement($name))
->appendChild($dom->createTextNode($value));
}
}
}
Please provide a solution
I may have missunderstood the qn, and I assume in your (pre) example test should be between <sName> tags. I would have thought a simple string replace before you echo/save your XML string would do the trick?
e.g.
echo str_replace( '><' , '> <', $myXml->asXML());
I am learning PHP. I have to extract all the script tags of a particluar $url whose src contain a particular word say 'abc'.
I have tried this:
$domd = new DOMDocument();
#$domd->loadHTML(#file_get_contents($url));
$data = array();
$items = $domd->getElementsByTagName('script');
foreach($items as $item) {
if($item->hasAttribute('src')){
$data[] = array(
'src' => $item->getAttribute('src')
);
}
}
print_r($data);
echo "\n";
The above code gives me the list of all the script tag's src's present in the $url.
But how should i check if a src in a script tag contains a word 'abc' ?
If you need to check each value in the array, to see if it contains 'abc', try a foreach() loop with an if statement using strpos() to see if the value contains 'abc', and then do something with it.
Something like this should do the trick:
foreach( $data as $key=>$value ) {
if( strpos( $value['src'],'abc' ) !== false ) {
//do something with it here
}
}
Just edited this. Call the ['src'] element of the subArray and use that in strpos(). Alternately, you could change your line that builds the $data array to this, since you only have one element in each subArray:
if($item->hasAttribute('src')){
$data[] = $item->getAttribute('src');
}
I am building a web crawler. It finds all the links on a page and their titles and meta descriptions etc. It does that fine. Then i wrote an array which gives all the starting urls for the links I want. So if it crawls a link and its url begins with any value in the array which gives the starting urls, insert into $news_stories.
The only problem is it doesn't seem to be inserting into them. The page returns blank and now it says that the array_intersect statement wants an array and that I havent specfied an array which I have.
In summary, I am struggling to understand where my code doesn't work and why the wanted urls aren't being inserted.
$bbc_values = array(
'http://www.bbc.co.uk/news/health-',
'http://www.bbc.co.uk/news/politics-',
'http://www.bbc.co.uk/news/uk-',
'http://www.bbc.co.uk/news/technology-',
'http://www.bbc.co.uk/news/england-',
'http://www.bbc.co.uk/news/northern_ireland-',
'http://www.bbc.co.uk/news/scotland-',
'http://www.bbc.co.uk/news/wales-',
'http://www.bbc.co.uk/news/business-',
'http://www.bbc.co.uk/news/education-',
'http://www.bbc.co.uk/news/science_and_enviroment-',
'http://www.bbc.co.uk/news/entertainment_and_arts-',
'http://edition.cnn.com/'
);
// BBC Algorithm
foreach ($links as $link) {
$output = array(
"title" => Titles($link), //dont know what Titles is, variable or string?
"description" => getMetas($link),
"keywords" => getKeywords($link),
"link" => $link
);
if (empty($output["description"])) {
$output["description"] = getWord($link);
}
}
$new_stories = array();
foreach ($output as $new_array) {
if (array_intersect($output['link'], $bbc_values) == true) {
$news_stories[] = $new_array;
}
print_r($news_stories);
}
You decalred the array as $new_stories and printing $news_stories..... diff is 'S'
check whether the code is coming inside this loop or not, i think not...
if (array_intersect($output['link'], $bbc_values) == true) {
echo 'here';
}
Hmm i don't think array_intersect is what you need for a comparison http://php.net/manual/en/function.array-intersect.php
Maybe you want to look for in_array http://php.net/manual/en/function.in-array.php
When the return parameter is used, this function uses internal output buffering so it cannot be used inside an ob_start() callback function.
In a mysql database I store some image names without the exact path.
So what I like to do is add the path before I load the returned array into jQuery (json) to use it in the JQ Galleria plugin.
In the columns I've got names likes this:
11101x1xTN.png 11101x2xTN.png 11101x3xTN.png
Which in the end should be like:
./test/img/cars/bmw/11101/11101x1xTN.png
./test/img/cars/bmw/11101/11101x2xTN.png
I could just add the whole path into the database but that seems 1. a wast of db space. 2. Then I need to update he whole db if the images path changes.
I could edit the jQuery plugin but it doesn't seem practical to update the source code of it.
What is the right thing to do and the fasted for processing?
Can you add a string after you make a db query and before you fetch the results?
part of the function where I make the query:
$needs = "thumb, image, big, title, description";
$result = $get_queries->getImagesById($id, $needs);
$sth=$this->_dbh->prepare("SELECT $needs FROM images WHERE id = :stockId");
$sth->bindParam(":stockId", $id);
$sth->execute();
$result = $sth->fetchAll(PDO::FETCH_ASSOC);
this is the foreach loop:
$addurl = array('thumb', 'image', 'big');
foreach ($result as $array) {
foreach ($array as $item => $val) {
if (in_array($item, $addurl)){
$val = '/test/img/cars/bmw/11101/'.$val;
}
}
}
the array looks like this:
Array
(
[0] => Array
(
[thumb] => 11101x1xTN.png
[image] => 11101x1xI.png
[big] => 11101x1xB.png
[title] => Title
[description] => This a blub.
)
)
The url should be add to thumb, image and big.
I tried to change the array values using a foreach loop but that didn't work. Also not noting if the use of that would course a unnecessary slowdown.
well, you almost nailed it. only thing you forgot is to store your $val back in array.
foreach ($result as $i => $array) {
foreach ($array as $item => $val) {
if (in_array($item, $addurl)){
$val = '/test/img/cars/bmw/11101/'.$val;
$result[$i][$item] = $val;
}
}
}
however, I'd make it little shorter
foreach ($result as $i => $array) {
foreach ($addurl as $item) {
$result[$i][$item] = '/test/img/cars/bmw/11101/'.$array[$item];
}
}
}
Assuming your array looks like this:
$result = array("11101x1xTN.png", "11101x2xTN.png", "11101x3xTN.png");
A simple array_map() can be used.
$result_parsed = array_map(function($str) { return './test/img/cars/bmw/11101/'.$str; }, $result);
As seen Here
I'm trying to make a universal script that adds keywords to my individual pages (since header is in an include file) so I am getting the end of the url (multi.php) and retrieving the desc etc. from it's array. For some reason instead of returning keywords or descriptions it instead just returns "m" . . . it's kind of random and has me scratching my head. Here's what I got
<html>
<head>
<title>Multi-Demensional Array</title>
<?php
$path = pathinfo($_SERVER['PHP_SELF']);
$allyourbase = $path['basename'];
$pages = array
(
"multi.php" => array
(
"keywords" => "index, home, test, etc",
"desc" => "This is the INDEX page",
"style" => "index.css"
),
"header.php" => array
(
"keywords" => "showcase, movies, vidya, etc",
"desc" => "SHOWCASE page is where we view vidya.",
"style" => "showcase.css"
)
);
?>
</head>
<body>
<?php
foreach($pages as $key => $value)
{
if($key == $allyourbase)
{
echo $key['desc'];
}
}
?>
</body>
</html>
The reason why this is happening is because in PHP if I had the following code:
$hello = 'world';
and I attempted to do the following:
echo $hello[0];
PHP Would treat the string as an array and return me whatever is in position 0, which would result in w, when your using a foreach your asking PHP to set the key of the array to $key, and it's value to $value.
you then echo echo $key['desc'];, as the value is a string, php sees it as an integer based index, so it will ignore your call for desc and then return the first index, if you were to change echo $key['desc'] to echo $value['desc'] which is a hash based array it will return the desired results.
You should just be able to do this:
if(isset($pages[$allyourbase]))
{
echo $pages[$allyourbase]['desc'];
}
No need for the loop
try
echo $key['desc'];
replace with
echo $value['desc'];
Other people have provided some great solutions, but it's important that you understand exactly what is happening here, so you don't make the same mistake again. Pay careful attention to the comments, and you will be on your way to successful coding!
Here's what is happening:
foreach ($pages as $key => $value) {
if ($key == $allyourbase) {
// At this point: $key = 'multi.php'
// Also: $value = array( ... );
// Keep in mind: $key['desc'] = $key[0] = 'm';
// You are grabbing the first letter of the 'multi.php' string.
// When dealing with strings, PHP sees $key['desc'] as $key[0],
// which is another way to grab the very first character of 'multi.php'
echo $key['desc'];
// You really want $pages[$key]['desc'], but below
// is a better way to do it, without the overhead of
// the loop.
}
}
If you kept the loop, which is really unnecessary, it would look like this:
foreach ($pages as $key => $value) {
if ($key == $allyourbase) {
echo $value['desc'];
}
}
The best solution is to replace the loop with the following code:
if (isset($pages[$allyourbase])) {
echo $pages[$allyourbase]['desc'];
} else {
// error handling
}
If I'm reading this right, echo $key['desc']; should be echo $value['desc'];.