url_to_absolute query - php

right, i am building a web crawler and there is a section of my code which translates to absolute urls instead of /macbookpro/ to http://www.apple.com/macbookpro. but when i echo my code, it only prints one result, which is the first link it sees why. Do i have to create an array, because when i did, i echoed the array and listed was the word 'Array'
<?php
require_once('simplehtmldom_1_5/simple_html_dom.php');
require_once('url_to_absolute/url_to_absolute.php');
$URL = 'http://www.theqlick.com'; // change it for urls to grab
// grabs the urls from URL
$file = file_get_html($URL);
foreach ($file->find('a') as $theelement) {
$links = url_to_absolute($URL, $theelement->href);
}
echo $links;
?>

var_dump your array, it gives you a text representation of your objects. It will show you the array and it's elements. Echo is more for just outputting strings. you could loop your array and echo each element, but if you just want to see it, var_dump is the answer.
http://www.php.net/manual/en/function.var-dump.php

If you are trying to build an array in $links You need to do
$links[] = url_to_absolute($URL, $theelement->href);
Right now, you are overriding the value of $links with each loop iteration.
You should also decalre $links = array(); somewhere before your foreach loop.

<?php
require_once('simplehtmldom_1_5/simple_html_dom.php');
require_once('url_to_absolute/url_to_absolute.php');
$links = Array();
$URL = 'http://www.theqlick.com'; // change it for urls to grab
// grabs the urls from URL
$file = file_get_html($URL);
foreach ($file->find('a') as $theelement) {
$links[] = url_to_absolute($URL, $theelement->href);
}
print_r($links);
So you'll need to init the array, add to it with [] and finally use something suitable to actually print it out, such as print_r.

Related

php parse content for input->name value

I'm trying to parse html content, for the value stored in the name tag.
My current try looks like the following:
function getAuthToken(){
$html = file_get_contents('url');
$values = array();
foreach($html->find('input') as $element)
{
$values[$element->name] = $element->value;
}
print_r($values);
}
What am I doing wrong?
$html in your code is not an object its a string (the results of file_get_contents).
$object->name is a class object
$array[key] is an array element
Neither of these in your code will do what you want to achieve with the result of file_get_contents because your $html var is a string, you need to do some string parsing to get the desired results.
You can check that with:
echo gettype($html);
You can also just echo out $html to find out what the string looks like to give you a better idea of what you are working with, for instance, is there a common character that you can explode the sting with an so getting an array to work with within your foreach.
example:
$newarray = explode('&', $html);
foreach ($newarray as $key => $value) {
//do your thing
}

PHP Echo XML Attributes Without Repeating

my question has to do with PHP and XML. I would like to echo out some attributes, but echo ones that repeat only once.
Say this was the XML I was dealing with, and it was called beatles.xml:
<XML_DATA item=“TheBeatles”>
<Beatles>
<Beatle Firstname=“George” Lastname=“Harrison” Instrument=“Guitar”>Harrison, George</Beatle>
<Beatle Firstname=“John” Lastname=“Lennon” Instrument=“Guitar”>Lennon, John</Beatle>
<Beatle Firstname=“Paul” Lastname=“McCartney” Instrument=“Bass”>McCartney, Paul</Beatle>
<Beatle Firstname=“Ringo” Lastname=“Starr” Instrument=“Drums”>Starr, Ringo</Beatle>
</Beatles>
</XML_DATA>
This is the PHP I have so far:
$xml = simplexml_load_file("http://www.example.com/beatles.xml");
$beatles = $xml->Beatles->Beatle;
foreach($beatles as $beatle) {
echo $beatle->attributes()->Instrument.',';
}
I would expect this to echo out Guitar,Guitar,Bass,Drums, but I would like Guitar to only display once. How would I prevent repeat attribute values from echoing out?
Inside the foreach loop, cast the instrument name as a string and push it into an array. Once the loop finishes execution, you will have an array containing all the instrument names (with duplicates, of course). You can now use array_unique() to filter out the duplicate values from the array:
$instruments = array();
foreach($beatles as $beatle) {
$instruments[] = (string) $beatle->attributes()->Instrument;
}
$instruments = array_unique($instruments);
Demo.
$xml = simplexml_load_file("http://www.example.com/beatles.xml");
$beatles = $xml->Beatles->Beatle;
$result = array();
foreach($beatles as $beatle) {
if (!array_key_exists($beatle->attributes()->Instrument, $result)) {
$result[] = $beatle->attributes()->Instrument;
// echo $beatle->attributes()->Instrument.',';
}
}
then Loop through the $result array with foreach
Either use xpath.

Extract only if the script tag's src contains a particular word in PHP

I am learning PHP. I have to extract all the script tags of a particluar $url whose src contain a particular word say 'abc'.
I have tried this:
$domd = new DOMDocument();
#$domd->loadHTML(#file_get_contents($url));
$data = array();
$items = $domd->getElementsByTagName('script');
foreach($items as $item) {
if($item->hasAttribute('src')){
$data[] = array(
'src' => $item->getAttribute('src')
);
}
}
print_r($data);
echo "\n";
The above code gives me the list of all the script tag's src's present in the $url.
But how should i check if a src in a script tag contains a word 'abc' ?
If you need to check each value in the array, to see if it contains 'abc', try a foreach() loop with an if statement using strpos() to see if the value contains 'abc', and then do something with it.
Something like this should do the trick:
foreach( $data as $key=>$value ) {
if( strpos( $value['src'],'abc' ) !== false ) {
//do something with it here
}
}
Just edited this. Call the ['src'] element of the subArray and use that in strpos(). Alternately, you could change your line that builds the $data array to this, since you only have one element in each subArray:
if($item->hasAttribute('src')){
$data[] = $item->getAttribute('src');
}

Printing contents of a nested array

I am trying to print the contents of a nested array, but it simply returns "Array" as a string yet will not iterate with a foreach loop. The arrays are coming from a mongodb find(). The dataset looks like this:
User
Post_Title
Post_Content[content1,content2,content3]
I am trying to get at the content1,2,3.
My current code looks like this:
$results = $collection->find($query);
foreach ($results as $doc)
{
echo $doc['title']; //this works
$content[] = $doc['content'];
print_r($content); //this prints "Array ( [0] => Array )"
foreach ($content as $item)
{
echo $item;
}
}
All this code does is print the Title, followed by Array ( [0] => Array ), followed by Array.
I feel quite stupid to not figure out something that seems so basic. Most posts on stack overflow refer to multidimensional associative arrays - in this case, the top level array is associative, but the content array is indexed.
I have also tried
foreach ($doc['content'] as $item)
But that gives
Warning: Invalid argument supplied for foreach()
I even tried iterating over the returned array again using a nested foreach, like the following:
foreach ($results as $doc)
{
echo $doc['title']; //this works
$content[] = $doc['content'];
print_r($content); //this prints "Array ( [0] => Array )"
foreach ($content as $item)
{
foreach ($item as $next_item)
{
echo $next_item;
}
echo $item;
}
}
The second foreach failed with an invalid argument..
Any thoughts would be greatly appreciated.
edit: perhaps it has something to do with how I am inserting the data to the DB. That code looks like this
$title = $_POST['list_title'];
$content[] = $_POST['list_content1'];
$content[] = $_POST['list_content2'];
$content[] = $_POST['list_content3'];
$object = new Creation();
$object->owner = "$username";
$object->title = "$title";
$object->content = "$content";
$object->insert();
Is this not the proper way to add an array as a property to a class?
The issue is some of array content is an array while other parts are strings. You should test for the sub content $doc being an array using is_array(). If it is, loop through the $doc like you would any other array, and if it isn't, you can echo the content (may need to test the content type before doing this if you're uncertain as to what content it can be)
The problem is in the code that you are saving data. You should replace the following line
$object->content = "$content";
with
$object->content = $content;

foreach results as feed_url

I'm trying to get the result of my foreach loop into an url to do a simplexml_load_file with.
So it goes like this:
(...) //SimpleXML_load_file to get $feed1
$x=1;
$count=$feed1->Count; //get a count for total number of loop from XML
foreach ($feed1->IdList->Id as $item1){
echo $item1;
if($count > $x) {
echo ',';} //Because I need coma after every Id, except the last one.
$x++;
}
The two echo are just to see the result. It gives me something like:
22927669,22039496,21326191,18396266,18295747,17360921,15705350,15681025,15254092,12939407,11943825,11495650,10964843
I would like to put that in a url to make a simplexml_load_file just like that
$feed_url = 'http://www.whatevergoeshere'. $RESULT_OF_FOREACH . 'someothertext';
So it would look like:
$feed_url = 'http://www.whatevergoeshere22927669,22039496,21326191,18396266,18295747,17360921,15705350,15681025,15254092,12939407,11943825,11495650,10964843someothertext';
I've try to store it into an array or a function and then call it into the feed_url but it did not work the way I tried it.
I hope it's clear, I'll answer fast to questions if not.
Thanks.
It's really difficult to make out what you want, so I'm going to guess you want to store the list as a comma delimited string in a variable. the easiest way is to implode the array of ids
$ids = array();
foreach ($feed1->IdList->Id as $item1){
$ids[] = (string) $item1;
}
$RESULT_OF_FOREACH = implode(',', $ids);
$feed_url = 'http://www.whatevergoeshere'. $RESULT_OF_FOREACH . 'someothertext';

Categories