The code below scrapes two values from a webpage and adds them to an array. I've got as far as being able to print the first row of that array but I'm unable to get the whole thing.
I presume some sort of loop will be required but my attempts so far have been unsuccessful.
I feel this should be fairly basic. Any idea what I can do to achieve the desired result?
if(!empty($html)) {
$doc->loadHTML($html);
libxml_clear_errors(); // remove errors for yucky html
$xpath = new DOMXPath($doc);
/* FIND LINK TO PRODUCT PAGE */
$products = array();
$row = $xpath->query("$product_location");
if ($row->length > 0) {
foreach ($row as $location) {
$products['product_url'] = $product_url_root.$location->getAttribute('href');
$products['shop_name'] = $shop_name;
$row = $xpath->query($photo_location);
/* FIND LINK TO IMAGE */
if ($row->length > 0) {
foreach ($row as $location) {
$products['photo_url'] = $photo_url_root.$location->getAttribute('src');
}
}
}
print_r($products);
}
}
EDIT
I should say that I'm hoping to get the array in this format:
Array (
[0] {product_url => 123, shop_name => name, photo_url => abc},
[1] {product_url => 456, shop_name => name, photo_url => def},
[2] {product_url => 789, shop_name => name, photo_url => ghi},
)
The plan is eventually to be able to use the following code in the place of print_r($products) to create an XML file:
$item = $channel->addChild("item");
$item->addChild("product_url", $entry['product_url']);
$item->addChild("shop_name", $entry['shop_name']);
$item->addChild("photo_url", $entry['photo_url']);
You'll need the following details to create the associative array you need:
the product URL
the shop name
the product image URL
Now, in your code, you're looping through the product URLs — and for each product URL, you're looping through the list of product image URLs. This will cause the code inside the nested foreach to be executed n^2 times. You do not want that.
Here's how you should structure your loops:
/* Create an array containing products */
if ($row->length > 0)
{
foreach ($row as $location)
{
$product_urls[] = $product_url_root . $location->getAttribute('href');
}
}
$imgs = $xpath->query($photo_location);
/* Create an array containing the image links */
if ($imgs->length > 0)
{
foreach ($imgs as $img)
{
$photo_url[] = $photo_url_root . $img->getAttribute('src');
}
}
$result = array();
/* Create an associative array containing all the above values */
foreach ($product_urls as $i => $product_url)
{
$result[] = array(
'product_url' => $product_url,
'shop_name' => $shop_name,
'photo_url' => $photo_url[$i]
);
}
print_r($result);
Related
I have one or more object in foreach and I want to merge all the objects in one in $refJSON.
$refObj = (object) array();
foreach($items as $item) { //here Im looping Two items
$refObj->refId = $item->getId();
$refObj->refLastName = $item->getLastName();
$refObj->refPhone = $item->getPhone();
$orderObj->refEmail = $item->getEmail();
}
$refJSON = json_encode($orderObj);
var_dump($refJSON);
Output :
//just the last item object
string(92) "{
"refId":"2",
"refLastName":"Joe",
"refPhone":"xxxxxxx",
"refEmail":"example#domaine.com"
}"
The output expected is to merge all the items ids 1 and 2 something like this:
[
{
"refId":"1",
"refLastName":"Steve",
"refPhone":"xxxxxxx",
"refEmail":"foo#domaine.com"
},
{
"refId":"2",
"refLastName":"Joe",
"refPhone":"xxxxxxx",
"refEmail":"example#domaine.com"
}
]
You are just overwriting the same object each time. Build each object and add this to an array (using []) and encode the result...
$refOut = array();
foreach($items as $item) { //here Im looping Two items
$refOut[] = ['refId' => $item->getId(),
'refLastName' => $item->getLastName(),
'refPhone' => $item->getPhone(),
'refEmail' => $item->getEmail()];
}
$refJSON = json_encode($refOut);
I have a variable $a='san-serif' and an array Font_list[] now I want only the arrays whose category is 'san-serif' will be filtered. I tried a lot of codes nothing seems working here is my code:-
public function filterFont() {
$a = $_POST['key'];
$url = "https://www.googleapis.com/webfonts/v1/webfonts?key=''";
$result = json_decode(file_get_contents( $url ));
$font_list = "";
foreach ( $result->items as $font )
{
$font_list[] = [
'font_name' => $font->family,
'category' => $font->category,
'variants' => implode(', ', $font->variants),
// subsets
// version
// files
];
}
$filter = filter($font_list);
print_r(array_filter($font_list, $filter));
}
Please help me :-(
What i understood according to that you want something like below:-
<?php
$a='san-serif'; // category you want to search
$font_list=Array('0'=>Array('font_name' => "sans-sherif",'category' => "san-serif"),'1'=>Array('font_name' => "times-new-roman",'category' => "san-serif"),'2'=>Array('font_name' => "sans-sherif",'category' => "roman"));
// your original array seems something like above i mentioned
echo "<pre/>";print_r($font_list); // print original array
$filtered_data = array(); // create new array
foreach($font_list as $key=>$value){ // iterate through original array
if($value['category'] == $a){ // if array category name is equal to serach category name
$filtered_data[$key] = $value; // assign that array to newly created array
}
}
echo "<pre/>";print_r($filtered_data); // print out new array
Output:- https://eval.in/597605
I am developing a search engine with vector space Model. I successfully computed tf-idf with associative array data already define in code. Now I want that data should be come from directory where I have a folders and in each folder there is a number of text files with dummy data. I have tried alot but stuck at 1 point using glob function because I want all .txt files as key and its contents as value in foreach loop of glob function.... Below is my code.
Tf-idf With Associative Array Data
$collection = array(
1 => 'this string is a short string but a good string',
2 => 'this one isn\'t quite like the rest but is here',
3 => 'this is a different short string that\' not as short'
);
$dictionary = array();
$docCount = array();
foreach($collection as $docID => $doc) {
$terms = explode(' ', $doc);
$docCount[$docID] = count($terms);
foreach($terms as $term) {
if(!isset($dictionary[$term])) {
$dictionary[$term] = array('df' => 0, 'postings' => array());
}
if(!isset($dictionary[$term]['postings'][$docID])) {
$dictionary[$term]['df']++;
$dictionary[$term]['postings'][$docID] = array('tf' => 0);
}
$dictionary[$term]['postings'][$docID]['tf']++;
}
}
$temp = ('docCount' => $docCount, 'dictionary' => $dictionary);
As you see in 1st foreach loop is that $DocID is key and $doc is its contents(value) of collection array. But I don't know how to implement exact same thing when files read from directory. See code below..
Tf-idf With .txt Files and its contents read from directory
foreach (glob("C:\\wamp\\www\\Web-info\\documents\\awd_1990_00\\*.txt") as $file) {
$file_handle = fopen($file, "r");
//echo $file;
$dictionary = array();
$docCount = array();
foreach($file as $docID=> $value) {
echo $value;
$terms = explode(' ', $doc);
$docCount[$docID] = count($terms);
foreach($terms as $term) {
if(!isset($dictionary[$term])) {
$dictionary[$term] = array('df' => 0, 'postings' => array());
}
if(!isset($dictionary[$term]['postings'][$docID])) {
$dictionary[$term]['df']++;
$dictionary[$term]['postings'][$docID] = array('tf' => 0);
}
$dictionary[$term]['postings'][$docID]['tf']++;
}
}
}
$temp = array('docCount' => $docCount, 'dictionary' => $dictionary);
This gives me error on 1st foreach loop that invalid arugument supplied for foreach loop. As I mentioned earlier I want .txt files as a key and its contents as a value in 1st foreach loop. But I got this error Can anybody please Tell me how to do this.. Thanks in advance..
If you want to treat the entire file as one value, you can use file_get_contents() to read the file into a string:
$dictionary = array();
$docCount = array();
foreach (glob("C:\\wamp\\www\\Web-info\\documents\\awd_1990_00\\*.txt") as $docID) {
$value = file_get_contents($docID);
...
}
I am trying to convert json feed from Twitter API 1.1 to arrays. What I am trying is
foreach($user))
Main_Array{
Name:
Id:
Array{
Array {
Tweet:
created at:
}
Array {
Tweet:
created at:
}
}
}
}
var_dump(Main_Array()); //unable to get the main array here. Only the last element of array is pulled
Here is what I tried:
$tweets = $connection->get("https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=".$user."&count=".$notweets);
$status = array();
foreach($tweets as $key) {
$status['text'] = $key ->text;
$status['stamp'] = $key -> created_at;
}
$tweetfeed = array (
'name' => $name,
'id' => $tweetname,
'status' => $status
);
I am only getting the last value for status array.
Also, I want to know if the structure I am using is good or please suggest if this can be better.
Thanks in advance.
You overwrite the previous $status entries every time in your foreach loop. You've to create a new sub-array for each tweet.
Here is a way to do it :
$i = 0;
foreach($tweets as $key) {
$status[$i]['text'] = $key ->text;
$status[$i]['stamp'] = $key -> created_at;
++$i;
}
Here is example how my array should look like:
$library = array(
'book' => array(
array(
'authorFirst' => 'Mark',
'authorLast' => 'Twain',
'title' => 'The Innocents Abroad'
),
array(
'authorFirst' => 'Charles',
'authorLast' => 'Dickens',
'title' => 'Oliver Twist'
)
)
);
When I get results from oracle database:
$row = oci_fetch_array($refcur, OCI_ASSOC+OCI_RETURN_NULLS);
But when I execute my code I only get one row.
For example: <books><book></book><name></name></books>
But I want all rows to be shown in xml.
EDIT:
This is my class for converting array to xml:
public static function toXml($data, $rootNodeName = 'data', &$xml=null)
{
// turn off compatibility mode as simple xml throws a wobbly if you don't.
if (ini_get('zend.ze1_compatibility_mode') == 1)
{
ini_set ('zend.ze1_compatibility_mode', 0);
}
if (is_null($xml))
{
$xml = simplexml_load_string("<".key($data)."/>");
}
// loop through the data passed in.
foreach($data as $key => $value)
{
// if numeric key, assume array of rootNodeName elements
if (is_numeric($key))
{
$key = $rootNodeName;
}
// delete any char not allowed in XML element names
$key = preg_replace('/[^a-z0-9\-\_\.\:]/i', '', $key);
// if there is another array found recrusively call this function
if (is_array($value))
{
// create a new node unless this is an array of elements
$node = ArrayToXML::isAssoc($value) ? $xml->addChild($key) : $xml;
// recrusive call - pass $key as the new rootNodeName
ArrayToXML::toXml($value, $key, $node);
}
else
{
// add single node.
$value = htmlentities($value);
$xml->addChild($key,$value);
}
}
// pass back as string. or simple xml object if you want!
return $xml->asXML();
}
// determine if a variable is an associative array
public static function isAssoc( $array ) {
return (is_array($array) && 0 !== count(array_diff_key($array, array_keys(array_keys($array)))));
}
}
?>
Now with below responde I have tried problem is I get following output: <book>...</book> tags after each row.. then I tried 3 dimensional array now I get: <book><book>...</book></book> on the proper place but I have 2 of them.
This is the line where I have determine which is root on that array and that's why I get this output. But don't know how to change it : $xml = simplexml_load_string("<".key($data)."/>");
Thank you.
oci_fetch_array() will always return a single row, you need to call it until there are no more rows to fetch in order to get all of them:
while ($row = oci_fetch_array($refcur, OCI_ASSOC+OCI_RETURN_NULLS))
{
$library['book'][] = $row;
}