PHPquery lib. and parsing XML - php

I started using the phpquery thingy, but I got lost in all that documentation.
In case someone does not know what the hell I am talking about: http://code.google.com/p/phpquery/
My question is pretty much basic.
I succeeded at loading an XML document and now I want to parse all the tags from it.
Using pq()->find('title') I can output all of the contents inside the title tags. Great!
But I want to throw every <title> tag in a variable. So, lets say that there are 10 <title> tags, I want every one of them in a separate variable, like: $title1, $title2 ... $title10. How can this be done?
Hope you understand the question.
TIA!

You could do it like this:
phpQuery::unloadDocuments();
phpQuery::newDocument($content);
$allTitles = [];
pq('title')->each(function ($item) use (&$allTitles) {
$allTitles[] = pq($item)->text();
});
var_dump($allTitles);
For example if there are 3 titles in the $content this var_dump will output:
array(3) {
[0] =>
string(6) "title1"
[1] =>
string(6) "title2"
[2] =>
string(6) "title3"
}

Related

How to display print_r result in source code with multiple lines

I have a simple php code which changes the order of the name inside an array.
$arr = [
"Meier, Peter",
"Schulze, Monika",
"Schmidt, Ursula",
"Brosowski, Klaus",
];
foreach($arr as $name => $name2)
{
$vname = explode(", ", $name2);
$new = array_reverse($vname);
$arr[$name] = implode(", ", $new);
}
echo "<pre>".print_r($arr, true)."</pre>";
Basically I would like the edit the code, that the source code displays the array not in one line, but in several lines (one for each Firstname + Lastname) like shown below:
Array
(
[0] => Peter, Meier
[1] => Monika, Schulze
[2] => Ursula, Schmidt
[3] => Klaus, Brosowski
)
Right now the source code shows the excelpt same result, but only in one line. Is it possible to adapt the print_r command in this way?
Best Regards
Edit for clarification:
my code gives as a result the array which I have also posted in my question. In the html source code, the same array is in one line[0] => Peter, Meier [1] => Monika, Schulze and so on. So there is a difference between the way my result is structured and how the source code is structured. I would like to change the source code structure of the array, that it looks like the actual result.
You can adopt the line_breaks from print_r for html with the function nl2br()
echo nl2br(print_r($arr, true));
or you build a List function with
echo '<ul>'.PHP_EOL;
foreach($arr as $name){
echo "<li>{$name}</li>".PHP_EOL;
}
echo '</ul>'.PHP_EOL;
I think I found out what you were trying to say:
output is based on https://3v4l.org/LLvZW from #tcj
When printing with print_r your output looks like this:
When you inspect your HTML elements in your browser your inspector shows this:
(Note: everything is in one line)
The clarifications:
the output looks like this because <pre> does set a fixed width, and white-space: pre;
If you remove white-space: pre; you should end with an output like this:
If you inspect your element in the browser, it will still all be in one line
HOWEVER, if you try to edit the element inside your browser inspector then it WON'T be in one line.
So my answer to this is, that I assume this is some optimization done by the browser.
Probably so you can see more of the other HTML you have, or for other reasons...
(Im on Firefox btw.)

Simple way to read variables on different lines from STDIN?

I want to read two integers on two lines like:
4
5
This code works:
fscanf(STDIN,"%d",$num);
fscanf(STDIN,"%d",$v);
But I wonder if there's a shorter way to write this? (For more variables, I don't want to write a statement for each variable) Like:
//The following two lines leaves the second variable to be NULL
fscanf(STDIN,"%d%d",$num,$v);
fscanf(STDIN,"%d\n%d",$num,$v);
Update: I solved this using the method provided in the answer to read an array and list to assign variables from an array.
Consider this example:
<?php
$formatCatalog = '%d,%s,%s,%d';
$inputValues = [];
foreach (explode(',', $formatCatalog) as $formatEntry) {
fscanf(STDIN, trim($formatEntry), $inputValues[]);
}
var_dump($inputValues);
When executing and feeding it with
1
foo
bar
4
you will get this output:
array(4) {
[0] =>
int(1)
[1] =>
string(3) "foo"
[2] =>
string(3) "bar"
[3] =>
int(4)
}
Bottom line: you certainly can use loops or similar for the purpose and this can shorten your code a bit. Most of all it simplifies its maintenance. However if you want to specify a format to read with each iteration, then you do need to specify that format somewhere. That is why shortening the code is limited...
Things are different if you do not want to handle different types of input formats. In that case you can use a generic loop:
<?php
$inputValues = [];
while (!feof(STDIN)) {
fscanf(STDIN, '%d', $inputValues[]);
}
var_dump($inputValues);
Now if you feed this with
1
2
3
on standard input and then detach the input (by pressing CTRL-D for example), then the output you get is:
array(3) {
[0] =>
int(1)
[1] =>
int(2)
[2] =>
int(3)
}
The same code is obviously usable with input redirection, so you can feed a file into the script which makes detaching the standard input obsolete...
If you can in your code, try to implement a array :
fscanf(STDIN, "%d\n", $n);
$num=array();
while($n--){
fscanf(STDIN, "%d\n", $num[]);
}
print_r($num);

PHP+JSON Display result with no key?

First of all, sorry because my english is bad.
Ive got a problem.
Requesting data from URL in JSON format and got something like:
array(1) { ["NICK_HERE"]=> array(5) { ["id"]=> int(123456789) ["name"]=> string(11) "NICK_HERE" ["class"]=> int(538) ["level"]=> int(97) ["online"]=> int(1420061059000) } }
And then, I want to display something from there, normally it would be $x['NICK_HERE']['id'], but because NICK_HERE will be changing, and because i can't use a variable in there, is there a way to bypass this?
For example something like $x[0]['id'] ?
Sort of choose first, no matter whats that?
Appreciate your help!
P.s. Happy New Year!
My advice (because I think it is the simplest is to use current(), but you can also use array_column()...
Code: (Demo)
$array=[
"NICK_HERE"=>["id"=>123456789,"name"=>"NICK_HERE","class"=>538,"level"=>97,"online"=>1420061059000]
];
echo current($array)['id'];
echo "\n";
echo array_column($array,'id')[0];
Output:
123456789
123456789
These methods assume that the subarray element id is guaranteed to exist. If it might not, you will need to check with isset() before trying to access -- to avoid a Notice.
Since you know the first array has one key name you can get the 0th key.
test.php
$x = array(
"NICK_HERE" => array(
"id" => 123456789,
"name" => "NICK_HERE",
"class" => 538,
"level" => 97,
"online" => 1420061059000
)
);
$name = array_keys($x)[0];
echo $x[$name]["id"];
?>
output:
php test.php
123456789%
Use foreach loop.
Suppose your array name is $array.
foreach($array as $i=>$a)
{
echo "index: ".$i." id:".$a['id']."<br>";
}
Suppose your json string name is $json. Then use this way
$array=$json_decode($json);
foreach($array as $i=>$a)
{
echo "index: ".$i." id:".$a->id."<br>";//here you will get it as object
}
Hope you got the idea.

Extract info from html?

First of all, I've seen a good deal of similar questions. I know regex or dom can be used, but I can't find any good examples of DOM and regex makes me pull my hair. In addition, I need to pull out multiple values from the html source, some simply contents, some attributes.
Here is an example of the html I need to get info from:
<div class="log">
<div class="message">
<abbr class="dt" title="time string">
DATA_1
</abbr>
:
<cite class="user">
<a class="tel" href="tel:+xxxx">
<abbr class="fn" title="DATA_2">
Me
</abbr>
</a>
</cite>
:
<q>
DATA_3
</q>
</div>
</div>
The "message" block may occur once or hundreds of times. I am trying to end up with data like this:
array(4) {
[0] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
[1] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
[2] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
[3] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
}
I tried using simplexml but it only seems to work on very simple html pages. Could someone link me to some examples? I get really confused since I need to get DATA_2 from a title attribute. What do you think is the best way to extract his data? It seems very similar to XML extraction which I have done, but I need to use some other method.
Here is an example using DOMDocument and DOMXpath to parse your HTML.
$doc = new DOMDocument;
$doc->loadHTMLFile('your_file.html');
$xpath = new DOMXpath($doc);
$res = array();
foreach ($xpath->query('//div[#class="message"]') as $elem) {
$res[] = array(
'time' => $xpath->query('abbr[#class="dt"]', $elem)->item(0)->nodeValue,
'name' => $xpath->query('cite/a/abbr[#class="fn"]', $elem)->item(0)->getAttribute('title'),
'message' => $xpath->query('q', $elem)->item(0)->nodeValue,
);
}
Can I suggest using xPath? It seems like a perfect candidate for what you want to do (but I may be misinterpreting what you're asking).
XPath will let you select particular nodes of an XML/HTML tree, and then you can operate on them from there. After that, it should be a simple task (or a tiny bit of simple regex at most. Personally, I love regex, so let me know if you need help with that).
Your XPath statements will look something like (assuming no conflicting names):
time (data 1):
/div/div/abbr/text()
name (data 2):
/div/div/cite/a/abbr/#title
message (data 3):
/div/div/q/text()
You can get more tech than this if, for example, if you want to identify the elements via their attributes, but what I've given you will be pretty fast.

Parse Xml file for comparison

ok this is driving me crazy.
I have been trying to parse a xml file into a specific array or object so I can compare it to a similar file to test for differences.
However I have had no luck. I have been attempting to use SimpleXMLIterator and SimpleXMLElement to do this.
Here are some samples:
<xml>
//This is the first record of 1073
<viddb>
<movies>1074</movies>
<movie>
<title>10.5</title>
<origtitle>10.5</origtitle>
<year>2004</year>
<genre>Disaster</genre>
<release></release>
<mpaa></mpaa>
<director>John Lafia</director>
<producers>Howard Braunstein, Jeffrey Herd</producers>
<actors>Kim Delaney, Fred Ward, Ivan Sergei</actors>
<description>An earthquake reaching a 10.5 magnitude on the Richter scale, strikes the west coast of the U.S. and Canada. A large portion of land falls into the ocean, and the situation is worsened by aftershocks and tsunami.</description>
<path>E:\www\Media\Videos\Disaster\10.5.mp4</path>
<length>164</length>
<size>3648</size>
<resolution>640x272</resolution>
<framerate>29.97</framerate>
<videocodec>AVC</videocodec>
<videobitrate>2966</videobitrate>
<label>Roku Media</label>
<poster>images/10.5.jpg</poster>
</movie>
Here is the object this record produces using $iter = new SimpleXMLIterator($xml, 0, TRUE);
object(SimpleXMLIterator)#71 (1) {
["viddb"] => object(SimpleXMLIterator)#72 (2) {
["movies"] => string(4) "1074"
["movie"] => array(1074) {
[0] => object(SimpleXMLIterator)#73 (19) {
["title"] => string(4) "10.5"
["origtitle"] => string(4) "10.5"
["year"] => string(4) "2004"
["genre"] => string(8) "Disaster"
["release"] => object(SimpleXMLIterator)#1158 (0) {
}
["mpaa"] => object(SimpleXMLIterator)#1159 (0) {
}
["director"] => string(10) "John Lafia"
["producers"] => string(31) "Howard Braunstein, Jeffrey Herd"
["actors"] => string(35) "Kim Delaney, Fred Ward, Ivan Sergei"
["description"] => string(212) "An earthquake reaching a 10.5 magnitude on the Richter scale, strikes the west coast of the U.S. and Canada. A large portion of land falls into the ocean, and the situation is worsened by aftershocks and tsunami."
["path"] => string(37) "E:\www\Media\Videos\Disaster\10.5.mp4"
["length"] => string(3) "164"
["size"] => string(4) "3648"
["resolution"] => string(7) "640x272"
["framerate"] => string(5) "29.97"
["videocodec"] => string(3) "AVC"
["videobitrate"] => string(4) "2966"
["label"] => string(10) "Roku Media"
["poster"] => string(15) "images/10.5.jpg"
}
What I'm trying to produce (at the moment) is a single level associative array for each movie . All the examples I've read on and followed always produced an array of arrays, which is much more difficult to work with.
This is were i'm at :
$iter = new SimpleXMLIterator($xml, 0, TRUE);
Zend_Debug::dump($iter);
//so far xpath has not worked for me, I can't get $result to return anything
$result = $iter->xpath('/xml/viddb/movies/movie');
$movies = array();
for ($iter->rewind(); $iter->valid(); $iter->next()) {
foreach ($iter->getChildren() as $key => $value) {
//I can get each movie title to echo but when I try to put them into an
// array it only has the last record
echo $value->title . '<br />';
$movies['title'] = $value->title;
}
}
return $movies;
I feel like I'm missing something simple and obvious...as usual :)
[EDIT]
I found my error, I was tripping over the array of objects thing. I had to cast the data I wanted as a string to make it work how I wanted. Just for info here is what I came up with to put me on the track I wanted:
public function indexAction() {
$xml = APPLICATION_PATH . '/../data/Videos.xml';
$iter = new SimpleXMLElement($xml, 0, TRUE);
$result = $iter->xpath('//movie');
$movies = array();
foreach ($result as $key => $movie) {
$movies[$key + 1] = (string) $movie->title;
}
Zend_Debug::dump($movies, 'Movies');
}
XPATH is the answer you are looking for. I think the reason your XPATH isn't working is because you are looking for a movie node under the movies node when the movies node does not have any children.
Edit: Think it might be easier to just use a foreach loop instead of the iterator. I had to look up the iterator as I had never seen it before. Been using simplxml and xpath for a while too. Also, I believe you should only use SimpleXMLElement if you are planning on editing the XML as well. If you simply want to read it for comparison, best to use simplexml_load_file. You can also change your xpath to simply.
xpath('//movie');
If you just need to compare the entire file contents, read the contents of both files into a string and do a string comparison. Otherwise, you can do the same at a lower level of the document by getting the innerXML of any node.

Categories