PHP Query in what tag is the word. - php

I'm currently using this code to retrieve tags.
$title = $pq->find("title")->text();
$h1 = $pq->find("h1")->text();
$p = $pq->find("p")->text();
Is this the proper way of doing it?
Secondly I have to see what word from my array $array_words is in which tag. So i have retrieved the file_get_contents and removed all tags and put all words in an array. Now lets take this for example:
Array
(
[0] => hello
[1] => there
[2] => this
[3] => is
[4] => a
[8] => test
[9] => array
)
and this would be the HTML:
<html>
<head>
<title>
hello there
</title>
</head>
<body>
<h1>
this is a
</h1>
<p>
test array
</p>
</body>
</html>
How can I find out which word is found in which tag?
I hope I made somewhat clear what I'm trying to do.

Based on the question, the point is that you need to create a reference of which word from $array_words is in some HTML tag.
So you have a array of tags that you want to check, right?
What i'm seen is it:
Get All Tags That you Want to Check.
Put All Tags on a Foreach to check all.
On Foreach, use phpQuery to find the words inside those tags.
phpQuery should return text, so you should break in into a new array of words called "$words_from_text", using explode. A new array are created.
Use a "in_array" comparator into a new foreach (inside the old one) to find what words from $array_words are inside the text.
If a Key From $words_from_text is find in the $array_words, put in on the array of Tags by setting a new array attached to the tag key.
$array_tags = (
'h1','div','title',
)
$array_words =
(
[0] => hello
[1] => there
[2] => this
[3] => is
[4] => a
[8] => test
[9] => array
)
Final Array with the results should be like it :
$array_tags = array(
['title'] = array('word1','word2'),
['h1'] = array('word3','word4'),
['div'] = array('word5','word6')
);
So if this example is what you need, you can use this guideline to resolve your problem.

Related

Php Curl parsing a m3u file [duplicate]

This question already has an answer here:
How to retrieve variable="value" pairs from m3u string
(1 answer)
Closed 3 years ago.
Hope you guys can help me out. I have the following .m3u file
#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
#EXTINF:-1 tvg-id="" tvg-name="Animal Planet" tvg-logo="" group-title="ENTRETENIMIENTO",Animal Planet
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/185.ts
As you can see, there is the main tag for the file
#EXTM3U and down that start the video information tag (#EXTINF:-1 ...) and down that the video link entry (http:// .....)
Can you explicitly tell me how can i parse this whole file (it's a pretty large one) and save the fields in an array for example like this? videos[ ]
and later i can acces to every video attributes lets say videos[0]['title'] for getting the title for the first video? and so on with the other attributes for example videos[42]['link'] and get the link to the video #42.
I am already using curl to get the file content into a variable like this
<?php
$handler = curl_init("link to m3u file");
$response = curl_exec ($handler);
curl_close($handler);
echo $response;
?>
What i need now is to parse the Curl response and save all the videos information into an array, where i can acces to every attribute of every video.
I know i must use some regexp or something like that. i just dont understand how. can you please help me with some code? thank you so much.
Behold the magik of Regx
$string = <<<CUT
#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
CUT;
preg_match_all('/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^\r\n]+)|(?<url>http[^\s]+)/', $string, $match );
$count = count( $match[0] );
$result = [];
$index = -1;
for( $i =0; $i < $count; $i++ ){
$item = $match[0][$i];
if( !empty($match['tag'][$i])){
//is a tag increment the result index
++$index;
}elseif( !empty($match['prop_key'][$i])){
//is a prop - split item
$result[$index][$match['prop_key'][$i]] = $match['prop_val'][$i];
}elseif( !empty($match['something'][$i])){
//is a prop - split item
$result[$index]['something'] = $item;
}elseif( !empty($match['url'][$i])){
$result[$index]['url'] = $item ;
}
}
print_r( $result );
Returns
array (
0 =>
array (
'tvg-name' => 'A&E',
'group-title' => 'ENTRETENIMIENTO',
'something' => ',A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
),
1 =>
array (
'tvg-name' => 'ABC Puerto Rico',
'group-title' => 'NACIONALES',
'something' => ',ABC Puerto Rico',
'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts',
),
)
Seriously though I have no clue what some of this is something for example. Anyway should get you started.
For the regx, it's actually pretty simple when it's broken down. The real trick is in using preg_match_all instead of preg_match.
Here is our regx
/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^\r\n]+)|(?<url>http[^\s]+)/
First we will break it down to more manageable bits. These are separated by the pipe | for or. Each one can be thought as a separate pattern, match this one or the next one. Now, the order can be important, because they will match left to right so if one matches on the left it stops. So you have to be careful no to have a regx that can match in two places ( if you don't want that ). However, it can be used to your advantage too, as I will show below. This is really what we are dealing with
(?P<tag>#EXTINF:-1)
(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")
(?<something>,[^\r\n]+)
(?<url>http[^\s]+)
Four regular expressions. For all of these (?P<name>...) is a named capture group, it just makes it more readable, easier to find the bits. If you look at the conditions I use to find the matches, for example!empty($match['tag'][$i]), we can use the tag index/key because of a named capture group, otherwise it would be 1. With a number of regx all together, having 1 2 3 can get messy if you consider this is actually nested so it would be $match[1][$i] for tag etc. Anyway, once that is taken out we have
#EXTINF:-1 match this string literally
(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)") this is more complicated (?: .. ) is a non-capture group, this is so the key/value winds up with the same index in the match array but not captured togather, Broken down this is ([-a-z]+)=\"([^"]+)\" or match a word followed by = then " than anything but a " ending with ". Basically one side captures the key, the other the value excluding the double quotes
,[^\r\n]+ starts with a comma then anything but a line return
and last http[^\s] a url
Now remember what I said about order being important, this url http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts would match the last expression, except that it starts with ,A&Ehttp://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts` which matches the 3rd one so it never gets to number 4
Hope that helps, granted you'll have to have a basic grasp of Regx, this is not really the place for a full tutorial on that, and you can find better examples then I can provide in a few short minutes.
Just for the sake of completeness, here is part of what preg_match_all returns
(
[0] => Array(
[0] => #EXTINF:-1
[1] => tvg-name="A&E"
[2] => group-title="ENTRETENIMIENTO"
[3] => ,A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
[4] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
[5] => #EXTINF:-1
[6] => tvg-name="ABC Puerto Rico"
[7] => group-title="NACIONALES"
[8] => ,ABC Puerto Rico
[9] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
)
[tag] => Array(
[0] => #EXTINF:-1
[1] =>
[2] =>
[3] =>
[4] =>
[5] => #EXTINF:-1
[6] =>
[7] =>
[8] =>
[9] =>
)
[1] => Array(
[0] => #EXTINF:-1
[1] =>
[2] =>
[3] =>
[4] =>
[5] => #EXTINF:-1
[6] =>
[7] =>
[8] =>
[9] =>
)
[prop_key] => Array(
[0] =>
[1] => tvg-name
[2] => group-title
[3] =>
[4] =>
[5] =>
[6] => tvg-name
[7] => group-title
[8] =>
[9] =>
)
[2] => Array( ... duplicate of prop_key .. )
etc.
)
The way to find the item in the above array is if you look at the for loop when it runs the first time index 0, the main part of the match $match[0][$i] contains all the matches, but the tag array only contains the items that match that regx, we can correlate them using the $i index.
if( !empty($match['tag'][$i])){
//is a tag increment the result index
++$index;
}
If $match[tag][$i] is not empty. which if you look at $match[tag][0] when $i = 0 you will see that indeed it is not empty. On the second loop $match[tag][1] is empty but $match[prop_key][1] is not so we know that when $i = 1 item is a prop_key match. That's how that works.
-ps- if you can find a way to remove the duplicated numeric indexes, please share it with me ... lol ... these are the normal matches if I didn't use a named capture group, as I said it can get messy.
I did a simple working m3u8 parser in php.
it's a remote m3u8 file parser to json but it easy to change the output
https://github.com/onigetoc/m3u8-PHP-Parser
I may soon change it or add a CURL parser instead of file_get_contents().
m3u-parser.php?url=https://raw.githubusercontent.com/onigetoc/m3u8-PHP-Parser/master/ressources/demofile.m3u
Once you get the CURL Response then read the file from Remote Location via CURL or fopen function.
For that you have read the files that are into directory from remote location and save all the files into Local server.
You can use the file function "Stat" for the getting all the information and keep into the $files
I have given the idea regarding how to collect all information and then you can create array.
Once the Array is created you can serialize the response for printing.

Insert String from Array after X amount of characters (Outside HTML) in PHP

I've looked and can't find a solution to this feature we would like to write. I'm fairly new to PHP so any help, advice and code examples are always greatly appreciated.
Let me explain what we want to do...
We have a block of HTML inside a string - the content could be up to 2000 words with styling such as <p>, <ul>, <h2> included in this HTML content string.
We also have an array of images related to this content inside a separate string.
We need to add the images from the array string into the HTML content at equal spaces without breaking the HTML code. So a simple character count won't work as it could break the HTML tags.
We need to equally space the images. So, for example; if we had 2000 words inside the HTML content string and 10 images in the array, we need to place an image every 200 words.
Any help or coding samples provided in order to achieve this is greatly appreciated - thank you for your help in advance.
You can use
$numword = str_word_count($str, 0);
for getting the number of row
or
$array = str_word_count($str,1);
for getting in $array an array with all the word (one for index) and then iterating on this array for rebuild text you need adding every number of time (word) the code for your image
This Sample is form php Manual
<?php
$str = "Hello fri3nd, you're
looking good today!";
print_r(str_word_count($str, 1));
print_r(str_word_count($str, 2));
print_r(str_word_count($str, 1, 'àáãç3'));
echo str_word_count($str);
?>
this is related result
Array
(
[0] => Hello
[1] => fri
[2] => nd
[3] => you're
[4] => looking
[5] => good
[6] => today
)
Array
(
[0] => Hello
[6] => fri
[10] => nd
[14] => you're
[29] => looking
[46] => good
[51] => today
)
Array
(
[0] => Hello
[1] => fri3nd
[2] => you're
[3] => looking
[4] => good
[5] => today
)
7
You can find it in this doc
for the insert you can try this way
$num = 200; // number of word after which inert the image
$text = $array[0]; // initialize the text with the first word in array
for ($cnt =1; $cnt< count( $array); $cnt++){
$text .= $array[$cnt]; // adding the word to the text
if (($cnt % $num) == 0) { // if array index multiple fo 200 insert the image
$text .= "<img src='your_img_path' >";
}
}

PHP - Remove items from an array with given parameter

I've searched around and I found some similar questions asked, but none that really help me (as my PHP abilities aren't quite enough to figure it out). I'm thinking that my question will be simple enough to answer, as the similar questions I found were solved with one or two lines of code. So, here goes!
I have a bit of code that searches the contents of a given directory, and provides the files in an array. This specific directory only has .JPG image files named like this:
Shot01.jpg
Shot01_tn.jpg
so on and so forth. My array gives me the file names in a way where I can use the results directly in an tag to be displayed on a site I'm building. However, I'm having a little trouble as I want to limit my array to not return items if they contain "_tn", so I can use the thumbnail that links to the full size image. I had thought about just not having thumbnails and resizing the images to make the PHP easier for me to do, but that feels like giving up to me. So, does anyone know how I can do this? Here's the code that I have currently:
$path = 'featured/';
$newest = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS));
$array = iterator_to_array($newest);
foreach($array as $fileObject):
$filelist = str_replace("_tn", "", $fileObject->getPathname());
echo $filelist . "<br>";
endforeach;
I attempted to use a str_replace(), but I now realize that I was completely wrong. This returns my array like this:
Array
(
[0] => featured/Shot01.jpg
[1] => featured/Shot01.jpg
[2] => featured/Shot02.jpg
[3] => featured/Shot02.jpg
[4] => featured/Shot03.jpg
[5] => featured/Shot03.jpg
)
I only have 3 images (with thumbnails) currently, but I will have more, so I'm also going to want to limit the results from the array to be a random 3 results. But, if that's too much to ask, I can figure that part out on my own I believe.
So there's no confusion, I want to completely remove the items from the array if they contain "_tn", so my array would look something like this:
Array
(
[0] => featured/Shot01.jpg
[2] => featured/Shot02.jpg
[4] => featured/Shot03.jpg
)
Thanks to anyone who can help!
<?php
function filtertn($var)
{
return(!strpos($var,'_tn'));
}
$array = Array(
[0] => featured/Shot01.jpg
[1] => featured/Shot01_tn.jpg
[2] => featured/Shot02.jpg
[3] => featured/Shot02_tn.jpg
[4] => featured/Shot03.jpg
[5] => featured/Shot03_tn.jpg
);
$filesarray=array_filter($array, "filtertn");
print_r($filesarray);
?>
Just use stripos() function to check if filename contains _tn string. If not, add to array.
Use this
<?php
$array = Array(
[0] => featured/Shot01.jpg
[1] => featured/Shot01_tn.jpg
[2] => featured/Shot02.jpg
[3] => featured/Shot02_tn.jpg
[4] => featured/Shot03.jpg
[5] => featured/Shot03_tn.jpg
)
foreach($array as $k=>$filename):
if(strpos($filename,"_tn")){
unset($array[$k]);
}
endforeach;
Prnt_r($array);
//OutPut will be you new array removed all name related _tn files
$array = Array(
[0] => featured/Shot01.jpg
[2] => featured/Shot02.jpg
[4] => featured/Shot03.jpg
)
?>
I can't understand what is the problem? Is it required to add "_tn" to array? Just check "_tn" existence and don't add this element to result array.
Try strpos() to know if filename contains string "_tn" or not.. if not then add filename to array
$path = 'featured/';
$newest = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS));
$array = iterator_to_array($newest);
$filesarray = array();
foreach($array as $fileObject):
// Check - string contains "_tn" substring or not
if(!strpos($fileObject->getPathname(), "_tn")){
// Check - value already exists in array or not
if(!in_array($fileObject->getPathname(), $filesarray)){
$filesarray[] = $fileObject->getPathname();
}
}
endforeach;
print_r($filesarray);

How to replace html tag with other tags using preg-match?

I have a string like the following.
<label>value1<label>:value<br>
<label>value2<label>:value<br>
<label>value3<label>:value<br>
and i need to arrange this as following
<li><label>value1<label><span>value</span><li>
i have tried for this last 2 days, but no luck.Any help?
This really isn't something you should do with regex. You might be able to fudge together a solution that works provided it makes a lot of assumptions about the content it's parsing, but it will always be fragile and liable to break should that content deviate from the expected by any significant degree.
A better bet is using PHP's DOM family of classes. I'm not really at liberty to write the code for you (and that's not what SO is for anyway), but I can give you a pointer regarding the steps you need to follow.
Locate text nodes that follow a label and precede a BR (XPath may be useful here)
Put the text node into a span.
Insert the span into the DOM after the label
Remove the BR.
wrap label and span in an li
If, for the sake of regex, you should use it then follow as below :
$string = <<<TOK
<label>value1<label>:value<br>
<label>value2<label>:value<br>
<label>value3<label>:value<br>
TOK;
preg_match_all('/<label>(.*?)<label>\:(.*?)<br>/s', $string, $matches);
print_r($matches);
/*
Array
(
[0] => Array
(
[0] => value1:value
[1] => value2:value
[2] => value3:value
)
[1] => Array
(
[0] => value1
[1] => value2
[2] => value3
)
[2] => Array
(
[0] => value
[1] => value
[2] => value
)
)
*/
$content = "";
foreach($matches as $key => $match)
{
$content.= "<li><label>{$matches[1][$key]}<label><span>{$matches[2][$key]}</span><li>\n";
}
echo($content);
/*
Output:
<li><label>value1<label><span>value</span><li>
<li><label>value2<label><span>value</span><li>
<li><label>value3<label><span>value</span><li>
*/

regex extract/replace values from xml-like tags via named (sub)groups

Trying to create a simple text-translator in PHP.
It shoult match something like:
Bla bla {translator id="TEST" language="de"/}
The language can be optional
Blabla <translator id="TEST"/>
Here is the code:
$result = preg_replace_callback(
'#{translator(\s+(?\'attribute\'\w+)="(?\'value\'\w+)")+/}#i',
array($this, 'translateTextCallback'),
$aText
);
It extracts the "attributes", but fetches only the last one. My first thought was, it has to do with the group naming, when PHP overwrites the (named) array elements on every match. But leaving out the group naming it also only returns the last match.
Here is an array as returned to the callback as example
Array
(
[0] => {translator id="TEST" language="de"/}
[1] => language="de"
[attribute] => language
[2] => language
[value] => de
[3] => de
)
When you iterate a group, you only get the last match. There is no way around this. You need to match the whole set of attribute/values and then parse them in code.

Categories