Php Curl parsing a m3u file [duplicate] - php

This question already has an answer here:
How to retrieve variable="value" pairs from m3u string
(1 answer)
Closed 3 years ago.
Hope you guys can help me out. I have the following .m3u file
#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
#EXTINF:-1 tvg-id="" tvg-name="Animal Planet" tvg-logo="" group-title="ENTRETENIMIENTO",Animal Planet
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/185.ts
As you can see, there is the main tag for the file
#EXTM3U and down that start the video information tag (#EXTINF:-1 ...) and down that the video link entry (http:// .....)
Can you explicitly tell me how can i parse this whole file (it's a pretty large one) and save the fields in an array for example like this? videos[ ]
and later i can acces to every video attributes lets say videos[0]['title'] for getting the title for the first video? and so on with the other attributes for example videos[42]['link'] and get the link to the video #42.
I am already using curl to get the file content into a variable like this
<?php
$handler = curl_init("link to m3u file");
$response = curl_exec ($handler);
curl_close($handler);
echo $response;
?>
What i need now is to parse the Curl response and save all the videos information into an array, where i can acces to every attribute of every video.
I know i must use some regexp or something like that. i just dont understand how. can you please help me with some code? thank you so much.

Behold the magik of Regx
$string = <<<CUT
#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
CUT;
preg_match_all('/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^\r\n]+)|(?<url>http[^\s]+)/', $string, $match );
$count = count( $match[0] );
$result = [];
$index = -1;
for( $i =0; $i < $count; $i++ ){
$item = $match[0][$i];
if( !empty($match['tag'][$i])){
//is a tag increment the result index
++$index;
}elseif( !empty($match['prop_key'][$i])){
//is a prop - split item
$result[$index][$match['prop_key'][$i]] = $match['prop_val'][$i];
}elseif( !empty($match['something'][$i])){
//is a prop - split item
$result[$index]['something'] = $item;
}elseif( !empty($match['url'][$i])){
$result[$index]['url'] = $item ;
}
}
print_r( $result );
Returns
array (
0 =>
array (
'tvg-name' => 'A&E',
'group-title' => 'ENTRETENIMIENTO',
'something' => ',A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
),
1 =>
array (
'tvg-name' => 'ABC Puerto Rico',
'group-title' => 'NACIONALES',
'something' => ',ABC Puerto Rico',
'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts',
),
)
Seriously though I have no clue what some of this is something for example. Anyway should get you started.
For the regx, it's actually pretty simple when it's broken down. The real trick is in using preg_match_all instead of preg_match.
Here is our regx
/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^\r\n]+)|(?<url>http[^\s]+)/
First we will break it down to more manageable bits. These are separated by the pipe | for or. Each one can be thought as a separate pattern, match this one or the next one. Now, the order can be important, because they will match left to right so if one matches on the left it stops. So you have to be careful no to have a regx that can match in two places ( if you don't want that ). However, it can be used to your advantage too, as I will show below. This is really what we are dealing with
(?P<tag>#EXTINF:-1)
(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")
(?<something>,[^\r\n]+)
(?<url>http[^\s]+)
Four regular expressions. For all of these (?P<name>...) is a named capture group, it just makes it more readable, easier to find the bits. If you look at the conditions I use to find the matches, for example!empty($match['tag'][$i]), we can use the tag index/key because of a named capture group, otherwise it would be 1. With a number of regx all together, having 1 2 3 can get messy if you consider this is actually nested so it would be $match[1][$i] for tag etc. Anyway, once that is taken out we have
#EXTINF:-1 match this string literally
(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)") this is more complicated (?: .. ) is a non-capture group, this is so the key/value winds up with the same index in the match array but not captured togather, Broken down this is ([-a-z]+)=\"([^"]+)\" or match a word followed by = then " than anything but a " ending with ". Basically one side captures the key, the other the value excluding the double quotes
,[^\r\n]+ starts with a comma then anything but a line return
and last http[^\s] a url
Now remember what I said about order being important, this url http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts would match the last expression, except that it starts with ,A&Ehttp://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts` which matches the 3rd one so it never gets to number 4
Hope that helps, granted you'll have to have a basic grasp of Regx, this is not really the place for a full tutorial on that, and you can find better examples then I can provide in a few short minutes.
Just for the sake of completeness, here is part of what preg_match_all returns
(
[0] => Array(
[0] => #EXTINF:-1
[1] => tvg-name="A&E"
[2] => group-title="ENTRETENIMIENTO"
[3] => ,A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
[4] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
[5] => #EXTINF:-1
[6] => tvg-name="ABC Puerto Rico"
[7] => group-title="NACIONALES"
[8] => ,ABC Puerto Rico
[9] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
)
[tag] => Array(
[0] => #EXTINF:-1
[1] =>
[2] =>
[3] =>
[4] =>
[5] => #EXTINF:-1
[6] =>
[7] =>
[8] =>
[9] =>
)
[1] => Array(
[0] => #EXTINF:-1
[1] =>
[2] =>
[3] =>
[4] =>
[5] => #EXTINF:-1
[6] =>
[7] =>
[8] =>
[9] =>
)
[prop_key] => Array(
[0] =>
[1] => tvg-name
[2] => group-title
[3] =>
[4] =>
[5] =>
[6] => tvg-name
[7] => group-title
[8] =>
[9] =>
)
[2] => Array( ... duplicate of prop_key .. )
etc.
)
The way to find the item in the above array is if you look at the for loop when it runs the first time index 0, the main part of the match $match[0][$i] contains all the matches, but the tag array only contains the items that match that regx, we can correlate them using the $i index.
if( !empty($match['tag'][$i])){
//is a tag increment the result index
++$index;
}
If $match[tag][$i] is not empty. which if you look at $match[tag][0] when $i = 0 you will see that indeed it is not empty. On the second loop $match[tag][1] is empty but $match[prop_key][1] is not so we know that when $i = 1 item is a prop_key match. That's how that works.
-ps- if you can find a way to remove the duplicated numeric indexes, please share it with me ... lol ... these are the normal matches if I didn't use a named capture group, as I said it can get messy.

I did a simple working m3u8 parser in php.
it's a remote m3u8 file parser to json but it easy to change the output
https://github.com/onigetoc/m3u8-PHP-Parser
I may soon change it or add a CURL parser instead of file_get_contents().
m3u-parser.php?url=https://raw.githubusercontent.com/onigetoc/m3u8-PHP-Parser/master/ressources/demofile.m3u

Once you get the CURL Response then read the file from Remote Location via CURL or fopen function.
For that you have read the files that are into directory from remote location and save all the files into Local server.
You can use the file function "Stat" for the getting all the information and keep into the $files
I have given the idea regarding how to collect all information and then you can create array.
Once the Array is created you can serialize the response for printing.

Related

How to shredding array with regx pattern in unified text?

I'm converting text from a txt file into an array.I need to shred the texts in this array using regex.
This is the array in my text file.
Array
(
[0] => 65S34523APPLE IS VERY BEAUTIFUL6.000TX786.34563.675 234.89
[1] => 06W01232BOOK IS SUCCESSFUL1.000YJ160.00021.853 496.00
[2] => 67E45643DO YOU HAVE A PEN? 7/56.450EQ9000.3451.432 765.12
)
if I need to explain a line as an example,
input => 65S34523APPLE IS VERY BEAUTIFUL6.000TX786.34563.675 234.89
required sections => 65S34523 APPLE IS VERY BEAUTIFUL 6.000 TX 786.345 63.67 5 234.89
target I want :
Array
(
[0] => 65S34523
[1] => APPLE IS VERY BEAUTIFUL
[2] => TX
[3] => 786.345
)
I need multiple regex patterns to achieve this.I need to shred the data I want in order in a loop.but since there is no specific layout, I don't know what to choose according to the regex patterns.
I've tried various codes to smash this array.
$smash =
array('65S34523APPLE IS VERY BEAUTIFUL6.000TX786.34563.675 234.89',
'06W01232BOOK IS SUCCESSFUL1.000YJ160.00021.853 496.00',
'67E45643DO YOU HAVE A PEN? 7/56.450EQ9000.3451.432 765.12');
I'm trying to foreach and parse the array.For example, I tried to get the text first.
foreach ($smash as $row) {
$delete_numbers = preg_replace('/\d/', '', $smash);
}
echo "<pre>";
print_r($delete_numbers);
echo "</pre>";
While it turned out it was that way.
Array
(
[0] => SAPPLE IS VERY BEAUTIFUL.TX.. .
[1] => WBOOK IS SUCCESSFUL.YJ.. .
[2] => EDO YOU HAVE A PEN? /.EQ.. .
)
Naturally, this is not what I want.Each array has a different structure.So i have to check with if-else too.
As you can see in the example, there is no pure text.Here
TX,YJ,EQ should be deleted.The dots should be wiped using apples.The first letters at the beginning of the text should
be removed.The remaining special characters must be removed.
I have tried many of the above.I have looked at alternative examples.
AS A RESULT;
I'm in a dead end.
Code: (Demo)
$smash = ['65S34523APPLE IS VERY BEAUTIFUL6.000TX786.34563.675 234.89',
'06W01232BOOK IS SUCCESSFUL1.000YJ160.00021.853 496.00',
'67E45643DO YOU HAVE A PEN? 7/56.450EQ9000.3451.432 765.12'];
foreach ($smash as $line) {
$result[] = preg_match('~(\w+\d)(\D+)[^A-Z]+([A-Z]{2})(\d+\.\d{3})~', $line, $out) ? array_slice($out, 1) : [];
}
var_export($result);
Output:
array (
0 =>
array (
0 => '65S34523',
1 => 'APPLE IS VERY BEAUTIFUL',
2 => 'TX',
3 => '786.345',
),
1 =>
array (
0 => '06W01232',
1 => 'BOOK IS SUCCESSFUL',
2 => 'YJ',
3 => '160.000',
),
2 =>
array (
0 => '67E45643',
1 => 'DO YOU HAVE A PEN? ',
2 => 'EQ',
3 => '9000.345',
),
)
My pattern assumes:
The first group will consist of numbers and letters and conclude with a digit.
The second group contains no digits.
The third group is consistently 2 uppercase letters.
The fourth group will reliably have three decimal places.
p.s. If you don't want that pesky trailing space after PEN?, you could use this:
https://3v4l.org/9XpA6
~(\w+\d)([^\d ]+(?: [^\d ]+)*) ?[^A-Z]+([A-Z]{2})(\d+\.\d{3})~

Insert String from Array after X amount of characters (Outside HTML) in PHP

I've looked and can't find a solution to this feature we would like to write. I'm fairly new to PHP so any help, advice and code examples are always greatly appreciated.
Let me explain what we want to do...
We have a block of HTML inside a string - the content could be up to 2000 words with styling such as <p>, <ul>, <h2> included in this HTML content string.
We also have an array of images related to this content inside a separate string.
We need to add the images from the array string into the HTML content at equal spaces without breaking the HTML code. So a simple character count won't work as it could break the HTML tags.
We need to equally space the images. So, for example; if we had 2000 words inside the HTML content string and 10 images in the array, we need to place an image every 200 words.
Any help or coding samples provided in order to achieve this is greatly appreciated - thank you for your help in advance.
You can use
$numword = str_word_count($str, 0);
for getting the number of row
or
$array = str_word_count($str,1);
for getting in $array an array with all the word (one for index) and then iterating on this array for rebuild text you need adding every number of time (word) the code for your image
This Sample is form php Manual
<?php
$str = "Hello fri3nd, you're
looking good today!";
print_r(str_word_count($str, 1));
print_r(str_word_count($str, 2));
print_r(str_word_count($str, 1, 'àáãç3'));
echo str_word_count($str);
?>
this is related result
Array
(
[0] => Hello
[1] => fri
[2] => nd
[3] => you're
[4] => looking
[5] => good
[6] => today
)
Array
(
[0] => Hello
[6] => fri
[10] => nd
[14] => you're
[29] => looking
[46] => good
[51] => today
)
Array
(
[0] => Hello
[1] => fri3nd
[2] => you're
[3] => looking
[4] => good
[5] => today
)
7
You can find it in this doc
for the insert you can try this way
$num = 200; // number of word after which inert the image
$text = $array[0]; // initialize the text with the first word in array
for ($cnt =1; $cnt< count( $array); $cnt++){
$text .= $array[$cnt]; // adding the word to the text
if (($cnt % $num) == 0) { // if array index multiple fo 200 insert the image
$text .= "<img src='your_img_path' >";
}
}

PHP Regex containing its limiters as ocurrences

I have this string:
{include="folder/file" vars="key:value"}
I have a regex to catch the file and the vars like this:
|\{include\=[\'\"](.*)\/(.*)[\'\"](.*)\}|U
First (.*) = folder
Second (.*) = file
Third (.*) = params (and I have some functions to parse it)
But there are some cases where I need to catch the params where they contains brackets {}. Like this:
{include="file" vars="key:{value}"}
The regext is working but it catches the results only until the first closing bracket. Like this:
{include="file" vars="key:{value}
So some part of the code remains out.
How can I make to allow those brackets as part of the results instead as a closing limiter???
Thanks!
You can use this regex:
\{include=['"](?:(.*)\/(.*?)|(\w+))['"] vars="(.*?)"\}
Working demo
MATCH 1
1. [10-16] `folder`
2. [17-21] `file`
4. [29-38] `key:value`
MATCH 2
3. [51-55] `file`
4. [63-74] `key:{value}`
Having in mind what #naomik said, I think I should change my regex.
What I want to make now is detecting this structure:
{word="value" word="value" ... n times}
I have this regex: (\w+)=['"](.*?)['"]
it detects :
{include="folder/file"}
{include="folder/file" vars="key:value"}
{vars="key:{value}" include="folder/file"} (order changed)
it works fine BUT I dont know how to add the initial and final brackets to the regex. When I add them it doesnt work like I want anymore
Live Demo
Another robust regexp that covers your first question :
preg_match_all("{include=[\"']{1}([^\"']+)[\"']{1} vars=[\"']{1}([^\"]+)[\"']{1}}", $str, $matches);
You'll get this kind of result into $matches :
Array
(
[0] => Array
(
[0] => {include="folder/file" vars="key:{value}"}
[1] => {include="folder/file" vars="key:value"}
[2] => {include="folder/file" vars="key:value"}
[3] => {include="file" vars="key:{value}"}
)
[1] => Array
(
[0] => folder/file
[1] => folder/file
[2] => folder/file
[3] => file
)
[2] => Array
(
[0] => key:{value}
[1] => key:value
[2] => key:value
[3] => key:{value}
)
)
you can access to what matters this way : $matches[1][0] and $matches[2][0] for the first elem, $matches[1][1] $matches[2][1] for the second, etc.
It does not store folder or file in separate results. For this, you'll have to write a sub piece of code. There is no elegant way to write a regex that is covering both include="folder/file" and include="file".
It does not support the inversion of include and vars. If you want to support this, you'll have to split your input data into chunks (line by line or text between braces) before your try to match the content with something like this :
preg_match_all("([\w]+)=[\"']{1}([^\"']+)[\"']{1}", $chunk, $matches);
then matches will contain something like this :
Array
(
[0] => Array
(
[0] => vars="key:{value}"
[1] => include="folder/file"
)
[1] => Array
(
[0] => vars
[1] => include
)
[2] => Array
(
[0] => key:{value}
[1] => folder/file
)
)
Then you know that $matches[1][0] contains 'vars', you can gets vars value in $matches[2][0]. For $matches[1][1] it contais 'include', you can then get 'folder/file' in $matches[2][1].

preg_match() behaving strange?

I want to compare two strings against url:
$reg1 = "/(^(((www\.))|(?!(www\.)))domain\.com\/paramsindex\/([a-z]+)\/([a-z]+)\/((([a-z0-9]+)(\-[a-z0-9]+){0,})(\/([a-z0-9]+)(\-[a-z0-9]+){0,}){0,})|()\/?$)/";
$reg2 = "/(^(((www\.))|(?!(www\.)))domain\.com\/paramsassoc\/([a-z]+)\/([a-z]+)\/((([a-z0-9]+)(\-[a-z0-9]+){0,})(\/([a-z0-9]+)(\-[a-z0-9]+){0,}){0,})|()\/?$)/";
$uri = "www.domain.com/paramsindex/cont/meth/par1/par2/par3/";
$r1 = preg_match($reg1, $uri);
echo "<p>First regex returned: {$r1}</p>";
$r2 = preg_match($reg2, $uri);
echo "<p>Second regex returned: {$r2}</p>";
Now these strings are not the same, difference is this:
www.domain.com/paramsindex/cont/meth/par1/par2/par3/
vs.
www.domain.com/paramsassoc/cont/meth/par1/par2/par3/
And yet PHP preg_match returns 1 for both of them.
Now you will say this is a long regex and why use that. And the thing is I could built shorter regex but it is built on the fly and... it youst needs to be like that.
And what bothers me is that in Rubular regexs works as it should.
When testing them I was using Rubular, and now i PHP it wont work. I know Rubular is Ruby regex editor but I tought it should be the same :(
Rubular testing:here
What is problem here? How should I write that regex in PHP so preg_match can see the difference? And regex should be as close to the one I already wrote, is there some simple fix to my problem? Something im overlooking?
That behavior is by design, preg_match returns 1 when a match is found. If you want to capture matches, see the matches parameter at: http://php.net/manual/en/function.preg-match.php
Edit: For example
$matches = array();
$r2 = preg_match($reg2, $uri, $matches);
echo "<p>Second regex returned: ";
print_r($matches);
echo "</p>";
I'll leave the above to document my own stupidity for not answering the right question.
At the end of your regex you have |()\/?$)/ which will make the regex match URL that ends with a slash. Take it out and it looks like you're golden from my tests.
Always remember to group your operands!
I can assume that this one is can be quite hard to spot, but it's all because of your use of the or-operator |. You are not grouping the operands correctly and therefore the result described in your post is being yield.
Your use of |() in the provided case will match either nothing or the full regular expression to the left of your operator |.
To solve this issue you will need to put parentheses around the operands that should be ORed.
An easy method of seeing where everything goes wrong is to run this below snippet:
$reg1 = "/(^(((www\.))|(?!(www\.)))domain\.com\/paramsindex\/([a-z]+)\/([a-z]+)\/((([a-z0-9]+)(\-[a-z0-9]+){0,})(\/([a-z0-9]+)(\-[a-z0-9]+){0,}){0,})|()\/?$
$reg2 = "/(^(((www\.))|(?!(www\.)))domain\.com\/paramsassoc\/([a-z]+)\/([a-z]+)\/((([a-z0-9]+)(\-[a-z0-9]+){0,})(\/([a-z0-9]+)(\-[a-z0-9]+){0,}){0,})|()\/?$
$uri = "www.domain.com/paramsindex/cont/meth/par1/par2/par3/";
var_dump (preg_match($reg1, $uri, $match1));
var_dump (preg_match($reg2, $uri, $match2));
print_r ($match1);
print_r ($match2);
output
int(1)
int(1)
Array
(
[0] => www.domain.com/paramsindex/cont/meth/par1/par2/par3
[1] => www.domain.com/paramsindex/cont/meth/par1/par2/par3
[2] => www.
[3] => www.
[4] => www.
[5] =>
[6] => cont
[7] => meth
[8] => par1/par2/par3
[9] => par1
[10] => par1
[11] =>
[12] => /par3
[13] => par3
)
Array
(
[0] => /
[1] => /
[2] =>
[3] =>
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] =>
[10] =>
[11] =>
[12] =>
[13] =>
[14] =>
[15] =>
)
As you see $reg2 matches a bunch of empty strings in $uri, which is an indication of what I described earlier.
If you come up with a short description of what you are trying to do I can provide you with a fully functional (and probably a bit neater than you current) regular expression.
Your RegEx is a mess and you will have to change it if you want it to work.
Check out the Rubular for your paramsindex: http://www.rubular.com/r/3ptjQ5aIrD
Now, for paramsassoc: http://www.rubular.com/r/o7GCbCsHyX
They both return a result. Sure it's an array full of empty strings, but it is a result nontheless.
That is why both are TRUE.

Trying to count existing files with array walk, getting "Only variables can be passed by reference"

existingReports[$i]
= count(array_walk(file_exists,$data['reports'][$i]));
Is something wrong with this statement?
For example
print_r($filename[3]);
[0] => uploads/2011-05-10%20Philippines%20Philippine%20storm%2022%20dead.pdf
[1] => uploads/2011-05-10%20Philippines%20Philippine%20storm%2022%20dead.pdf
[2] => uploads/2011-05-10%20Philippines%20Philippine%20storm%2022%20dead.pdf
[3] => uploads/2011-05-12%20Philippines%20Nestle%20noodles.jpg
[4] => uploads/2011-05-12%20Philippines%20Nestle%20noodles.jpg
[5] => uploads/2011-05-12%20Philippines%20Nestle%20noodles.jpg
[6] => uploads/2011-05-13%20Algeria%20TESTTEST
Obviously I am checking to see which of those reports exists, I will also need to dedupe them now that I am looking at this.
You probably want to use array_map array_filter here instead:
= count(array_filter($data['reports'][$i], "file_exists"));
Of course, the first parameter still needs to contain an array. If it matches your example output of print_r($filename[3]); it should work however.

Categories