PHP difficult preg_match - php

I'm trying to manage some files on my personal web server that I use for XBMC but all the files (Movies from YIFY) have names like
Jumanji.1995-720p.YIFY.mp4
Silver.Linings.Playbook.2012.1080p.x264.YIFY.mp4
American Hustle (2013) 1080p BrRip x264 - YIFY.mp4
Notice that some of the items are separated with . and others with _ or spaces.
So what I need to do is to preg_match the file to an array of (title,year,quality) I only know some preg_match basics.
But this is way to hard for me.
e.g
echo extract('American Hustle (2013) 1080p BrRip x264 - YIFY.mp4');
This should output =
array(
'title' => 'American Hustle',
'year' => '2013',
'quality' => 1080p
);
Thanks in advance

^(.*?)\W+(\d{4})(?=[\W ]+?(\d{3,4})p)
You can try this.See demo.
https://regex101.com/r/nS2lT4/29
The regex starts and captures anything from start to a non word letter which can be one or more and has 4 digits ahead of it.After this a lookahead makes sure that after capturing \d{4} there are non word letters which can be one or more and has 4 digits of p ahead of it.Because of lookahead we capture the last 4 digits which have only non word characters and 4 digits p after them.

you have 3 differents formats then you need 3 differents parsing types
try this :
$tests = array(
// format 1
"Jumanji.1995-720p.YIFY.mp4",
"Silver.Linings.Playbook.2012-1080p.YIFY.mp4",
"American.Hustle.2013-1080p.YIFY.mp4",
// format 2
"Jumanji.1995.720p.x264.YIFY.mp4",
"Silver.Linings.Playbook.2012.1080p.x264.YIFY.mp4",
"American.Hustle.2013.1080p.x264.YIFY.mp4",
// format 3
"Jumanji (1995) 720p BrRip x264 - YIFY.mp4",
"Silver Linings Playbook (2012) 1080p BrRip x264 - YIFY.mp4",
"American Hustle (2013) 1080p BrRip x264 - YIFY.mp4",
);
function extractInfos($s) {
$infos = array();
if (FALSE === strpos($s, ".YIFY.")) {
// format 3
$tab = explode(" ", $s);
$yearIndex = count($tab) - 6;
$infos["year"] = trim($tab[$yearIndex], "()");
$infos["quality"] = $tab[$yearIndex + 1];
array_splice($tab, $yearIndex);
$infos["title"] = implode(" ", $tab);
} else {
// format 1 or 2
$tab = explode(".", $s);
$yearIndex = count($tab) - 3;
if (FALSE === strpos($tab[$yearIndex], "-")) {
// format 2
$yearIndex -= 2;
$infos["year"] = $tab[$yearIndex];
$infos["quality"] = $tab[$yearIndex + 1];
} else {
// format 1
list($infos["year"], $infos["quality"]) = explode("-", $tab[$yearIndex]);
}
array_splice($tab, $yearIndex);
$infos["title"] = implode(" ", $tab);
}
return $infos;
}
echo "<table border=\"1\">";
foreach ($tests as $s) {
$infos = extractInfos($s);
?>
<tr>
<td>
<?php echo $infos["title"];?>
</td>
<td>
<?php echo $infos["year"];?>
</td>
<td>
<?php echo $infos["quality"];?>
</td>
</tr>
<?php
}
echo "</table>";

Related

Separating phone numbers and extensions from poorly formatted data in PHP

I have a bunch of phone number strings. I need to separate the number from the extension. However, the formatting is obviously all over the place. How would you best achieve this in PHP?
555-555-5555 ext 230
555-555-5555 ex 230
555-555-5555 x 230
555-555-5555 ext. 230
555-555-5555 ext230
555-555-5555 x230
555-555-5555 ext # 230`
I tried to use regex but I've not been able to come up with a pattern that matches everything above.
The phone numbers are also not exactly in good shape themselves.
Everything from (555)555-555-5555 to 555 555-555-5555. Oh and some records have multiple numbers separated by words like Mobile:, Cell:, or a newline :D . But, that problem is for another question.
Also, extensions are not always 3 numbers. Could be 2-4.
My expected result would be something along the lines of:
$array = [
'phone' => '555-555-5555',
'ext' => '123'
];
Given the phone number is not f-ed up as well. You can do it like this:
$array = array (
'555-555-5555 ext 230',
'555-555-5555 ex 230',
'555-555-5555 x 230',
'555-555-5555 ext. 230',
'555-555-5555 ext230',
'555-555-5555 x230',
'555-555-5555 ext # 230`',
);
$data = array();
foreach ($array as $val)
{
while (!is_numeric(substr($val,-1))) {
$val = substr_replace($val ,"", -1);
}
$data[] = array(
'num' => substr($val, 0, 12),
'ext' => substr($val, -3)
);
}
echo "<pre>"; print_r($data);
try this
<?php
$number = "555-555-5555 ext 230";
preg_match_all('!\d+!', $number, $matches);
for($x=0;$x<count($matches);$x++){
for($y=0;$y<count($matches[$x]);$y++){
if($y == (count($matches[$x]) - 1)){
echo $matches[$x][$y];
}else{
echo $matches[$x][$y]."-";
}
}
}
?>
result 555-555-5555-230 . btw, what is your expected result ?
Update . I don't know if this is best way, but please give a try
<?php
$number = "555-555-5555 x 230";
preg_match_all('!\d+!', $number, $matches);
for($x=0;$x<count($matches);$x++){
for($y=0;$y<count($matches[$x]);$y++){
if($y == (count($matches[$x]) - 1)){
$result[]= "#".$matches[$x][$y];
}else{
$result[] = $matches[$x][$y];
}
}
}
for($xy=0;$xy<count($result);$xy++){
if($xy == count($result) - 1 ){
$data['ext'][] = $result[$xy];
}else{
$data['number'][] = $result[$xy];
}
}
$num = implode("-", $data['number']);
$ext = implode("", str_replace("#","",$data['ext']));
$final = array("number" => $num, "ext" => $ext);
echo "<pre>";print_r($final);
?>

Extract and merge strings between different positions

I'm trying to make this works. I want to replace some parts of a sentence between given positions and then show the full sentence with the changes. With the following code, I'm able to make the changes but I don't know how to put together the rest of the sentence.
I have two arrays, one with the positions where a woman name appears and another one with men names. The code replaces the pronoun "his" by "her" when a woman is before a man between the intervals. The last thing I need is to reconstruct the sentence with the changes made but I don't know how to extract the rest of the sentence (the result, in the example, is from positions 0 to 20 (Maria her dress but) and 36 to 51 (Lorena her dog) but I need to extract from 20 to 36 (Peter his jeans) and 51 to the end (Juan his car) to merge them in their positions).
The result should be: "Maria her dress but Peter his jeans Lorena her dog Juan his car". I'll appreciate any help with this, I've been looking for other similar questions but I found nothing.
<?php
$womenpos = array("0","36"); //these arrays are just examples of positions
$menpos = array("20","51"); //they will change depending on the sentence
$sentence = "Maria his dress but Peter his jeans Lorena his dog Juan his car";
echo $sentence."\n";
foreach ($womenpos as $index => $value) {
$value2 = $menpos[$index];
if($value < $value2) {
echo "\nWoman(" . $value . ") is before man(" . $value2 . ")\n";
$end = ($value2 - $value);
$improved = str_replace(' his ', ' her ',
substr($sentence, $value, $end));
echo $improved."\n";
} else {
$improved = "Nothing changed";
echo $improved;
}
}
Ok, how about this:
$womenpos = array("0","36");
$menpos = array("20","51");
$bothpos = array_merge($womenpos,$menpos);
sort ($bothpos);
print_r($bothpos);
$sentence = "Maria his dress but Peter his jeans Lorena his dog Juan his car";
echo $sentence."\n";
for ($i = 0; $i<sizeof($bothpos); $i++) {
$start = $bothpos[$i];
if ($i ==sizeof($bothpos)-1) {
$end = strlen($sentence);
}
else {
$end = $bothpos[$i+1];
}
$length = $end-$start;
$segment = substr($sentence, $start, $length);
if (in_array($start, $womenpos)) {
$new_segment = str_replace (' his ', ' her ', $segment);
}
else { $new_segment = $segment; }
$improved .= $new_segment;
print "<li>$start-$end: $segment : $new_segment </li>\n";
}
print "<p>Improved: $improved</p>";
This combines the men's and women's position arrays to consider each stretch of text as one that might have an error. If that stretch of text starts at one of the womenpos points, then it changes 'his' to 'her'. If not it leaves it alone.
Does this get you in the direction you want to go in? I hope so!
This approaches the problem differently, but I wonder if it would provide the solution you're looking for:
$sentence = "Maria his dress but Peter his jeans Lorena his dog Juan his car";
$women = array ("Maria", "Lorena");
$words = explode (" ", $sentence);
for ($i=0; $i< sizeof($words); $i++) {
if ($words[$i] == "his" && in_array($words[$i-1], $women)) {
$words[$i] = "her";
}
}
print (join(" ", $words));
This goes through the words one at a time; if the preceding word is in the $women array and the current word is "his", it changes the word to "her". Then it spits out all the words in order.
Does this do what you need, or do you really want a complex string positioning answer?

How can I divide streetnr and street out of one String?

I want to divide an String into var street and streetnr. How can I do this with php?
The data look like:
Bakerstreet 5
Wild Street 5 a
Best Street 5a
Simplestreet
I have Streets without numbers and Streets with letters ... How can I do this?
So the street should always be like
var street = Bakerstreet, Wild Street, Best Street, Simplestreet
var streetnr should always be = 5, 5 a, 5a
My idea was to explode the string after every blank " " ... Then I reverse the array and look if the first element is just a letter. If it is, I put it into streetnr. Then I check the next element. If there are just numbers, I put it into streetnr ... and so on
I've used this as an excercise for my PHP-Training. And while I'm a huge fan of regex, I wanted to do it without em and came up with the following:
<?php
$addr = array("Bakerstreet 5",
"Wild Street 5 a",
"Best Street 5a",
"Simplestreet",
"Won't Work 47a Suite 18b",
"1st Street 10b ",
"Route 66 12a "
);
echo "<h1>Address-Parsing</h1><ol>";
foreach ($addr as $ad)
{
$no=""; // Number
$st=""; // Street
$GotNo = false;
$r = strrev(trim($ad));
echo "<li>ad=$ad";
do {
while ($r{0}=="0") {// special handling for leading "0"s (in reverted string) which are ignored by sscanf...
if (!$GotNo) $no = "0" . $no;
else $st = "0" . $st;
$r = substr($r,1);
}
$d = sscanf($r,"%d"); // get number
$s = sscanf($r,"%c"); // get string
if (is_null($d[0]) && !$GotNo) {
// no matching number and have not matched no yet - so this must be string following the nr
$no = strrev($s[0]) . $no;
$r = substr($r,strlen($s[0])); // remove match
} elseif (!$GotNo) {
$no = strrev($d[0]) . $no;
$GotNo = true;
$r = substr($r,strlen($d[0])); // remove match
} else {// we already have a number, so any text must be streetname
$st = strrev($r) . $st;
$r="";
}
if ($r !== trim($r)) {$st = " " . $st; $r = trim($r);}
} while (0<strlen($r));
$st = trim( $st);
if (empty($st)) {// might happen when no number was found...
$st=$no;
$no="";
}
echo "|st=$st|no=$no|</li>";
}
echo "</ol>";
?>
Use a regular Expression
preg_match('~^\s*(.*)\s+([0-9]+\s*[a-zA-Z]{0,1})\s*$~', $street, $match);
preg_match returns true if the street is in the right format and $match is an array containing [1]=street name, [2]=streetnr.
You could match using the following regular expression:
preg_match('/^(.+?)(?:\s+(\d.*))?$/', $street, $matches);
$street = $matches[1];
if (count($matches) == 3)
$streetnr = $matches[2];

PHP replace a random word of a string

I want to replace one random word of which are several in a string.
So let's say the string is
$str = 'I like blue, blue is my favorite colour because blue is very nice and blue is pretty';
And let's say I want to replace the word blue with red but only 2 times at random positions.
So after a function is done the output could be like
I like red, blue is my favorite colour because red is very nice and blue is pretty
Another one could be
I like blue, red is my favorite colour because blue is very nice and red is pretty
So I want to replace the same word multiple times but every time on different positions.
I thought of using preg_match but that doesn't have an option that the position of the words peing replaced is random also.
Does anybody have a clue how to achieve this?
Much as I am loathed to use regex for something which is on the face of it very simple, in order to guarantee exactly n replaces I think it can help here, as it allows use to easily use array_rand(), which does exactly what you want - pick n random items from a list of indeterminate length (IMPROVED).
<?php
function replace_n_occurences ($str, $search, $replace, $n) {
// Get all occurences of $search and their offsets within the string
$count = preg_match_all('/\b'.preg_quote($search, '/').'\b/', $str, $matches, PREG_OFFSET_CAPTURE);
// Get string length information so we can account for replacement strings that are of a different length to the search string
$searchLen = strlen($search);
$diff = strlen($replace) - $searchLen;
$offset = 0;
// Loop $n random matches and replace them, if $n < 1 || $n > $count, replace all matches
$toReplace = ($n < 1 || $n > $count) ? array_keys($matches[0]) : (array) array_rand($matches[0], $n);
foreach ($toReplace as $match) {
$str = substr($str, 0, $matches[0][$match][1] + $offset).$replace.substr($str, $matches[0][$match][1] + $searchLen + $offset);
$offset += $diff;
}
return $str;
}
$str = 'I like blue, blue is my favorite colour because blue is very nice and blue is pretty';
$search = 'blue';
$replace = 'red';
$replaceCount = 2;
echo replace_n_occurences($str, $search, $replace, $replaceCount);
See it working
echo preg_replace_callback('/blue/', function($match) { return rand(0,100) > 50 ? $match[0] : 'red'; }, $str);
Well, you could use this algorithm:
calculate the random amount of times you want to replace the string
explode the string into an array
for that array replace the string occurence only if a random value between 1 and 100 is % 3 (for istance)
Decrease the number calculated at point 1.
Repeat until the number reaches 0.
<?php
$amount_to_replace = 2;
$word_to_replace = 'blue';
$new_word = 'red';
$str = 'I like blue, blue is my favorite colour because blue is very nice and blue is pretty';
$words = explode(' ', $str); //convert string to array of words
$blue_keys = array_keys($words, $word_to_replace); //get index of all $word_to_replace
if(count($blue_keys) <= $amount_to_replace) { //if there are less to replace, we don't need to randomly choose. just replace them all
$keys_to_replace = array_keys($blue_keys);
}
else {
$keys_to_replace = array();
while(count($keys_to_replace) < $amount_to_replace) { //while we have more to choose
$replacement_key = rand(0, count($blue_keys) -1);
if(in_array($replacement_key, $keys_to_replace)) continue; //we have already chosen to replace this word, don't add it again
else {
$keys_to_replace[] = $replacement_key;
}
}
}
foreach($keys_to_replace as $replacement_key) {
$words[$blue_keys[$replacement_key]] = $new_word;
}
$new_str = implode(' ', $words); //convert array of words back into string
echo $new_str."\n";
?>
N.B. I just realized this will not replace the first blue, since it is entered into the word array as "blue," and so doesn't match in the array_keys call.

PHP Word Length Density / Count calc for a string

Given a text, how could I count the density / count of word lengths, so that I get an output like this
1 letter words : 52 / 1%
2 letter words : 34 / 0.5%
3 letter words : 67 / 2%
Found this but for python
counting the word length in a file
Index by word length
You could start by splitting your text into words, using either explode() (as a very/too simple solution) or preg_split() (allows for stuff that's a bit more powerful) :
$text = "this is some kind of text with several words";
$words = explode(' ', $text);
Then, iterate over the words, getting, for each one of those, its length, using strlen() ; and putting those lengths into an array :
$results = array();
foreach ($words as $word) {
$length = strlen($word);
if (isset($results[$length])) {
$results[$length]++;
}
else {
$results[$length] = 1;
}
}
If you're working with UTF-8, see mb_strlen().
At the end of that loop, $results would look like this :
array
4 => int 5
2 => int 2
7 => int 1
5 => int 1
The total number of words, which you'll need to calculate the percentage, can be found either :
By incrementing a counter inside the foreach loop,
or by calling array_sum() on $results after the loop is done.
And for the percentages' calculation, it's a bit of maths -- I won't be that helpful, about that ^^
You could explode the text by spaces and then for each resulting word, count the number of letters. If there are punctuation symbols or any other word separator, you must take this into account.
$lettercount = array();
$text = "lorem ipsum dolor sit amet";
foreach (explode(' ', $text) as $word)
{
#$lettercount[strlen($word)]++; // # for avoiding E_NOTICE on first addition
}
foreach ($lettercount as $numletters => $numwords)
{
echo "$numletters letters: $numwords<br />\n";
}
ps: I have not proved this, but should work
You can be smarter about removing punctuation by using preg_replace.
$txt = "Sean Hoare, who was first named News of the World journalist to make hacking allegations, found dead at Watford home. His death is not being treated as suspiciou";
$txt = str_replace( " ", " ", $txt );
$txt = str_replace( ".", "", $txt );
$txt = str_replace( ",", "", $txt );
$a = explode( " ", $txt );
$cnt = array();
foreach ( $a as $b )
{
if ( isset( $cnt[strlen($b)] ) )
$cnt[strlen($b)] += 1;
else
$cnt[strlen($b)] = 1;
}
foreach ( $cnt as $k => $v )
{
echo $k . " letter words: " . $v . " " . round( ( $v * 100 ) / count( $a ) ) . "%\n";
}
My simple way to limit the number of words characters in some string with php.
function checkWord_len($string, $nr_limit) {
$text_words = explode(" ", $string);
$text_count = count($text_words);
for ($i=0; $i < $text_count; $i++){ //Get the array words from text
// echo $text_words[$i] ; "
//Get the array words from text
$cc = (strlen($text_words[$i])) ;//Get the lenght char of each words from array
if($cc > $nr_limit) //Check the limit
{
$d = "0" ;
}
}
return $d ; //Return the value or null
}
$string_to_check = " heare is your text to check"; //Text to check
$nr_string_limit = '5' ; //Value of limit len word
$rez_fin = checkWord_len($string_to_check,$nr_string_limit) ;
if($rez_fin =='0')
{
echo "false";
//Execute the false code
}
elseif($rez_fin == null)
{
echo "true";
//Execute the true code
}
?>

Categories