PREG_ or Regex Question - php

I'd like to match the last instance of / (I believe you use [^/]+$) and copy the contents of the next four or less numbers until I get to a dash -.
I believe the "right" method to return this number is through a preg_split, but I'm
not sure. the only other way I know is to explode on /, array reverse, explode on -, assign. I'm sure there's a more elegant way though?
For instance
example.com/12-something // get 12
example.com/996-something // get 996
example.com/12345-no-deal // return nothing
I'm unfortunately not a regex guru like some of you folks though.
Here is an ugly way to do the same thing.
$strip = array_reverse(explode('/', $page));
$strip = $strip[0];
$strip = explode('-', $strip);
$strip = $strip[0];
echo (strlen($strip) < 4) ? (int)$strip : null;

This should work
$str = "example.com/123-test";
preg_match("/\/([\d]{1,4})-[^\/]+$/", $str, $matches);
echo $matches[1]; // 123
It makes sure that the ###-word part is at the end and that there are only 1-4 digits.

A match on /\/(\d{1,4})-[^\/]+$/ should fit the bill with the number in the first capture var. My apologies, I don't write PHP and I don't want to deal with preg_match's interface, but that's the regex anyhow.
If PHP supports non-slash regex delimiters these days, m#/(\d{1,4})-[^/]+$# is the version with fewer leaning-toothpicks.

Related

How to remove repeated sequence of characters in a string?

Imagine if:
$string = "abcdabcdabcdabcdabcdabcdabcdabcd";
How do I remove the repeated sequence of characters (all characters, not just alphabets) in the string so that the new string would only have "abcd"? Perhaps running a function that returns a new string with removed repetitions.
$new_string = remove_repetitions($string);
The possible string before removing the repetition is always like above. I don’t know how else to explain since English is not my first language. Other examples are,
$string = “EqhabEqhabEqhabEqhabEqhab”;
$string = “o=98guo=98guo=98gu”;
Note that I want it to work with other sequence of characters as well. I tried using Regex but I couldn't figure out a way to accomplish it. I am still new to php and Regex.
For details : https://algorithms.tutorialhorizon.com/remove-duplicates-from-the-string/
In different programming have a different way to remove the same or duplicate character from a string.
Example: In PHP
<?php
$str = "Hello World!";
echo count_chars($str,3);
?>
OutPut : !HWdelor
https://www.w3schools.com/php/func_string_count_chars.asp
Here, if we wish to remove the repeating substrings, I can't think of a way other than knowing what we wish to collect since the patterns seem complicated.
In that case, we could simply use a capturing group and add our desired output in it the remove everything else:
(abcd|Eqhab|guo=98)
I'm guessing it should be simpler way to do this though.
Test
$re = '/.+?(abcd|Eqhab|guo=98)\1.+/m';
$str = 'abcdabcdabcdabcdabcdabcdabcdabcd
EqhabEqhabEqhabEqhabEqhab
o98guo=98guo=98guo=98guo=98guo=98guo=98guo98';
$subst = '$1';
$result = preg_replace($re, $subst, $str);
echo $result;
Demo
You did not tell what exactly to remove. A "sequnece of characters" can be as small as just 1 character.
So this simple regex should work
preg_replace ( '/(.)(?=.*?\1)/g','' 'abcdabcdabcdabcdabcdabcd');

Using preg_replace and regex with varying formats

I have three types of strings that I encounter. My goal is to cycle through all of them and just get name.
page_1.name
page_2.name.text
page_1.name.something
The only way I can figure out doing this is to first remove the page_# with the following:
$remove_page = preg_replace("/(page_\d+\.)(\w+)/", "$2", $string);
Then remove the last bit like so:
$get_name = preg_replace("/(\w+)(\.\w+)/", "$1", $remove_page);
Is there a more efficient way to do this? This works, but I feel like I'm only slightly grasping the power of regex.
You can use this regex:
preg_match('/(?<=page_\d\.)[^.]+/', $input, $matches);
(?<=page_\d\.) is a lookbehind that makes sure our match is preceded by page_ and a digit. [^.]+ will match 1 or more characters that are not DOT.
RegEx Demo
Otherwise split by DOT and take 1st element:
$arr = explode('.', $input);
$name = $arr[1];

PHP preg_replace, split or match?

I need to parse a string and replace a specific format for tv show names that don't fit my normal format of my media player's queue.
Some examples
Show.Name.2x01.HDTV.x264 should be Show.Name.S02E01.HDTV.x264
Show.Name.10x05.HDTV.XviD should be Show.Name.S10E05.HDTV.XviD
After the show name, there may be 1 or 2 digits before the x, I want the output to always be an S with two digits so add a leading zero if needed. After the x it should always be an E with two digits.
I looked through the manual pages for the preg_replace, split and match functions but couldn't quite figure out what I should do here. I can match the part of the string I want with /\dx\d{2}/ so I was thinking first check if the string has that pattern, then try and figure out how to split the parts out of the match but I didn't get anywhere.
I work best with examples, so if you can point me in the right direction with one that would be great. My only test area right now is a PHP 4 install, so please no PHP 5 specific directions, once I understand whats happening I can probably update it later for PHP 5 if needed :)
A different approach as a solution using #sprintf using PHP4 and below.
$text = preg_replace('/([0-9]{1,2})x([0-9]{2})/ie',
'sprintf("S%02dE%02d", $1, $2)', $text);
Note: The use of the e modifier is depreciated as of PHP5.5, so use preg_replace_callback()
$text = preg_replace_callback('/([0-9]{1,2})x([0-9]{2})/',
function($m) {
return sprintf("S%02dE%02d", $m[1], $m[2]);
}, $text);
Output
Show.Name.S02E01.HDTV.x264
Show.Name.S10E05.HDTV.XviD
See working demo
preg_replace is the function you are looking function.
You have to write a regex pattern that picks correct place.
<?php
$replaced_data = preg_replace("~([0-9]{2})x([0-9]{2})~s", "S$1E$2", $data);
$replaced_data = preg_replace("~S([1-9]{1})E~s", "S0$1E", $replaced_data);
?>
Sorry I could not test it but it should work.
An other way using the preg_replace_callback() function:
$subject = <<<'LOD'
Show.Name.2x01.HDTV.x264 should be Show.Name.S02E01.HDTV.x264
Show.Name.10x05.HDTV.XviD should be Show.Name.S10E05.HDTV.XviD
LOD;
$pattern = '~([0-9]++)x([0-9]++)~i';
$callback = function ($match) {
return sprintf("S%02sE%02s", $match[1], $match[2]);
};
$result = preg_replace_callback($pattern, $callback, $subject);
print_r($result);

PHP Regex to remove everything after a character

So I've seen a couple articles that go a little too deep, so I'm not sure what to remove from the regex statements they make.
I've basically got this
foo:bar all the way to anotherfoo:bar;seg98y34g.?sdebvw h segvu (anything goes really)
I need a PHP regex to remove EVERYTHING after the colon. the first part can be any length (but it never contains a colon. so in both cases above I'd end up with
foo and anotherfoo
after doing something like this horrendous example of psuedo-code
$string = 'foo:bar';
$newstring = regex_to_remove_everything_after_":"($string);
EDIT
after posting this, would an explode() work reliably enough? Something like
$pieces = explode(':', 'foo:bar')
$newstring = $pieces[0];
explode would do what you're asking for, but you can make it one step by using current.
$beforeColon = current(explode(':', $string));
I would not use a regex here (that involves some work behind the scenes for a relatively simple action), nor would I use strpos with substr (as that would, effectively, be traversing the string twice). Most importantly, this provides the person who reads the code with an immediate, "Ah, yes, that is what the author is trying to do!" instead of, "Wait, what is happening again?"
The only exception to that is if you happen to know that the string is excessively long: I would not explode a 1 Gb file. Instead:
$beforeColon = substr($string, 0, strpos($string,':'));
I also feel substr isn't quite as easy to read: in current(explode you can see the delimiter immediately with no extra function calls and there is only one incident of the variable (which makes it less prone to human errors). Basically I read current(explode as "I am taking the first incident of anything prior to this string" as opposed to substr, which is "I am getting a substring starting at the 0 position and continuing until this string."
Your explode solution does the trick. If you really want to use regexes for some reason, you could simply do this:
$newstring = preg_replace("/(.*?):(.*)/", "$1", $string);
A bit more succinct than other examples:
current(explode(':', $string));
You can use RegEx that m.buettner wrote, but his example returns everything BEFORE ':', if you want everything after ':' just use $2 instead of $1:
$newstring = preg_replace("/(.*?):(.*)/", "$2", $string);
You could use something like the following. demo: http://codepad.org/bUXKN4el
<?php
$s = 'anotherfoo:bar;seg98y34g.?sdebvw h segvu';
$result = array_shift(explode(':', $s));
echo $result;
?>
Why do you want to use a regex?
list($beforeColon) = explode(':', $string);

Replace from one custom string to another custom string

How can I replace a string starting with 'a' and ending with 'z'?
basically I want to be able to do the same thing as str_replace but be indifferent to the values in between two strings in a 'haystack'.
Is there a built in function for this? If not, how would i go about efficiently making a function that accomplishes it?
That can be done with Regular Expression (RegEx for short).
Here is a simple example:
$string = 'coolAfrackZInLife';
$replacement = 'Stuff';
$result = preg_replace('/A.*Z/', $replacement, $string);
echo $result;
The above example will return coolStuffInLife
A little explanation on the givven RegEx /A.*Z/:
- The slashes indicate the beginning and end of the Regex;
- A and Z are the start and end characters between which you need to replace;
- . matches any single charecter
- * Zero or more of the given character (in our case - all of them)
- You can optionally want to use + instead of * which will match only if there is something in between
Take a look at Rubular.com for a simple way to test your RegExs. It also provides short RegEx reference
$string = "I really want to replace aFGHJKz with booo";
$new_string = preg_replace('/a[a-zA-z]+z/', 'boo', $string);
echo $new_string;
Be wary of the regex, are you wanting to find the first z or last z? Is it only letters that can be between? Alphanumeric? There are various scenarios you'd need to explain before I could expand on the regex.
use preg_replace so you can use regex patterns.

Categories