Remove numbers from string and create a new string - php

How can I remove numbers from a string like "take_me_apart_12_13" to make it "take_me_apart" and save the numbers (12 & 13) into an array?
I've tried things like preg_split("/_/", $string) but that breaks the whole structure of the string (instead of just removing the numbers and keeping the words and underscores. I've been looking for a php function to use for this sort of thing but I cannot find one that accomplishes this task.
I know I could do something along the lines of using a for loop and checking each character and creating a new string in there but I was hoping to avoid that. If it's the only possible solution though, I'll have to use it.

use this [0-9] with _
$data = "take_me_apart_12_13";
preg_match_all('!\d+!', $data , $matches);
print_r($matches);
echo $words = preg_replace('/_[0-9]+/', '', $data);
DEMO
OUTPUT :: take_me_apart

Related

Replacing part of the string with regex

I have a string like this:
$string ='//upload.wikimedia.org/wikipedia/commons/thumb/6/6b/AkutanZero1.jpg/220px-AkutanZero1.jpg';
But I'm trying to replace a section of it with another:
$string ='//upload.wikimedia.org/wikipedia/commons/thumb/6/6b/AkutanZero1.jpg/123px-AkutanZero1.jpg';
I'm using trying to use preg_replace, and I know that the string will always end with /thumb/(a hex value)/(two hex values)/(stuff)/(one or more numbers)-px-(stuff)
Unfortunately I haven't been successful in getting the text replaced and don't know what I'm doing wrong.
It would be easy if I could assume /(one or more numbers)-px existing only once but it could also exist in the /(stuff) part too.
preg_replace('/\/thumb\/[0-9a-f]\/[0-9a-f]{2}\/.+\/([0-9]+)-px-.+$/i', '328', $string);
preg_replace('/(\/thumb\/[0-9a-f]\/[0-9a-f]{2}\/.+\/)([0-9]+)(-px-.+)$/i', $1.'328'.$3, $string);
Based on your single sample input, you don't need any capture groups to get the expected result. Just find the occurrence(s) of digits followed by px- and swap in your preferred value. If this isn't robust enough, please improve your question.
Code: (Demo)
$string='//upload.wikimedia.org/wikipedia/commons/thumb/6/6b/AkutanZero1.jpg/220px-AkutanZero1.jpg';
echo preg_replace('/\d+px-/','123px-',$string);
Output:
//upload.wikimedia.org/wikipedia/commons/thumb/6/6b/AkutanZero1.jpg/123px-AkutanZero1.jpg

How to get a number from a html source page?

I'm trying to retrieve the followed by count on my instagram page. I can't seem to get the Regex right and would very much appreciate some help.
Here's what I'm looking for:
y":{"count":
That's the beginning of the string, and I want the 4 numbers after that.
$string = preg_replace("{y"\"count":([0-9]+)\}","",$code);
Someone suggested this ^ but I can't get the formatting right...
You haven't posted your strings so it is a guess to what the regex should be... so I'll answer on why your codes fail.
preg_replace('"followed_by":{"count":\d')
This is very far from the correct preg_replace usage. You need to give it the replacement string and the string to search on. See http://php.net/manual/en/function.preg-replace.php
Your second usage:
$string = preg_replace(/^y":{"count[0-9]/","",$code);
Is closer but preg_replace is global so this is searching your whole file (or it would if not for the anchor) and will replace the found value with nothing. What your really want (I think) is to use preg_match.
$string = preg_match('/y":\{"count(\d{4})/"', $code, $match);
$counted = $match[1];
This presumes your regex was kind of correct already.
Per your update:
Demo: https://regex101.com/r/aR2iU2/1
$code = 'y":{"count:1234';
$string = preg_match('/y":\{"count:(\d{4})/', $code, $match);
$counted = $match[1];
echo $counted;
PHP Demo: https://eval.in/489436
I removed the ^ which requires the regex starts at the start of your string, escaped the { and made the\d be 4 characters long. The () is a capture group and stores whatever is found inside of it, in this case the 4 numbers.
Also if this isn't just for learning you should be prepared for this to stop working at some point as the service provider may change the format. The API is a safer route to go.
This regexp should capture value you're looking for in the first group:
\{"count":([0-9]+)\}
Use it with preg_match_all function to easily capture what you want into array (you're using preg_replace which isn't for retrieving data but for... well replacing it).
Your regexp isn't working because you didn't escaped curly brackets. And also you didn't put count quantifier (plus sign in my example) so it would only capture first digit anyway.

Multiple preg_replace

I have many strings that all start with #and a pseudo and I want to change these pseudo via regex to the real name.
I haven't many pseudo (maybe 5 to 10) so I can go with a simple regex like:
$find = array('#alex', '#donald');
$replace = array('Alex A.', 'Donald B.' );
$result= preg_replace($find, $replace, $feed->itemTitle);
My problem is that I already have a pre_replace on these string, that removes the link. So far this is my regex:
<?php echo preg_replace('#(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?).....#',' ',$feed->itemTitle); ?>
I can't come up with a solution that will mix the two regex. (regex is something I am not confortable with).
To have already a preg_replace for the links isn't a problem, don't bother about that.
If you want you can build a giant pattern with capture groups to be used with preg_replace_callback that allows the callback function to choose which is the replacement string to return according to the capture group number. However, this isn't the good way.
Since, you want to replace fixed strings (#alex, #donald are fixed strings) the best and fastest way is to use strtr (even if you parse the string a second time):
$trans = array('#alex' => 'Alex A.',
'#donald' => 'Donald B.');
$result = strtr($feed->itemTitle, $trans);

Regex for PHP seems simple but is killing me

I'm trying to make a replace in a string with a regex, and I really hope the community can help me.
I have this string :
031,02a,009,a,aaa,AZ,AZE,02B,975,135
And my goal is to remove the opposite of this regex
[09][0-9]{2}|[09][0-9][A-Za-z]
i.e.
a,aaa,AZ,AZE,135
(to see it in action : http://regexr.com?3795f )
My final goal is to preg_replace the first string to only get
031,02a,009,02B,975
(to see it in action : http://regexr.com?3795f )
I'm open to all solution, but I admit that I really like to make this work with a preg_replace if it's possible (It became something like a personnal challenge)
Thanks for all help !
As #Taemyr pointed out in comments, my previous solution (using a lookbehind assertion) was incorrect, as it would consume 3 characters at a time even while substrings weren't always 3 characters.
Let's use a lookahead assertion instead to get around this:
'/(^|,)(?![09][0-9]{2}|[09][0-9][A-Za-z])[^,]*/'
The above matches the beginning of the string or a comma, then checks that what follows does not match one of the two forms you've specified to keep, and given that this condition passes, matches as many non-comma characters as possible.
However, this is identical to #anubhava's solution, meaning it has the same weakness, in that it can leave a leading comma in some cases. See this Ideone demo.
ltriming the comma is the clean way to go there, but then again, if you were looking for the "clean way to go," you wouldn't be trying to use a single preg_replace to begin with, right? Your question is whether it's possible to do this without using any other PHP functions.
The anwer is yes. We can take
'/(^|,)foo/'
and distribute the alternation,
'/^foo|,foo/'
so that we can tack on the extra comma we wish to capture only in the first case, i.e.
'/^foo,|,foo/'
That's going to be one hairy expression when we substitute foo with our actual regex, isn't it. Thankfully, PHP supports recursive patterns, so that we can rewrite the above as
'/^(foo),|,(?1)/'
And there you have it. Substituting foo for what it is, we get
'/^((?![09][0-9]{2}|[09][0-9][A-Za-z])[^,]*),|,(?1)/'
which indeed works, as shown in this second Ideone demo.
Let's take some time here to simplify your expression, though. [0-9] is equivalent to \d, and you can use case-insensitive matching by adding /i, like so:
'/^((?![09]\d{2}|[09]\d[a-z])[^,]*),|,(?1)/i'
You might even compact the inner alternation:
'/^((?![09]\d(\d|[a-z]))[^,]*),|,(?1)/i'
Try it in more steps:
$newList = array();
foreach (explode(',', $list) as $element) {
if (!preg_match('/[09][0-9]{2}|[09][0-9][A-Za-z]/', $element) {
$newList[] = $element;
}
}
$list = implode(',', $newList);
You still have your regex, see! Personnal challenge completed.
Try matching what you want to keep and then joining it with commas:
preg_match_all('/[09][0-9]{2}|[09][0-9][A-Za-z]/', $input, $matches);
$result = implode(',', $matches);
The problem you'll be facing with preg_replace is the extra-commas you'll have to strip, cause you don't just want to remove aaa, you actually want to remove aaa, or ,aaa. Now what when you have things to remove both at the beginning and at the end of the string? You can't just say "I'll just strip the comma before", because that might lead to an extra comma at the beginning of the string, and vice-versa. So basically, unless you want to mess with lookaheads and/or lookbehinds, you'd better do this in two steps.
This should work for you:
$s = '031,02a,009,a,aaa,AZ,AZE,02B,975,135';
echo ltrim(preg_replace('/(^|,)(?![09][0-9]{2}|[09][0-9][A-Za-z])[^,]+/', '', $s), ',');
OUTPUT:
031,02a,009,02B,975
Try this:
preg_replace('/(^|,)[1-8a-z][^,]*/i', '', $string);
this will remove all substrings starting with the start of the string or a comma, followed by a non allowed first character, up to but excluding the following comma.
As per #GeoffreyBachelet suggestion, to remove residual commas, you should do:
trim(preg_replace('/(^|,)[1-8a-z][^,]*/i', '', $string), ',');

Convert Notepad++ Regex to PHP Regular Expression

I'm trying to convert a Notepad++ Regex to a PHP regular expression which basically get IDs from a list of URL in this format:
http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html
http://www.example.com/category-example/1471337-text-blah-blah-2-blah-2010.html
Using Notepad++ regex function i get the output that i need in two steps (a list of comma separated IDs)
(.*)/ replace with space
-(.*) replace with comma
Result:
1371937,1471337
I tried to do something similar with PHP preg_replace but i can't figure how to get the correct regex, the below example removes everything except digits but it doesn't work as expected since there can be also numbers that do not belong to ID.
$bb = preg_replace('/[^0-9]+/', ',', $_POST['Text']);
?>
Which is the correct structure?
Thanks
If you are matching against:
http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html
To get:
1371937
You would:
$url = "http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html";
preg_match( "/[^\d]+(\d+)-/", $url, $matches );
$code = $matches[1];
.. which matches all non-numeric characters, then an unbroken string of numbers, until it reaches a '-'
If all you want to do is find the ID, then you should use preg_match, not preg_replace.
You've got lost of options for the pattern, the simplest being:
$url = 'http://www.example.com/category-example/1371937-text-blah-blah-blah-2012.html';
preg_match('/\d+/', $url, $matches);
echo $matches[0];
Which simply finds the first bunch of numbers in the URL. This works for the examples.

Categories