Replacing a string inside a string in PHP - php

I have strings in my application that users can send via a form, and they can optionally replace words in that string with replacements that they also specify. For example if one of my users entered this string:
I am a user string and I need to be parsed.
And chose to replace and with foo the resulting string should be:
I am a user string foo I need to be parsed.
I need to somehow find the starting position of what they want to replace, replace it with the word they want and then tie it all together.
Could anyone write this up or at least provide an algorithm? My PHP skills aren't really up to the task :(
Thanks. :)

$result = preg_replace('/\band\b/i', 'foo', $subject);
will find all occurences of and where it's a word on its own and replace it with foo. \b ensures that there is a word boundary before and after and.

use preg_replace. You don't need to think so hard about this though you will have to learn a little bit about regexes. :)

Read up on str_replace, or for more complex replacements on Regular Expressions and preg_replace.
Examples for both:
<?php
$str = 'I am a user string and I need to be parsed.';
echo str_replace( 'and', 'foo', $str ) . "\n";
echo preg_replace( '/and/', 'foo', $str ) . "\n";
?>
In response to the comments of this answer, note that both examples above will replace every occurrence of the search string (and), even when it happens to be within another word.
To take care of that you either have to add the word separators to the str_replace call (see the comment of an example), but this will get quite complicated when you want to take care of all common word separators (space, commas, dots, exclamation marks, question marks etc.).
An easier to way to fix this problem is to use the power of regular expressions and make sure, the actual search string is not found within another word. See Tim Pietzcker's example below for a possible solution.

Related

How to get a number from a html source page?

I'm trying to retrieve the followed by count on my instagram page. I can't seem to get the Regex right and would very much appreciate some help.
Here's what I'm looking for:
y":{"count":
That's the beginning of the string, and I want the 4 numbers after that.
$string = preg_replace("{y"\"count":([0-9]+)\}","",$code);
Someone suggested this ^ but I can't get the formatting right...
You haven't posted your strings so it is a guess to what the regex should be... so I'll answer on why your codes fail.
preg_replace('"followed_by":{"count":\d')
This is very far from the correct preg_replace usage. You need to give it the replacement string and the string to search on. See http://php.net/manual/en/function.preg-replace.php
Your second usage:
$string = preg_replace(/^y":{"count[0-9]/","",$code);
Is closer but preg_replace is global so this is searching your whole file (or it would if not for the anchor) and will replace the found value with nothing. What your really want (I think) is to use preg_match.
$string = preg_match('/y":\{"count(\d{4})/"', $code, $match);
$counted = $match[1];
This presumes your regex was kind of correct already.
Per your update:
Demo: https://regex101.com/r/aR2iU2/1
$code = 'y":{"count:1234';
$string = preg_match('/y":\{"count:(\d{4})/', $code, $match);
$counted = $match[1];
echo $counted;
PHP Demo: https://eval.in/489436
I removed the ^ which requires the regex starts at the start of your string, escaped the { and made the\d be 4 characters long. The () is a capture group and stores whatever is found inside of it, in this case the 4 numbers.
Also if this isn't just for learning you should be prepared for this to stop working at some point as the service provider may change the format. The API is a safer route to go.
This regexp should capture value you're looking for in the first group:
\{"count":([0-9]+)\}
Use it with preg_match_all function to easily capture what you want into array (you're using preg_replace which isn't for retrieving data but for... well replacing it).
Your regexp isn't working because you didn't escaped curly brackets. And also you didn't put count quantifier (plus sign in my example) so it would only capture first digit anyway.

Split string on non-alphanumerics in PHP? Is it possible with php's native function?

I was trying to split a string on non-alphanumeric characters or simple put I want to split words. The approach that immediately came to my mind is to use regular expressions.
Example:
$string = 'php_php-php php';
$splitArr = preg_split('/[^a-z0-9]/i', $string);
But there are two problems that I see with this approach.
It is not a native php function, and is totally dependent on the PCRE Library running on server.
An equally important problem is that what if I have punctuation in a word
Example:
$string = 'U.S.A-men's-vote';
$splitArr = preg_split('/[^a-z0-9]/i', $string);
Now this will spilt the string as [{U}{S}{A}{men}{s}{vote}]
But I want it as [{U.S.A}{men's}{vote}]
So my question is that:
How can we split them according to words?
Is there a possibility to do it with php native function or in some other way where we are not dependent?
Regards
Sounds like a case for str_word_count() using the oft forgotten 1 or 2 value for the second argument, and with a 3rd argument to include hyphens, full stops and apostrophes (or whatever other characters you wish to treat as word-parts) as part of a word; followed by an array_walk() to trim those characters from the beginning or end of the resultant array values, so you only include them when they're actually embedded in the "word"
Either you have PHP installed (then you also have PCRE), or you don't. So your first point is a non-issue.
Then, if you want to exclude punctuation from your splitting delimiters, you need to add them to your character class:
preg_split('/[^a-z0-9.\']+/i', $string);
If you want to treat punctuation characters differently depending on context (say, make a dot only be a delimiter if followed by whitespace), you can do that, too:
preg_split('/\.\s+|[^a-z0-9.\']+/i', $string);
As per my comment, you might want to try (add as many separators as needed)
$splitArr = preg_split('/[\s,!\?;:-]+|[\.]\s+/', $string, -1, PREG_SPLIT_NO_EMPTY);
You'd then have to handle the case of a "quoted" word (it's not so easy to do in a regular expression, because 'is" "this' quoted? And how?).
So I think it's best to keep ' and " within words (so that "it's" is a single word, and "they 'll" is two words) and then deal with those cases separately. For example a regexp would have some trouble in correctly handling
they 're 'just friends'. Or that's what they say.
while having "'re" and a sequence of words of which the first is left-quoted and the last is right-quoted, the first not being a known sequence ('s, 're, 'll, 'd ...) may be handled at application level.
This is not a php-problem, but a logical one.
Words could be concatenated by a -. Abbrevations could look like short sentences.
You can match your example directly by creating a solution that fits only on this particular phrase. But you cant get a solution for all possible phrases. That would require a neuronal-computing based content-recognition.

Regular Expressions find and replace

I am having problems with RegEx in PHP and can't seem to find the answer.
I have a string, which is 3 letters, all caps ie COS.
the letters will change but always be 3 chars long and in caps, it will also be in the center of another string, surrounded by commas.
I need a regEx to find 3 caps letter inside a string and cahnge them from COS to 'COS'
(im doing this to amend a sql insert string)
I can't seem to find the regEx unless i use spercifit letter but the letters will change.
I need something along the lines of
[A-z]{3} then replace with '[A-Z]' (I know this isnt anywere near correct, just shorthand)
Anyone any suggestions?
Cheers
EDIT:
Just wanted to add incase anyone comes accross this question at a later date:
the sql insert string (provided from an external source and ftp's to my server daily)
contained the 3 capital string twice, once with commas and once with out
so I had to also remove the double commas added from the first regEx
$sqlString = preg_replace('/([A-Z]{3})/', "'$1'", $isqlString);
$sqlString = preg_replace('/\'\'([A-Z]{3})\'\'/', "'$1'", $sqlStringt);
Thanks everyone
You were actually very close. You could use:
echo preg_replace('/([A-Z]{3})/', "'$1'", 'COS'); //will output 'COS'
For MySQL statements I would advise to use the function mysql_real_escape_string() though.
$string = preg_replace('/([A-Z]{3})/', "'$1'", $string);
http://php.net/manual/en/function.preg-replace.php
Assuming it's like you said, "three capital letters surrounded by commas, e.g.
Foo bar,COS,Foo Bar
You can use look-ahead and look-behinds and find the letters:
(?<=,)([A-Z]{3})(?=,)
Then a simple replace to surround with single quotes will be adequate:
'$1'
All together, Here's it working.
preg_replace('/(^|\b)([A-Z]{3})(\b|$)/', "'${2}'", $string);

Regular expression anchor text for a link

I am trying to pull the anchor text from a link that is formatted this way:
<h3><b>File</b> : i_want_this</h3>
I want only the anchor text for the link : "i_want_this"
"variable_text" varies according to the filename so I need to ignore that.
I am using this regex:
<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>
This is matching of course the complete link.
PHP uses a pretty close version to PCRE (PERL Regex). If you want to know a lot about regex, visit perlretut.org. Also, look into Regex generators like exspresso.
For your use, know that regex is greedy. That means that when you specify that you want something, follwed by anything (any repetitions) followed by something, it will keep on going until that second something is reached.
to be more clear, what you want is this:
<a href="
any character, any number of times (regex = .* )
">
any character, any number of times (regex = .* )
</a>
beyond that, you want to capture the second group of "any character, any number of times". You can do that using what are called capture groups (capture anything inside of parenthesis as a group for reference later, also called back references).
I would also look into named subpatterns, too - with those, you can reference your choice with a human readable string rather than an array index. Syntax for those in PHP are (?P<name>pattern) where name is the name you want and pattern is the actual regex. I'll use that below.
So all that being said, here's the "lazy web" for your regex:
<?php
$str = '<h3><b>File</b> : i_want_this</h3>';
$regex = '/(<a href\=".*">)(?P<target>.*)(<\/a>)/';
preg_match($regex, $str, $matches);
print $matches['target'];
?>
//This should output "i_want_this"
Oh, and one final thought. Depending on what you are doing exactly, you may want to look into SimpleXML instead of using regex for this. This would probably require that the tags that we see are just snippits of a larger whole as SimpleXML requires well-formed XML (or XHTML).
I'm sure someone will probably have a more elegant solution, but I think this will do what you want to done.
Where:
$subject = "<h3><b>File</b> : i_want_this</h3>";
Option 1:
$pattern1 = '/(<a href=")(.*)(">)(.*)(<\/a>)/i';
preg_match($pattern1, $subject, $matches1);
print($matches1[4]);
Option 2:
$pattern2 = '()(.*)()';
ereg($pattern2, $subject, $matches2);
print($matches2[4]);
Do not use regex to parse HTML. Use a DOM parser. Specify the language you're using, too.
Since it's in a captured group and since you claim it's matching, you should be able to reference it through $1 or \1 depending on the language.
$blah = preg_match( $pattern, $subject, $matches );
print_r($matches);
The thing to remember is that regex's return everything you searched for if it matches. You need to specify that only care about the part you've surrounded in parenthesis (the anchor text). I'm not sure what language you're using the regex in, but here's an example in Ruby:
string = 'i_want_this'
data = string.match(/<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>/)
puts data # => outputs 'i_want_this'
If you specify what you want in parenthesis, you can reference it:
string = 'i_want_this'
data = string.match(/<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>/)[1]
puts data # => outputs 'i_want_this'
Perl will have you use $1 instead of [1] like this:
$string = 'i_want_this';
$string =~ m/<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>/;
$data = $1;
print $data . "\n";
Hope that helps.
I'm not 100% sure if I understand what you want. This will match the content between the anchor tags. The URL must start with /en/browse/file/, but may end with anything.
#(.*?)#
I used # as a delimiter as it made it clearer. It'll also help if you put them in single quotes instead of double quotes so you don't have to escape anything at all.
If you want to limit to numbers instead, you can use:
#(.*?)#
If it should have just 5 numbers:
#(.*?)#
If it should have between 3 and 6 numbers:
#(.*?)#
If it should have more than 2 numbers:
#(.*?)#
This should work:
<a href="[^"]*">([^<]*)
this says that take EVERYTHING you find until you meet "
[^"]*
same! take everything with you till you meet <
[^<]*
The paratese around [^<]*
([^<]*)
group it! so you can collect that data in PHP! If you look in the PHP manual om preg_match you will se many fine examples there!
Good luck!
And for your concrete example:
<a href="/en/browse/file/variable_text">([^<]*)
I use
[^<]*
because in some examples...
.*?
can be extremely slow! Shoudln't use that if you can use
[^<]*
You should use the tool Expresso for creating regular expression... Pretty handy..
http://www.ultrapico.com/Expresso.htm

PHP: Preg replace string with nothing

I'm new to preg_replace() and I've been trying to get this to work, I couldn't so StackOverflow is my last chance.
I have a string with a few of these:
('pm_IDHERE', 'NameHere');">
I want it to be replaced with nothing, so it would require 2 wildcards for NameHere and pm_IDHERE.
But I've tried it and failed myself, so could someone give me the right code please, and thanks :)
Update:
You are almost there, you just have to make the replacement an empty string and escape the parenthesis properly, otherwise they will be treated as capture group (which you don't need btw):
$str = preg_replace("#\('pm_.+?', '.*?'\);#si", "", $str);
You probably also don't need the modifiers s and i but that is up to you.
Old answer:
Probably str_replace() is sufficient:
$str = "Some string that contains pm_IDHERE and NameHere";
$str = str_replace(array('pm_IDHERE', 'NameHere'), '', $str);
If this is not what you mean and pm_IDHERE is actually something like pm_1564 then yes, you probably need regular expressions for that. But if NameHere has no actual pattern or structure, you cannot replace it with regular expression.
And you definitely have to explain better what kind of string you have and what kind of string you have want to replace.

Categories