PHP regex title conversion / negative look ahead / toLowerCase

PHP regex title conversion / negative look ahead / toLowerCase - php

I'm trying to convert some titles in my html pages to <h2>. The pattern is simple.
<?php
$test = "<p><strong>THIS IS A TEST</strong></p><div>And this is Random STUFF</div><p><strong>CP</strong></p>";
$pattern = "/<p><strong>([A-Z ]*?)<\/strong><\/p>/";
$replacement = "<h2>$1</h2>";
$test = preg_replace($pattern, $replacement, $test);
?>
Basically, grab anything that's between <p><strong></strong></p> that is capitalized. Easy enough, so here's the complicated bit.
Firstly, I need to make a single exception. <p><strong>CP</strong></p> must not be converted to <h2>. I tried adding ?!(CP) right after the <p><strong> but it doesn't work.
Secondly, I need to be able to make the first letter capitalized. When I use "ucfirst" with "strtolower" on the preg_replace (ex:ucfirst(strtolower(preg_replace($pattern, $replacement, $test)));), it makes all the characters in the string to lowercase and ucfirst doesn't work as it's detecting "<" to be the first character.
Any hints or am I even going in the right direction?
EDIT
Thanks for the help, it was definitely better to use preg_replace_callback. I found that all my titles were more than 3 characters so I added the limiter. Also added special characters.
Here's my final code:
$pattern = "/<p><strong>([A-ZÀ-ÿ0-9 ']{3,}?)<\/strong><\/p>/";
$replacement = "<h2>$1</h2>";
$test[$i] = preg_replace_callback($pattern, create_function('$matches', 'return "<h2>".ucfirst(mb_strtolower($matches[1]))."</h2>";'), $test[$i]);

Try http://php.net/manual/de/function.preg-replace-callback.php .
You can create a custom function that is called on every match. In this function you can decide to a) not replace CP and b) to not put $1, but ucfirst.
Hope this helps & good luck.

Related

How to remove repeated sequence of characters in a string?

Imagine if:
$string = "abcdabcdabcdabcdabcdabcdabcdabcd";
How do I remove the repeated sequence of characters (all characters, not just alphabets) in the string so that the new string would only have "abcd"? Perhaps running a function that returns a new string with removed repetitions.
$new_string = remove_repetitions($string);
The possible string before removing the repetition is always like above. I don’t know how else to explain since English is not my first language. Other examples are,
$string = “EqhabEqhabEqhabEqhabEqhab”;
$string = “o=98guo=98guo=98gu”;
Note that I want it to work with other sequence of characters as well. I tried using Regex but I couldn't figure out a way to accomplish it. I am still new to php and Regex.

For details : https://algorithms.tutorialhorizon.com/remove-duplicates-from-the-string/
In different programming have a different way to remove the same or duplicate character from a string.
Example: In PHP
<?php
$str = "Hello World!";
echo count_chars($str,3);
?>
OutPut : !HWdelor
https://www.w3schools.com/php/func_string_count_chars.asp

Here, if we wish to remove the repeating substrings, I can't think of a way other than knowing what we wish to collect since the patterns seem complicated.
In that case, we could simply use a capturing group and add our desired output in it the remove everything else:
(abcd|Eqhab|guo=98)
I'm guessing it should be simpler way to do this though.
Test
$re = '/.+?(abcd|Eqhab|guo=98)\1.+/m';
$str = 'abcdabcdabcdabcdabcdabcdabcdabcd
EqhabEqhabEqhabEqhabEqhab
o98guo=98guo=98guo=98guo=98guo=98guo=98guo98';
$subst = '$1';
$result = preg_replace($re, $subst, $str);
echo $result;
Demo

You did not tell what exactly to remove. A "sequnece of characters" can be as small as just 1 character.
So this simple regex should work
preg_replace ( '/(.)(?=.*?\1)/g','' 'abcdabcdabcdabcdabcdabcd');

How to get a number from a html source page?

I'm trying to retrieve the followed by count on my instagram page. I can't seem to get the Regex right and would very much appreciate some help.
Here's what I'm looking for:
y":{"count":
That's the beginning of the string, and I want the 4 numbers after that.
$string = preg_replace("{y"\"count":([0-9]+)\}","",$code);
Someone suggested this ^ but I can't get the formatting right...

You haven't posted your strings so it is a guess to what the regex should be... so I'll answer on why your codes fail.
preg_replace('"followed_by":{"count":\d')
This is very far from the correct preg_replace usage. You need to give it the replacement string and the string to search on. See http://php.net/manual/en/function.preg-replace.php
Your second usage:
$string = preg_replace(/^y":{"count[0-9]/","",$code);
Is closer but preg_replace is global so this is searching your whole file (or it would if not for the anchor) and will replace the found value with nothing. What your really want (I think) is to use preg_match.
$string = preg_match('/y":\{"count(\d{4})/"', $code, $match);
$counted = $match[1];
This presumes your regex was kind of correct already.
Per your update:
Demo: https://regex101.com/r/aR2iU2/1
$code = 'y":{"count:1234';
$string = preg_match('/y":\{"count:(\d{4})/', $code, $match);
$counted = $match[1];
echo $counted;
PHP Demo: https://eval.in/489436
I removed the ^ which requires the regex starts at the start of your string, escaped the { and made the\d be 4 characters long. The () is a capture group and stores whatever is found inside of it, in this case the 4 numbers.
Also if this isn't just for learning you should be prepared for this to stop working at some point as the service provider may change the format. The API is a safer route to go.

This regexp should capture value you're looking for in the first group:
\{"count":([0-9]+)\}
Use it with preg_match_all function to easily capture what you want into array (you're using preg_replace which isn't for retrieving data but for... well replacing it).
Your regexp isn't working because you didn't escaped curly brackets. And also you didn't put count quantifier (plus sign in my example) so it would only capture first digit anyway.

PHP preg_replace, split or match?

I need to parse a string and replace a specific format for tv show names that don't fit my normal format of my media player's queue.
Some examples
Show.Name.2x01.HDTV.x264 should be Show.Name.S02E01.HDTV.x264
Show.Name.10x05.HDTV.XviD should be Show.Name.S10E05.HDTV.XviD
After the show name, there may be 1 or 2 digits before the x, I want the output to always be an S with two digits so add a leading zero if needed. After the x it should always be an E with two digits.
I looked through the manual pages for the preg_replace, split and match functions but couldn't quite figure out what I should do here. I can match the part of the string I want with /\dx\d{2}/ so I was thinking first check if the string has that pattern, then try and figure out how to split the parts out of the match but I didn't get anywhere.
I work best with examples, so if you can point me in the right direction with one that would be great. My only test area right now is a PHP 4 install, so please no PHP 5 specific directions, once I understand whats happening I can probably update it later for PHP 5 if needed :)

A different approach as a solution using #sprintf using PHP4 and below.
$text = preg_replace('/([0-9]{1,2})x([0-9]{2})/ie',
'sprintf("S%02dE%02d", $1, $2)', $text);
Note: The use of the e modifier is depreciated as of PHP5.5, so use preg_replace_callback()
$text = preg_replace_callback('/([0-9]{1,2})x([0-9]{2})/',
function($m) {
return sprintf("S%02dE%02d", $m[1], $m[2]);
}, $text);
Output
Show.Name.S02E01.HDTV.x264
Show.Name.S10E05.HDTV.XviD
See working demo

preg_replace is the function you are looking function.
You have to write a regex pattern that picks correct place.
<?php
$replaced_data = preg_replace("~([0-9]{2})x([0-9]{2})~s", "S$1E$2", $data);
$replaced_data = preg_replace("~S([1-9]{1})E~s", "S0$1E", $replaced_data);
?>
Sorry I could not test it but it should work.

An other way using the preg_replace_callback() function:
$subject = <<<'LOD'
Show.Name.2x01.HDTV.x264 should be Show.Name.S02E01.HDTV.x264
Show.Name.10x05.HDTV.XviD should be Show.Name.S10E05.HDTV.XviD
LOD;
$pattern = '~([0-9]++)x([0-9]++)~i';
$callback = function ($match) {
return sprintf("S%02sE%02s", $match[1], $match[2]);
};
$result = preg_replace_callback($pattern, $callback, $subject);
print_r($result);

Converting links occuring inside a string

I am attempting to change a string occurance e.g. http://www.bbc.co.uk/ so that it appears inside a html link e.g. http://www.bbc.co.uk
however for some reason my regex conversion does not work. Can someone please point me in the correct direction?
$text = "I love this website http://www.bbc.co.uk/";
$x = preg_replace("#[a-z]+://[^<>\s]+[[a-z0-9]/]#i", "\\0", $text);
var_dump($x);
outputs I love this website http://www.bbc.co.uk/ (No html link)

Your weird character class is at fault:
[[a-z0-9]/]
Double square brackets are for POSIX character classes like [[:digit:]].
You meant to write just:
[a-z0-9/]

It is because you regex is giving you a match (in fact it's really not even close to giving you a match as you are not accepting periods in the domain name at all). Try something like this:
$pattern = '#https?://.*\b#i';
$replace = '$0';
$x = preg_replace($pattern, $replace, $text);
Note that I am not actually trying to validate the URL format here, so I just accept anything like http():// up to the next word boundary. It didn't seem as if you were going for a true URL validation regex anyway (i.e. validating there is at least one ., that the TLD component has 2-6 characters, etc.), so I just figure I would give you the simplest pattern that would match.

Use this:
$x = preg_replace('#http://[?=&a-z0-9._/-]+#i', '<a target="_blank" href="$0">$0</a>', $text);

Replace from one custom string to another custom string

How can I replace a string starting with 'a' and ending with 'z'?
basically I want to be able to do the same thing as str_replace but be indifferent to the values in between two strings in a 'haystack'.
Is there a built in function for this? If not, how would i go about efficiently making a function that accomplishes it?

That can be done with Regular Expression (RegEx for short).
Here is a simple example:
$string = 'coolAfrackZInLife';
$replacement = 'Stuff';
$result = preg_replace('/A.*Z/', $replacement, $string);
echo $result;
The above example will return coolStuffInLife
A little explanation on the givven RegEx /A.*Z/:
- The slashes indicate the beginning and end of the Regex;
- A and Z are the start and end characters between which you need to replace;
- . matches any single charecter
- * Zero or more of the given character (in our case - all of them)
- You can optionally want to use + instead of * which will match only if there is something in between
Take a look at Rubular.com for a simple way to test your RegExs. It also provides short RegEx reference

$string = "I really want to replace aFGHJKz with booo";
$new_string = preg_replace('/a[a-zA-z]+z/', 'boo', $string);
echo $new_string;
Be wary of the regex, are you wanting to find the first z or last z? Is it only letters that can be between? Alphanumeric? There are various scenarios you'd need to explain before I could expand on the regex.

use preg_replace so you can use regex patterns.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP regex title conversion / negative look ahead / toLowerCase - php

Try http://php.net/manual/de/function.preg-replace-callback.php . You can create a custom function that is called on every match. In this function you can decide to a) not replace CP and b) to not put $1, but ucfirst. Hope this helps & good luck.

Related

How to remove repeated sequence of characters in a string?

How to get a number from a html source page?

PHP preg_replace, split or match?

Converting links occuring inside a string

Replace from one custom string to another custom string

Categories

Resources