Conditionally Replace a specific Character in a string - php

I am trying to remove the # sign from a block of text. The problem is that in certain cases (when at the beginning of a line, the # sign needs to stay.
I have succeeded by using the RegEx pattern .\#, however on when the # sign does get removed it also removes the character preceding it.
Goal: remove all # signs UNLESS the # sign is the first character in the line.
<?php
function cleanFile($text)
{
$pattern = '/.\#/';
$replacement = '%40';
$val = preg_replace($pattern, $replacement, $text);
$text = $val;
return $text;
};
$text = ' Test: test#test.com'."\n";
$text .= '#Test: Leave the leading at sign alone'."\n";
$text .= '#Test: test#test.com'."\n";
$valResult = cleanFile($text);
echo $valResult;
?>
Output:
Test: tes%40test.com
#Test: Leave the leading at sign alone
#Test: tes%40test.com

You can do this with regex using a negative lookbehind: /(?<!^)#/m (an # sign not preceded by the start of a line (or the start of the string if you skip out the m modifier)).
Regex 101 Demo
In code:
<?php
$string = "Test: test#test.com\n#Test: Leave the leading at sign alone\n#Test: test#test.com;";
$string = preg_replace("/(?<!^)#/m", "%40", $string);
var_dump($string);
?>
which outputs the following:
string(84) "Test: test%40test.com
#Test: Leave the leading at sign alone
#Test: test%40test.com;"
Codepad demo

There's no need for regexp in such simple case.
function clean($source) {
$prefix = '';
$offset = 0;
if( $source[0] == '#' ) {
$prefix = '#';
$offset = 1;
}
return $prefix . str_replace('#', '', substr( $source, $offset ));
}
and test case
$test = array( '#foo#bar', 'foo#bar' );
foreach( $test as $src ) {
echo $src . ' => ' . clean($src) . "\n";
}
would give:
#foo#bar => #foobar
foo#bar => foobar

the syntax [^] means negative match (as in don't match), but I don't think the following would work
$pattern = '/[^]^#/';

Related

PHP Preg Replace. Remove strings inside {~ string ~} pattern, but skip <pre>{~ string ~}</pre> [duplicate]

I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);
You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>
It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.

Check if a word occur in string and not to be in first and last

I am trying to check if word is occur in a string but not to be the first and last word, if its true then remove the space after and before of the word and replace with a underscore.
Input:
$str = 'This is a cool area";
Output:
$str = 'This is a_cool_area";
I want to check that the word 'cool' is inside the string but not a first and last word. if yes the remove the space & replace with '_'
You can use preg_replace to do this job, using this regex:
/(?<=\w)\s+(' . $word . ')\s+(?=\w)/i
which looks for the word, surrounded by at least one word character on either side (to prevent matching at the beginning or ending of the sentence). Usage in PHP:
$str = 'This is a cool area';
$word = 'cool';
$str = preg_replace('/(?<=\w)\s+(' . $word . ')\s+(?=\w)/i', '_$1_', $str);
echo $str . "\n";
$str = ' Cool areas are cool ';
$str = preg_replace('/(?<=\w)\s+(' . $word . ')\s+(?=\w)/i', '_$1_', $str);
echo $str . "\n";
Output:
This is a_cool_area
Cool areas are cool
Demo on 3v4l.org
function checkWord($str, $word)
{
$arr = explode(" ", $str);
$newArr = array_slice($arr, 1, -1);
$key = array_search($word, $newArr);
if($key !== false)
{
return implode('_',array_slice($arr, $key, 3));
}
else
{
return $str;
}
}
echo checkWord('This is a cool area', 'cool');

How to get substring of string using starting and ending character in PHP?

I am looking for finding middle part of a string using starting tag and ending tag in PHP.
$str = 'Abc/hello#gmail.com/1267890(A-29)';
$agcodedup = substr($str, '(', -1);
$agcode = substr($agcodedup, 1);
final expected value of agcode:
$agcode = 'A-29';
You can use preg_match
$str = 'Abc/hello#gmail.com/1267890(A-29)';
if( preg_match('/\(([^)]+)\)/', $string, $match ) ) echo $match[1]."\n\n";
Outputs
A-29
You can check it out here
http://sandbox.onlinephpfunctions.com/code/5b6aa0bf9725b62b87b94edbccc2df1d73450ee4
Basically Regular expression says:
start match, matches \( Open Paren literal
capture group ( .. )
match everything except [^)]+ Close Paren )
end match, matches \) Close Paren literal
Oh and if you really have your heart set on substr here you go:
$str = 'Abc/hello#gmail.com/1267890(A-29)';
//this is the location/index of the ( OPEN_PAREN
//strlen 0 based so we add +1 to offset it
$start = strpos( $str,'(') +1;
//this is the location/index of the ) CLOSE_PAREN.
$end = strpos( $str,')');
//we need the length of the substring for the third argument, not its index
$len = ($end-$start);
echo substr($str, $start, $len );
Ouputs
A-29
And you can test this here
http://sandbox.onlinephpfunctions.com/code/88723be11fc82d88316d32a522030b149a4788aa
If it was me, I would benchmark both methods, and see which is faster.
May this helps to you.
function getStringBetween($str, $from, $to, $withFromAndTo = false)
{
$sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
if ($withFromAndTo) {
return $from . substr($sub,0, strrpos($sub,$to)) . $to;
} else {
return substr($sub,0, strrpos($sub,$to));
}
$inputString = "Abc/hello#gmail.com/1267890(A-29)";
$outputString = getStringBetween($inputString, '(', ')');
echo $outputString;
//output will be A-29
$outputString = getStringBetween($inputString, '(', ')', true);
echo $outputString;
//output will be (A-29)
return $outputString;
}

PHP Regex expression excluding <pre> tag

I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);
You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>
It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.

Replacing comments in php with preg_replace

I need to replace all block comments with preg_replace() in php.
For example:
/**asdfasdf
fasdfasdf*/
echo "hello World\n";
For this:
echo "hello World\n";
I tried some solutions from this site, but no one works for me.
My code:
$file = file_get_contents($fileinput);
$file = preg_replace('/\/\*([^\\n]*[\\n]?)*\*\//', '', $file);
echo $file;
My output for example is same as input.Link to my regex test
Use the http://www.php.net/manual/en/function.token-get-all.php:
$file = file_get_contents($fileinput);
$tokens = token_get_all($file); // prepend an open tag if your file doesnt have one
$plain = '';
foreach ($tokens as $token) {
if (is_array($token)) {
list($number, $string) = $token;
if (!in_array($number, [T_OPEN_TAG, T_COMMENT])) { // add all tokens you dont want
$plain .= $string;
}
} else {
$plain .= $token;
}
}
print_r($plain);
Output:
echo "hello World\n";
Here is a list of all PHP tokens:
http://www.php.net/manual/en/tokens.php
Try this
$file = preg_replace('/^\s*?\/\*.*?\*\//m', '', $file);
The best way to parse PHP code is to use the tokenizer.
However it is not so difficult to do it with a regex. You must only skip all strings:
$pattern = <<<'EOD'
~
(?(DEFINE)
(?<sq> ' (?>[^'\\]++|\\{2}|\\.)* ' ) # single quotes
(?<dq> " (?>[^"\\]++|\\{2}|\\.)* " ) # double quotes
(?<hd> <<< \s* (["']?)(\w+)\g{-2} \R .*? (?<=\n) \g{-1} ;? (\R|$) ) # heredoc like
(?<string> \g<sq> | \g<dq> | \g<hd>)
)
\g<string> (*SKIP)(*FAIL) | /\* .*? \*/
~xs
EOD;
$result = preg_replace($pattern, '', $data);

Categories