PHP : Separate alphanumeric word with space in string - php

How can separate alphanumeric value with space in one statement
Example :
$arr="new stackoverflow 244code 7490script design";
So how can possible to separate alpha and number with space like :
$arr="new stackoverflow 244 code 7490 script design";

You can use preg_split() function
Check demo Codeviper
preg_split('#(?<=\d)(?=[a-z])#i', "new stackoverflow 244code 7490script design");
PHP
print_r(preg_split('#(?<=\d)(?=[a-z])#i', "new stackoverflow 244code 7490script design"));
Result
Array ( [0] => new stackoverflow 244 [1] => code 7490 [2] => script design )
You can also use preg_replace() function
Check demo Codeviper
PHP
echo preg_replace('#(?<=\d)(?=[a-z])#i', ' ', "new stackoverflow 244code 7490script design");
Result
new stackoverflow 244 code 7490 script design
Hope this help you!

You may use preg_replace (Example):
$arr = "new stackoverflow 244code 7490script design";
$newstr = preg_replace('#(?<=\d)(?=[a-z])#i', ' ', $arr);
echo $newstr; // new stackoverflow 244 code 7490 script design
The regex pattern used from user1153551's answer.

Use preg_replace like this:
$new = preg_replace('/(\d)([a-z])/i', "$1 $2", $arr);
regex101 demo
(\d) match and catches a digit. ([a-z]) matches and catches a letter. In the replace it puts back the digit, adds a space and puts back the letter.
If you don't want to use backreferences, you can use lookarounds:
$new = preg_replace('/(?<=\d)(?=[a-z])/i', ' ', $arr);
If you want to replace between letter and number as well...
$new = preg_replace('/(?<=\d)(?=[a-z])|(?<=[a-z])(?=\d)/i', ' ', $arr);
regex101 demo
(?<=\d) is a positive lookbehind that makes sure that there is a digit before the current position.
(?=[a-z]) is a positive lookahead that makes sure that there is a letter right after the current position.
Similarly, (?<=[a-z]) makes sure there's a letter before the current position and (?=\d) makes sure there's a digit right after the current position.
An different alternative would be to split and join back with spaces:
$new_arr = preg_split('/(?<=\d)(?=[a-z])/i', $arr);
$new = implode(' ', $new_arr);
Or...
$new = implode(' ', preg_split('/(?<=\d)(?=[a-z])/i', $arr));

preg_split
preg_split — Split string by a regular expression
<?php
// split the phrase by any number of commas or space characters,
// which include " ", \r, \t, \n and \f
$matches = preg_split('#(?<=\d)(?=[a-z])#i', "new stackoverflow 244code 7490script design");
echo $matches['0'],' '.$matches['1'].' '.$matches['2'];
?>
WORKING DEMO

Related

How to replace all occurrences of a character except the first one in PHP using a regular expression?

Given an address stored as a single string with newlines delimiting its components like:
1 Street\nCity\nST\n12345
The goal would be to replace all newline characters except the first one with spaces in order to present it like:
1 Street
City ST 12345
I have tried methods like:
[$street, $rest] = explode("\n", $input, 2);
$output = "$street\n" . preg_replace('/\n+/', ' ', $rest);
I have been trying to achieve the same result using a one liner with a regular expression, but could not figure out how.
I would suggest not solving this with complicated regex but keeping it simple like below. You can split the string with a \n, pop out the first split and implode the rest with a space.
<?php
$input = explode("\n","1 Street\nCity\nST\n12345");
$input = array_shift($input) . PHP_EOL . implode(" ", $input);
echo $input;
Online Demo
You could use a regex trick here by reversing the string, and then replacing every occurrence of \n provided that we can lookahead and find at least one other \n:
$input = "1 Street\nCity\nST\n12345";
$output = strrev(preg_replace("/\n(?=.*\n)/", " ", strrev($input)));
echo $output;
This prints:
1 Street
City ST 12345
You can use a lookbehind pattern to ensure that the matching line is preceded with a newline character. Capture the line but not the trailing newline character and replace it with the same line but with a trailing space:
preg_replace('/(?<=\n)(.*)\n/', '$1 ', $input)
Demo: https://onlinephp.io/c/5bd6d
You can use an alternation pattern that matches either the first two lines or a newline character, capture the first two lines without the trailing newline character, and replace the match with what's captured and a space:
preg_replace('/(^.*\n.*)\n|\n/', '$1 ', $input)
Demo: https://onlinephp.io/c/2fb2f
I leave you another method, the regex is correct as long as the conditions are met, in this way it always works
$string=explode("/","1 Street\nCity\nST\n12345");
$string[0]."<br>";
$string[1]." ".$string[2]." ".$string[3]

PHP Word Replacing Issue

Actually this problem shouldn’t be that much hard but I searched for it in stackoverflow but couldn’t find anything that works as I want or I can understand. Here’s what i’m asking for:
Image there is a text like:
“hi today the temperature is high”
I’d like to replace string “hi” with “al” but I don’t want the word high to be replaced to as “algh”. I know I need to use preg_replace function but i couldn’t make it work.
ps: If you can show your solution with an array too, I will be more satisfied. Like there’s an array of strings to be changed and there’s an array of strings to be changed as.
Appreciate your help thanks :)
You can use regex with \b to make it work.
$string = 'hi today the temperature is high';
$pattern = '/\bhi\b/';
$replacement = 'al';
echo preg_replace($pattern, $replacement, $string);
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
https://regex101.com/r/WdQTMp/2
I would recommend using a negative lookahead against the non-whitespace character \S.
This results in the simple regex hi(?!\S):
<?php
$string = "hi today the temperature is high";
$string2 = preg_replace('/hi(?!\S)/', 'al', $string);
echo $string2; // "al today the temperature is high";
This can be seen working here.
Note that this will only cover strings that start with hi. In order to exclude strings that have text before hi (like sushi), you'll need a negative lookbehind as well:
<?php
$string = "I eat sushi - hi today the temperature is high";
$string2 = preg_replace('/(?<!\S)hi(?!\S)/', 'al', $string);
echo $string2; // "I eat sushi - al today the temperature is high";
This can be seen working here.
Hope this helps! :)
For example:
<?php
$arrFrom = array("1","2","3","B");
$arrTo = array("A","B","C","D");
$word = "ZBB2";
echo str_replace($arrFrom, $arrTo, $word);
?>
I would expect as result: "ZDDB"
However, this return: "ZDDD"
(Because B = D according to our array)
To make this work, use "strtr" instead:
<?php
$arr = array("1" => "A","2" => "B","3" => "C","B" => "D");
$word = "ZBB2";
echo strtr($word,$arr);
?>
This returns: "ZDDB"

Using preg_replace to modify first space, but not inside a group of words using PHP

I would like to replace commas and potential spaces (i.e. that user can type or not) of an expression using preg_replace.
$expression = 'alfa,beta, gamma gmm, delta dlt, epsilon psln';
but I was unable to format the output as I want:
'alfa|beta|gamma gmm|delta dlt|epsilon psln'
Amongst others I tried this:
preg_replace (/,\s+/, '|', $expression);
and although it was the closest I got, it's not yet right. With code above I receive:
alfa,beta|gamm|gmm|delt|dlt|epsilo|psl|
Then I tried this (with | = OR):
preg_replace (/,\s+|,/, '|', $expression);
and although I solved the problem with the comma, it is still wrong:
alfa|beta|gamm|gmm|delt|dlt|epsilo|psl|
What should I do to only delete space after comma and not inside the word-group?
Many thanks in advance!
Use ,\s* instead of ,\s+ and replace the matched characters with | symbol. If you use ,\s+, it matches the commas and the following one or more spaces but it forgot the commas which are alone. By making the occurrence of spaces to zero or more times, it would also match the commas which are alone.
DEMO
Code:
<?php
$string = 'alfa,beta, gamma gmm, delta dlt, epsilon psln';
$pattern = "~,\s*~";
$replacement = "|";
echo preg_replace($pattern, $replacement, $string);
?>
Output:
alfa|beta|gamma gmm|delta dlt|epsilon psln
How about using regular PHP functions to achieve this?
<?php
$expression = 'alfa,beta, gamma gmm, delta dlt, epsilon psln';
$pieces = explode(',', $expression);
foreach($pieces as $k => $v)
$pieces[$k] = trim($v);
$result = implode('|', $pieces);
echo $result;
?>
Output:
alfa|beta|gamma gmm|delta dlt|epsilon psln
This will distinguish between spaces at start/end of piece and spaces in pieces.

regex for matching three specific character

while attempting a question in SO,i tried to write the regular expression which matches three characters that should be in the string.
i am following the answer Regular Expressions: Is there an AND operator?
<?php
$words = "systematic,gear,synthesis,mysterious";
$words=explode(",",$words);
$your_array = preg_grep("/^(^s|^m|^e)/", $words);
print_r($your_array);
?>
the output should be systematic and mysterious.but i am getting synthesis also.
Why is it so?what i am doing wrong?
** i dont want a new solution :)
SEE HERE
You can do this:
$wordlist = 'systematic,gear,synthesis,mysterious';
$words = explode(',', $wordlist);
foreach($words as $word) {
if (preg_match('~(?=[^s]*s)(?=[^m]*m)(?=[^e]*e)~', $word))
echo '<br/>' . $word;
}
//or
$res = preg_grep('~(?=[^s]*s)(?=[^m]*m)(?=[^e]*e)~', $words);
print_r($res);
To test the presence of a character in the string, I use (?=[^s]*s).
[^s]*s means all that is not a "s" zero or more times, and a "s".
(?=..) is a lookahead assertion and means "followed by". It is only a check, a lookahead give no characters in a match result, but the main interest with this feature is that you can check the same substring several times.
What is wrong with your pattern?
/^(^s|^m|^e)/ will give you only words that begins with "s" or "m" or "e" because ^ is an anchor and means : "start of the string". In other words, your pattern is the same as /^([sme])/.

Removing long words regex

I would like to how can I remove long word from a string. Words greater than length n.
I tried the following:
//remove words which have more than 5 characters from string
$s = 'abba bbbbbbbbbbbb 1234567 zxcee ytytytytytytytyt zczc xyz';
echo preg_replace("~\s(.{5,})\s~isU", " ", $s);
Gives the Output (which is incorrect):
abba 1234567 ytytytytytytytyt zczc xyz
Use this regex: \b\w{5,}\b. It will match long words.
\b - word boundary
\w{5,} - alphanumeric 5 or more repetitions
\b - word boundary
<?php
//remove words which have more than 5 characters from string
$s = 'abba bbbbbbbbbbbb 1234567 zxcee ytytytytytytytyt zczc xyz';
$patterns = array(
'long_words' => '/[^\s]{5,}/',
'multiple_spaces' => '/\s{2,}/'
);
$replacements = array(
'long_words' => '',
'multiple_spaces' => ' '
);
echo trim(preg_replace($patterns, $replacements, $s));
?>
Output:
abba zczc xyz
Update, to address the issue you presented in the comments. You can do it like this:
<?php
//remove words which have more than 5 characters from string
$s = '123 ReallyLongStringComesHere 123';
$patterns = array(
'html_space' => '/ /',
'long_words' => '/[^\s]{5,}/',
'multiple_spaces' => '/\s{2,}/'
);
$replacements = array(
'html_space' => ' ',
'long_words' => '',
'multiple_spaces' => ' '
);
echo str_replace(' ', ' ', trim(preg_replace($patterns, $replacements, $s)));
?>
Output:
123 123
A better approach maybe to use regular string manipulation instead of a regex? A simple implode/explode and strlen will do nicely. Depending on the size of your string of course, but for your example it should be fine.
You're close:
preg_replace("~\w{5,}~", "", $s);
Working codepad example: http://codepad.org/c5AN1E6M
Also, you'll want to collapse multiple spaces into one:
preg_replace("~ +~", " ", $s);
Example for this one
Add the global modifier g or use preg_match_all().
Summary:
any answer starting or ending with \s will fail to remove words at the beginning and the end of string (and you should use a test string which fails with these!)
\b doesn't fail like that but it won't remove whitespaces. you can combine that what a suggested double-space remover but that won't preserve original duplicated whitespaces (this may not be a problem).
explode+implode has a nice property that it preserves duplicated whitespaces but you have to do it for every whitespace character.
an alternative for whitespace-preserving (which I haven't seen here) is to use two patterns, one starting with \b ending with \s and another one starting with \s and ending with $.

Categories