Hey all. I have a string of names separated by commas. I'm exploding this string of names into an array of names with the comma as the delimiter. I need a RegEx to remove a white space(if any) only after the comma and not the white space between the first and last name.
So as an example:
$nameStr = "Sponge Bob,Bart Simpson, Ralph Kramden,Uncle Scrooge,Mickey Mouse";
See the space before Ralph Kramden? I need to have that space removed and not the space between the names. And I need to have any other spaces before names that would occur removed as well.
P.S.: I've noticed an interesting behavior regarding white space and this situation. Take the following example:
When not line breaking an echo like so:
$nameStr = "Sponge Bob,Bart Simpson, Ralph Kramden,Uncle Scrooge,Mickey Mouse";
$nameArray = explode(",", $nameStr);
foreach($nameArray as $value)
{
echo $value;
}
The result is: Sponge BobBart Simpson Ralph KramdenUncle ScroogeMickey Mouse
Notice the white space still there before Ralph Kramden
However, when line breaking the above echo like so:
echo $value . "<br />";
The output is:
Sponge Bob
Bart Simpson
Ralph Kramden
Uncle Scrooge
Mickey Mouse
And all of the names line up with what appears to be no white space before the name.
So what exactly is PHP's behavior regarding a white space at the start of a string?
Cheers all. Thanks for replies.
What's with today's fixation on using regexp to solve every little problem
$nameStr = "Sponge Bob,Bart Simpson, Ralph Kramden,Uncle Scrooge,Mickey Mouse";
$nameArray = explode(",", $nameStr);
foreach($nameArray as $value)
{
echo trim($value);
}
EDIT
PHP's behaviour re white space is to treat it as the appropriate character and do what it's told by you.
HTML's behaviour (or at least that of web browsers) is rather different... and you'll need to learn and understand that difference
Try
$nameStr = preg_replace("/,([\s])+/",",",$nameStr);
$nameArray = explode(",", $nameStr);
This is a workable regex solution, but as others have pointed out above, a simple trim() will do the job with what you already have.
As you have mentioned to remove White Space only after the Comma, considering that space before comma can be left.
You can also use below:
$nameStr = "Sponge Bob, Bart Simpson, Ralph Kramden,Uncle Scrooge,Mickey Mouse";
while (strpos($nameStr, ', ') !== FALSE) {
$nameStr = str_replace(', ', ',', $nameStr);
}
echo $nameStr;
After this, you can simply explode it as:
$allNames = explode(',', $nameStr);
Otherwise the regex by Michael is very good.
Why don't you just preg_split?
$names = preg_split('~,\s*~', $names);
PHP couldn't care less what's in a string, unless it's parsing it for variable interpolation. A space is like any other character, an ascii or unicode value that just happens to show up as a "blank".
How are you replacing those post-comma spaces?
$str = str_replace(', ', ',', $str);
^--space
If that's not catching the space before your Ralph name, then most likely whatever's there isn't a space character (ascii 32). You can try displaying what it is with:
echo ord(substr($nameArray['index of Ralph here'], 0, 1));
Have you tried it with the trim function?
$nameStr = "Sponge Bob,Bart Simpson, Ralph Kramden,Uncle Scrooge,Mickey Mouse";
$nameArray = explode(",", $nameStr);
for($i = 0; $i < count($nameArray); $i++) {
$nameArray[$i] = trim($nameArray[$i];
}
$newNameStr = implode(',', $nameArray);
You can do this with a RegExp, but since it looks like you aren't very experienced with RegExp you shouldn't use them, because when doing it wrong they cost a good chunk of performance.
An easy solution is to avoid using regex and apply the trim function:
$nameStr = 'Bart Simpson, Lisa Simpson,Homer Simpson, Madonna';
$names = explode(',', $nameStr);
foreach($names as &$name) {
$name = trim($name);
}
//this also doesn't falter when a name is only one word
this one works for me
$string = "test1, test2, test3, test4, , ,";
array_map('trim', array_filter(explode(',', $string), 'trim'))
output
=> [
"test1",
"test2",
"test3",
"test4",
]
Related
I have an array with rule field that has a string like this:
FREQ=MONTHLY;BYDAY=3FR
FREQ=MONTHLY;BYDAY=3SA
FREQ=WEEKLY;UNTIL=20170728T080000Z;BYDAY=MO,TU,WE,TH,FR
FREQ=MONTHLY;UNTIL=20170527T100000Z;BYDAY=4SA
FREQ=WEEKLY;BYDAY=SA
FREQ=WEEKLY;INTERVAL=2;BYDAY=TH
FREQ=WEEKLY;BYDAY=TH
FREQ=WEEKLY;UNTIL=20170610T085959Z;BYDAY=SA
FREQ=MONTHLY;BYDAY=2TH
Each line is a different array, I am giving a few clues to get an idea of what I need.
What I need is to write a regex that would take off all unnecessary values.
So, I don't need FREQ= ; BYDAY= etc. I basically need the values after = but each one I want to store in a different variable.
Taking third one as an example it would be:
$frequency = WEEKLY
$until = 20170728T080000Z
$day = MO, TU, WE, TH, FR
It doesn't have to be necessarily one regex, there can be one regex for each value. So I have one for FREQ:
preg_match("/[^FREQ=][A-Z]+/", $input_line, $output_array);
But I can't do it for the rest unfortunately, how can I solve this?
The only way to go would be PHP array destructuring:
$str = "FREQ=WEEKLY;UNTIL=20170728T080000Z;BYDAY=MO,TU,WE,TH,FR";
preg_match_all('~(\w+)=([^;]+)~', $str, $matches);
[$freq, $until, $byday] = $matches[2]; // As of PHP 7.1 (otherwise use list() function)
echo $freq, " ", $until, " ", $byday;
// WEEKLY 20170728T080000Z MO,TU,WE,TH,FR
Live demo
Be more general
Using extract function:
preg_match_all('~(\w+)=([^;]+)~', $str, $m);
$m[1] = array_map('strtolower', $m[1]);
$vars = array_combine($m[1], $m[2]);
extract($vars);
echo $freq, " ", $until, " ", $byday;
Live demo
Notice: For this problem, I recommend the generell approach #revo posted, it's concise and safe and easy on the eyes -- but keep in mind, that regular expressions come with a performance penalty compared to fixed string functions, so if you can use strpos/substr/explode/..., try to use them, don't 'knee-jerk' to a preg_-based solution.
Since the seperators are fixed and don't seem to occur in the values your are interested in, and you furthermore rely on knowledge of the keys (FREQ:, etc) you don't need regular-expressions (as much as I like to use them anywhere I can, and you can use them here); why not simply explode and split in this case?
$lines = explode("\n", $text);
foreach($lines as $line) {
$parts = explode(';', $line);
$frequency = $until = $day = $interval = null;
foreach($parts as $part) {
list($key, $value) = explode('=', $part);
switch($key) {
case 'FREQ':
$frequency = $value;
break;
case 'INTERVAL':
$interval = $value;
break;
// and so on
}
}
doSomethingWithTheValues();
}
This may be more readable and efficient if your use-case is as simple as stated.
You need to use the Pattern
;?[A-Z]+=
together with preg_split();
preg_split('/;?[A-Z]+=/', $str);
Explanation
; match Semikolon
? no or one of the last Character
[A-Z]+ match one or more uppercase Letters
= match one =
If you want to have each Line into a seperate Array, you should do it this Way:
# split each Line into an Array-Element
$lines = preg_split('/[\n\r]+/', $str);
# initiate Array for Results
$results = array();
# start Looping trough Lines
foreach($lines as $line){
# split each Line by the Regex mentioned above and
# put the resulting Array into the Results-Array
$results[] = preg_split('/;?[A-Z]+=/', $line);
}
For my project I needed to analyze different sentences and work out which ones were questions by determining if they ended in question marks or not.
So I tried using explode but it didn't support multiple delimiters. I temporarily replaced all punctuation to be chr(1) so that I could explode all sentences no matter what they ended with (., !, ?, etc...).
Then I needed to find the last letter of each sentence however the explode function had removed all of the punctuation, so I needed some way of putting it back in there.
It took me a long time to solve the problem but eventually I cracked it. I am posting my solution here so that others may use it.
$array = preg_split('~([.!?:;])~u',$raw , null, PREG_SPLIT_DELIM_CAPTURE);
Here is my function, multipleExplodeKeepDelimiters. And an example of how it can be used, by exploding a string into different sentences and seeing if the last character is a question mark:
function multipleExplodeKeepDelimiters($delimiters, $string) {
$initialArray = explode(chr(1), str_replace($delimiters, chr(1), $string));
$finalArray = array();
foreach($initialArray as $item) {
if(strlen($item) > 0) array_push($finalArray, $item . $string[strpos($string, $item) + strlen($item)]);
}
return $finalArray;
}
$punctuation = array(".", ";", ":", "?", "!");
$string = "I am not a question. How was your day? Thank you, very nice. Why are you asking?";
$sentences = multipleExplodeKeepDelimiters($punctuation, $string);
foreach($sentences as $question) {
if($question[strlen($question)-1] == "?") {
print("'" . $question . "' is a question<br />");
}
}
here is a long string like"abc,adbc,abcf,abc,adbc,abcf"
I want to use regex to remove the duplicate strings which are seperated by comma
the following is my codes, but the result is not what I expect.
$a='abc,adbc,abcf,abc,adbc,abcf';
$b=preg_replace('/(,[^,]+,)(?=.*?\1)/',',',','.$a.',');
echo $b;
output:,adbc,abc,adbc,abcf,
It should be : ,abc,adbc,abcf,
please point my problem. thanks.
Here I am sharing simple php logic instead regex
$a='abc,adbc,abcf,abc,adbc,abcf';
$pieces = explode(",", $a);
$unique_values = array_unique($pieces);
$string = implode(",", $unique_values);
Here is positive lookahead base attempt on regex based solution to OP's problem.
$arr = array('ball ball code', 'abcabc bde bde', 'awycodeawy');
foreach($arr as $str)
echo "'$str' => '" . preg_replace('/(\w{2,})(?=.*?\\1)\W*/', '', $str) ."'\n";
OUTPUT
'ball ball code' => 'ball code'
'abcabc bde bde' => 'abc bde'
'awycodeawy' => 'codeawy'
As you can for the input 'awycodeawy' it makes it to 'codeawy' instead of 'awycode'. The reason is that it is possible to find a variable length lookahead something which is not possible for lookbehind.
You can also try
echo implode(",", array_unique(preg_split(",", $yourLongString)));
Try this....
$string='abc,adbc,abcf,abc,adbc,abcf';
$exp = explode(",", $string);
$arr = array_unique($exp);
$output=implode(',', $arr);
I have some problems with the preg_replace.
I would change the mentions in a links but the name isn't a username.
So in the name there are spaces, i found a good solution but i don't know to do it.
Sostantially i would that preg_replace the words that are between # and ,
For example:
#John Doeh, #Jenna Diamond, #Sir Duck Norman
and replace to
VAL
How do I do it?
I think that you want it like:
John Doeh
For this try:
$myString="#John Doeh, #Jenna Diamond, #Sir Duck Norman";
foreach(explode(',',$myString) as $str)
{
if (preg_match("/\\s/", $str)) {
$val=str_replace("#","",trim($str));
echo "<a href='user.php?name=".$val."'>".$val."</a>";
// there are spaces
}
}
Based on my assumption you want to remove strings which start with #Some Name, in a text like: #Some Name, this is a message.
Then replace that to an href, like: First_Name
If that is the case then the following regex will do:
$str = '#First_Name, say something';
echo preg_replace ( '/#([[:alnum:]\-_ ]+),.*/', '$1', $str );
Will output:
First_Name
I also added support for numbers, underscores and dashes. Are those valid in a name aswell? Any other characters that are valid in a #User Name? Those are things that are important to know.
Two methods:
<?php
// preg_replace method
$string = '#John Doeh, #Jenna Diamond, #Sir Duck Norman';
$result = preg_replace('/#([\w\s]+),?/', '$1', $string);
echo $result . "<br>\n";
// explode method
$arr = explode(',', $string);
$result2 = '';
foreach($arr as $name){
$name = trim($name, '# ');
$result2 .= ''.$name.' ';
}
echo $result2;
?>
Example Input: SMK SUNGAI PUNAI
My Code:
$school = 'SMK SUNGAI PUNAI';
echo ucwords(strtolower($school));
Unwanted Output: Smk Sungai Punai
Question
How to make the output SMK Sungai Punai which allows SMK to remain in ALL-CAPS.
Update.
The problem I have list of 10,000 school names. From PDF, I convert to mysql. I copied exactly from PDF the name of schools -- all in uppercase.
How can I implement conditional title-casing?
As far as I understand you want to have all school names with the first character of every word in uppercase and exclude some special words ($exceptions in my sample) from this processing.
You could do that like this:
function createSchoolName($school) {
$exceptions = array('SMK', 'PTS', 'SBP');
$result = "";
$words = explode(" ", $school);
foreach ($words as $word) {
if (in_array($word, $exceptions))
$result .= " ".$word;
else
$result .= " ".strtolower($word);
}
return trim(ucwords($result));
}
echo createSchoolName('SMK SUNGAI PUNAI');
This example would return SMK Sungai Punai as required by your question.
You can very simply create a pipe-delimited set of excluded words/acronyms, then use (*SKIP)(*FAIL) to prevent matching those whole words.
mb_convert_case() is an excellent function to call because it instantly provides TitleCasing and it is multibyte safe.
Code: (Demo)
$pipedExclusions = 'SMK|USA|AUS';
echo preg_replace_callback(
'~\b(?:(?:' . $pipedExclusions . ')(*SKIP)(*FAIL)|\p{Lu}+)\b~u',
fn($m) => mb_convert_case($m[0], MB_CASE_TITLE),
'SMK SUNGAI PUNAI'
);
// SMK Sungai Punai
There's no really good way to do it. In this case you can assume it's an abbreviation because it's only three letters long and contains no vowels. You can write a set of rules that look for abbreviations in the string and then uppercase them, but in some cases it'll be impossible... consider "BOB PLAYS TOO MUCH WOW."
You can use something like this:
<?php
$str = 'SMK SUNGAI PUNAI';
$str = strtolower($str);
$arr = explode(" ", $str);
$new_str = strtoupper($arr[0]). ' ' .ucfirst($arr[1]). ' ' .ucfirst($arr[2]);
echo '<p>'.$new_str.'</p>';
// Result: SMK Sungai Punai
?>