How can I explode a string by one or more spaces or tabs?
Example:
A B C D
I want to make this an array.
$parts = preg_split('/\s+/', $str);
To separate by tabs:
$comp = preg_split("/\t+/", $var);
To separate by spaces/tabs/newlines:
$comp = preg_split('/\s+/', $var);
To seperate by spaces alone:
$comp = preg_split('/ +/', $var);
This works:
$string = 'A B C D';
$arr = preg_split('/\s+/', $string);
The author asked for explode, to you can use explode like this
$resultArray = explode("\t", $inputString);
Note: you must used double quote, not single.
I think you want preg_split:
$input = "A B C D";
$words = preg_split('/\s+/', $input);
var_dump($words);
instead of using explode, try preg_split: http://www.php.net/manual/en/function.preg-split.php
In order to account for full width space such as
full width
you can extend Bens answer to this:
$searchValues = preg_split("#[\s+ ]#u", $searchString);
Sources:
strip out multi-byte white space from a string PHP
What are all the Japanese whitespace characters?
(I don't have enough reputation to post a comment, so I'm wrote this as an answer.)
Assuming $string = "\tA\t B \tC \t D "; (a mix of tabs and spaces including leading tab and trailing space)
Obviously splitting on just spaces or just tabs will not work. Don't use these:
preg_split('~ +~', $string) // one or more literal spaces, allow empty elements
preg_split('~ +~', $string, -1, PREG_SPLIT_NO_EMPTY) // one or more literal spaces, deny empty elements
preg_split('~\t+~', $string) // one or more tabs, allow empty elements
preg_split('~\t+~', $string, -1, PREG_SPLIT_NO_EMPTY) // one or more tabs, deny empty elements
Use these:
preg_split('~\s+~', $string) // one or more whitespace character, allow empty elements
preg_split('~\s+~', $string, -1, PREG_SPLIT_NO_EMPTY), // one or more whitespace character, deny empty elements
preg_split('~[\t ]+~', $string) // one or more tabs or spaces, allow empty elements
preg_split('~[\t ]+~', $string, -1, PREG_SPLIT_NO_EMPTY) // one or more tabs or spaces, allow empty elements
preg_split('~\h+~', $string) // one or more horizontal whitespaces, allow empty elements
preg_split('~\h+~', $string, -1, PREG_SPLIT_NO_EMPTY) // one or more horizontal whitespaces, deny empty elements
A demonstration of all techniques below can be found here.
Reference Horizontal Whitespace
Related
I have the following string:
$thetextstring = "jjfnj 948"
At the end I want to have:
echo $thetextstring; // should print jjf-nj948
So basically what am trying to do is to join the separated string then separate the first 3 letters with a -.
So far I have
$string = trim(preg_replace('/s+/', ' ', $thetextstring));
$result = explode(" ", $thetextstring);
$newstring = implode('', $result);
print_r($newstring);
I have been able to join the words, but how do I add the separator after the first 3 letters?
Use a regex with preg_replace function, this would be a one-liner:
^.{3}\K([^\s]*) *
Breakdown:
^ # Assert start of string
.{3} # Match 3 characters
\K # Reset match
([^\s]*) * # Capture everything up to space character(s) then try to match them
PHP code:
echo preg_replace('~^.{3}\K([^\s]*) *~', '-$1', 'jjfnj 948');
PHP live demo
Without knowing more about how your strings can vary, this is working solution for your task:
Pattern:
~([a-z]{2}) ~ // 2 letters (contained in capture group1) followed by a space
Replace:
-$1
Demo Link
Code: (Demo)
$thetextstring = "jjfnj 948";
echo preg_replace('~([a-z]{2}) ~','-$1',$thetextstring);
Output:
jjf-nj948
Note this pattern can easily be expanded to include characters beyond lowercase letters that precede the space. ~(\S{2}) ~
You can use str_replace to remove the unwanted space:
$newString = str_replace(' ', '', $thetextstring);
$newString:
jjfnj948
And then preg_replace to put in the dash:
$final = preg_replace('/^([a-z]{3})/', '\1-', $newString);
The meaning of this regex instruction is:
from the beginning of the line: ^
capture three a-z characters: ([a-z]{3})
replace this match with itself followed by a dash: \1-
$final:
jjf-nj948
$thetextstring = "jjfnj 948";
// replace all spaces with nothing
$thetextstring = str_replace(" ", "", $thetextstring);
// insert a dash after the third character
$thetextstring = substr_replace($thetextstring, "-", 3, 0);
echo $thetextstring;
This gives the requested jjf-nj948
You proceeding is correct. For the last step, which consists in inserting a - after the third character, you can use the substr_replace function as follows:
$thetextstring = 'jjfnj 948';
$string = trim(preg_replace('/\s+/', ' ', $thetextstring));
$result = explode(' ', $thetextstring);
$newstring = substr_replace(implode('', $result), '-', 3, false);
If you are confident enough that your string will always have the same format (characters followed by a whitespace followed by numbers), you can also reduce your computations and simplify your code as follows:
$thetextstring = 'jjfnj 948';
$newstring = substr_replace(str_replace(' ', '', $thetextstring), '-', 3, false);
Visit this link for a working demo.
Oldschool without regex
$test = "jjfnj 948";
$test = str_replace(" ", "", $test); // strip all spaces from string
echo substr($test, 0, 3)."-".substr($test, 3); // isolate first three chars, add hyphen, and concat all characters after the first three
I'm trying to convert plain links to HTML links using preg_replace. However it's replacing links that are already converted.
To combat this I'd like it to ignore the replacement if the link starts with a quote.
I think a positive lookahead may be needed but everything I've tried hasn't worked.
$string = 'test http://www.example.com';
$string = preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $string);
var_dump($string);
The above outputs:
http://www.example.com">test</a> http://www.example.com
When it should output:
test http://www.example.com
You might get along with lookarounds.
Lookarounds are zero-width assertions that make sure to match/not to match anything immediately around the string in question. They do not consume any characters.
That being said, a negative lookbehind might be what you need in your situation:
(?<![">])\bhttps?://\S+\b
In PHP this would be:
<?php
$string = 'I want to be transformed to a proper link: http://www.google.com ';
$string .= 'But please leave me alone ';
$string .= '(https://www.google.com).';
$regex = '~ # delimiter
(?<![">]) # a neg. lookbehind
https?://\S+ # http:// or https:// followed by not a whitespace
\b # a word boundary
~x'; # verbose to enable this explanation.
$string = preg_replace($regex, "<a href='$0'>$0</a>", $string);
echo $string;
?>
See a demo on ideone.com. However, maybe a parser is more appropriate.
Since you can use Arrays in preg_replace, this might be convenient to use depending on what you want to achieve:
<?php
$string = 'test http://www.example.com';
$rx = array("&(<a.+https?:\/\/[\w]+[^ \,\"\n\r\t<]*>)(.*)(<\/a\>)&si", "&(\s){1,}(https?:\/\/[\w]+[^ \,\"\n\r\t<]*)&");
$rp = array("$1$2$3", "$2");
$string = preg_replace($rx,$rp, $string);
var_dump($string);
// DUMPS:
// 'testhttp://www.example.com'
The Idea
You can split your string at the already existing anchors, and only parse the pieces in between.
The Code
$input = 'test http://www.example.com';
// Split the string at existing anchors
// PREG_SPLIT_DELIM_CAPTURE flag includes the delimiters in the results set
$parts = preg_split('/(<a.*?>.*?<\/a>)/is', $input, PREG_SPLIT_DELIM_CAPTURE);
// Use array_map to parse each piece, and then join all pieces together
$output = join(array_map(function ($key, $part) {
// Because we return the delimiter in the results set,
// every $part with an uneven key is an anchor.
return $key % 2
? preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $part)
: $part;
}, array_keys($parts), $parts);
I have a product feed where the product options is formatted like this:
Color{1} : Black[14], White[42] Size{2} : Small[16], Medium[17], Large[18]
For my script to understand and parse the product options correctly, it needs to be in the following format:
Color:Black,White|Size:Small,Medium,Large
I started out like this to remove unnecessary information:
$matches[1] = preg_replace("/\{\d{1,}\} : /", ': ', $matches[1]);
$matches[1] = preg_replace("/\[\d{1,}\]/", '', $matches[1]);
Which gives this output:
Color: Black, White Size: Small, Medium, Large
But my problem now is "how to insert a pipe before the option name, unless its only one option, or the first option". I guess I need to use some sort of lookback, but I have no idea.
First, split the string into several individual options using preg_split():
$arr = preg_split('/\s+(?=[a-z]+{\d+})/i', $str);
(?=[a-z]+{\d+}) is a positive lookahead that asserts that the whitespace (\s+) is followed by a string of the format <string>{xx}. It's used here to pinpoint on which spaces the split should happen. It's important to note that the lookahead assertion is zero-width, i.e. it doesn't consume any characters at all.
Once you have the split array, loop through it, and remove {xx}, [xx] parts and whitespace:
foreach ($arr as &$str)
$str = preg_replace('/(?:{\d+}|\[\d+\]|\s*)/', '', $str);
Join the array by |:
echo join('|', $arr);
Output:
Color:Black,White|Size:Small,Medium,Large
Demo
This method uses only two iterations of regex substitution
First, delete all spaces along with digits
$re = "/(.\\d+.|[ ]+)/";
$str = "Color{1} : Black[14], White[42] Size{2} : Small[16], Medium[17], Large[18]";
$subst = '';
$result = preg_replace($re, $subst, $str);
Then add in the pipe
$re = "/([a-z])([A-Z])/";
$subst = '\1|\2';
$endresult = preg_replace($re, $subst, $result);
Input:
Color{1} : Black[14], White[42] Size{2} : Small[16], Medium[17], Large[18]
Output:
Color:Black,White|Size:Small,Medium,Large
Here's a quick demo
Note: I'm assuming that the digits are always surrounded by a curly brace or a bracket without any spacing in between and that the quantity names are only alpha character (never digits).
I'm trying to remove all words of less than 3 characters from a string, specifically with RegEx.
The following doesn't work because it is looking for double spaces. I suppose I could convert all spaces to double spaces beforehand and then convert them back after, but that doesn't seem very efficient. Any ideas?
$text='an of and then some an ee halved or or whenever';
$text=preg_replace('# [a-z]{1,2} #',' ',' '.$text.' ');
echo trim($text);
Removing the Short Words
You can use this:
$replaced = preg_replace('~\b[a-z]{1,2}\b\~', '', $yourstring);
In the demo, see the substitutions at the bottom.
Explanation
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)
[a-z]{1,2} matches one or two letters
\b another word boundary
Replace with the empty string.
Option 2: Also Remove Trailing Spaces
If you also want to remove the spaces after the words, we can add \s* at the end of the regex:
$replaced = preg_replace('~\b[a-z]{1,2}\b\s*~', '', $yourstring);
Reference
Word Boundaries
You can use the word boundary tag: \b:
Replace: \b[a-z]{1,2}\b with ''
Use this
preg_replace('/(\b.{1,2}\s)/','',$your_string);
As some solutions worked here, they had a problem with my language's "multichar characters", such as "ch". A simple explode and implode worked for me.
$maxWordLength = 3;
$string = "my super string";
$exploded = explode(" ", $string);
foreach($exploded as $key => $word) {
if(mb_strlen($word) < $maxWordLength) unset($exploded[$key]);
}
$string = implode(" ", $exploded);
echo $string;
// outputs "super string"
To me, it seems that this hack works fine with most PHP versions:
$string2 = preg_replace("/~\b[a-zA-Z0-9]{1,2}\b\~/i", "", trim($string1));
Where [a-zA-Z0-9] are the accepted Char/Number range.
I have to make a regex for two choices, for exemple I have a string:
apps; chrome
I have to split the string in 2 pieces without spaces
1-> apps
2-> chrome
but the problem is that string might be "apps;chrome" (w/o space after ;)
I tried with explode
$part = explode(";", $search);
If the string is with space between characters the second piece have a space.
What I want is a regex for following cases to split them in 2 pieces
apps; chrome
apps;chrome
I hope you understand, sorry for my english :)
The trim function will help:
list($k1,$k2) = array_map("trim",explode(";",$search));
One-liner! =3
Try using trim on the various parts.
e.g.
$parts = array_map('trim', explode(';', $search));
Well, if you are sure about the separators, and you have two options, basically.
1) Using explode(';', $string) and array_map
This will explode the string and them apply trim() over the array;
$slices = explode(';', $string);
$slices_filtered = array_map("trim", $slices);
2) Using preg_split("/[,; \t\n]+/",$string);
This will split strings like "we , are; the \n champions" into {we,are,the,champions}
$slices_filtered = preg_split("/[,; \t\n]+/",$string);
** considering the 'options' won't have spaces on it; if they do, you should use some pattern like
/[,;][ ]*/
Just because you'd specified a Regular Expression ... and this should allow you to match any 2 lower-case alpha strings separated by a semi-colon, with any number (or type) of whitespace "noise".
$sFullString = "app; chrome"; //or wherever you're getting your string from
//RegExp pattern to match many strings including "app;chrome"
$sRegExp = '/^\s*([a-z]+);\s*([a-z]+)\s*$/';
//first replacement
$sAppMatch = preg_replace($sRegExp, "$1", $sFullString);
//second replacement
$sChromeMatch = preg_replace($sRegExp, "$2", $sFillString);
Just use a trim() function before treating your $parts
Why regex?
<?php
$parts = explode(";", $search);
foreach ($parts as $k => $v) {
$parts[$k]=trim($v);
}