Split word by capital letter - php

I want to split a word by capital letter in PHP
For example:
$string = "facebookPageUrl";
I want it like this:
$array = array("facebook", "Page", "Url");
How should I do it? I want the shortest and most efficient way.

You can use preg_split with the a look-ahead assertion:
preg_split('/(?=\p{Lu})/u', $str)
Here \p{Lu} is a character class of all Unicode uppercase letters. If you just work with US-ASCII characters, you could also use [A-Z] instead.

$string = "facebookPageUrl";
preg_match_all('((?:^|[A-Z])[^A-Z]*)', $string, $matches);
var_dump($matches);
http://ideone.com/wL9jM

Related

Search string for first word that has an exclamation-mark

I have a string like this:
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
I want to get the first word that is followed by an exclamation-mark. So in the example above, it should be:
$word = 'k-on';
I'm lost as to what's the appropriate approach to take. Maybe a regex solution?
If you need to only support ASCII letter words, you can use
/\b[a-z]+(?:-[a-z]+)*!/i
See regex demo
If you plan to support Unicode, use \p{L}:
/\b\p{L}+(?:-\p{L}+)*!/u
See another regex demo
Here is the pattern explanation:
\b - a word boundary (the previous character must be a non-word one or the beginning of the string)
\p{L}+ - 1 or more Unicode characters (or ASCII if [a-zA-Z] is used)
(?:-\p{L}+)* - zero or more sequences of:
- - a literal hyphen
\p{L}+ - 1 or more Unicode characters (or ASCII if [a-zA-Z] is used)
! - a literal ! symbol
PHP demo:
$re = '/\b\p{L}+(?:-\p{L}+)*!/u';
$str = "Hello k-ąn! Lorem Ipsum! Lorem.";
preg_match($re, $str, $match);
print_r($match);
I think this might do what you're looking for. Basically split the string into words, look for the first word that ends in '!', do whatever then break out of the loop:
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
arry = explode(" ", $string);
foreach ($arry as $word) {
if (substr($word,-1) == "!") {
do something ...
break;
}
}
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
preg_match('/[A-Za-z0-9-]+!/', $string, $match);
$yourWord = str_replace("!", "", $match[0]); //prints k-on
obviously, the Solution for the requirement is RegExp, here i used a simple expression which allows AlphaNumeric String, exceptionally allowing hyphen(-) as well. use of preg_match matches the pattern into the string and returns the first matching keyword, which in your case is k-on! and used str_replace in order to take out the exclamation from the returned string.
know more about preg_match : http://php.net/manual/en/function.preg-match.php

Replace all the first character of words in a string using preg_replace()

I have a string as
This is a sample text. This text will be used as a dummy for "various" RegEx "operations" using PHP.
I want to select and replace all the first alphabet of each word (in the example : T,i,a,s,t,T,t,w,b,u,a,d,f,",R,",u,P). How do I do it?
I tried /\b.{1}\w+\b/. I read the expression as "select any character that has length of 1 followed by word of any length" but didn't work.
You may try this regex as well:
(?<=\s|^)([a-zA-Z"])
Demo
Your regex - /\b.{1}\w+\b/ - matches any string that is not enclosed in word characters, starts with any symbol that is in a position after a word boundary (thus, it can even be whitespace if there is a letter/digit/underscore in front of it), followed with 1 or more alphanumeric symbols (\w) up to the word boundary.
That \b. is the culprit here.
If you plan to match any non-whitespace preceded with a whitespace, you can just use
/(?<!\S)\S/
Or
/(?<=^|\s)\S/
See demo
Then, replace with any symbol you need.
You may try to use the following regex:
(.)[^\s]*\s?
Using the preg_match_all and implode the output result group 1
<?php
$string = 'This is a sample text. This text will be used as a dummy for'
. '"various" RegEx "operations" using PHP.';
$pattern = '/(.)[^\s]*\s?/';
$matches;
preg_match_all($pattern, $string, $matches);
$output = implode('', $matches[1]);
echo $output; //Output is TiastTtwbuaadf"R"uP
For replace use something like preg_replace_callback like:
$pattern = '/(.)([^\s]*\s?)/';
$output2 = preg_replace_callback($pattern,
function($match) { return '_' . $match[2]; }, $string);
//result: _his _s _ _ample _ext. _his _ext _ill _e _sed _s _ _ummy _or _various" _egEx _operations" _sing _HP.

PHP preg_replace special characters

I am wanting to replace all non letter and number characters i.e. /&%#$ etc with an underscore (_) and replace all ' (single quotes) with ""blank (so no underscore).
So "There wouldn't be any" (ignore the double quotes) would become "There_wouldnt_be_any".
I am useless at reg expressions hence the post.
Cheers
If you by writing "non letters and numbers" exclude more than [A-Za-z0-9] (ie. considering letters like åäö to be letters to) and want to be able to accurately handle UTF-8 strings \p{L} and \p{N} will be of aid.
\p{N} will match any "Number"
\p{L} will match any "Letter Character", which includes
Lower case letter
Modifier letter
Other letter
Title case letter
Upper case letter
Documentation PHP: Unicode Character Properties
$data = "Thäre!wouldn't%bé#äny";
$new_data = str_replace ("'", "", $data);
$new_data = preg_replace ('/[^\p{L}\p{N}]/u', '_', $new_data);
var_dump (
$new_data
);
output
string(23) "Thäre_wouldnt_bé_äny"
$newstr = preg_replace('/[^a-zA-Z0-9\']/', '_', "There wouldn't be any");
$newstr = str_replace("'", '', $newstr);
I put them on two separate lines to make the code a little more clear.
Note: If you're looking for Unicode support, see Filip's answer below. It will match all characters that register as letters in addition to A-z.
do this in two steps:
replace not letter characters with this regex:
[\/\&%#\$]
replace quotes with this regex:
[\"\']
and use preg_replace:
$stringWithoutNonLetterCharacters = preg_replace("/[\/\&%#\$]/", "_", $yourString);
$stringWithQuotesReplacedWithSpaces = preg_replace("/[\"\']/", " ", $stringWithoutNonLetterCharacters);

Replace symbol if it is preceded and followed by a word character

I want to change a specific character, only if it's previous and following character is of English characters. In other words, the target character is part of the word and not a start or end character.
For Example...
$string = "I am learn*ing *PHP today*";
I want this string to be converted as following.
$newString = "I am learn'ing *PHP today*";
$string = "I am learn*ing *PHP today*";
$newString = preg_replace('/(\w)\*(\w)/', '$1\'$2', $string);
// $newString = "I am learn'ing *PHP today* "
This will match an asterisk surrounded by word characters (letters, digits, underscores). If you only want to do alphabet characters you can do:
preg_replace('/([a-zA-Z])\*([a-zA-Z])/', '$1\'$2', 'I am learn*ing *PHP today*');
The most concise way would be to use "word boundary" characters in your pattern -- they represent a zero-width position between a "word" character and a "non-word" characters. Since * is a non-word character, the word boundaries require the both neighboring characters to be word characters.
No capture groups, no references.
Code: (Demo)
$string = "I am learn*ing *PHP today*";
echo preg_replace('~\b\*\b~', "'", $string);
Output:
I am learn'ing *PHP today*
To replace only alphabetical characters, you need to use a [a-z] as a character range, and use the i flag to make the regex case-insensitive. Since the character you want to replace is an asterisk, you also need to escape it with a backslash, because an asterisk means "match zero or more times" in a regular expression.
$newstring = preg_replace('/([a-z])\*([a-z])/i', "$1'$2", $string);
To replace all occurances of asteric surrounded by letter....
$string = preg_replace('/(\w)*(\w)/', '$1\'$2', $string);
AND
To replace all occurances of asteric where asteric is start and end character of the word....
$string = preg_replace('/*(\w+)*/','\'$1\'', $string);

PHP RegEx - Get All Digits

I have this code:
$string = "123456ABcd9999";
$answer = ereg("([0-9]*)", $string, $digits);
echo $digits[0];
This outputs '123456'. I'd like it to output '1234569999' ie. all the digits. How can I achieve this. I've been trying lots of different regex things but can't figure it out.
First, don't use ereg (it's deprecated). Secondly, why not replace it out:
$answer = preg_replace('#\D#', '', $string);
Note that \D is the inverse of \d. So \d matches all decimal numeric characters (0-9), therefore \D matches anything that \d does not match...
You could use preg_replace for this, preg_replace("/[^0-9]/", "", $string) for example.

Categories