How to remove all non-uppercase characters in a string?

How to remove all non-uppercase characters in a string? - php

Yeah I'm basically just trying to explode a phrase like Social Inc. or David Jason to SI and DJ. I've tried using explode but couldn't figure out how to explode everything BUT the capital letters, do I need to use preg_match()?

You can use this regex (?![A-Z]). with preg_replace() to replace every char except the one in uppercase.
preg_replace("/(?![A-Z])./", "", $yourvariable)
The regex will look for anythings NOT an uppercase letter ( ?! negative lookahead ).
I've created a regex101 if you wish to test it with other cases.
EDIT As an update of this thread, You could also use the ^ char inside the square braquets to reverse the effect.
preg_replace("/([^A-Z])./", "", $yourvariable)
This will match all char that are not uppercase and replace them with nothing.

Quick and easy:
$ucaseletters = preg_replace('/[^A-Z]/', '', $input);
This will replace everything that is not an uppercase Letter within the Range A-Z.
Explanation:
^ within [] (Character-Set) is the negation-Operator (=anything that is NOT...)

Nicholas and Bernhard have provided successful regex patterns but they are not as efficient as they could be.
Use /[^A-Z]+/ and an empty replacement string with preg_replace().
preg_replace('~[^A-Z]+~', '', $string)
The negated character class has a one or more quantifier, so longer substrings are matched and fewer replacements are required.
The multibyte/unicode equivalent would be: (Demo)
preg_replace('~[^\p{Lu}]+~u', '', 'Az+0ǻÉé') // outputs: AÉ
This is the best pattern to use with preg_split as well, but preg_split generates an array, so there is the extra step of calling implode.

I've got a more complicated solution but it works too!
$s = str_split("Social Inc.");
foreach ($s as $idx => $char) {
if(preg_match("/[A-Z]/", $char))
{
echo $char;
}
}
It will echo the upper-case letters.

Related

Matching something with regex in a string and removing / cutting out everything that did not match

I am wondering how to solve this. Let's say I have a string looking like this:
xx-123-456-12-xxl-1235-6122
I also have an regex that will try match anything that look like this
[LETTER][LETTER]-[NUMBER][NUMBER][NUMBER]-[NUMBER][NUMBER][NUMBER]
meaning in the strong above it would match this:
xx-123-456
How do I go about cutting everything else out of that string, that did not match the regular expression. Meaning that everything after xx-123-456 should be cut our and removed. This would need to work as well no matter where in the string the regex finds the match.
Any ideas / solutions?

This will work:
$txt = 'xx-123-456-12-xxl-1235-6122';
preg_match( '/^[a-z]{2}-\d{3}-\d{3}/i', $txt, $matches );
echo $matches[0];
^ = begin of the string;
[a-z] = any characters from a through z;
{2} = previous pattern repeat 2;
\d = any digit/number

There are several ways to do this in php:
Use preg_match to match what you want and print matched array element
Use preg_replace and use captured group to use a back-reference in replacement.
Use preg_replace and use lookbehind assertion
Use preg_replace and use \K (match reset)
Here is one approach using #4:
$str = 'xx-123-456-12-xxl-1235-6122';
$str = preg_replace('/^\p{L}{2}-\p{N}{3}-\p{N}{3}\K.*$/u', '', $str);
//=> xx-123-456
RegEx Demo

preg_replace to remove stand-alone numbers

I'm looking to replace all standalone numbers from a string where the number has no adjacent characters (including dashes), example:
Test 3 string 49Test 49test9 9
Should return Test string 49Test 49Test9
So far I've been playing around with:
$str = 'Test 3 string 49Test 49test9 9';
$str= preg_replace('/[^a-z\-]+(\d+)[^a-z\-]+?/isU', ' ', $str);
echo $str;
However with no luck, this returns
Test string 9Test 9test9
leaving out part of the string, i thought to add [0-9] to the matches, but to no avail, what am I missing, seems so simple?
Thanks in advance

Try using a word boundary and negative look-arounds for hyphens, eg
$str = preg_replace('/\b(?<!-)\d+(?!-)\b/', '', $str);

Not that complicated, if you watch the spaces :)
<?php
$str = 'Test 3 string 49Test 49test9 9';
$str = preg_replace('/(\s(\d+)\s|\s(\d+)$|^(\d+)\s)/iU', '', $str);
echo $str;

Try this, I tried to cover your additional requirement to not match on 5-abc
\s*(?<!\B|-)\d+(?!\B|-)\s*
and replace with a single space!
See it here online on Regexr
The problem then is to extend the word boundary with the character -. I achieved this by using negative look arounds and looking for - or \B (not a word boundary)
Additionally I am matching the surrounding whitespace with the \s*, therefore you have to replace with a single space.

I would suggest using
explode(" ",$str)
to get an array of the "words" in your string. Then it should be easier to filter out single numbers.

PHP: How to convert a string that contains upper case characters

i'm working on class names and i need to check if there is any upper camel case name and break it this way:
"UserManagement" becomes "user-management"
or
"SiteContentManagement" becomes "site-content-management"
after extensive search i only found various use of ucfirst, strtolower,strtoupper, ucword and i can't see how to use them to suit my needs any ideas?
thanks for reading ;)

You can use preg_replace to replace any instance of a lowercase letter followed with an uppercase with your lower-dash-lower variant:
$dashedName = preg_replace('/([^A-Z-])([A-Z])/', '$1-$2', $className);
Then followed by a strtolower() to take care of any remaining uppercase letters:
return strtolower($dashedName);
The full function here:
function camel2dashed($className) {
return strtolower(preg_replace('/([^A-Z-])([A-Z])/', '$1-$2', $className));
}
To explain the regular expression used:
/ Opening delimiter
( Start Capture Group 1
[^A-Z-] Character Class: Any character NOT an uppercase letter and not a dash
) End Capture Group 1
( Start Capture Group 2
[A-Z] Character Class: Any uppercase letter
) End Capture Group 2
/ Closing delimiter
As for the replacement string
$1 Insert Capture Group 1
- Literal: dash
$2 Insert Capture Group 2

Theres no built in way to do it.
This will ConvertThis into convert-this:
$str = preg_replace('/([a-z])([A-Z])/', '$1-$2', $str);
$str = strtolower($str);

You can use a regex to get each words, then add the dashes like this:
preg_match_all ('/[A-Z][a-z]+/', $className, $matches); // get each camelCase words
$newName = strtolower(implode('-', $matches[0])); // add the dashes and lowercase the result

This simply done without any capture groups -- just find the zero-width position before an uppercase letter (excluding the first letter of the string), then replace it with a hyphen, then call strtolower on the new string.
Code: (Demo)
echo strtolower(preg_replace('~(?!^)(?=[A-Z])~', '-', $string));
The lookahead (?=...) makes the match but doesn't consume any characters.

The best way to do that might be preg_replace using a pattern that replaces uppercase letters with their lowercase counterparts adding a "-" before them.
You could also go through each letter and rebuild the whole string.

Insert separators into a string in regular intervals

I have the following string in php:
$string = 'FEDCBA9876543210';
The string can be have 2 or more (I mean more) hexadecimal characters
I wanted to group string by 2 like :
$output_string = 'FE:DC:BA:98:76:54:32:10';
I wanted to use regex for that, I think I saw a way to do like "recursive regex" but I can't remember it.
Any help appreciated :)

If you don't need to check the content, there is no use for regex.
Try this
$outputString = chunk_split($string, 2, ":");
// generates: FE:DC:BA:98:76:54:32:10:
You might need to remove the last ":".
Or this :
$outputString = implode(":", str_split($string, 2));
// generates: FE:DC:BA:98:76:54:32:10
Resources :
www.w3schools.com - chunk_split()
www.w3schools.com - str_split()
www.w3schools.com - implode()
On the same topic :
Split string into equal parts using PHP

Sounds like you want a regex like this:
/([0-9a-f]{2})/${1}:/gi
Which, in PHP is...
<?php
$string = 'FE:DC:BA:98:76:54:32:10';
$pattern = '/([0-9A-F]{2})/gi';
$replacement = '${1}:';
echo preg_replace($pattern, $replacement, $string);
?>
Please note the above code is currently untested.

You can make sure there are two or more hex characters doing this:
if (preg_match('!^\d*[A-F]\d*[A-F][\dA-F]*$!i', $string)) {
...
}
No need for a recursive regex. By the way, recursive regex is a contradiction in terms. As a regular language (which a regex parses) can't be recursive, by definition.
If you want to also group the characters in pairs with colons in between, ignoring the two hex characters for a second, use:
if (preg_match('!^[\dA-F]{2}(?::[A-F][\dA-F]{2})*$!i', $string)) {
...
}
Now if you want to add the condition requiring tow hex characters, use a positive lookahead:
if (preg_match('!^(?=[\d:]*[A-F][\d:]*[A-F])[\dA-F]{2}(?::[A-F][\dA-F]{2})*$!i', $string)) {
...
}
To explain how this works, the first thing it does it that it checks (with a positive lookahead ie (?=...) that you have zero or more digits or colons followed by a hex letter followed by zero or more digits or colons and then a letter. This will ensure there will be two hex letters in the expression.
After the positive lookahead is the original expression that makes sure the string is pairs of hex digits.

Recursive regular expressions are usually not possible. You may use a regular expression recursively on the results of a previous regular expression, but most regular expression grammars will not allow recursivity. This is the main reason why regular expressions are almost always inadequate for parsing stuff like HTML. Anyways, what you need doesn't need any kind of recursivity.
What you want, simply, is to match a group multiple times. This is quite simple:
preg_match_all("/([a-z0-9]{2})+/i", $string, $matches);
This will fill $matches will all occurrences of two hexadecimal digits (in a case-insensitive way). To replace them, use preg_replace:
echo preg_replace("/([a-z0-9]{2})/i", $string, '\1:');
There will probably be one ':' too much at the end, you can strip it with substr:
echo substr(preg_replace("/([a-z0-9]{2})/i", $string, '\1:'), 0, -1);

While it is not horrible practice to use rtrim(chunk_split($string, 2, ':'), ':'), I prefer to use direct techniques that avoid "mopping up" after making modifications.
Code: (Demo)
$string = 'FEDCBA9876543210';
echo preg_replace('~[\dA-F]{2}(?!$)\K~', ':', $string);
Output:
FE:DC:BA:98:76:54:32:10
Don't be intimidated by the regex. The pattern says:
[\dA-F]{2} # match exactly two numeric or A through F characters
(?!$) # that is not located at the end of the string
\K # restart the fullstring match
When I say "restart the fullstring match" I mean "forget the previously matched characters and start matching from this point forward". Because there are no additional characters matched after \K, the pattern effectively delivers the zero-width position where the colon should be inserted. In this way, no original characters are lost in the replacement.

How to replace double/more letters to a single letter?

I need to convert any letter that occur twice or more within a word with a single letter of itself.
For example:
School -> Schol
Google -> Gogle
Gooooogle -> Gogle
VooDoo -> Vodo
I tried the following, but stuck at the second parameter in eregi_replace.
$word = 'Goooogle';
$word2 = eregi_replace("([a-z]{2,})", "?", $word);
If I use \\\1 to replace ?, it would display the exact match.
How do I make it single letter?
Can anyone help? Thanks

See regular expression to replace two (or more) consecutive characters by only one?
By the way: you should use the preg_* (PCRE) functions instead of the deprecated ereg_* functions (POSIX).
Richard Szalay's answer leads the right way:
$word = 'Goooogle';
$word2 = preg_replace('/(\w)\1+/', '$1', $word);

Not only are you capturing the entire thing (instead of just the first character), but {2,} rematching [a-z] (not the original match). It should work if you use:
$word2 = eregi_replace("(\w)\1+", "\\1", $word);
Which backreferences the original match. You can replace \w with [a-z] if you wish.
The + is required for your Goooogle example (for the JS regex engine, anyway), but I'm not sure why.
Remember that you will need to use the "global" flag ("g").

Try this:
$string = "thhhhiiiissssss hasss sooo mannnny letterss";
$string = preg_replace('/([a-zA-Z])\1+/', '$1', $string);
How this works:
/ ... / # Marks the start and end of the expression.
([a-zA-Z]) # Match any single a-z character lowercase or uppercase.
\1+ # One or more occurrence of the single character we matched previously.
$1
\1+ # The same single character we matched previously.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to remove all non-uppercase characters in a string? - php

Yeah I'm basically just trying to explode a phrase like Social Inc. or David Jason to SI and DJ. I've tried using explode but couldn't figure out how to explode everything BUT the capital letters, do I need to use preg_match()?

Quick and easy: $ucaseletters = preg_replace('/[^A-Z]/', '', $input); This will replace everything that is not an uppercase Letter within the Range A-Z. Explanation: ^ within [] (Character-Set) is the negation-Operator (=anything that is NOT...)

I've got a more complicated solution but it works too! $s = str_split("Social Inc."); foreach ($s as $idx => $char) { if(preg_match("/[A-Z]/", $char)) { echo $char; } } It will echo the upper-case letters.

Related

Matching something with regex in a string and removing / cutting out everything that did not match

preg_replace to remove stand-alone numbers

PHP: How to convert a string that contains upper case characters

Insert separators into a string in regular intervals

How to replace double/more letters to a single letter?

Categories

Resources