mb_str_replace()... is slow. any alternatives? - php

I want to make sure some string replacement's I'm running are multi byte safe. I've found a few mb_str_replace functions around the net but they're slow. I'm talking 20% increase after passing maybe 500-900 bytes through it.
Any recommendations? I'm thinking about using preg_replace as it's native and compiled in so it might be faster. Any thoughts would be appreciated.

As said there, str_replace is safe to use in utf-8 contexts, as long as all parameters are utf-8 valid, because it won't be any ambiguous match between both multibyte encoded strings. If you check the validity of your input, then you have no need to look for a different function.

As encoding is a real challenge when there are inputs from everywhere (utf8 or others), I prefer using only multibyte-safe functions. For str_replace, I am using this one which is fast enough.
if (!function_exists('mb_str_replace'))
{
function mb_str_replace($search, $replace, $subject, &$count = 0)
{
if (!is_array($subject))
{
$searches = is_array($search) ? array_values($search) : array($search);
$replacements = is_array($replace) ? array_values($replace) : array($replace);
$replacements = array_pad($replacements, count($searches), '');
foreach ($searches as $key => $search)
{
$parts = mb_split(preg_quote($search), $subject);
$count += count($parts) - 1;
$subject = implode($replacements[$key], $parts);
}
}
else
{
foreach ($subject as $key => $value)
{
$subject[$key] = mb_str_replace($search, $replace, $value, $count);
}
}
return $subject;
}
}

Here's my implementation, based off Alain's answer:
/**
* Replace all occurrences of the search string with the replacement string. Multibyte safe.
*
* #param string|array $search The value being searched for, otherwise known as the needle. An array may be used to designate multiple needles.
* #param string|array $replace The replacement value that replaces found search values. An array may be used to designate multiple replacements.
* #param string|array $subject The string or array being searched and replaced on, otherwise known as the haystack.
* If subject is an array, then the search and replace is performed with every entry of subject, and the return value is an array as well.
* #param string $encoding The encoding parameter is the character encoding. If it is omitted, the internal character encoding value will be used.
* #param int $count If passed, this will be set to the number of replacements performed.
* #return array|string
*/
public static function mbReplace($search, $replace, $subject, $encoding = 'auto', &$count=0) {
if(!is_array($subject)) {
$searches = is_array($search) ? array_values($search) : [$search];
$replacements = is_array($replace) ? array_values($replace) : [$replace];
$replacements = array_pad($replacements, count($searches), '');
foreach($searches as $key => $search) {
$replace = $replacements[$key];
$search_len = mb_strlen($search, $encoding);
$sb = [];
while(($offset = mb_strpos($subject, $search, 0, $encoding)) !== false) {
$sb[] = mb_substr($subject, 0, $offset, $encoding);
$subject = mb_substr($subject, $offset + $search_len, null, $encoding);
++$count;
}
$sb[] = $subject;
$subject = implode($replace, $sb);
}
} else {
foreach($subject as $key => $value) {
$subject[$key] = self::mbReplace($search, $replace, $value, $encoding, $count);
}
}
return $subject;
}
His doesn't accept a character encoding, although I suppose you could set it via mb_regex_encoding.
My unit tests pass:
function testMbReplace() {
$this->assertSame('bbb',Str::mbReplace('a','b','aaa','auto',$count1));
$this->assertSame(3,$count1);
$this->assertSame('ccc',Str::mbReplace(['a','b'],['b','c'],'aaa','auto',$count2));
$this->assertSame(6,$count2);
$this->assertSame("\xbf\x5c\x27",Str::mbReplace("\x27","\x5c\x27","\xbf\x27",'iso-8859-1'));
$this->assertSame("\xbf\x27",Str::mbReplace("\x27","\x5c\x27","\xbf\x27",'gbk'));
}

Top rated note on http://php.net/manual/en/ref.mbstring.php#109937 says str_replace works for multibyte strings.

Related

PHP Update array index/key [duplicate]

If I had:
$string = "PascalCase";
I need
"pascal_case"
Does PHP offer a function for this purpose?
A shorter solution: Similar to the editor's one with a simplified regular expression and fixing the "trailing-underscore" problem:
$output = strtolower(preg_replace('/(?<!^)[A-Z]/', '_$0', $input));
PHP Demo |
Regex Demo
Note that cases like SimpleXML will be converted to simple_x_m_l using the above solution. That can also be considered a wrong usage of camel case notation (correct would be SimpleXml) rather than a bug of the algorithm since such cases are always ambiguous - even by grouping uppercase characters to one string (simple_xml) such algorithm will always fail in other edge cases like XMLHTMLConverter or one-letter words near abbreviations, etc. If you don't mind about the (rather rare) edge cases and want to handle SimpleXML correctly, you can use a little more complex solution:
$output = ltrim(strtolower(preg_replace('/[A-Z]([A-Z](?![a-z]))*/', '_$0', $input)), '_');
PHP Demo |
Regex Demo
Try this on for size:
$tests = array(
'simpleTest' => 'simple_test',
'easy' => 'easy',
'HTML' => 'html',
'simpleXML' => 'simple_xml',
'PDFLoad' => 'pdf_load',
'startMIDDLELast' => 'start_middle_last',
'AString' => 'a_string',
'Some4Numbers234' => 'some4_numbers234',
'TEST123String' => 'test123_string',
);
foreach ($tests as $test => $result) {
$output = from_camel_case($test);
if ($output === $result) {
echo "Pass: $test => $result\n";
} else {
echo "Fail: $test => $result [$output]\n";
}
}
function from_camel_case($input) {
preg_match_all('!([A-Z][A-Z0-9]*(?=$|[A-Z][a-z0-9])|[A-Za-z][a-z0-9]+)!', $input, $matches);
$ret = $matches[0];
foreach ($ret as &$match) {
$match = $match == strtoupper($match) ? strtolower($match) : lcfirst($match);
}
return implode('_', $ret);
}
Output:
Pass: simpleTest => simple_test
Pass: easy => easy
Pass: HTML => html
Pass: simpleXML => simple_xml
Pass: PDFLoad => pdf_load
Pass: startMIDDLELast => start_middle_last
Pass: AString => a_string
Pass: Some4Numbers234 => some4_numbers234
Pass: TEST123String => test123_string
This implements the following rules:
A sequence beginning with a lowercase letter must be followed by lowercase letters and digits;
A sequence beginning with an uppercase letter can be followed by either:
one or more uppercase letters and digits (followed by either the end of the string or an uppercase letter followed by a lowercase letter or digit ie the start of the next sequence); or
one or more lowercase letters or digits.
A concise solution and can handle some tricky use cases:
function decamelize($string) {
return strtolower(preg_replace(['/([a-z\d])([A-Z])/', '/([^_])([A-Z][a-z])/'], '$1_$2', $string));
}
Can handle all these cases:
simpleTest => simple_test
easy => easy
HTML => html
simpleXML => simple_xml
PDFLoad => pdf_load
startMIDDLELast => start_middle_last
AString => a_string
Some4Numbers234 => some4_numbers234
TEST123String => test123_string
hello_world => hello_world
hello__world => hello__world
_hello_world_ => _hello_world_
hello_World => hello_world
HelloWorld => hello_world
helloWorldFoo => hello_world_foo
hello-world => hello-world
myHTMLFiLe => my_html_fi_le
aBaBaB => a_ba_ba_b
BaBaBa => ba_ba_ba
libC => lib_c
You can test this function here: http://syframework.alwaysdata.net/decamelize
The Symfony Serializer Component has a CamelCaseToSnakeCaseNameConverter that has two methods normalize() and denormalize(). These can be used as follows:
$nameConverter = new CamelCaseToSnakeCaseNameConverter();
echo $nameConverter->normalize('camelCase');
// outputs: camel_case
echo $nameConverter->denormalize('snake_case');
// outputs: snakeCase
Ported from Ruby's String#camelize and String#decamelize.
function decamelize($word) {
return preg_replace(
'/(^|[a-z])([A-Z])/e',
'strtolower(strlen("\\1") ? "\\1_\\2" : "\\2")',
$word
);
}
function camelize($word) {
return preg_replace('/(^|_)([a-z])/e', 'strtoupper("\\2")', $word);
}
One trick the above solutions may have missed is the 'e' modifier which causes preg_replace to evaluate the replacement string as PHP code.
Most solutions here feel heavy handed. Here's what I use:
$underscored = strtolower(
preg_replace(
["/([A-Z]+)/", "/_([A-Z]+)([A-Z][a-z])/"],
["_$1", "_$1_$2"],
lcfirst($camelCase)
)
);
"CamelCASE" is converted to "camel_case"
lcfirst($camelCase) will lower the first character (avoids 'CamelCASE' converted output to start with an underscore)
[A-Z] finds capital letters
+ will treat every consecutive uppercase as a word (avoids 'CamelCASE' to be converted to camel_C_A_S_E)
Second pattern and replacement are for ThoseSPECCases -> those_spec_cases instead of those_speccases
strtolower([…]) turns the output to lowercases
php does not offer a built in function for this afaik, but here is what I use
function uncamelize($camel,$splitter="_") {
$camel=preg_replace('/(?!^)[[:upper:]][[:lower:]]/', '$0', preg_replace('/(?!^)[[:upper:]]+/', $splitter.'$0', $camel));
return strtolower($camel);
}
the splitter can be specified in the function call, so you can call it like so
$camelized="thisStringIsCamelized";
echo uncamelize($camelized,"_");
//echoes "this_string_is_camelized"
echo uncamelize($camelized,"-");
//echoes "this-string-is-camelized"
I had a similar problem but couldn't find any answer that satisfies how to convert CamelCase to snake_case, while avoiding duplicate or redundant underscores _ for names with underscores, or all caps abbreviations.
Th problem is as follows:
CamelCaseClass => camel_case_class
ClassName_WithUnderscores => class_name_with_underscore
FAQ => faq
The solution I wrote is a simple two functions call, lowercase and search and replace for consecutive lowercase-uppercase letters:
strtolower(preg_replace("/([a-z])([A-Z])/", "$1_$2", $name));
"CamelCase" to "camel_case":
function camelToSnake($camel)
{
$snake = preg_replace('/[A-Z]/', '_$0', $camel);
$snake = strtolower($snake);
$snake = ltrim($snake, '_');
return $snake;
}
or:
function camelToSnake($camel)
{
$snake = preg_replace_callback('/[A-Z]/', function ($match){
return '_' . strtolower($match[0]);
}, $camel);
return ltrim($snake, '_');
}
If you are looking for a PHP 5.4 version and later answer here is the code:
function decamelize($word) {
return $word = preg_replace_callback(
"/(^|[a-z])([A-Z])/",
function($m) { return strtolower(strlen($m[1]) ? "$m[1]_$m[2]" : "$m[2]"); },
$word
);
}
function camelize($word) {
return $word = preg_replace_callback(
"/(^|_)([a-z])/",
function($m) { return strtoupper("$m[2]"); },
$word
);
}
You need to run a regex through it that matches every uppercase letter except if it is in the beginning and replace it with underscrore plus that letter. An utf-8 solution is this:
header('content-type: text/html; charset=utf-8');
$separated = preg_replace('%(?<!^)\p{Lu}%usD', '_$0', 'AaaaBbbbCcccDdddÁáááŐőőő');
$lower = mb_strtolower($separated, 'utf-8');
echo $lower; //aaaa_bbbb_cccc_dddd_áááá_őőőő
If you are not sure what case your string is, better to check it first, because this code assumes that the input is camelCase instead of underscore_Case or dash-Case, so if the latters have uppercase letters, it will add underscores to them.
The accepted answer from cletus is way too overcomplicated imho and it works only with latin characters. I find it a really bad solution and wonder why it was accepted at all. Converting TEST123String into test123_string is not necessarily a valid requirement. I rather kept it simple and separated ABCccc into a_b_cccc instead of ab_cccc because it does not lose information this way and the backward conversion will give the exact same string we started with. Even if you want to do it the other way it is relative easy to write a regex for it with positive lookbehind (?<!^)\p{Lu}\p{Ll}|(?<=\p{Ll})\p{Lu} or two regexes without lookbehind if you are not a regex expert. There is no need to split it up into substrings not to mention deciding between strtolower and lcfirst where using just strtolower would be completely fine.
Short solution:
$subject = "PascalCase";
echo strtolower(preg_replace('/\B([A-Z])/', '_$1', $subject));
Not fancy at all but simple and speedy as hell:
function uncamelize($str)
{
$str = lcfirst($str);
$lc = strtolower($str);
$result = '';
$length = strlen($str);
for ($i = 0; $i < $length; $i++) {
$result .= ($str[$i] == $lc[$i] ? '' : '_') . $lc[$i];
}
return $result;
}
echo uncamelize('HelloAWorld'); //hello_a_world
A version that doesn't use regex can be found in the Alchitect source:
decamelize($str, $glue='_')
{
$counter = 0;
$uc_chars = '';
$new_str = array();
$str_len = strlen($str);
for ($x=0; $x<$str_len; ++$x)
{
$ascii_val = ord($str[$x]);
if ($ascii_val >= 65 && $ascii_val <= 90)
{
$uc_chars .= $str[$x];
}
}
$tok = strtok($str, $uc_chars);
while ($tok !== false)
{
$new_char = chr(ord($uc_chars[$counter]) + 32);
$new_str[] = $new_char . $tok;
$tok = strtok($uc_chars);
++$counter;
}
return implode($new_str, $glue);
}
So here is a one-liner:
strtolower(preg_replace('/(?|([a-z\d])([A-Z])|([^\^])([A-Z][a-z]))/', '$1_$2', $string));
danielstjules/Stringy provieds a method to convert string from camelcase to snakecase.
s('TestUCase')->underscored(); // 'test_u_case'
Laravel 5.6 provides a very simple way of doing this:
/**
* Convert a string to snake case.
*
* #param string $value
* #param string $delimiter
* #return string
*/
public static function snake($value, $delimiter = '_'): string
{
if (!ctype_lower($value)) {
$value = strtolower(preg_replace('/(.)(?=[A-Z])/u', '$1'.$delimiter, $value));
}
return $value;
}
What it does: if it sees that there is at least one capital letter in the given string, it uses a positive lookahead to search for any character (.) followed by a capital letter ((?=[A-Z])). It then replaces the found character with it's value followed by the separactor _.
If you are not using Composer for PHP you are wasting your time.
composer require doctrine/inflector
use Doctrine\Inflector\InflectorFactory;
// Couple ways to get class name:
// If inside a parent class
$class_name = get_called_class();
// Or just inside the class
$class_name = get_class();
// Or straight get a class name
$class_name = MyCustomClass::class;
// Or, of course, a string
$class_name = 'App\Libs\MyCustomClass';
// Take the name down to the base name:
$class_name = end(explode('\\', $class_name)));
$inflector = InflectorFactory::create()->build();
$inflector->tableize($class_name); // my_custom_class
https://github.com/doctrine/inflector/blob/master/docs/en/index.rst
Use Symfony String
composer require symfony/string
use function Symfony\Component\String\u;
u($string)->snake()->toString()
The direct port from rails (minus their special handling for :: or acronyms) would be
function underscore($word){
$word = preg_replace('#([A-Z\d]+)([A-Z][a-z])#','\1_\2', $word);
$word = preg_replace('#([a-z\d])([A-Z])#', '\1_\2', $word);
return strtolower(strtr($word, '-', '_'));
}
Knowing PHP, this will be faster than the manual parsing that's happening in other answers given here. The disadvantage is that you don't get to chose what to use as a separator between words, but that wasn't part of the question.
Also check the relevant rails source code
Note that this is intended for use with ASCII identifiers. If you need to do this with characters outside of the ASCII range, use the '/u' modifier for preg_matchand use mb_strtolower.
Here is my contribution to a six-year-old question with god knows how many answers...
It will convert all words in the provided string that are in camelcase to snakecase. For example "SuperSpecialAwesome and also FizBuzz καιΚάτιΑκόμα" will be converted to "super_special_awesome and also fizz_buzz και_κάτι_ακόμα".
mb_strtolower(
preg_replace_callback(
'/(?<!\b|_)\p{Lu}/u',
function ($a) {
return "_$a[0]";
},
'SuperSpecialAwesome'
)
);
Yii2 have the different function to make the word snake_case from CamelCase.
/**
* Converts any "CamelCased" into an "underscored_word".
* #param string $words the word(s) to underscore
* #return string
*/
public static function underscore($words)
{
return strtolower(preg_replace('/(?<=\\w)([A-Z])/', '_\\1', $words));
}
This is one of shorter ways:
function camel_to_snake($input)
{
return strtolower(ltrim(preg_replace('/([A-Z])/', '_\\1', $input), '_'));
}
function camel2snake($name) {
$str_arr = str_split($name);
foreach ($str_arr as $k => &$v) {
if (ord($v) >= 64 && ord($v) <= 90) { // A = 64; Z = 90
$v = strtolower($v);
$v = ($k != 0) ? '_'.$v : $v;
}
}
return implode('', $str_arr);
}
The worst answer on here was so close to being the best(use a framework). NO DON'T, just take a look at the source code. seeing what a well established framework uses would be a far more reliable approach(tried and tested). The Zend framework has some word filters which fit your needs. Source.
here is a couple of methods I adapted from the source.
function CamelCaseToSeparator($value,$separator = ' ')
{
if (!is_scalar($value) && !is_array($value)) {
return $value;
}
if (defined('PREG_BAD_UTF8_OFFSET_ERROR') && preg_match('/\pL/u', 'a') == 1) {
$pattern = ['#(?<=(?:\p{Lu}))(\p{Lu}\p{Ll})#', '#(?<=(?:\p{Ll}|\p{Nd}))(\p{Lu})#'];
$replacement = [$separator . '\1', $separator . '\1'];
} else {
$pattern = ['#(?<=(?:[A-Z]))([A-Z]+)([A-Z][a-z])#', '#(?<=(?:[a-z0-9]))([A-Z])#'];
$replacement = ['\1' . $separator . '\2', $separator . '\1'];
}
return preg_replace($pattern, $replacement, $value);
}
function CamelCaseToUnderscore($value){
return CamelCaseToSeparator($value,'_');
}
function CamelCaseToDash($value){
return CamelCaseToSeparator($value,'-');
}
$string = CamelCaseToUnderscore("CamelCase");
There is a library providing this functionality:
SnakeCaseFormatter::run('CamelCase'); // Output: "camel_case"
If you use Laravel framework, you can use just snake_case() method.
How to de-camelize without using regex:
function decamelize($str, $glue = '_') {
$capitals = [];
$replace = [];
foreach(str_split($str) as $index => $char) {
if(!ctype_upper($char)) {
continue;
}
$capitals[] = $char;
$replace[] = ($index > 0 ? $glue : '') . strtolower($char);
}
if(count($capitals) > 0) {
return str_replace($capitals, $replace, $str);
}
return $str;
}
An edit:
How would I do that in 2019:
PHP 7.3 and before:
function toSnakeCase($str, $glue = '_') {
return ltrim(
preg_replace_callback('/[A-Z]/', function ($matches) use ($glue) {
return $glue . strtolower($matches[0]);
}, $str),
$glue
);
}
And with PHP 7.4+:
function toSnakeCase($str, $glue = '_') {
return ltrim(preg_replace_callback('/[A-Z]/', fn($matches) => $glue . strtolower($matches[0]), $str), $glue);
}
If you're using the Laravel framework, a simpler built-in method exists:
$converted = Str::snake('fooBar'); // -> foo_bar
See documentation here:
https://laravel.com/docs/9.x/helpers#method-snake-case
The open source TurboCommons library contains a general purpose formatCase() method inside the StringUtils class, which lets you convert a string to lots of common case formats, like CamelCase, UpperCamelCase, LowerCamelCase, snake_case, Title Case, and many more.
https://github.com/edertone/TurboCommons
To use it, import the phar file to your project and:
use org\turbocommons\src\main\php\utils\StringUtils;
echo StringUtils::formatCase('camelCase', StringUtils::FORMAT_SNAKE_CASE);
// will output 'camel_Case'

PHP ldap_add function to escape ldap special characters in DN syntax

I'm trying to add some users to my Ldap DB but I get some errors (invalid dn syntax) when I use some special characters like ",.". I need a function that escape all characters. I try preg_quote but I get some errors in some cases.
Thanks in advance
Code:
$user = 'Test , Name S.L';
if(!(ldap_add($ds, "cn=" . $user . ",".LDAP_DN_BASE, $info))) {
include 'error_new_account.php';
}
EDIT Jan 2013: added support for escaping leading/trailing spaces in DN strings, per RFC 4514. Thanks to Eugenio for pointing out this issue.
EDIT 2014: I added this function to PHP 5.6. The code below is now a like-for-like drop-in replacement for earlier PHP versions.
if (!function_exists('ldap_escape')) {
define('LDAP_ESCAPE_FILTER', 0x01);
define('LDAP_ESCAPE_DN', 0x02);
/**
* #param string $subject The subject string
* #param string $ignore Set of characters to leave untouched
* #param int $flags Any combination of LDAP_ESCAPE_* flags to indicate the
* set(s) of characters to escape.
* #return string
*/
function ldap_escape($subject, $ignore = '', $flags = 0)
{
static $charMaps = array(
LDAP_ESCAPE_FILTER => array('\\', '*', '(', ')', "\x00"),
LDAP_ESCAPE_DN => array('\\', ',', '=', '+', '<', '>', ';', '"', '#'),
);
// Pre-process the char maps on first call
if (!isset($charMaps[0])) {
$charMaps[0] = array();
for ($i = 0; $i < 256; $i++) {
$charMaps[0][chr($i)] = sprintf('\\%02x', $i);;
}
for ($i = 0, $l = count($charMaps[LDAP_ESCAPE_FILTER]); $i < $l; $i++) {
$chr = $charMaps[LDAP_ESCAPE_FILTER][$i];
unset($charMaps[LDAP_ESCAPE_FILTER][$i]);
$charMaps[LDAP_ESCAPE_FILTER][$chr] = $charMaps[0][$chr];
}
for ($i = 0, $l = count($charMaps[LDAP_ESCAPE_DN]); $i < $l; $i++) {
$chr = $charMaps[LDAP_ESCAPE_DN][$i];
unset($charMaps[LDAP_ESCAPE_DN][$i]);
$charMaps[LDAP_ESCAPE_DN][$chr] = $charMaps[0][$chr];
}
}
// Create the base char map to escape
$flags = (int)$flags;
$charMap = array();
if ($flags & LDAP_ESCAPE_FILTER) {
$charMap += $charMaps[LDAP_ESCAPE_FILTER];
}
if ($flags & LDAP_ESCAPE_DN) {
$charMap += $charMaps[LDAP_ESCAPE_DN];
}
if (!$charMap) {
$charMap = $charMaps[0];
}
// Remove any chars to ignore from the list
$ignore = (string)$ignore;
for ($i = 0, $l = strlen($ignore); $i < $l; $i++) {
unset($charMap[$ignore[$i]]);
}
// Do the main replacement
$result = strtr($subject, $charMap);
// Encode leading/trailing spaces if LDAP_ESCAPE_DN is passed
if ($flags & LDAP_ESCAPE_DN) {
if ($result[0] === ' ') {
$result = '\\20' . substr($result, 1);
}
if ($result[strlen($result) - 1] === ' ') {
$result = substr($result, 0, -1) . '\\20';
}
}
return $result;
}
}
So you would do:
$user = 'Test , Name S.L';
$cn = ldap_escape($user, '', LDAP_ESCAPE_DN);
if (!ldap_add($ds, "cn={$cn}," . LDAP_DN_BASE, $info)) {
include 'error_new_account.php';
}
PHP 5.6 Beta released ldap_escape() function recently and it is in effect, However, this version is not production ready at present, you can very use it for your development purposes as of now.
Just a heads up if your not on PHP 5.6 yet, you can mirror the exact PHP 5.6 function ldap_escape() using the methods I created below, keep in mind this is meant for use in a class. The above answer doesn't perform exactly like the ldap_escape function, as in it doesn't escape all characters into a hex string if no flags have been given, so this would be more suitable for a drop in replacement for earlier versions of PHP, in an object oriented way.
I've documented every line for an easier understanding on whats going on. Scroll down for output.
Methods (Compatible with PHP 5 or greater):
/**
* Escapes the inserted value for LDAP.
*
* #param string $value The value to escape
* #param string $ignore The characters to ignore
* #param int $flags The PHP flag to use
*
* #return bool|string
*/
public function escapeManual($value, $ignore = '*', $flags = 0)
{
/*
* If a flag was supplied, we'll send the value
* off to be escaped using the PHP flag values
* and return the result.
*/
if($flags) {
return $this->escapeWithFlags($value, $ignore, $flags);
}
// Convert ignore string into an array
$ignores = str_split($ignore);
// Convert the value to a hex string
$hex = bin2hex($value);
/*
* Separate the string, with the hex length of 2,
* and place a backslash on the end of each section
*/
$value = chunk_split($hex, 2, "\\");
/*
* We'll append a backslash at the front of the string
* and remove the ending backslash of the string
*/
$value = "\\" . substr($value, 0, -1);
// Go through each character to ignore
foreach($ignores as $charToIgnore)
{
// Convert the characterToIgnore to a hex
$hexed = bin2hex($charToIgnore);
// Replace the hexed variant with the original character
$value = str_replace("\\" . $hexed, $charToIgnore, $value);
}
// Finally we can return the escaped value
return $value;
}
/**
* Escapes the inserted value with flags. Supplying either 1
* or 2 into the flags parameter will escape only certain values
*
*
* #param string $value The value to escape
* #param string $ignore The characters to ignore
* #param int $flags The PHP flag to use
* #return bool|string
*/
public function escapeWithFlags($value, $ignore = '*', $flags = 0)
{
// Convert ignore string into an array
$ignores = str_split($ignore);
$escapeFilter = ['\\', '*', '(', ')'];
$escapeDn = ['\\', ',', '=', '+', '<', '>', ';', '"', '#'];
switch($flags)
{
case 1:
// Int 1 equals to LDAP_ESCAPE_FILTER
$escapes = $escapeFilter;
break;
case 2:
// Int 2 equals to LDAP_ESCAPE_DN
$escapes = $escapeDn;
break;
case 3:
// If both LDAP_ESCAPE_FILTER and LDAP_ESCAPE_DN are used
$escapes = array_merge($escapeFilter, $escapeDn);
break;
default:
// Customize your own default return value
return false;
}
foreach($escapes as $escape)
{
// Make sure the escaped value isn't inside the ignore array
if( ! in_array($escape, $ignores))
{
$hexed = chunk_split(bin2hex($escape), 2, "\\");
$hexed = "\\" . substr($hexed, 0, -1);
$value = str_replace($escape, $hexed, $value);
}
}
return $value;
}
Tests (be aware that LDAP_ESCAPE constants are only available in PHP 5.6):
// Value to escape
$value = 'testing=+<>"";:#()*\x00';
$php = ldap_escape($value, $ignore = '*');
$man = $this->escapeManual($value, $ignore = '*');
echo $php; // \74\65\73\74\69\6e\67\3d\2b\3c\3e\22\22\3b\3a\23\28\29*\5c\78\30\30
echo $man; // \74\65\73\74\69\6e\67\3d\2b\3c\3e\22\22\3b\3a\23\28\29*\5c\78\30\30
$php = ldap_escape($value, $ignore = '*', LDAP_ESCAPE_DN);
$man = $this->escapeManual($value, $ignore = '*', LDAP_ESCAPE_DN);
echo $php; // testing\3d\2b\3c\3e\22\22\3b:\23()*\5cx00
echo $man; // testing\3d\2b\3c\3e\22\22\3b:\23()*\5cx00
$php = ldap_escape($value, $ignore = '*', LDAP_ESCAPE_FILTER);
$man = $this->escapeManual($value, $ignore = '*', LDAP_ESCAPE_FILTER);
echo $php; // testing=+<>"";:#\28\29*\5cx00
echo $man; // testing=+<>"";:#\28\29*\5cx00
Github Gist link: https://gist.github.com/stevebauman/0db9b5daa414d60fc266
Those characters must escaped to be part of the data of a distinguished name or relative distinguished name. Escape the character (as in all LDAP) with a backslash 2 hex digit, such as \2a. Anything else would not be in compliance with the standards body documents. See RFC4514 for more specific information regarding the string representation of distinguished names.

Replace all substring instances with a variable string

If you had the string
'Old string Old more string Old some more string'
and you wanted to get
'New1 string New2 more string New3 some more string'
how would you do it?
In other words, you need to replace all instances of 'Old' with variable string 'New'.$i. How can it be done?
An iterative solution that doesn't need regular expressions:
$str = 'Old string Old more string Old some more string';
$old = 'Old';
$new = 'New';
$i = 1;
$tmpOldStrLength = strlen($old);
while (($offset = strpos($str, $old, $offset)) !== false) {
$str = substr_replace($str, $new . ($i++), $offset, $tmpOldStrLength);
}
$offset in strpos() is just a little bit micro-optimization. I don't know, if it's worth it (in fact I don't even know, if it changes anything), but the idea is that we don't need to search for $old in the substring that is already processed.
See Demo
Old string Old more string Old some more string
New1 string New2 more string New3 some more string
Use preg_replace_callback.
$count = 0;
$str = preg_replace_callback(
'~Old~',
create_function('$matches', 'return "New".$count++;'),
$str
);
From the PHP manual on str_replace:
Replace all occurrences of the search string with the replacement string
mixed str_replace ( mixed $search , mixed $replace , mixed $subject [, int &$count ] )
search
The value being searched for, otherwise known as the needle. An array may be used to designate multiple needles.
replace
The replacement value that replaces found search values. An array may be used to designate multiple replacements.
subject
The string or array being searched and replaced on, otherwise known as the haystack.
If subject is an array, then the search and replace is performed with every entry of subject, and the return value is an array as well.
count
If passed, this will be set to the number of replacements performed.
Use:
$str = 'Old string Old more string Old some more string';
$i = 1;
while (preg_match('/Old/', $str)) {
$str = preg_replace('/Old/', 'New'.$i++, $str, 1);
}
echo $str,"\n";
Output:
New1 string New2 more string New3 some more string
I had some similar solution like KingCrunch's, but as he already answered it, I was wondering about a str_replace variant with a callback for replacements and came up with this (Demo):
$subject = array('OldOldOld', 'Old string Old more string Old some more string');
$search = array('Old', 'string');
$replace = array(
function($found, $count) {return 'New'.$count;},
function($found, $count) {static $c=0; return 'String'.(++$c);}
);
$replace = array();
print_r(str_ureplace($search, $replace, $subject));
/**
* str_ureplace
*
* str_replace like function with callback
*
* #param string|array search
* #param callback|array $replace
* #param string|array $subject
* #param int $replace_count
* #return string|array subject with replaces, FALSE on error.
*/
function str_ureplace($search, $replace, $subject, &$replace_count = null) {
$replace_count = 0;
// Validate input
$search = array_values((array) $search);
$searchCount = count($search);
if (!$searchCount) {
return $subject;
}
foreach($search as &$v) {
$v = (string) $v;
}
unset($v);
$replaceSingle = is_callable($replace);
$replace = $replaceSingle ? array($replace) : array_values((array) $replace);
foreach($replace as $index=>$callback) {
if (!is_callable($callback)) {
throw new Exception(sprintf('Unable to use %s (#%d) as a callback', gettype($callback), $index));
}
}
// Search and replace
$subjectIsString = is_string($subject);
$subject = (array) $subject;
foreach($subject as &$haystack) {
if (!is_string($haystack)) continue;
foreach($search as $key => $needle) {
if (!$len = strlen($needle))
continue;
$replaceSingle && $key = 0;
$pos = 0;
while(false !== $pos = strpos($haystack, $needle, $pos)) {
$replaceWith = isset($replace[$key]) ? call_user_func($replace[$key], $needle, ++$replace_count) : '';
$haystack = substr_replace($haystack, $replaceWith, $pos, $len);
}
}
}
unset($haystack);
return $subjectIsString ? reset($subject) : $subject;
}

PHP Like thing similar to MySQL Like, for if statement?

I want an if statement that uses same thingy like mysql something LIKE '%something%'
I want to build an if statement in php.
if ($something is like %$somethingother%)
Is it possible?
The reason for me asking this question is that I don't want to change the MySQL command, it's a long page with many stuff on it, I don't want to build a different function for this.
Let me know if this is possible, if possible then how to do it .
if ($something is like %$somethingother%)
Is it possible?
no.
I don't want to change the MySQL command, it's a long page with many stuff on it
Use some good editor, that supports regular expressions in find & replace, and turn it to something like:
if(stripos($something, $somethingother) !== FALSE){
}
I know, this question isn't actual but I've solved similar problem :)
My solution:
/**
* SQL Like operator in PHP.
* Returns TRUE if match else FALSE.
* #param string $pattern
* #param string $subject
* #return bool
*/
function like_match($pattern, $subject)
{
$pattern = str_replace('%', '.*', preg_quote($pattern, '/'));
return (bool) preg_match("/^{$pattern}$/i", $subject);
}
Examples:
like_match('%uc%','Lucy'); //TRUE
like_match('%cy', 'Lucy'); //TRUE
like_match('lu%', 'Lucy'); //TRUE
like_match('%lu', 'Lucy'); //FALSE
like_match('cy%', 'Lucy'); //FALSE
look on strstr function
Use this function which works same like SQL LIKE operator but it will return boolean value and you can make your own condition with one more if statement
function like($str, $searchTerm) {
$searchTerm = strtolower($searchTerm);
$str = strtolower($str);
$pos = strpos($str, $searchTerm);
if ($pos === false)
return false;
else
return true;
}
$found = like('Apple', 'app'); //returns true
$notFound = like('Apple', 'lep'); //returns false
if($found){
// This will execute only when the text is like the desired string
}
Use function, that search string in another string like: strstr, strpos, substr_count.
strpos() is not working for so i have to use this preg_match()
$a = 'How are you?';
if (preg_match('/\bare\b/', $a)) {
echo 'true';
}
like in this e.g i am matching with word "are"
hope for someone it will be helpful
But you will have to give lowercase string then it will work fine.
Example of strstr function:
$myString = "Hello, world!";
echo strstr( $myString, "wor" ); // Displays 'world!'
echo ( strstr( $myString, "xyz" ) ? "Yes" : "No" ); // Displays 'No'
If you have access to a MySQL server, send a query like this with MySQLi:
$SQL="select case when '$Value' like '$Pattern' then 'True' else 'False' end as Result";
$Result=$MySQLi->query($SQL)->fetch_all(MYSQLI_ASSOC)[0]['Result'];
Result will be a string containing True or False. Let PHP do what it's good for and use SQL for likes.
I came across this requirement recently and came up with this:
/**
* Removes the diacritical marks from a string.
*
* Diacritical marks: {#link https://unicode-table.com/blocks/combining-diacritical-marks/}
*
* #param string $string The string from which to strip the diacritical marks.
* #return string Stripped string.
*/
function stripDiacriticalMarks(string $string): string
{
return preg_replace('/[\x{0300}-\x{036f}]/u', '', \Normalizer::normalize($string , \Normalizer::FORM_KD));
}
/**
* Checks if the string $haystack is like $needle, $needle can contain '%' and '_'
* characters which will behave as if used in a SQL LIKE condition. Character escaping
* is supported with '\'.
*
* #param string $haystack The string to check if it is like $needle.
* #param string $needle The string used to check if $haystack is like it.
* #param bool $ai Whether to check likeness in an accent-insensitive manner.
* #param bool $ci Whether to check likeness in a case-insensitive manner.
* #return bool True if $haystack is like $needle, otherwise, false.
*/
function like(string $haystack, string $needle, bool $ai = true, bool $ci = true): bool
{
if ($ai) {
$haystack = stripDiacriticalMarks($haystack);
$needle = stripDiacriticalMarks($needle);
}
$needle = preg_quote($needle, '/');
$tokens = [];
$needleLength = strlen($needle);
for ($i = 0; $i < $needleLength;) {
if ($needle[$i] === '\\') {
$i += 2;
if ($i < $needleLength) {
if ($needle[$i] === '\\') {
$tokens[] = '\\\\';
$i += 2;
} else {
$tokens[] = $needle[$i];
++$i;
}
} else {
$tokens[] = '\\\\';
}
} else {
switch ($needle[$i]) {
case '_':
$tokens[] = '.';
break;
case '%':
$tokens[] = '.*';
break;
default:
$tokens[] = $needle[$i];
break;
}
++$i;
}
}
return preg_match('/^' . implode($tokens) . '$/u' . ($ci ? 'i' : ''), $haystack) === 1;
}
/**
* Escapes a string in a way that `UString::like` will match it as-is, thus '%' and '_'
* would match a literal '%' and '_' respectively (and not behave as in a SQL LIKE
* condition).
*
* #param string $str The string to escape.
* #return string The escaped string.
*/
function escapeLike(string $str): string
{
return strtr($str, ['\\' => '\\\\', '%' => '\%', '_' => '\_']);
}
The code above is unicode aware to be able to catch cases like:
like('Hello 🙃', 'Hello _'); // true
like('Hello 🙃', '_e%o__'); // true
like('asdfas \\🙃H\\\\%🙃É\\l\\_🙃\\l\\o asdfasf', '%' . escapeLike('\\🙃h\\\\%🙃e\\l\\_🙃\\l\\o') . '%'); // true
You can try all of this on https://3v4l.org/O9LX0
I think it's worth mentioning the str_contains() function available in PHP 8 which performs a case-sensitive check indicating whether a string is contained within another string, returning true or false.
Example taken from the documentation:
$string = 'The lazy fox jumped over the fence';
if (str_contains($string, 'lazy')) {
echo "The string 'lazy' was found in the string\n";
}
if (str_contains($string, 'Lazy')) {
echo 'The string "Lazy" was found in the string';
} else {
echo '"Lazy" was not found because the case does not match';
}
//The above will output:
//The string 'lazy' was found in the string
//"Lazy" was not found because the case does not match
See the full documentation here.
like_match() example is the best
this one witch SQL reqest is simple (I used it before), but works slowly then like_match() and exost database server resources when you iterate by array keys and every round hit db server with request usually not necessery. I made it faster ferst cutting / shrink array by pattern elements but regexp on array works always faster.
I like like_match() :)

How to convert PascalCase to snake_case?

If I had:
$string = "PascalCase";
I need
"pascal_case"
Does PHP offer a function for this purpose?
A shorter solution: Similar to the editor's one with a simplified regular expression and fixing the "trailing-underscore" problem:
$output = strtolower(preg_replace('/(?<!^)[A-Z]/', '_$0', $input));
PHP Demo |
Regex Demo
Note that cases like SimpleXML will be converted to simple_x_m_l using the above solution. That can also be considered a wrong usage of camel case notation (correct would be SimpleXml) rather than a bug of the algorithm since such cases are always ambiguous - even by grouping uppercase characters to one string (simple_xml) such algorithm will always fail in other edge cases like XMLHTMLConverter or one-letter words near abbreviations, etc. If you don't mind about the (rather rare) edge cases and want to handle SimpleXML correctly, you can use a little more complex solution:
$output = ltrim(strtolower(preg_replace('/[A-Z]([A-Z](?![a-z]))*/', '_$0', $input)), '_');
PHP Demo |
Regex Demo
Try this on for size:
$tests = array(
'simpleTest' => 'simple_test',
'easy' => 'easy',
'HTML' => 'html',
'simpleXML' => 'simple_xml',
'PDFLoad' => 'pdf_load',
'startMIDDLELast' => 'start_middle_last',
'AString' => 'a_string',
'Some4Numbers234' => 'some4_numbers234',
'TEST123String' => 'test123_string',
);
foreach ($tests as $test => $result) {
$output = from_camel_case($test);
if ($output === $result) {
echo "Pass: $test => $result\n";
} else {
echo "Fail: $test => $result [$output]\n";
}
}
function from_camel_case($input) {
preg_match_all('!([A-Z][A-Z0-9]*(?=$|[A-Z][a-z0-9])|[A-Za-z][a-z0-9]+)!', $input, $matches);
$ret = $matches[0];
foreach ($ret as &$match) {
$match = $match == strtoupper($match) ? strtolower($match) : lcfirst($match);
}
return implode('_', $ret);
}
Output:
Pass: simpleTest => simple_test
Pass: easy => easy
Pass: HTML => html
Pass: simpleXML => simple_xml
Pass: PDFLoad => pdf_load
Pass: startMIDDLELast => start_middle_last
Pass: AString => a_string
Pass: Some4Numbers234 => some4_numbers234
Pass: TEST123String => test123_string
This implements the following rules:
A sequence beginning with a lowercase letter must be followed by lowercase letters and digits;
A sequence beginning with an uppercase letter can be followed by either:
one or more uppercase letters and digits (followed by either the end of the string or an uppercase letter followed by a lowercase letter or digit ie the start of the next sequence); or
one or more lowercase letters or digits.
A concise solution and can handle some tricky use cases:
function decamelize($string) {
return strtolower(preg_replace(['/([a-z\d])([A-Z])/', '/([^_])([A-Z][a-z])/'], '$1_$2', $string));
}
Can handle all these cases:
simpleTest => simple_test
easy => easy
HTML => html
simpleXML => simple_xml
PDFLoad => pdf_load
startMIDDLELast => start_middle_last
AString => a_string
Some4Numbers234 => some4_numbers234
TEST123String => test123_string
hello_world => hello_world
hello__world => hello__world
_hello_world_ => _hello_world_
hello_World => hello_world
HelloWorld => hello_world
helloWorldFoo => hello_world_foo
hello-world => hello-world
myHTMLFiLe => my_html_fi_le
aBaBaB => a_ba_ba_b
BaBaBa => ba_ba_ba
libC => lib_c
You can test this function here: http://syframework.alwaysdata.net/decamelize
The Symfony Serializer Component has a CamelCaseToSnakeCaseNameConverter that has two methods normalize() and denormalize(). These can be used as follows:
$nameConverter = new CamelCaseToSnakeCaseNameConverter();
echo $nameConverter->normalize('camelCase');
// outputs: camel_case
echo $nameConverter->denormalize('snake_case');
// outputs: snakeCase
Ported from Ruby's String#camelize and String#decamelize.
function decamelize($word) {
return preg_replace(
'/(^|[a-z])([A-Z])/e',
'strtolower(strlen("\\1") ? "\\1_\\2" : "\\2")',
$word
);
}
function camelize($word) {
return preg_replace('/(^|_)([a-z])/e', 'strtoupper("\\2")', $word);
}
One trick the above solutions may have missed is the 'e' modifier which causes preg_replace to evaluate the replacement string as PHP code.
Most solutions here feel heavy handed. Here's what I use:
$underscored = strtolower(
preg_replace(
["/([A-Z]+)/", "/_([A-Z]+)([A-Z][a-z])/"],
["_$1", "_$1_$2"],
lcfirst($camelCase)
)
);
"CamelCASE" is converted to "camel_case"
lcfirst($camelCase) will lower the first character (avoids 'CamelCASE' converted output to start with an underscore)
[A-Z] finds capital letters
+ will treat every consecutive uppercase as a word (avoids 'CamelCASE' to be converted to camel_C_A_S_E)
Second pattern and replacement are for ThoseSPECCases -> those_spec_cases instead of those_speccases
strtolower([…]) turns the output to lowercases
php does not offer a built in function for this afaik, but here is what I use
function uncamelize($camel,$splitter="_") {
$camel=preg_replace('/(?!^)[[:upper:]][[:lower:]]/', '$0', preg_replace('/(?!^)[[:upper:]]+/', $splitter.'$0', $camel));
return strtolower($camel);
}
the splitter can be specified in the function call, so you can call it like so
$camelized="thisStringIsCamelized";
echo uncamelize($camelized,"_");
//echoes "this_string_is_camelized"
echo uncamelize($camelized,"-");
//echoes "this-string-is-camelized"
I had a similar problem but couldn't find any answer that satisfies how to convert CamelCase to snake_case, while avoiding duplicate or redundant underscores _ for names with underscores, or all caps abbreviations.
Th problem is as follows:
CamelCaseClass => camel_case_class
ClassName_WithUnderscores => class_name_with_underscore
FAQ => faq
The solution I wrote is a simple two functions call, lowercase and search and replace for consecutive lowercase-uppercase letters:
strtolower(preg_replace("/([a-z])([A-Z])/", "$1_$2", $name));
"CamelCase" to "camel_case":
function camelToSnake($camel)
{
$snake = preg_replace('/[A-Z]/', '_$0', $camel);
$snake = strtolower($snake);
$snake = ltrim($snake, '_');
return $snake;
}
or:
function camelToSnake($camel)
{
$snake = preg_replace_callback('/[A-Z]/', function ($match){
return '_' . strtolower($match[0]);
}, $camel);
return ltrim($snake, '_');
}
If you are looking for a PHP 5.4 version and later answer here is the code:
function decamelize($word) {
return $word = preg_replace_callback(
"/(^|[a-z])([A-Z])/",
function($m) { return strtolower(strlen($m[1]) ? "$m[1]_$m[2]" : "$m[2]"); },
$word
);
}
function camelize($word) {
return $word = preg_replace_callback(
"/(^|_)([a-z])/",
function($m) { return strtoupper("$m[2]"); },
$word
);
}
You need to run a regex through it that matches every uppercase letter except if it is in the beginning and replace it with underscrore plus that letter. An utf-8 solution is this:
header('content-type: text/html; charset=utf-8');
$separated = preg_replace('%(?<!^)\p{Lu}%usD', '_$0', 'AaaaBbbbCcccDdddÁáááŐőőő');
$lower = mb_strtolower($separated, 'utf-8');
echo $lower; //aaaa_bbbb_cccc_dddd_áááá_őőőő
If you are not sure what case your string is, better to check it first, because this code assumes that the input is camelCase instead of underscore_Case or dash-Case, so if the latters have uppercase letters, it will add underscores to them.
The accepted answer from cletus is way too overcomplicated imho and it works only with latin characters. I find it a really bad solution and wonder why it was accepted at all. Converting TEST123String into test123_string is not necessarily a valid requirement. I rather kept it simple and separated ABCccc into a_b_cccc instead of ab_cccc because it does not lose information this way and the backward conversion will give the exact same string we started with. Even if you want to do it the other way it is relative easy to write a regex for it with positive lookbehind (?<!^)\p{Lu}\p{Ll}|(?<=\p{Ll})\p{Lu} or two regexes without lookbehind if you are not a regex expert. There is no need to split it up into substrings not to mention deciding between strtolower and lcfirst where using just strtolower would be completely fine.
Short solution:
$subject = "PascalCase";
echo strtolower(preg_replace('/\B([A-Z])/', '_$1', $subject));
Not fancy at all but simple and speedy as hell:
function uncamelize($str)
{
$str = lcfirst($str);
$lc = strtolower($str);
$result = '';
$length = strlen($str);
for ($i = 0; $i < $length; $i++) {
$result .= ($str[$i] == $lc[$i] ? '' : '_') . $lc[$i];
}
return $result;
}
echo uncamelize('HelloAWorld'); //hello_a_world
A version that doesn't use regex can be found in the Alchitect source:
decamelize($str, $glue='_')
{
$counter = 0;
$uc_chars = '';
$new_str = array();
$str_len = strlen($str);
for ($x=0; $x<$str_len; ++$x)
{
$ascii_val = ord($str[$x]);
if ($ascii_val >= 65 && $ascii_val <= 90)
{
$uc_chars .= $str[$x];
}
}
$tok = strtok($str, $uc_chars);
while ($tok !== false)
{
$new_char = chr(ord($uc_chars[$counter]) + 32);
$new_str[] = $new_char . $tok;
$tok = strtok($uc_chars);
++$counter;
}
return implode($new_str, $glue);
}
So here is a one-liner:
strtolower(preg_replace('/(?|([a-z\d])([A-Z])|([^\^])([A-Z][a-z]))/', '$1_$2', $string));
danielstjules/Stringy provieds a method to convert string from camelcase to snakecase.
s('TestUCase')->underscored(); // 'test_u_case'
Laravel 5.6 provides a very simple way of doing this:
/**
* Convert a string to snake case.
*
* #param string $value
* #param string $delimiter
* #return string
*/
public static function snake($value, $delimiter = '_'): string
{
if (!ctype_lower($value)) {
$value = strtolower(preg_replace('/(.)(?=[A-Z])/u', '$1'.$delimiter, $value));
}
return $value;
}
What it does: if it sees that there is at least one capital letter in the given string, it uses a positive lookahead to search for any character (.) followed by a capital letter ((?=[A-Z])). It then replaces the found character with it's value followed by the separactor _.
If you are not using Composer for PHP you are wasting your time.
composer require doctrine/inflector
use Doctrine\Inflector\InflectorFactory;
// Couple ways to get class name:
// If inside a parent class
$class_name = get_called_class();
// Or just inside the class
$class_name = get_class();
// Or straight get a class name
$class_name = MyCustomClass::class;
// Or, of course, a string
$class_name = 'App\Libs\MyCustomClass';
// Take the name down to the base name:
$class_name = end(explode('\\', $class_name)));
$inflector = InflectorFactory::create()->build();
$inflector->tableize($class_name); // my_custom_class
https://github.com/doctrine/inflector/blob/master/docs/en/index.rst
Use Symfony String
composer require symfony/string
use function Symfony\Component\String\u;
u($string)->snake()->toString()
The direct port from rails (minus their special handling for :: or acronyms) would be
function underscore($word){
$word = preg_replace('#([A-Z\d]+)([A-Z][a-z])#','\1_\2', $word);
$word = preg_replace('#([a-z\d])([A-Z])#', '\1_\2', $word);
return strtolower(strtr($word, '-', '_'));
}
Knowing PHP, this will be faster than the manual parsing that's happening in other answers given here. The disadvantage is that you don't get to chose what to use as a separator between words, but that wasn't part of the question.
Also check the relevant rails source code
Note that this is intended for use with ASCII identifiers. If you need to do this with characters outside of the ASCII range, use the '/u' modifier for preg_matchand use mb_strtolower.
Here is my contribution to a six-year-old question with god knows how many answers...
It will convert all words in the provided string that are in camelcase to snakecase. For example "SuperSpecialAwesome and also FizBuzz καιΚάτιΑκόμα" will be converted to "super_special_awesome and also fizz_buzz και_κάτι_ακόμα".
mb_strtolower(
preg_replace_callback(
'/(?<!\b|_)\p{Lu}/u',
function ($a) {
return "_$a[0]";
},
'SuperSpecialAwesome'
)
);
Yii2 have the different function to make the word snake_case from CamelCase.
/**
* Converts any "CamelCased" into an "underscored_word".
* #param string $words the word(s) to underscore
* #return string
*/
public static function underscore($words)
{
return strtolower(preg_replace('/(?<=\\w)([A-Z])/', '_\\1', $words));
}
This is one of shorter ways:
function camel_to_snake($input)
{
return strtolower(ltrim(preg_replace('/([A-Z])/', '_\\1', $input), '_'));
}
function camel2snake($name) {
$str_arr = str_split($name);
foreach ($str_arr as $k => &$v) {
if (ord($v) >= 64 && ord($v) <= 90) { // A = 64; Z = 90
$v = strtolower($v);
$v = ($k != 0) ? '_'.$v : $v;
}
}
return implode('', $str_arr);
}
The worst answer on here was so close to being the best(use a framework). NO DON'T, just take a look at the source code. seeing what a well established framework uses would be a far more reliable approach(tried and tested). The Zend framework has some word filters which fit your needs. Source.
here is a couple of methods I adapted from the source.
function CamelCaseToSeparator($value,$separator = ' ')
{
if (!is_scalar($value) && !is_array($value)) {
return $value;
}
if (defined('PREG_BAD_UTF8_OFFSET_ERROR') && preg_match('/\pL/u', 'a') == 1) {
$pattern = ['#(?<=(?:\p{Lu}))(\p{Lu}\p{Ll})#', '#(?<=(?:\p{Ll}|\p{Nd}))(\p{Lu})#'];
$replacement = [$separator . '\1', $separator . '\1'];
} else {
$pattern = ['#(?<=(?:[A-Z]))([A-Z]+)([A-Z][a-z])#', '#(?<=(?:[a-z0-9]))([A-Z])#'];
$replacement = ['\1' . $separator . '\2', $separator . '\1'];
}
return preg_replace($pattern, $replacement, $value);
}
function CamelCaseToUnderscore($value){
return CamelCaseToSeparator($value,'_');
}
function CamelCaseToDash($value){
return CamelCaseToSeparator($value,'-');
}
$string = CamelCaseToUnderscore("CamelCase");
There is a library providing this functionality:
SnakeCaseFormatter::run('CamelCase'); // Output: "camel_case"
If you use Laravel framework, you can use just snake_case() method.
How to de-camelize without using regex:
function decamelize($str, $glue = '_') {
$capitals = [];
$replace = [];
foreach(str_split($str) as $index => $char) {
if(!ctype_upper($char)) {
continue;
}
$capitals[] = $char;
$replace[] = ($index > 0 ? $glue : '') . strtolower($char);
}
if(count($capitals) > 0) {
return str_replace($capitals, $replace, $str);
}
return $str;
}
An edit:
How would I do that in 2019:
PHP 7.3 and before:
function toSnakeCase($str, $glue = '_') {
return ltrim(
preg_replace_callback('/[A-Z]/', function ($matches) use ($glue) {
return $glue . strtolower($matches[0]);
}, $str),
$glue
);
}
And with PHP 7.4+:
function toSnakeCase($str, $glue = '_') {
return ltrim(preg_replace_callback('/[A-Z]/', fn($matches) => $glue . strtolower($matches[0]), $str), $glue);
}
If you're using the Laravel framework, a simpler built-in method exists:
$converted = Str::snake('fooBar'); // -> foo_bar
See documentation here:
https://laravel.com/docs/9.x/helpers#method-snake-case
The open source TurboCommons library contains a general purpose formatCase() method inside the StringUtils class, which lets you convert a string to lots of common case formats, like CamelCase, UpperCamelCase, LowerCamelCase, snake_case, Title Case, and many more.
https://github.com/edertone/TurboCommons
To use it, import the phar file to your project and:
use org\turbocommons\src\main\php\utils\StringUtils;
echo StringUtils::formatCase('camelCase', StringUtils::FORMAT_SNAKE_CASE);
// will output 'camel_Case'

Categories