Good morning,
I need to truncate a string with a specific delimiter character.
For example with that string:
myString = 'customname_489494984';
I would like to strip automatically every string parts after "_"
(in this case "489494984").
Is there a function in order to truncate the string part after a specific
delimiter?
Many thanks!
François
You can also use strstr() which finds the first occurrence of a string:
$myString= "customname_489494984";
echo strstr($myString, '_', true);
Here's my reference: http://php.net/manual/en/function.strstr.php
Use a simple combo of substr and strpos:
$myString = 'customname_489494984';
echo substr($myString, 0, strpos($myString, '_'));
BONUS - Wrap it in a custom function:
function truncateStringAfter($string, $delim)
{
return substr($string, 0, strpos($string, $delim));
}
echo truncateStringAfter('customname_489494984', '_');
Try this,It will remove anything after "_".
$string= 'customname_489494984';
$string=substr($string, 0, strrpos($string, '_'));
The problem with other straightforward approaches is that if the String doesn't contain “_” then an empty String will be returned. Here is a small StringUtils class (part of a larger one) that does that with beforeFirst static method in a multibyte safe way.
var_dump(StringUtils::beforeFirst("customname_489494984", "_"));
class StringUtils
{
/**
* Returns the part of a string <b>before the first</b> occurrence of the string to search for.
* #param <b>$string</b> The string to be searched
* #param <b>$search</b> The string to search for
* #param <b>$caseSensitive boolean :optional</b> Defines if the search will be case sensitive. By default true.
* #return string
*/
public static function beforeFirst($string,$search,$caseSensitive = true)
{
$firstIndex = self::firstIndexOf($string, $search,$caseSensitive);
return $firstIndex == 0 ? $string : self::substring($string, 0 , $firstIndex);
}
/**
* Returns a part of the string from a character and for as many characters as provided
* #param <b>$string</b> The string to retrieve the part from
* #param <b>$start</b> The index of the first character (0 for the first one)
* #param <b>$length</b> The length of the part the will be extracted from the string
* #return string
*/
public static function substring($string,$start,$length = null)
{
return ( $length == null ) ? ( mb_substr($string, $start) ) : ( $length == 0 ? "" : mb_substr($string, $start , $length) );
}
/**
* Return the index of <b>the first occurance</b> of a part of a string to the string
* #param <b>$string</b> The string to be searched
* #param <b>$search</b> The string to search for
* #param <b>$caseSensitive boolean :optional</b> Defines if the search will be case sensitive. By default true.
* #return number
*/
public static function firstIndexOf($string,$search,$caseSensitive = true)
{
return $caseSensitive ? mb_strpos($string, $search) : mb_stripos($string, $search);
}
/**
* Returns how many characters the string is
* #param <b>$string</b> The string
* #return number
*/
public static function length($string)
{
return mb_strlen($string);
}
}
Related
I am using preg_match to verify an input :
$string = 'test';
if (preg_match ('/[^-a-z\d]/i', $string))
{
return true;
}
else
{
return false;
}
Well I can not understand why it return false here !
Am I missing something ?
Shouldn't it return true ?
The solution :
(preg_match ('/^[-a-z\d]*$/i', $string))
Tyr this
(preg_match ('/^[-A-Za-z\d]*$/', $string))
So preg_match() can return 0, 1, or false it is probably safer to do a strict check that the result equals 1, i.e.
abstract class Pattern
{
/** #const int */
const TEXT_MATCH = 1;
/**
* #param string $pattern regex pattern to match against
* #param string $string the text string to search
*
* #return bool
*/
public static function match(string $pattern, string $string): bool
{
return preg_match($pattern, $string) === self::TEXT_MATCH;
}
}
Primarily the bit you're interested in is this
preg_match($pattern, $string) === 1;
If you then use it:
$result = Pattern::match('/[^-a-z\d]/i', 'test');
var_dump($result); // false
The string doesn't pass the check. If that doesn't resolve your issue we're going to need to see a few examples of strings you are passing through that are always passing. It's probably also worth explaining what you want the regex to match against, because that could be part of the issue also.
What is the easiest way to check if a string contains a valid float?
For example
is_string_float("1") = true
is_string_float("1.234") = true
is_string_float("1.2e3") = true
is_string_float("1b2") = false
is_string_float("aldhjsfb") = false
The easiest way would be to use built in function is_float(). To test if a variable is a number or a numeric string you must use is_numeric().
If you really want to know if a string contains a float and ONLY a float you cannot use is_float() (wrong type) or is_numeric() (returns true for a string like "1" too) only. I'd use
<?php
function foo(string $string) : bool {
return is_numeric($string) && strpos($string, '.') !== false;
}
?>
instead.
maybe you can use a couple of functions
out of the box
function is_string_float($string) {
if(is_numeric($string)) {
$val = $string+0;
return is_float($val);
}
return false;
}
This can easily be achieved by double casting.
/**
* #param string $text
* #return bool
*/
function is_string_float(string $text): bool
{
return $text === (string) (float) $text;
}
From what I've done research on, I can't seem to find a correct method to format a multiline phpdoc #param line. What is the recommended way to do so?
Here's an example:
/**
* Prints 'Hello World'.
*
* Prints out 'Hello World' directly to the output.
* Can be used to render examples of PHPDoc.
*
* #param string $noun Optional. Sends a greeting to a given noun instead.
* Input is converted to lowercase and capitalized.
* #param bool $surprise Optional. Adds an exclamation mark after the string.
*/
function helloYou( $noun = 'World', $surprise = false ) {
$string = 'Hello ' . ucwords( strtolower( $string ) );
if( !!$surprise ) {
$string .= '!';
}
echo $string;
}
Would that be correct, or would you not add indentation, or would you just keep everything on one, long line?
You can simply do it this way:
/**
*
* #param string $string Optional. Sends a greeting to a given noun instead.
* Input is converted to lowercase and capitalized.
* #param bool $surprise
*/
function helloYou( $string = 'World', $surprise = false )
{
$string = 'Hello ' . ucwords( strtolower( $string ) );
if( !!$surprise ) {
$string .= '!';
}
echo $string;
}
So your example is fine except for one thing: the PHPDoc #param needs to have the same name as the PHP parameter. You called it $noun in the doc and $string in the actual code.
I'm trying to count how many numbers inside of a variable. Here is the regex that i use..
preg_match('/[^0-9]/', $password, $numbers, PREG_OFFSET_CAPTURE);
When I try to get all numbers one by one, I use:
print_r($this->filter->password_filter($data["user_password"]));
user_password is 123d4sd6789. Result is an empty array.
Array ( )
Well, you can easily do it using preg_split:
$temp = preg_split('/\d/', $password);
$numCount = count($temp) - 1;
Your regex is flawed, since you're trying to count the number of digits in a password, using:
/[^0-9]/
Won't cut it, perhpas you meant to write:
/[0-9]/
Because what you have now matches everything EXCEPT a number.
There are a great number of ways to do what you're trying to do, I've benchmarked 4 different approaches, and found that using regex is the fastest approach. Using the bundled pcre extension that PHP ships with, preg_split outperforms all other approaches ~70% of the time, 20% of the time, the loops are faster, though, and ~10% preg_match_all is fastest.
On codepad, who don't use the standard PCRE for some reason, preg_match_all doesn't work, nor did shuffle prove to be reliable, so I added a knuth method, and I decided to test the differences between /\d/ and /[0-9]/ in combination with preg_split instead. On codepad, regex is faster >95% of the time as a result.
In short: use preg_split + regex for the best results.
Anyway, here's the benchmark code. It may seem silly to put it all into a class, but really, it's the fair way to benchmark. The string that is processed is kept in memory, as are all the arrays that are used to time the functions, and compare speeds.
I'm not calling the test methods directly, either, but use a timeCall method instead, simply because I want the garbage collector to GC what needs to be GC'ed after each call. Anyway, it's not too difficult to figure this code out, and it's the results that matter
class Bench
{
/**
* #var string
*/
private $str = '123d4sd6789';
private $functions = array(
'regex' => null,
'regex2' => null,
'loop' => null,
'loop2' => null
);
private $random = null;
public function __construct()
{
$this->random = array_keys($this->functions);
if (!shuffle($this->random)) $this->knuth();
}
/**
* Knuth shuffle
*/
private function knuth()
{
for ($i=count($this->random)-1,$j=mt_rand(0,$i);$i>0;$j=mt_rand(0,--$i))
{
$temp = $this->random[$j];
$this->random[$j] = $this->random[$i];
$this->random[$i] = $temp;
}
}
/**
* Call all functions in random order, timing each function
* determine fastest approach, and echo results
* #param $randomize
* #return string
*/
public function test($randomize)
{
if ($randomize) if (!shuffle($this->random)) $this->knuth();
foreach($this->random as $func) $this->functions[$func] = $this->timeCall($func);
$fastest = array('f', 100000);
foreach($this->functions as $func => $time)
{
$fastest = $fastest[1] > $time ? array($func, $time) : $fastest;
echo 'Function ', $func, ' took ', $time, 'ms', PHP_EOL;
}
echo 'Fastest approach: ', $fastest[0], ' (', $fastest[1], 'ms)', PHP_EOL;
return $fastest[0];
}
/**
* Time function call
* #param string $func
* #return float mixed
*/
private function timeCall($func)
{
echo $func, PHP_EOL;
$start = microtime(true);
$this->{$func}();
return (microtime(true) - $start);
}
/**
* Count digits in string using preg_split
* #return int
*/
private function regex()
{
return count(preg_split('/\d/', $this->str)) - 1;
}
/**
* count digits in string using str_split + is_numeric + loop
* #return int
*/
private function loop()
{
$chars = str_split($this->str);
$counter = 0;
foreach($chars as $char) if (is_numeric($char)) ++$counter;
return $counter;
}
/**
* count digits by iterating over string, using is_numeric
* #return int
*/
private function loop2()
{
for($i=0,$j=strlen($this->str),$counter=0;$i<$j;++$i) if (is_numeric($this->str{$i})) ++$counter;
return $counter;
}
/**
* use preg_split + [0-9] instead of \d
* #return int
*/
private function regex2()
{
return count(preg_split('/[0-9]/', $this->str)) - 1;
if (preg_match_all('/[0-9]/',$this->str, $matches)) return count($matches);
return 0;
}
}
$benchmark = new Bench();
$totals = array();
for($i=0;$i<10;++$i)
{
$func = $benchmark->test($i);
if (!isset($totals[$func])) $totals[$func] = 0;
++$totals[$func];
if ($i < 9) echo PHP_EOL, '---------------------------------------------', PHP_EOL;
}
var_dump($totals);
Here's the codepad I set up
Do you really want a regex?
$arr1 = str_split($password);
$counter=0;
foreach($arr1 as $v){
if(is_numeric($v))$counter++;
}
Use preg_match_all to select all the numbers:
$reg = '/[0-9]/';
$string = '123d4sd6789';
preg_match_all($reg, $string, $out);
echo (count($out[0]));
i'm wondering why i'm having troubles when inserting strings in db like hey hey %80 the '%80' still produce an eception :
Uncaught exception 'MongoException' with message 'non-utf8 string: hey hey �'
what i need to do? :( is %80 not a utf-8; char? :O
js pass the string to the controller:
function new_pool_post(_url,_data,_starter){
$.ajax({
type:'POST',
data:_data,
dataType:'json',
url:_url,
beforeSend:function(){
$('.ajax-loading').show();
$(_starter).attr('disabled','disabled');
},
error:function(){
$('.ajax-loading').hide();
$(_starter).removeAttr('disabled');
},
success:function(json){
$('.ajax-loading').hide();
$(_starter).removeAttr('disabled');
if(json){
$('.pool-append').prepend(json.pool_post);
}
}
});
}
controller receive data:
$id_project = $this->input->post('id_project',true);
$id_user = $this->session->userdata('user_id');
$pool_post = $this->input->post('pool_post',true);
controller sanitize data :
public function xss_clean($str, $is_image = FALSE)
{
/*
* Is the string an array?
*
*/
if (is_array($str))
{
while (list($key) = each($str))
{
$str[$key] = $this->xss_clean($str[$key]);
}
return $str;
}
/*Remove non utf-8; chars*/
$str = htmlspecialchars(urlencode(preg_replace('/[\x00-\x1F\x80-\xFF]/','',$str)));
/*
* Remove Invisible Characters
*/
$str = remove_invisible_characters($str);
// Validate Entities in URLs
$str = $this->_validate_entities($str);
/*
* URL Decode
*
* Just in case stuff like this is submitted:
*
* Google
*
* Note: Use rawurldecode() so it does not remove plus signs
*
*/
$str = rawurldecode($str);
/*
* Convert character entities to ASCII
*
* This permits our tests below to work reliably.
* We only convert entities that are within tags since
* these are the ones that will pose security problems.
*
*/
$str = preg_replace_callback("/[a-z]+=([\'\"]).*?\\1/si", array($this, '_convert_attribute'), $str);
$str = preg_replace_callback("/<\w+.*?(?=>|<|$)/si", array($this, '_decode_entity'), $str);
/*
* Remove Invisible Characters Again!
*/
$str = remove_invisible_characters($str);
/*
* Convert all tabs to spaces
*
* This prevents strings like this: ja vascript
* NOTE: we deal with spaces between characters later.
* NOTE: preg_replace was found to be amazingly slow here on
* large blocks of data, so we use str_replace.
*/
if (strpos($str, "\t") !== FALSE)
{
$str = str_replace("\t", ' ', $str);
}
/*
* Capture converted string for later comparison
*/
$converted_string = $str;
// Remove Strings that are never allowed
$str = $this->_do_never_allowed($str);
/*
* Makes PHP tags safe
*
* Note: XML tags are inadvertently replaced too:
*
* <?xml
*
* But it doesn't seem to pose a problem.
*/
if ($is_image === TRUE)
{
// Images have a tendency to have the PHP short opening and
// closing tags every so often so we skip those and only
// do the long opening tags.
$str = preg_replace('/<\?(php)/i', "<?\\1", $str);
}
else
{
$str = str_replace(array('<?', '?'.'>'), array('<?', '?>'), $str);
}
/*
* Compact any exploded words
*
* This corrects words like: j a v a s c r i p t
* These words are compacted back to their correct state.
*/
$words = array(
'javascript', 'expression', 'vbscript', 'script',
'applet', 'alert', 'document', 'write', 'cookie', 'window'
);
foreach ($words as $word)
{
$temp = '';
for ($i = 0, $wordlen = strlen($word); $i < $wordlen; $i++)
{
$temp .= substr($word, $i, 1)."\s*";
}
// We only want to do this when it is followed by a non-word character
// That way valid stuff like "dealer to" does not become "dealerto"
$str = preg_replace_callback('#('.substr($temp, 0, -3).')(\W)#is', array($this, '_compact_exploded_words'), $str);
}
/*
* Remove disallowed Javascript in links or img tags
* We used to do some version comparisons and use of stripos for PHP5,
* but it is dog slow compared to these simplified non-capturing
* preg_match(), especially if the pattern exists in the string
*/
do
{
$original = $str;
if (preg_match("/<a/i", $str))
{
$str = preg_replace_callback("#<a\s+([^>]*?)(>|$)#si", array($this, '_js_link_removal'), $str);
}
if (preg_match("/<img/i", $str))
{
$str = preg_replace_callback("#<img\s+([^>]*?)(\s?/?>|$)#si", array($this, '_js_img_removal'), $str);
}
if (preg_match("/script/i", $str) OR preg_match("/xss/i", $str))
{
$str = preg_replace("#<(/*)(script|xss)(.*?)\>#si", '[removed]', $str);
}
}
while($original != $str);
unset($original);
// Remove evil attributes such as style, onclick and xmlns
$str = $this->_remove_evil_attributes($str, $is_image);
/*
* Sanitize naughty HTML elements
*
* If a tag containing any of the words in the list
* below is found, the tag gets converted to entities.
*
* So this: <blink>
* Becomes: <blink>
*/
$naughty = 'alert|applet|audio|basefont|base|behavior|bgsound|blink|body|embed|expression|form|frameset|frame|head|html|ilayer|iframe|input|isindex|layer|link|meta|object|plaintext|style|script|textarea|title|video|xml|xss';
$str = preg_replace_callback('#<(/*\s*)('.$naughty.')([^><]*)([><]*)#is', array($this, '_sanitize_naughty_html'), $str);
/*
* Sanitize naughty scripting elements
*
* Similar to above, only instead of looking for
* tags it looks for PHP and JavaScript commands
* that are disallowed. Rather than removing the
* code, it simply converts the parenthesis to entities
* rendering the code un-executable.
*
* For example: eval('some code')
* Becomes: eval('some code')
*/
$str = preg_replace('#(alert|cmd|passthru|eval|exec|expression|system|fopen|fsockopen|file|file_get_contents|readfile|unlink)(\s*)\((.*?)\)#si', "\\1\\2(\\3)", $str);
// Final clean up
// This adds a bit of extra precaution in case
// something got through the above filters
$str = $this->_do_never_allowed($str);
/*
* Images are Handled in a Special Way
* - Essentially, we want to know that after all of the character
* conversion is done whether any unwanted, likely XSS, code was found.
* If not, we return TRUE, as the image is clean.
* However, if the string post-conversion does not matched the
* string post-removal of XSS, then it fails, as there was unwanted XSS
* code found and removed/changed during processing.
*/
if ($is_image === TRUE)
{
return ($str == $converted_string) ? TRUE: FALSE;
}
log_message('debug', "XSS Filtering completed");
return $str;
}
controller pass sanitized data to model and model inserts in mongo db:
nothing more ... :)
I had related problem
eq
ucfirst for UTF-8 need use mb_ucfirst('helo','UTF-8');
And i think in your situation problem is with: substr need use mb_substr
else :
So meybe on the begin iconv convert to iso-8859-1 and on write to db icon to t Utf-8
To prevent the problem you can use
header("Content-Type: text/html; charset=UTF-8");
in the top of the php file.
Found the solution in this stackoverflow post and worked for me when migrating MySQL DB to MongoDB with latin special chars.