Delete first character of string in-place using PHP - php

There is this post:
PHP Subtract First Character of String
It advices me to use substr(...);
I want to keep a rolling text to log if an error occurs, (The 1000 latest characters from a stream) but it seems like there would be a better way than to create a 1000 character string from a 1001 character string, then assigning that string to the latter.
I will be doing this in a very tight loop, so performance should not be negligible (even though I haven't measure this yet).
Is there any way to delete first character of a string in-place?

This should work properly but not a good choice
<?php
$str = '12345678';
$str[0] = null;
echo $str; // output: 2345678
?>
Since
echo strlen($str); // output: 8 because first character is not deleted, it is "hidden"
Take me over 500 points if this is helpful (:

The obvious question would be why you would want to do this in PHP? You probably have operating system support for rolling logs.
However, if you wish to have a robust solution you are most likely best off using substr.
Another option could be to use the array access for a string:
unset(your_string[0]);

Related

How to get NumberFormatter::parse() to only parse actual numeric strings?

I’m trying to parse some strings in some messed-up CSV files (about 100,000 rows per file). Some columns have been squished together in some rows, and I’m trying to get them unsquished back into their proper columns. Part of the logic needed there is to find whether a substring in a given colum is numeric or not.
Non-numeric strings can be anything, including strings that happen to begin with a number; numeric strings are generally written the European way, with dots used for thousand separators and commas for decimals, so without going through a bunch of string replacements, is_numeric() won’t do the trick:
\var_dump(is_numeric('3.527,25')); // bool(FALSE)
I thought – naïvely, it transpires – that the right thing to do would be to use NumberFormatter::parse(), but it seems that function doesn’t actually check whether the string given as a whole is parseable as a numeric string at all – instead it just starts at the beginning and when it reaches a character not allowed in a numeric string, cuts off the rest.
Essentially, what I’m looking for is something that will yield this:
$formatter = new \NumberFormatter('de-DE', \NumberFormatter::DECIMAL);
\var_dump($formatter->parse('3.527,25')); // float(3527.25)
\var_dump($formatter->parse('3thisisnotanumber')); // bool(FALSE)
But all I can get is this:
$formatter = new \NumberFormatter('de-DE', \NumberFormatter::DECIMAL);
\var_dump($formatter->parse('3.527,25')); // float(3527.25)
\var_dump($formatter->parse('3thisisnotanumber')); // float(3)
I figured perhaps the problem was that the LENIENT_PARSE attribute was set to true, but setting it to false ($formatter->setAttribute(\NumberFormatter::LENIENT_PARSE, 0)) has no effect; non-numeric strings still get parsed just fine as long as they begin with a number.
Since there are so many rows and each row may have as many as ten columns that need to be validated, I’m looking at upwards of a million validations per file – for that reason, I would prefer avoiding a preg_match()-based solution, since a million regex match calls would be quite expensive.
Is there some way to tell the NumberFormatter class that you would like it to please not be lenient and only treat the string as parseable if the entire string is numeric?
You can strip all the separators and check if whatever remains is a numeric value.
function customIsNumeric(string $value): bool
{
return is_numeric(str_replace(['.', ','], '', $value));
}
Live test available here.
You can use is_numeric() to check that it is only numbers before parsing. But NumberFormatter does not do what you are looking for here.

php covert a Hexadecimal number 273ef9 into a path 27/3e/f9

As the title reads, what it is an effeicent way to covert a Hexadecimal number such as 273ef9 into a path such as 27/3e/f9 in PHP?
updated:::
actually, I want a unsual number convert to dexadecimal and furthr convert to a path....but may be we can skip the middle step.
How about combining a str_split with implode? Might not be super efficient but very readable:
implode('/',str_split("273ef9",2));
As a side note, this will of course work well with larger hex strings and can handle partial (3,5,7 in length) hex numbers (by just printing it as a single letter after the last slash).
Edit: With what you're asking now (decimal -> hex -> path), it would look like this:
$num = 2572025;
$hex = dechex($num);
implode('/',str_split($hex,2));
Of course, you can combine it for an even shorter but less readable representation:
implode('/',str_split(dechex($num),2));
The most efficient approach is to touch each character in the hex value exactly once, building up the string as you go. Because the string may have either an odd or even number of digits, you'll have to start with a check for this, outputting a single digit if it's an odd-length string. Then use a for loop to append groups of two digits, being careful with whether or not to add a slash. It will be a few lines of code.
Unless this code is being executed many millions of times, it probably isn't worth writing out this algorithm; Michael Petrov's is so readable and so nice. Go with this unless you have a real need to optimize.
By the way, to go from a decimal number to a hex string, just use dechex :)

PHP dealing with huge string

I have to replace xmlns with ns in my incomming xml in order to fix SimpleXMLElements xpath() function. Most functions do not have a performance problem. But there allways seems to be an overhead as the string grows.
E.g. preg_replace on a 2 MB string takes 50ms to process, even if I limit the replaces to 1 and the replace is done at the very beginning.
If I substr the first few characters and just replace that part it is slightly faster. But not really that what I want.
Is there any PHP method that would perform better in my problem? And if there is no option, could a simple php extension help, that just does Replace => SimpleXMLElement in C?
If you know exactly where the offending "x", "m" and "l" are, you can just use something like $xml[$x_pos] = ' '; $xml[$m_pos] = ' '; $xml[$l_pos] = ' ' to transform them into spaces. Or transform them into ns___ (where _ = space).
You're always going to get an overhead when trying to do this - you're dealing with a char array and trying to do replace multiple matching elements of the array (i.e. words).
50ms is not much of an overhead, unless (as I suspect) you're trying to do this in a loop?
50ms sounds pretty reasonable to me, for something like this. The requirement itself smells of something being wrong.
Is there any particular reason that you're using regular expressions? Why do people keep jumping to the overkill regex solution?
There is a bog-standard string replace function called str_replace that may do what you want in a fraction of the time (though whether this is right for you depends on how complex your search/replace is).
From the PHP source, as we can see, for example here:
http://svn.php.net/repository/php/php-src/branches/PHP_5_2/ext/standard/string.c
I don`t see, any copies, but I'm not expert in C. From the other hand we can see there many convert to string calls, which at 1st sight could copy values. If they copy values, then we in trouble here.
Only if we in trouble
Try to invent some str_replace wheel here with the help of string-by-char processing. For example we have string $somestring = "somevalue". In PHP we could work with it's chars by indexes as echo $somestring{0}, which will give us "s" or echo $somestring{2} which will give us "m". I'm not sure in this way, but it's possible, if official implimentations don't use references, as they should use.

Blocking Cuss/Vulgar/Obscenity Terms in PHP

I know you might laugh, but actually this is a common need in most apps. Many apps that take in customer/visitor input may need to filter cuss words or vulgar terms.
Sometimes PHP changes and new stuff gets added in. For instance, just the other day I learned about MultiCurl API in PHP5. So, anyway, is there a new native function in PHP that lets me filter most common English-based cuss words in a string, as well as flip a boolean to say, "string had English-based cuss words in it"? It doesn't need to be perfect, obviously, but cut out a good bit of garbage and let me replace it with ### for instance.
If that's not part of PHP yet, then does anyone have a function that I can use which cloaks the cuss word list? For instance, I want it such that I can drop the class in a project and not have to worry about another programmer getting offended. In other words, a decently encoded cuss word list -- not one actually spelled out.
Now, obviously it needs to be flexible and let words like "rebuttal" get through.
tl;dr: Does PHP5 now have a native function that can filter obscene words? And if not, does anyone have a class that encodes a cuss word list so that it doesn't offend other programmers?
I doubt this is something that would be a high priority for the core PHP team since that treads dangerously close to censorship. Censorship in that they would have a 'master' list of 'inappropriate' language which should be filtered.
You can do this fairly simply. Make up an array of all the words you want filtered out and when a page is displayed that contains user input run a preg_filter() on the words.
$bad_words = array('bleeping', 'blooping');
$submitted_text = 'bleh blah....';
echo preg_filter($bad_words, $replace, $submitted_text);
Note: you will have to deal with the edge cases where a bad word might be inside of a good word (i.e.- 'shitzu[sic] dog')
EDIT
For the bad-words-inside-good-words issue, you can add to the regular expression to require space at the beginning and end of the bad word. If you have lots of submissions though, it's going to be a constant battle to keep up with the trolls.
<?php
$badwords = "fuc";
$replacebad = "****";
$string = $_POST['something'];
$filtered = str_ireplace($badwords, $replacebad, "$string");
echo $filtered;
?>
something like this ?
Edit:
sorry I didn't noticed the php5 part ..

php - Is strpos the fastest way to search for a string in a large body of text?

if (strpos(htmlentities($storage->getMessage($i)),'chocolate'))
Hi, I'm using gmail oauth access to find specific text strings in email addresses. Is there a way to find text instances quicker and more efficiently than using strpos in the above code? Should I be using a hash technique?
According to the PHP manual, yes- strpos() is the quickest way to determine if one string contains another.
Note:
If you only want to determine if a particular needle occurs within haystack,
use the faster and less memory intensive function strpos() instead.
This is quoted time and again in any php.net article about other string comparators (I pulled this one from strstr())
Although there are two changes that should be made to your statement.
if (strpos($storage->getMessage($i),'chocolate') !== FALSE)
This is because if(0) evaluates to false (and therefore doesn't run), however strpos() can return 0 if the needle is at the very beginning (position 0) of the haystack. Also, removing htmlentities() will make your code run a lot faster. All that htmlentities() does is replace certain characters with their appropriate HTML equivalent. For instance, it replaces every & with &
As you can imagine, checking every character in a string individually and replacing many of them takes extra memory and processor power. Not only that, but it's unnecessary if you plan on just doing a text comparison. For instance, compare the following statements:
strpos('Billy & Sally', '&'); // 6
strpos('Billy & Sally', '&'); // 6
strpos('Billy & Sally', 'S'); // 8
strpos('Billy & Sally', 'S') // 12
Or, in the worst case, you may even cause something true to evaluate to false.
strpos('<img src...', '<'); // 0
strpos('<img src...','<'); // FALSE
In order to circumvent this you'd end up using even more HTML entities.
strpos('<img src...', '<'); // 0
But this, as you can imagine, is not only annoying to code but gets redundant. You're better off excluding HTML entities entirely. Usually HTML entities is only used when you're outputting text. Not comparing.
strpos is likely to be faster than preg_match and the alternatives in this case, the best idea would be to do some benchmarks of your own with real example data and see what is best for your needs, although that may be overdoing it. Don't worry too much about performance until it starts to become a problem
strpos() return the begin position of first occurrence of string, if no match will return Null so statement is fairly usable.
if (!is_null(strpos($storage->getMessage($i),'chocolate'))) {}

Categories