php break string into smaller parts based on string, not length - php

I have read up on the php functions wordwrap and chunk_split but I can't figure out how to break down a string into smaller chunks when there are no physical breaks in the string.
I have a URL-encoded string:
%5B%7B%22partNumber%22%3A%2243160-1104%22%7D%2C%7B%22partNumber%22%3A%2242410-6170%22%7D%2C%7B%22partNumber%22%3A%2222-10-2021%22%7D%2C%7B%22partNumber%22%3A%2255091-0674%22%7D%2C%7B%22partNumber%22%3A%2243160-0106%22%7D%2C%7B%22partNumber%22%3A%2287832-1420%22%7D%2C%7B%22partNumber%22%3A%2273415-1001%22%7D%2C%7B%22partNumber%22%3A%2253627-1274%22%7D%2C%7B%22partNumber%22%3A%2243650-0510%22%7D%5D
of a bunch of part numbers I'm feeding into an API. This API can only take 500 characters at a time before it returns a false to me, so I need to break my string down to UNDER 500 characters, but still be a complete, searchable string.
Meaning - however it's broken down, each iteration of this new string needs to be
under 500 characters
end with B%22, so that the next iteration of the string starts with
partNumber%22
I'm not sure how I would accomplish this using the wordwrap + explode method as I've only ever used this to break a string by length. Is there a function similar to this that I can use where I can specify an exact string to break at after so many characters?

use explode.
$apiStrings = explode("B%22", $string);
foreach($apiStrings as $apiString)
{
//Do request
}

Related

Check if a string longer than xx characters without going through the entire string with strlen()?

I'm displaying some very very long text strings with a show more link. So I need a way to find out how long the string is.
I know I can simply use strlen() to size the string up and then compare the returned length against a limit like 50 characters but for very long strings I think it's a performance drag, especially considering there are nearly a thousand strings to be gauged per user request.
So is there any way to just make sure if the string is longer than 50 characters and then stop?
I know I can make an in-house function to go through the string character by character but is there any better practice in this case as I believe this is a rather common problem?
If the only thing you need to check if the string is longer then 50 characters, then you can try to check if the 50th character exists.
For example, for a string of 25 chars long I need to know if it's longer then 20 characters or not.
$str = '123456789qwertyuiopasdfgh';
if($str[20] != '') echo 'It is longer';
else echo 'It is not';
I'm not sure if it's faster then simple strlen, but you can try some performance tests C:
UPD: Also, note that the index starts from 0

Find nth character except if its enclosed in brackets php

I use the following function to find the nth character in a string which works well. However there is one exception, lets say its a comma for this purpose, what i need to alter about this is that if the coma is within ( and ) then it shouldnt count that
function strposnth($haystack, $needle, $nth=1, $insenstive=0)
{
//if its case insenstive, convert strings into lower case
if ($insenstive) {
$haystack=strtolower($haystack);
$needle=strtolower($needle);
}
//count number of occurances
$count=substr_count($haystack,$needle);
//first check if the needle exists in the haystack, return false if it does not
//also check if asked nth is within the count, return false if it doesnt
if ($count<1 || $nth > $count) return false;
//run a loop to nth number of occurrence
//start $pos from -1, cause we are adding 1 into it while searching
//so the very first iteration will be 0
for($i=0,$pos=0,$len=0;$i<$nth;$i++)
{
//get the position of needle in haystack
//provide starting point 0 for first time ($pos=0, $len=0)
//provide starting point as position + length of needle for next time
$pos=strpos($haystack,$needle,$pos+$len);
//check the length of needle to specify in strpos
//do this only first time
if ($i==0) $len=strlen($needle);
}
//return the number
return $pos;
}
So ive got the regex working that only captures the comma when outside of () which is:
'/,(?=[^)]*(?:[(]|$))/'
and you can see a live example working here:
http://regex101.com/r/xE4jP8
but im not sure how to make it work within the strpos loop, i know what i need to do, tell it the needle has this regex exception but i am not sure how to make it work. Maybe i should ditch the function and use another method?
Just to mention my end result i want is to split the string after every 6 commas before the next string starts, example:
rttr,ertrret,ertret(yes,no),eteert,ert ert,rtrter,0 rttr,ert(yes,no)rret,ert ret,eteert,ertert,rtrter,1 rttr,ertrret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,eteert,ertert,rtrter,2 rttr,ert(white,black)rret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,et(blue,green)eert,ertert,rtrter,1
Note that there is always a 1 digit number (1-3) and a space after the 6th comma before the next part of the string begins but i cant really rely on that as its possible earlier in the string this pattern could happen so i can always rely on the fact ill need to split the string after the first digit and space after the 6th comma. So i want to split the string directly after this.
For example the above string would be split like this:
rttr,ertrret,ertret(yes,no),eteert,ert ert,rtrter,0
rttr,ert(yes,no)rret,ert ret,eteert,ertert,rtrter,1
rttr,ertrret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,eteert,ertert,rtrter,2
rttr,ert(white,black)rret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,et(blue,green)eert,ertert,rtrter,1
I can do that myself pretty easily if i know how to get the position of the character then i can use substr to split it but an easier way might be preg_split but im not sure how that would work until i figure this part out
I hope i wasnt too confusing in explaining, i bet i was :)
For these kind of nesting problems regex usually is not the right tool. However, when the problem is actually not that complicated, as yours seems to be, regex will do just fine.
Try this:
(?:^|,)((?:[^,(]*(?:\([^)]*\))?)*)
^ start the search with a comma or the start of the string
^ start non capture group
^ search until comma or open parenthesis
^ if parenthesis found then capture until
^ end of parenthesis
^ end of capture group repeat if necessary
See it in action: http://regex101.com/r/eS0cX4
As you can see this will capture everything between the comma's outside of the parenthesis. If you get all these matches into an array using preg_match_all you can split it any which way you like.

How to split a string and find the occurence of one string in another?

I need to figure out how to do some C# code in php, and im not sure exactly how.
so first off i need the Split function, im going to have a string like
"identifier 82asdjka271akshjd18ajjd"
and i need to split the identifier word from the rest. so in C#, i used string.Split(new char{' '}); or something like that (working off the top of my head) and got two strings, the first word, and then the second part.. i understand that the php split function has been deprecated as of PHP 5.3.0.. so thats not an option, what are the alternatives?
and im also looking for a IndexOf function, so if i had the above code again as an example, i would need the location of 271 in the string, so i can generate a substring.
you can use explode for splitting and strpos for finding the index of one string inside another.
$a = "identifier 82asdjka271akshjd18ajjd";
$arr = explode(' ',$a); // split on space..to get an array of size 2.
$pos = strpos($arr[1],'271'); // search for '271' in the 2nd ele of array.
echo $pos; // prints 8

How to count characters including white space and then break onto a new line after a certain length using PHP

How to count characters including white space and then break after a certain length for instance how would i break a string after 25 characters onto a new line using PHP?
Fortunately somebody's already done the work. Use wordwrap.
If you really want to reinvent the wheel for learning sake, here are a few pieces to get you started:
for (...) { }
strlen()
$str[$x] to access character x of string $str
%
.
Try chunk_split() if you don't mind cutting words in half. It treats whitespace as any other char.

php regular expression to filter out junk

So I have an interesting problem: I have a string, and for the most part i know what to expect:
http://www.someurl.com/st=????????
Except in this case, the ?'s are either upper case letters or numbers. The problem is, the string has garbage mixed in: the string is broken up into 5 or 6 pieces, and in between there's lots of junk: unprintable characters, foreign characters, as well as plain old normal characters. In short, stuff that's apt to look like this: Nyþ=mî;ëMÝ×nüqÏ
Usually the last 8 characters (the ?'s) are together right at the end, so at the moment I just have PHP grab the last 8 chars and hope for the best. Occasionally, that doesn't work, so I need a more robust solution.
The problem is technically unsolvable, but I think the best solution is to grab characters from the end of the string while they are upper case or numeric. If I get 8 or more, assume that is correct. Otherwise, find the st= and grab characters going forward as many as I need to fill up the 8 character quota. Is there a regex way to do this or will i need to roll up my sleeves and go nested-loop style?
update:
To clear up some confusion, I get an input string that's like this:
[garbage]http:/[garbage]/somewe[garbage]bsite.co[garbage]m/something=[garbage]????????
except the garbage is in unpredictable locations in the string (except the end is never garbage), and has unpredictable length (at least, I have been able to find patterns in neither). Usually the ?s are all together hence me just grabbing the last 8 chars, but sometimes they aren't which results in some missing data and returned garbage :-\
$var = '†http://þ=www.ex;üßample-website.î;ëcomÝ×ü/joy_hÏere.html'; // test case
$clean = join(
array_filter(
str_split($var, 1),
function ($char) {
return (
array_key_exists(
$char,
array_flip(array_merge(
range('A','Z'),
range('a','z'),
range((string)'0',(string)'9'),
array(':','.','/','-','_')
))
)
);
}
)
);
Hah, that was a joke. Here's a regex for you:
$clean = preg_replace('/[^A-Za-z0-9:.\/_-]/','',$var);
As stated, the problem is unsolvable. If the garbage can contain "plain old normal characters" characters, and the garbage can fall at the end of the string, then you cannot know whether the target string from this sample is "ABCDEFGH" or "BCDEFGHI":
__http:/____/somewe___bsite.co____m/something=__ABCDEFGHI__
What do these values represent? If you want to retain all of it, just without having to deal with garbage in your database, maybe you should hex-encode it using bin2hex().
You can use this regular expression :
if (preg_match('/[\'^£$%&*()}{##~?><>,|=_+¬-]/', $string) ==1)

Categories