Find nth character except if its enclosed in brackets php - php

I use the following function to find the nth character in a string which works well. However there is one exception, lets say its a comma for this purpose, what i need to alter about this is that if the coma is within ( and ) then it shouldnt count that
function strposnth($haystack, $needle, $nth=1, $insenstive=0)
{
//if its case insenstive, convert strings into lower case
if ($insenstive) {
$haystack=strtolower($haystack);
$needle=strtolower($needle);
}
//count number of occurances
$count=substr_count($haystack,$needle);
//first check if the needle exists in the haystack, return false if it does not
//also check if asked nth is within the count, return false if it doesnt
if ($count<1 || $nth > $count) return false;
//run a loop to nth number of occurrence
//start $pos from -1, cause we are adding 1 into it while searching
//so the very first iteration will be 0
for($i=0,$pos=0,$len=0;$i<$nth;$i++)
{
//get the position of needle in haystack
//provide starting point 0 for first time ($pos=0, $len=0)
//provide starting point as position + length of needle for next time
$pos=strpos($haystack,$needle,$pos+$len);
//check the length of needle to specify in strpos
//do this only first time
if ($i==0) $len=strlen($needle);
}
//return the number
return $pos;
}
So ive got the regex working that only captures the comma when outside of () which is:
'/,(?=[^)]*(?:[(]|$))/'
and you can see a live example working here:
http://regex101.com/r/xE4jP8
but im not sure how to make it work within the strpos loop, i know what i need to do, tell it the needle has this regex exception but i am not sure how to make it work. Maybe i should ditch the function and use another method?
Just to mention my end result i want is to split the string after every 6 commas before the next string starts, example:
rttr,ertrret,ertret(yes,no),eteert,ert ert,rtrter,0 rttr,ert(yes,no)rret,ert ret,eteert,ertert,rtrter,1 rttr,ertrret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,eteert,ertert,rtrter,2 rttr,ert(white,black)rret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,eteert,ertert,rtrter,0 rttr,ertrret,ert ret,et(blue,green)eert,ertert,rtrter,1
Note that there is always a 1 digit number (1-3) and a space after the 6th comma before the next part of the string begins but i cant really rely on that as its possible earlier in the string this pattern could happen so i can always rely on the fact ill need to split the string after the first digit and space after the 6th comma. So i want to split the string directly after this.
For example the above string would be split like this:
rttr,ertrret,ertret(yes,no),eteert,ert ert,rtrter,0
rttr,ert(yes,no)rret,ert ret,eteert,ertert,rtrter,1
rttr,ertrret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,eteert,ertert,rtrter,2
rttr,ert(white,black)rret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,eteert,ertert,rtrter,0
rttr,ertrret,ert ret,et(blue,green)eert,ertert,rtrter,1
I can do that myself pretty easily if i know how to get the position of the character then i can use substr to split it but an easier way might be preg_split but im not sure how that would work until i figure this part out
I hope i wasnt too confusing in explaining, i bet i was :)

For these kind of nesting problems regex usually is not the right tool. However, when the problem is actually not that complicated, as yours seems to be, regex will do just fine.
Try this:
(?:^|,)((?:[^,(]*(?:\([^)]*\))?)*)
^ start the search with a comma or the start of the string
^ start non capture group
^ search until comma or open parenthesis
^ if parenthesis found then capture until
^ end of parenthesis
^ end of capture group repeat if necessary
See it in action: http://regex101.com/r/eS0cX4
As you can see this will capture everything between the comma's outside of the parenthesis. If you get all these matches into an array using preg_match_all you can split it any which way you like.

Related

How can I get all occurrences of this pattern with the regex of PHP?

How can I get, into an array, all occurrences of this pattern 4321[5-9][7-9]{6} but excluding, for example, the occurrences where there is a digit immediately before the value, or immediately after it?
For instance, 43217999999 should be valid but 143217999999 (note the number 1 at the beginning) should not be valid.
As the first example, 432179999991 shouldn't be valid because of the 1 that it has in the end.
The added difficulty, at least for me, is that I have to parse this in whatever position I can find it inside a string.
The string looks like this, literally:
43217999997 / 543217999999 // 43217999998 _ 43217999999a43216999999-43216999999 arandomword 432159999997
As you would be able to note, it has no standard way of separating the values (I marked in bold the values that would make it invalid, so I shouldn't match those)
My idea right now is something like this:
(\D+|^)(4321[5-9][7-9]{6})(\D+|$)
(\D+|^) meaning that I expect in that position the start of the string or at least one non-digit and (\D+|$) meaning that I expect there the end of the string or at least one non-digit.
That obviously doesn't do what I picture in my head.
I also tried do it in two steps, first:
preg_match_all("/\D+4321[5-9][7-9]{6}\D+|4321[5-9][7-9]{6}\D+|4321[5-9][7-9]{6}$/", $input, $outputArray);
and then:
for($cont = 0; $cont < count($outputArray); $cont++) {
preg_match("/4321[5-9][7-9]{6}/", $outputArray[0][$cont], $outputArray2[]);
}
so I can print
echo "<pre>" . print_r($outputArray2, true) . "</pre>";
but that doesn't let me exclude the ones that have a number before the start of the value (5432157999999 for example), and then, I am not making any progress with my idea.
Thanks in advance for any help.
If you literally want to check if there is no digit before or after the match you can use negative look ahead and look behind.
(?![0-9]) at the end means: "is not followed by 0-9"
(?<![0-9]) at the start means: "is not preceded by 0-9"
See this example https://regex101.com/r/6xbmJk/1

PHP finding "words" within a string

I need to compare 2 lists of strings against each other and output strings which contain the strings searched for. should be very easy, i just can't figure it out.
to overly simplify it, let's use arrays. I am accessing an API with SOAP and running it against my own list contained in a table, but.... let's use arrays. the comparison is what i'm having trouble with.
hit submit button on listsearch.php and it executes.
ARRAY Mylist : TED, DEAD, FIRST, LAST, PUPPY
ARRAY TheirList..<br> teddybearnoose, <br>hauntedhouse, <br>hehasdeparted, <br>deadmouse, <br>walkingdead, <br>thegratefuldead, <br>firstkiss, <br>thinkfirst,<br> firsttobelast,<br> firstmanonthemoon, <br>firstreattempted, <br>somecrap, <br>something, <br>notdisplayed, <br>50000otherwords,<br> miscjunk
outputs as:
TEDdybearnoose<br>
haunTEDhouse<br>
hehasdeparTED<br>
DEADmouse<br>
walkingDEAD<br>
thegratefulDEAD<br>
FIRSTkiss<br>
thinkFIRST<br>
FIRSTtobeLAST <--- note<br>
FIRSTmanonthemoon<br>
FIRSTreattempTED <--- note<br>
<br>
only outputs strings which contain a string in my list, in any position. CAPS is just to make the words stand out to you. not important.
now, part 2?
same "TheirList", except i type a keyword into a text area, and select whether i want it at the beginning end or anywhere from a dropdown.
keywordsearch.php
search for: [ TED ] at: [beginning / end / anywhere] of string.
how would you make that one work?
Thanks in advance. This should be a breeze for most of you. I appreciate it. i'll try to answer questions promptly
You can use strpos() to find the position of a substring (docs).
It makes it very easy to check whether the substring occurred at the beginning or at the end of the string:
// String contains substring
strpos($string, $substring) !== false;
// String starts with substring
strpos($string, $substring) === 0;
// String ends with substring
strpos($string, $substring) === strlen($string) - strlen($substring);

Get string between first and third occurence of character

I have many Strings looking like this:
QR-DF-6549-1 and QR-DF-6549
I want to get these parts of the strings:
DF-6549
Edit:
So I also want to get rid of the "-1" at the end, in case it exists.
How can I do this with php? I know substr but I am a bit lost at this one.
Thank you very much!
A regular expression is probably the best way given your sample data
// ^ start matching at start of string
// [A-Z]{2}- must start with two capital letters and a dash
// ( we want to capture everything that follows
// [A-Z]{2}- next part must start with two capital letters and a dash
// \d+ a sequence of one or more digits
// ) end the capture - this will be index 1 in the $match array allowed
if (preg_match('/^[A-Z]{2}-([A-Z]{2}-\d+)/', $str, $match)) {
$data=$match[1];
}

Delete multiple file for/while

I have a php pull down that I select an item and delete
all files associated with it.
It works well if there was only 5 or 6. After I put the
first 4 to test and get it working I realized it could
take a very long time to enter in a couple hundred and
would blot the script.
Not knowing enough about for and while loops is there
anyone that might have a way to help?
There will never be more than one set deleted at a time.
Thanks in advance.
<?php
$workitem = $_POST["workitem"];
$workdirPAth = "/var/work.files/";
if($workitem == 'item1.php')
{
unlink("$workdirPath/page1.php");
unlink("$workdirPath/temp1.php");
unlink("$workdirPath/all1.php");
}
if($workitem == 'item2.php')
{
unlink("$workdirPath/page2.php");
unlink("$workdirPath/temp2.php");
unlink("$workdirPath/all2.php");
}
if($workitem == 'item3.php')
{
unlink("$workdirPath/page3.php");
unlink("$workdirPath/temp3.php");
unlink("$workdirPath/all3.php");
}
if($workitem == 'item4.php')
{
unlink("$workdirPath/page4.php");
unlink("$workdirPath/temp4.php");
unlink("$workdirPath/all3.php");
?>
Some simple pattern matching and substitution is all you need here.
First, the code:
1. if (preg_match('/^item(\d+)\.php$/', $workitem, $matches)) {
2. $number = $matches[1];
3. foreach(array('page','temp','all') as $base) {
4. unlink("$workdirPath/$base$number.php");
5. }
6. } else {
7. # unrecognized work item value; complain to user or whatever
8. }
The preg_match function takes a pattern, a string, and an array. If the string matches the pattern, the parts that match are stored in the array. The particular type of pattern is a *p*erl5-compatible *reg*ular expression, which is where the preg_ part of the name comes from.
Regular expressions are scary-looking to the uninitiated, but they're a handy way to scan a string and get some values out of it. Most characters just represent themselves; the string "foo" matches the regular expression /foo/. But some characters have special meanings that let you make more general patterns to match a whole set of strings where you don't have to know ahead of time exactly what's in them.
The /s just mark the beginning and end of the actual regular expression; they're there because you can stick additional modifier flags inside the string along with the expression itself.
The ^and $ arepresent the beginning and end of the string. "/foo/" matches "foo", but also "foobar", "bunnyfoofoo", and so on - any string that contains "foo" will match. But /^foo$/ matches only "foo" exactly.
\d means "any digit". + means "one or more of that last thing". So \d+ means "one or more digits".
The period (.) is special; it matches any character at all. Since we want a literal period, we have to escape it with a backslash; \. just matches a period.
So our regular expression is '/^item\d+\.php$/', which will match any itemnumber.php filename. But that's not quite enough. The preg_match function is basically a binary test: does the string match the pattern or not, yes or no? In this case, it's not enough to just say "yup, the string is valid"; we need to know which items specifically the user specified. That's what capture groups are for. We use parentheses to say "remember what matched this part", and provide an array name that gets filled with those remembrances.
The part of the string that matches the whole regular expression (which may not be the whole string, if the regular expression isn't anchored with ^...$ like this one is) is always put in element 0 of the array. If you use parentheses in the regular expression, then the part of the string that matches the part of the regular expression inside the first pair of parentheses is stored in element 1 of the array; if there's a second set of parentheses, the matching part of the string goes in element 2 of the array, and so on.
So we put parentheses around our number ((\d+)) and then the actual number will be remembered in element 1 of our $matches array.
Great, we have a number. Now we just need to use it to build up the filenames we want to delete.
In each case, we want to delete three files: page$n.php, temp$n.php, and all$n.php, where $n is the number we extracted above. We could just put three unlink calls, but since they're all so similar, we can use a loop instead.
Take the different prefixes that are the same no matter the number, and make an array out of them. Then loop over that array. In the body of the loop, the variable $base will contain whichever element of the array it's currently on. Stick that between the $workdirPath prefix and the $number we got from the match, append .php, and that's your file. unlink it and go back to the top of the loop to grab the next one.

Filter array of numeric PIN code strings which may be in the format "######" or "### ###"

I have a PHP array of strings. The strings are supposed to represent PIN codes which are of 6 digits like:
560095
Having a space after the first 3 digits is also considered valid e.g. 560 095.
Not all array elements are valid. I want to filter out all invalid PIN codes.
Yes you can make use of regex for this.
PHP has a function called preg_grep to which you pass your regular expression and it returns a new array with entries from the input array that match the pattern.
$new_array = preg_grep('/^\d{3} ?\d{3}$/',$array);
Explanation of the regex:
^ - Start anchor
\d{3} - 3 digits. Same as [0-9][0-9][0-9]
? - optional space (there is a space before ?)
If you want to allow any number of any whitespace between the groups
you can use \s* instead
\d{3} - 3 digits
$ - End anchor
Yes, you can use a regular expression to make sure there are 6 digits with or without a space.
A neat tool for playing with regular expressions is RegExr... here's what RegEx I came up with:
^[0-9]{3}\s?[0-9]{3}$
It matches the beginning of the string ^, then any three numbers [0-9]{3} followed by an optional space \s? followed by another three numbers [0-9]{3}, followed by the end of the string $.
Passing the array into the PHP function preg_grep along with the Regex will return a new array with only matching indeces.
If you just want to iterate over the valid responses (loop over them), you could always use a RegexIterator:
$regex = '/^\d{3}\s?\d{3}$/';
$it = new RegexIterator(new ArrayIterator($array), $regex);
foreach ($it as $valid) {
//Only matching items will be looped over, non-matching will be skipped
}
It has the benefit of not copying the entire array (it computes the next one when you want it). So it's much more memory efficient than doing something with preg_grep for large arrays. But it also will be slower if you iterate multiple times (but for a single iteration it should be faster due to the memory usage).
If you want to get an array of the valid PIN codes, use codaddict's answer.
You could also, at the same time as filtering only valid PINs, remove the optional space character so that all PINs become 6 digits by using preg_filter:
$new_array = preg_filter('/^(\d{3}) ?(\d{3})$/D', '$1$2', $array);
The best answer might depend on your situation, but if you wanted to do a simple and low cost check first...
$item = str_replace( " ", "", $var );
if ( strlen( $item ) !== 6 ){
echo 'fail early';
}
Following that, you could equally go on and do some type checking - as long as valid numbers did not start with a 0 in which case is might be more difficult.
If you don't fail early, then go on with the regex solutions already posted.

Categories