php regular expression to detect index level

php regular expression to detect index level - php

I had a look at regular expressions in PHP but I don't really understand how they work.
I have various strings like "1-title" , "1-1-secondTitle", "1-2-otherTitle", This goes up to three level ("1-2-1-text"), so on every string I have this formating and I would like to check if there is one, two or three number before the string starts and then output "0", "1" or "2".
So to make it clearer
"1-index" should return "0"
"3-1-text" should return "1"
"5-2-1-otherTitle" should return "2"
is it possible to check the number of char before the first letter on a string?

Alternaatively, if you dont want to use regex, then, you could just iterate thru the string and overwrite the index. Consider this example:
$string = '5-2-1-otherTitle';
$string = str_ireplace('-', '', $string);
$index = null;
for($x = 0; $x < strlen($string); $x++) {
if(is_numeric($string[$x])) {
$index = $x;
}
}
echo $index; // outputs 2
// works
// echo preg_match_all("/[0-9]/", $string) - 1;

Related

Is it possible to cut a word in parts? [PHP]

I would like to ask if it is possible to cut a word like
"Keyboard" in multiple strings, in PHP?
I want the string to be cut whenever a / is in it.
Example:
String: "Key/boa/rd"
Now I want that the cut result look like this:
String1: "Key"
String2: "boa"
String3: "rd"

You can use the PHP explode function. So, if your string was "Key/boa/rd", you would do:
explode('/', 'Key/boa/rd');
and get:
[
"Key",
"boa",
"rd",
]
It's unclear from your question, but if you don't want an array (and instead would like variables) you can use array destructuring like so:
[$firstPart, $secondPart, $thirdPart] = explode('/', 'Key/boa/rd');
However, if the string only had one / then that approach could lead to an exception being thrown.

The answer by Nathaniel assumes that your original string contains / characters. It is possible that you only used those in your example and you want to split the string into equal-length substrings. The function for that is str_split and it looks like:
$substrings = str_split($original, 3);
That will split the string $original into an array of strings, each of length 3 (except the very last one if it doesn't divide equally).

You can travel through the line character by character, checking for your delimeter.
<?php
$str = "Key/boa/rd";
$i = $j = 0;
while(true)
{
if(isset($str[$i])) {
$char = $str[$i++];
} else {
break;
}
if($char === '/') {
$j++;
} else {
if(!isset($result[$j])) {
$result[$j] = $char;
} else {
$result[$j] .= $char;
}
}
}
var_export($result);
Output:
array (
0 => 'Key',
1 => 'boa',
2 => 'rd',
)
However explode, preg_split or strtok are probably the goto Php functions when wanting to split strings.

how to get last integer in a string?

I have variables like this:
$path1 = "<span class='span1' data-id=2>lorem ipsum</span>";
$path2 = "<span class='span2' data-id=14>lorem ipsum</span>";
I need to get value of data-id but I think it's not possible in php.
Maybe is possible to get last integer in a string ?
Something like:
$a = $path1.lastinteger(); // 2
$b = $path2.lastinteger(); // 14
Any help?

You could use a simple regex:
/data-id=[\"\']([1-9]+)[\"\']/g
Then you can build this function:
function lastinteger($item) {
preg_match_all('/data-id=[\"\']([1-9]+)[\"\']/',$item,$array);
$out = end($array);
return $out[0];
}
Working DEMO.
The full code:
function lastinteger($item) {
preg_match_all('/data-id=[\"\']([1-9]+)[\"\']/',$item,$array);
$out = end($array);
return $out[0];
}
$path1 = "<span class='span1' data-id=2>lorem ipsum</span>";
$path2 = "<span class='span2' data-id=14>lorem ipsum</span>";
$a = lastinteger($path1); //2
$b = lastinteger($path2); //14
References:
preg_match_all()
end()
Tutorial for regex: tutorialspoint.com
Good tool to create regex: regexr.com

If you'd rather not use a regular expression you can use the DOM API:
$dom = DOMDocument::loadHTML($path2);
echo $dom->getElementsByTagName('span')[0]->getAttribute('data-id');

Depending on how those variables get set, one solution could be to get the integer first and then inject it into the paths:
$dataId = 2;
$path1 = "<span class='span1' data-id='$dataId'>lorem ipsum</span>";
(Note that I added quotes around the data-id value, otherwise your HTML is invalid.)
If you aren't building the strings yourself, but need to parse them, I would not simply get the last integer. The reason is that you might get a tag like this:
<span class='span1' data-id='4'>Happy 25th birthday!</span>
In that case, the last integer in the string is 25, which isn't what you want. Instead, I would use a regular expression to capture the value of the data-id attribute:
$path1 = "<span class='span1' data-id='2'>lorem ipsum</span>";
preg_match('/data-id=\'(\d+)\'/', $path1, $matches);
$dataId = $matches[1];
echo($dataId); // => 2

I think that you should use regex, for example:
<?php
$paths = [];
$paths[] = '<span class=\'span1\' data-id=2>lorem ipsum dolor</span>';
$paths[] = '<span class=\'span2\' data-id=14>lorem ipsum</span>';
foreach($paths as $path){
preg_match('/data-id=([0-9]+)/', $path, $data_id);
echo 'data-id for '.$path.' is '.$data_id[1].'<br />';
}
This will output:
data-id for lorem ipsum dolor is 2
data-id for lorem ipsum is 14

function find($path){
$countPath = strlen($path);
for ($i = 0; $i < $countPath; $i++) {
if (substr($path, $i, 3) == "-id") {
echo substr($path, $i + 4, 1);
}
}}find($path1);

Wrote an appropiate function for this purpose.
function LastNumber($text){
$strlength = strlen($text); // get length of the string
$lastNumber = ''; // this variable will accumulate digits
$numberFound = 0;
for($i = $strlength - 1; $i > 0; $i--){ // for cicle reads your string from end to start
if(ctype_digit($text[$i])){
$lastNumber = $lastNumber . $text[$i]; // if digit is found, it is added to last number string;
$numberFound = 1; // and numberFound variable is set to 1
}else if($numberFound){ // if atleast one digit was found (numberFound == 1) and we find non-digit, we know that last number of the string is over so we break from the cicle.
break;
}
}
return intval(strrev($lastNumber), 10); // strrev reverses last number string, because we read original string from end to start. Finally, intval function converts string to integer with base 10
}

I know this has been answered, but if you really want to get the last int in a line without referring specifically to attributes or tag names, consider using this.
$str = "
asfdl;kjas;lkjfasl;kfjas;lf 999 asdflkasjfdl;askjf
<span class='span1' data-id=2>lorem ipsum</span>
<span class='span2' data-id=14>lorem ipsum</span>
Look at me I end with a number 1234
";
$matches = NULL;
$num = preg_match_all('/(.*)(?<!\d)(\d+)[^0-9]*$/m', $str, $matches);
if (!$num) {
echo "no matches found\n";
} else {
var_dump($matches[2]);
}
It will return an array from a multiline input of the last integers for each line:
array(4) {
[0] =>
string(3) "999"
[1] =>
string(1) "2"
[2] =>
string(2) "14"
[3] =>
string(4) "1234"
}

Cut string in PHP at nth-from-end occurrence of character

I have a string which can be written in a number of different ways, it will always follow the same pattern but the length of it can differ.
this/is/the/path/to/my/fileA.php
this/could/also/be/the/path/to/my/fileB.php
another/example/of/a/long/address/which/is/a/path/to/my/fileC.php
What I am trying to do is cut the string so that I am left with
path/to/my/file.php
I have some code which I got from this page and modified it to the following
$max = strlen($full_path);
$n = 0;
for($j=0;$j<$max;$j++){
if($full_path[$j]=='/'){
$n++;
if($n>=3){
break 1;
}
}
}
$path = substr($full_path,$j+1,$max);
Which basically cuts it at the 3rd instance of the '/' character, and gives me what is left. This was fine when I was working in one environment, but when I migrated it to a different server, the path would be longer, and so the cut would give me too long an address. I thought that rather than changing the hard coded integer value for each instance, it would work better if I had it cut the string at the 4th from last instance, as I always want to keep the last 4 'slashes' of information
Many thanks
EDIT - final code solution
$exploded_name = explode('/', $full_path);
$exploded_trimmed = array_slice($exploded_name, -4);
$imploded_name = implode('/', $exploded_trimmed);

just use explode with your string and if pattern is always the same then get last element of the array and your work is done
$pizza = "piece1/piece2/piece3/piece4/piece5/piece6";
$pieces = explode("/", $pizza);
echo $pieces[0]; // piece1
echo $pieces[1]; // piece2
Then reverse your array get first four elements of array and combine them using "implode"
to get desired string

This function below can work like a substr start from nth occurrence
function substr_after_nth($str, $needle, $key)
{
$array = explode($needle, $str);
$temp = array();
for ($i = $key; $i < count($array); $i++) {
$temp[] = $array[$i];
}
return implode($needle, $temp);
}
Example
$str = "hello-world-how-are-you-doing";
substr after 4th occurrence of "-" to get "you-doing"
call the function above as
echo substr_after_nth($str, "-", 4);
it will result as
you-doing

PHP: Check string for certain words

How can I check if data submitted from a form or querystring has certain words in it?
I'm trying to look for words containing admin, drop, create etc in form [Post] data and querystring data so I can accept or reject it.
I'm converting from ASP to PHP. I used to do this using an array in ASP (keep all illegal words in a string and use ubound to check the whole string for those words), but is there a better (efficient) way to do this in PHP?
Eg: A string like this would be rejected: "The administrator dropped a blah blah" because it has admin and drop in it.
I intend using this to check usernames when creating accounts and for other things too.
Thanks

You could use stripos()
int stripos ( string $haystack , string $needle [, int $offset = 0 ] )
You could have a function like:
function checkBadWords($str, $badwords) {
foreach ($badwords as $word) {
if (stripos(" $str ", " $word ") !== false) {
return false;
}
}
return true;
}
And to use it:
if (!checkBadWords('something admin', array('admin')) {
// ...
}

strpos() will let you search for a substring within a larger string. It's quick and works well. It returns false if the string's not found, and a number (which could be zero, so you need to use === to check) if it finds the string.
stripos() is a case-insensitive version of the same.
I'm trying to look for words containing admin, drop, create etc in form [Post] data and querystring data so I can accept or reject it.
I suspect that you are trying to filter the string so it's suitable for including in something like a database query, or something like that. If this is the case, this is probably not a good way to go about it, and you'd need to actually need to escape the string using mysql_real_escape_string() or equivalent.

$badwords = array("admin", "drop",);
foreach (str_word_count($string, 1) as $word) {
foreach ($badwords as $bw) {
if (strpos($word, $bw) === 0) {
//contains word $word that starts with bad word $bw
}
}
}
For JGB146, here is a performance comparison with regular expressions:
<?php
function has_bad_words($badwords, $string) {
foreach (str_word_count($string, 1) as $word) {
foreach ($badwords as $bw) {
if (stripos($word, $bw) === 0) {
return true;
}
}
return false;
}
}
function has_bad_words2($badwords, $string) {
$regex = array_map(function ($w) {
return "(?:\\b". preg_quote($w, "/") . ")"; }, $badwords);
$regex = "/" . implode("|", $regex) . "/";
return preg_match($regex, $string) != 0;
}
$badwords = array("abc", "def", "ghi", "jkl", "mnop");
$string = "The quick brown fox jumps over the lazy dog";
$start = microtime(true);
for ($i = 0; $i < 10000; $i++) {
has_bad_words($badwords, $string);
}
echo "elapsed: ". (microtime(true) - $start);
$start = microtime(true);
for ($i = 0; $i < 10000; $i++) {
has_bad_words2($badwords, $string);
}
echo "elapsed: ". (microtime(true) - $start);
Example output:
elapsed: 0.076514959335327
elapsed: 0.29999899864197
So regular expressions are much slower.

You could use regular expression like this:
preg_match("~(admin)|(drop)|(another token)|(yet another)~",$subject);
building the pattern string from array
$pattern = implode(")|(", $banned_words);
$pattern = "~(".$pattern.")~";

function check($string, $array) {
foreach($array as $item) {
if( preg_match("/($item)/", $string) )
return true;
}
return false;
}

You can certainly do a loop, as others have suggested. But I think you can get closer to the behavior you're looking for with an operation that directly uses arrays, plus it allows execution via a single if statement.
Originally, I was thinking you could do this with a simple preg_match() call (hence the downvote), however preg_match does not support arrays. Instead, you can do a replacement via preg_replace to have all rejected strings replaced with nothing, and then check to see if the string is changed. This is simple and avoids requiring a loop iteration for each rejected string.
$rejectedStrs = array("/admin/", "/drop/", "/create/");
if($input == preg_replace($rejectedStrs, "", $input)) {
//do stuff
} else {
//reject
}
Note also that you can provide case-insensitive searches by using the i flag on the regex patterns, changing the array of patterns to $rejectedStrs = array("/admin/i", "/drop/i", "/create/i");
On Efficiency
There has been some debate about the efficiency of doing it this way vs the accepted nested loop method. I ran some tests and found the preg_replace method executed around twice as fast as the nested loop. Here is the code and output of those tests:
$input = "You can certainly do a loop, as others have suggested. But I think you can get closer to the behavior you're looking for with an operation that directly uses arrays, plus it allows execution via a single if statement. You can certainly do a loop, as others have suggested. But I think you can get closer to the behavior you're looking for with an operation that directly uses arrays, plus it allows execution via a single if statement.";
$input = "Short string with no matches";
$input2 = "Longer string with a lot more words but still no matches. Longer string with a lot more words but still no matches. Longer string with a lot more words but still no matches. Longer string with a lot more words but still no matches. Longer string with a lot more words but still no matches. Longer string with a lot more words but still no matches. Longer string with a lot more words but still no matches. ";
$input3 = "Short string which loop will match quickly";
$input4 = "Longer string that will eventually be matches but first has a lot of words, followed by more words and then more words, followed by more words and then more words, followed by more words and then more words, followed by more words and then more words, followed by more words and then more words, followed by more words and then more words, followed by more words and then more words, followed by more words and then more words and then finally the word create near the end";
$start1 = microtime(true);
$rejectedStrs = array("/loop/", "/operation/", "/create/");
$p_matches = 0;
for ($i = 0; $i < 10000; $i++) {
if (preg_check($rejectedStrs, $input)) $p_matches++;
if (preg_check($rejectedStrs, $input2)) $p_matches++;
if (preg_check($rejectedStrs, $input3)) $p_matches++;
if (preg_check($rejectedStrs, $input4)) $p_matches++;
}
$start2 = microtime(true);
$rejectedStrs = array("loop", "operation", "create");
$l_matches = 0;
for ($i = 0; $i < 10000; $i++) {
if (loop_check($rejectedStrs, $input)) $l_matches++;
if (loop_check($rejectedStrs, $input2)) $l_matches++;
if (loop_check($rejectedStrs, $input3)) $l_matches++;
if (loop_check($rejectedStrs, $input4)) $l_matches++;
}
$end = microtime(true);
echo "preg_match: ".$start1." ".$start2."= ".($start2-$start1)."\nloop_match: ".$start2." ".$end."=".($end-$start2);
function preg_check($rejectedStrs, $input) {
if($input == preg_replace($rejectedStrs, "", $input))
return true;
return false;
}
function loop_check($badwords, $string) {
foreach (str_word_count($string, 1) as $word) {
foreach ($badwords as $bw) {
if (stripos($word, $bw) === 0) {
return true;
}
}
return false;
}
}
Output:
preg_match: 1281908071.4032 1281908071.9947= 0.5915060043335
loop_match: 1281908071.9947 1281908073.006=1.0112948417664

This is actually pretty simple, use substr_count.
And example for you would be:
if (substr_count($variable_to_search, "drop"))
{
echo "error";
}
And to make things even simpler, put your keywords (ie. "drop", "create", "alter") in an array and use foreach to check them. That way you cover all your words. An example
foreach ($keywordArray as $keyword)
{
if (substr_count($variable_to_search, $keyword))
{
echo "error"; //or do whatever you want to do went you find something you don't like
}
}

How to find first non-repetitive character from a string?

I've spent half day trying to figure out this and finally I got working solution.
However, I feel like this can be done in simpler way.
I think this code is not really readable.
Problem: Find first non-repetitive character from a string.
$string = "abbcabz"
In this case, the function should output "c".
The reason I use concatenation instead of $input[index_to_remove] = ''
in order to remove character from a given string
is because if I do that, it actually just leave empty cell so that my
return value $input[0] does not not return the character I want to return.
For instance,
$str = "abc";
$str[0] = '';
echo $str;
This will output "bc"
But actually if I test,
var_dump($str);
it will give me:
string(3) "bc"
Here is my intention:
Given: input
while first char exists in substring of input {
get index_to_remove
input = chars left of index_to_remove . chars right of index_to_remove
if dupe of first char is not found from substring
remove first char from input
}
return first char of input
Code:
function find_first_non_repetitive2($input) {
while(strpos(substr($input, 1), $input[0]) !== false) {
$index_to_remove = strpos(substr($input,1), $input[0]) + 1;
$input = substr($input, 0, $index_to_remove) . substr($input, $index_to_remove + 1);
if(strpos(substr($input, 1), $input[0]) == false) {
$input = substr($input, 1);
}
}
return $input[0];
}

<?php
// In an array mapped character to frequency,
// find the first character with frequency 1.
echo array_search(1, array_count_values(str_split('abbcabz')));

Python:
def first_non_repeating(s):
for i, c in enumerate(s):
if s.find(c, i+1) < 0:
return c
return None
Same in PHP:
function find_first_non_repetitive($s)
{
for($i = 0; i < strlen($s); $i++) {
if (strpos($s, $s[i], $i+1) === FALSE)
return $s[i];
}
}

Pseudocode:
Array N;
For each letter in string
if letter not exists in array N
Add letter to array and set its count to 1
else
go to its position in array and increment its count
End for
for each position in array N
if value at potition == 1
return the letter at position and exit for loop
else
//do nothing (for clarity)
end for
Basically, you find all distinct letters in the string, and for each letter, you associate it with a count of how many of that letter exist in the string. then you return the first one that has a count of 1
The complexity of this method is O(n^2) in the worst case if using arrays. You can use an associative array to increase it's performance.

1- use a sorting algotithm like mergesort (or quicksort has better performance with small inputs)
2- then control repetetive characters
non repetetive characters will be single
repetetvives will fallow each other
Performance : sort + compare
Performance : O(n log n) + O(n) = O(n log n)
For example
$string = "abbcabz"
$string = mergesort ($string)
// $string = "aabbbcz"
Then take first char form string then compare with next one if match repetetive
move to the next different character and compare
first non-matching character is non-repetetive

This can be done in much more readable code using some standard PHP functions:
// Count number of occurrences for every character
$counts = count_chars($string);
// Keep only unique ones (yes, we use this ugly pre-PHP-5.3 syntax here, but I can live with that)
$counts = array_filter($counts, create_function('$n', 'return $n == 1;'));
// Convert to a list, then to a string containing every unique character
$chars = array_map('chr', array_keys($counts));
$chars = implode($chars);
// Get a string starting from the any of the characters found
// This "strpbrk" is probably the most cryptic part of this code
$substring = strlen($chars) ? strpbrk($string, $chars) : '';
// Get the first character from the new string
$char = strlen($substring) ? $substring[0] : '';
// PROFIT!
echo $char;

$str="abbcade";
$checked= array(); // we will store all checked characters in this array, so we do not have to check them again
for($i=0; $i<strlen($str); $i++)
{
$c=0;
if(in_array($str[$i],$checked)) continue;
$checked[]=$str[$i];
for($j=$i+1;$j<=strlen($str);$j++)
{
if($str[$i]==$str[$j])
{
$c=1;
break;
}
}
if($c!=1)
{
echo "First non repetive char is:".$str[$i];
break;
}
}

This should replace your code...
$array = str_split($string);
$array = array_count_values($array);
$array = array_filter($array, create_function('$key,$val', 'return($val == 1);'));
$first_non_repeated_letter = key(array_shift($array));
Edit: spoke too soon. Took out 'array_unique', thought it actually dropped duplicate values. But character order should be preserved to be able to find the first character.

Here's a function in Scala that would do it:
def firstUnique(chars:List[Char]):Option[Char] = chars match {
case Nil => None
case head::tail => {
val filtered = tail filter (_!=head)
if (tail.length == filtered.length) Some(head) else firstUnique(filtered)
}
}
scala> firstUnique("abbcabz".toList)
res5: Option[Char] = Some(c)
And here's the equivalent in Haskell:
firstUnique :: [Char] -> Maybe Char
firstUnique [] = Nothing
firstUnique (head:tail) = let filtered = (filter (/= head) tail) in
if (tail == filtered) then (Just head) else (firstUnique filtered)
*Main> firstUnique "abbcabz"
Just 'c'
You can solve this more generally by abstracting over lists of things that can be compared for equality:
firstUnique :: Eq a => [a] -> Maybe a
Strings are just one such list.

Can be also done using array_key_exists during building an associative array from the string. Each character will be a key and will count the number as value.
$sample = "abbcabz";
$check = [];
for($i=0; $i<strlen($sample); $i++)
{
if(!array_key_exists($sample[$i], $check))
{
$check[$sample[$i]] = 1;
}
else
{
$check[$sample[$i]] += 1;
}
}
echo array_search(1, $check);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

php regular expression to detect index level - php

Related

Is it possible to cut a word in parts? [PHP]

how to get last integer in a string?

Cut string in PHP at nth-from-end occurrence of character

PHP: Check string for certain words

How to find first non-repetitive character from a string?

Categories

Resources