splitting long regex to pieces PHP

splitting long regex to pieces PHP - php

I have a very long list of names and I am using preg_replace to match if a name from the list is anywhere in the string. If I test it with few names in the regex it works fine, but having in mind that I have over 5000 names it gives me the error "preg_replace(): Compilation failed: regular expression is too large".
Somehow I cannot figure out how to split the regex into pieces so it becomes smaller (if even possible).
The list with names is created dynamically from a database. Here is my code.
$query_gdpr_names = "select name FROM gdpr_names";
$result_gdpr_names = mysqli_query($connect, $query_gdpr_names);
while ($row_gdpr_names = mysqli_fetch_assoc($result_gdpr_names))
{
$AllNames .= '"/'.$row_gdpr_names['name'].'\b/ui",';
}
$AllNames = rtrim($AllNames, ',');
$AllNames = "[$AllNames]";
$search = preg_replace($AllNames, '****', $search);
The created $AllNames str looks like this (in the example 3 names only)
$AllNames = ["/Lola/ui", "/Monica\b/ui", "/Chris\b/ui"];
And the test string
$search = "I am Lola and my friend name is Chris";
Any help is very appreciated.

Since it appears that you can't easily handle the replacement from PHP using a single regex alternation, one alternative would be to just iterate each name in the result set one by one and make a replacement:
while ($row_gdpr_names = mysqli_fetch_assoc($result_gdpr_names)) {
$name = $row_gdpr_names['name'];
$regex = "/\b" . $name . "\b/ui";
$search = preg_replace($regex, '----', $search);
}
$search = preg_replace("/----/", '****', $search);
This is not the most efficient pattern for doing this. Perhaps there is some way you can limit your result set to avoid a too long single alternation.

Ok, I was debugging a lot. Even isolating everything else but this part of code
$search = "Lola and Chris";
$query_gdpr_names = "select * FROM gdpr_names";
$result_gdpr_names = mysqli_query($connect, $query_gdpr_names);
while ($row_gdpr_names = mysqli_fetch_assoc($result_gdpr_names)) {
$name = $row_gdpr_names['name'];
$regex = "/\b" . $name . "\b/ui";
$search = preg_replace($regex, '****', $search);
}
echo $search;
Still, print inside but not outside the loop.

The problem actually was in the database records. There was a slash in one of the records

Related

How to Split a name and use a function to display a set result

Using Php I have to create a function to split a name and then have it display the first character of the first name and all of the last name example is below:
what it looks like before
what it looks like after
the only thing is I have no idea how the only hint i was given was research the PHP strpos() function any help would be appreciated

you can use the explode() to separate the names like:
$names = explode(' ', $fullName);
The first letter of the first name:
$flfn = substr($names[0], 0,1);
Output:
echo $flfn . " " . $names[1];
You should probably check that the length of the $names array is 2 after the explode so you can manage different cases such as someone with 2 last names.

Hello and welcome to SO.
There can be multiple ways to get this thing.
what I can do for you is as below
function getShortName($userId, $connect) {
$query = $connect->prepare("SELECT username FROM users WHERE id = ?");
$query->execute(array([$userId]));
$user = $query->fetch(PDO::FETCH_ASSOC);
$names = explode(" ", $user->username);
$realName = $names[0][0]." ".$names[1];
echo $realName;
}
$name = "Prateik Darji";
getShortName($name);

Search and replace all lines in a multiline string

I have a string with a large list with items named as follows:
str = "f05cmdi-test1-name1
f06dmdi-test2-name2";
So the first 4 characters are random characters. And I would like to have an output like this:
'mdi-test1-name1',
'mdi-test2-name2',
As you can see the first characters from the string needs to be replaced with a ' and every line needs to end with ',
How can I change the above string into the string below? I've tried for ours with 'strstr' and 'str_replace' but I can't get it working. It would save me a lot of time if I got it work.
Thanks for your help guys!

Here is a way to do the job:
$input = "f05cmdi-test1-name1
f05cmdi-test2-name2";
$result = preg_replace("/.{4}(\S+)/", "'$1',", $input);
echo $result;
Where \S stands for a NON space character.

EDIT : I deleted the above since the following method is better and more reliable and can be used for any possible combination of four characters.
So what do I do if there are a million different possibillites as starting characters ?
In your specific example I see that the only space is in between the full strings (full string = "f05cmdi-test1-name1" )
So:
str = "f05cmdi-test1-name1 f06dmdi-test2-name2";
$result_array = [];
// Split at the spaces
$result = explode(" ", $str);
foreach($result as $item) {
// If four random chars take string after the first four random chars
$item = substr($item, 5);
$result_array = array_push($result_arrray, $item);
}
Resulting in:
$result_array = [
"mdi-test1-name1",
"mdi-test2-name2",
"....."
];
IF you would like a single string in the style of :
"'mdi-test1-name1','mdi-test2-name2','...'"
Then you can simply do the following:
$result_final = "'" . implode("','" , $result_array) . "'";

This is doable in a rather simple regex pattern
<?php
$str = "f05cmdi-test1-name1
f05cmdi-test2-name2";
$str = preg_replace("~[a-z0-9]{1,4}mdi-test([0-9]+-[a-z0-9]+)~", "'mdi-test\\1',", $str);
echo $str;
Alter to your more specific needs

find position of word in array php

I've nearly figured out my sms issue, and have it narrowed down to one small issue that I can't seem to get to work.
Here's what I have:
include('Services/Twilio.php');
/* Read the contents of the 'Body' field of the Request. */
$body = $_REQUEST['Body'];
/* Remove formatting from $body until it is just lowercase
characters without punctuation or spaces. */
$result = rtrim($body);
$result = strtolower($result);
$text = explode(' ',$result);
$keyword = array('dog','pigeon','owl');
$word = array_intersect($keyword,$text);
$key = array_search($word, array_keys($keyword));
$word[$key]();
/* ^^this is the issue */
So the SMS app now can read a sentence and find the keyword no sweat. Problem is, I also need to have the position of the word in relation to where it is in the keyword array. If I manually change the number to the correct position and send a text containing that keyword it functions flawlessly.
Unfortunately array_search doesn't work because it can't accept a dynamic needle. Is there any way to have the array position auto populate based on what keyword is found? In my code above, you can see (hopefully) what I'm trying to do.

Solved issue. It can be done dynamically.
$body = $_REQUEST['Body'];
/* Remove formatting from $body until it is just lowercase
characters without punctuation or spaces. */
$result = rtrim($body);
$result = strtolower($result);
$text = explode(' ',$result);
$keyword = array('listing','pigeon_show','owl');
$word = array_intersect($keyword,$text);
$key = key($word);
if ($word){
$word[$key]();}
else
{index();}

PHP SEO Functions

I am having a problem trying to understand functions with variables. Here is my code. I am trying to create friendly urls for a site that reports scams. I created a DB full of bad words to remove from the url if it is preset. If the name in the url contains a link I would like it to look like this: example.com-scam.php or html (whichever is better). However, right now it strips the (.) and it looks like this examplecom. How can I fix this to leave the (.) and add a -scam.php or -scam.html to the end?
functions/seourls.php
/* takes the input, scrubs bad characters */
function generate_seo_link($link, $replace = '-', $remove_words = true, $words_array = array()) {
//make it lowercase, remove punctuation, remove multiple/leading/ending spaces
$return = trim(ereg_replace(' +', ' ', preg_replace('/[^a-zA-Z0-9\s]/', '', strtolower($link))));
//remove words, if not helpful to seo
//i like my defaults list in remove_words(), so I wont pass that array
if($remove_words) { $return = remove_words($return, $replace, $words_array); }
//convert the spaces to whatever the user wants
//usually a dash or underscore..
//...then return the value.
return str_replace(' ', $replace, $return);
}
/* takes an input, scrubs unnecessary words */
function remove_words($link,$replace,$words_array = array(),$unique_words = true)
{
//separate all words based on spaces
$input_array = explode(' ',$link);
//create the return array
$return = array();
//loops through words, remove bad words, keep good ones
foreach($input_array as $word)
{
//if it's a word we should add...
if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true))
{
$return[] = $word;
}
}
//return good words separated by dashes
return implode($replace,$return);
}
This is my test.php file:
require_once "dbConnection.php";
$query = "select * from bad_words";
$result = mysql_query($query);
while ($record = mysql_fetch_assoc($result))
{
$words_array[] = $record['word'];
}
$sql = "SELECT * FROM reported_scams WHERE id=".$_GET['id'];
$rs_result = mysql_query($sql);
while ($row = mysql_fetch_array($rs_result)) {
$link = $row['business'];
}
require_once "functions/seourls.php";
echo generate_seo_link($link, '-', true, $words_array);
Any help understanding this would be greatly appreciated :) Also, why am I having to echo the function?

Your first real line of code has the comment:
//make it lowercase, remove punctuation, remove multiple/leading/ending spaces
Periods are punctuation, so they're being removed. Add . to the accepted character set if you want to make an exception.

Alter your regular expression (second line) to allow full stops:
$return = trim(ereg_replace(' +', ' ', preg_replace('/[^a-zA-Z0-9\.\s]/', '', strtolower($link))));
The reason your code needs to be echoed is because you are returning a variable in the function. You can change return in the function to echo/print if you want to print it out as soon as you call the function.

php tokenisation

I have a string of characters separated by many hashes (#). I need to get the individual words in between the hashes on php. here's what my code looks like:
$sql = "SELECT attribute_type.at_name,attribute_type.at_id FROM attribute_type
WHERE attribute_type.prodType_id = $pt_id
AND attribute_type.at_id NOT IN (SELECT at_id
FROM attribute_type
WHERE attribute_type.at_name = 'Product')";
while($items. strpos("#")>0){
// add the selected AT in every loop to be excluded
// .
// here tokens from $items are stored individually in
// $selectedAT (whose value changes in every loop/cycle)
//
// add to the current sql statement the values $at_id and $searchparam
$sql = $sql . "AND attribute_type.at_id NOT IN
(SELECT at_id FROM attribute_type
WHERE attribute_type.at_name = '$selectedAT')";
}
$dbcon = new DatabaseManager();
$rs = $dbcon->runQuery($sql);

explode creates an array by splitting a string on a given token
$words = explode("#", $items);
Now if you need to take these words you extracted from the string and use them to compare to some column in a SQL query...
$sql = "SELECT ... WHERE column IN ('" . implode("', '", $words) . "')";
You should not need to build a query in a loop as you are doing once you have the words in an array.
Even if you did want to do it that way, you don't want to create a subquery for every word when you could just OR the words together in one subquery.

Try strtok. Example paste:
$string = "This is\tan example\nstring";
$tok = strtok($string, " \n\t");
while ($tok !== false) {
echo "Word=$tok<br />";
$tok = strtok(" \n\t");
}

Do not use split as suggested in another answer (which has now been updated). It uses old POSIX regulat expressions and it's deprecated.
To split a string, use $words = explode('#', $items); which does not use a regular expression but a plain string.
Docref: http://php.net/explode

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

splitting long regex to pieces PHP - php

The problem actually was in the database records. There was a slash in one of the records

Related

How to Split a name and use a function to display a set result

Search and replace all lines in a multiline string

find position of word in array php

PHP SEO Functions

php tokenisation

Categories

Resources