PHP - Search, put letters written in bold - php

I'm trying to do a search engine where I write in a textbox, for example, "Mi" and it selects and shows "Mike Ross". However it's not working with spaces. I write "Mike" and I get "Mike Ross", but when I write "Mike " I get "Mike Ross" (no bold).
The same is happening with accents.
So I write "Jo" and the result is "João Carlos". If I write "Joa", the result is "João Carlos" (without any bold part). I want to ignore the accents while writing but still display them in the results.
So this is my script after the SELECT:
while($row = $result->fetch_array()) {
$name = $row['name'];
$array = explode(' ',trim($name));
$array_length = count($array);
for ($i=0; $i<$array_length; $i++ ) {
$letters = substr($array[$i], 0, $q_length);
if (strtoupper($letters) == strtoupper($q)) {
$bold_name = '<strong>'.$letters.'</strong>';
$final_name = preg_replace('~'.$letters.'~i', $bold_name, $array[$i], 1);
$array[$i] = $final_name;
}
array[$i] = array[$i]." ";
}
foreach ($array as $t_name) { echo $t_name;
}
Thank you for your help!

if (strtoupper($letters) == strtoupper($q))
This will never evaluate to "true" with spaces since you're removing spaces from the matchable letter set with explode(' ', trim($name), effectively making any value of $q with a space unmatchable to $letters
Here's a quick example that does what I think you're looking for
<?php
$q = "Mike "; // User query
$name = "Mike Ross"; // Database row value
if(stripos($name, $q) !== false) // Case-insensitive match
{
// Case-insensitive replace of match with match enclosed in strong tag
$result = preg_replace("/($q)/i", '<strong>$1</strong>', $name);
print_r($result);
}
// Result is
// <strong>Mike </strong>Ross
From what I can tell (a quick google for "replace accented characters PHP"), you're kind of out of luck with that one. This question provides a quick solution using strtr, and this tip uses a similar method with str_replace.
Unfortunately, these rely on predefined character sets, so incoming accents you haven't prepared for will fail. You may be better off relying on users to enter the special characters when they search, or create a new column with a "searchable" name with the accented characters replaced as best as you can, and return the real name as the "matched" display field.
One more Note
I found another solution that can do most of what you want, except the returned name will not have the accent. It will, however, match the accented value in the DB with a non-accented search. Modified code is:
<?php
$q = "Joa";
$name = "João Carlos";
$searchable_name = replace_accents($name);
if(stripos($searchable_name, $q) !== false)
{
$result = preg_replace("/($q)/i", '<strong>$1</strong>', $searchable_name);
print_r($result);
}
function replace_accents($str) {
$str = htmlentities($str, ENT_COMPAT, "UTF-8");
$str = preg_replace('/&([a-zA-Z])(uml|acute|grave|circ|tilde);/','$1',$str);
return html_entity_decode($str);
}

Related

php foreach counting new lines (\n)

If I have a piece of code that works like this:
$i = 0;
$names = explode(",", $userInput);
foreach($names as $name) {
$i++;
}
It works perfectly, provided the user has placed a comma after each name entered into the html textarea this comes from. But I want to make it more user friendly and change it so that each name can be entered on a new line and it'll count how many lines the user has entered to determine the number of names entered into the field. So I tried:
$i = 0;
$names = explode("\n", $userInput);
foreach($names as $name) {
$i++;
}
But this just gives me "1" as a result, regardless the number of new lines in the textarea. How do I make my explode count new lines instead of basing the count on something specifically entered into the text string?
EDIT Thanks to the people who answered, I don't believe there were any wrong answers as such, just one that suited my original code better than the others, and functioned. I ended up adopting this and modifying it so that numerous blank line returns did not result in artificially inflating the $userInput count. Here is what I am now using:
if(($userInput) != NULL) {
$i = 0;
$names = explode(PHP_EOL, trim($userInput));
foreach($names as $name) {
$i++;
}
}
It trims the empty space from the $userInput so that the remainder of the function is performed on only valid line content. :)
Don't make it complicated, you don't have to explode it into an array, just use this:
(Just count the new line character (PHP_EOL) in your string with substr_count())
echo substr_count($userInput, PHP_EOL);
Try using the PHP end of line constant PHP_EOL
$names = explode(PHP_EOL, $userInput);
A blank string as input:
var_dump(explode("\n", ''));
gives this as a result from a call to explode():
array(1) { [0]=> string(0) "" }
so you could use a ternary statement:
$names = $userInput == '' ? array() : explode("\n", $userInput);
Maybe you can change the explode function with preg_split to explode the user string with a regex
$users = preg_split('/[\n\r]+/', $original);
That's the idea, but I'm not on the computer so I can't test my code.
That regex would split the string if it founds one or more line breaks.

PHP word censor with keeping the original caps

We want to censor certain words on our site but each word has different censored output.
For example:
PHP => P*P, javascript => j*vascript
(However not always the second letter.)
So we want a simple "one star" censor system but with keeping the original caps. The datas coming from the database are uncensored so we need the fastest way that possible.
$data="Javascript and php are awesome!";
$word[]="PHP";
$censor[]="H";//the letter we want to replace
$word[]="javascript";
$censor[]="a"//but only once (j*v*script would look wierd)
//Of course if it needed we can use the full censored word in $censor variables
Expected value:
J*vascript and p*p are awesome!
Thanks for all the answers!
You can put your censored words in key-based array, and value of the array should be the position of what char is replaced with * (see $censor array example bellow).
$string = 'JavaSCRIPT and pHp are testing test-ground for TEST ŠĐČĆŽ ŠĐčćŽ!';
$censor = [
'php' => 2,
'javascript' => 2,
'test' => 3,
'šđčćž' => 4,
];
function stringCensorSlow($string, array $censor) {
foreach ($censor as $word => $position) {
while (($pos = mb_stripos($string, $word)) !== false) {
$string =
mb_substr($string, 0, $pos + $position - 1) .
'*' .
mb_substr($string, $pos + $position);
}
}
return $string;
}
function stringCensorFast($string, array $censor) {
$pattern = [];
foreach ($censor as $word => $position) {
$word = '~(' . mb_substr($word, 0, $position - 1) . ')' . mb_substr($word, $position - 1, 1) . '(' . mb_substr($word, $position) . ')~iu';
$pattern[$word] = '$1*$2';
}
return preg_replace(array_keys($pattern), array_values($pattern), $string);
}
Use example :
echo stringCensorSlow($string, $censor);
# J*vaSCRIPT and p*p are te*ting te*t-ground for TE*T ŠĐČ*Ž ŠĐč*Ž!
echo stringCensorFast($string, $censor) . "\n";
# J*vaSCRIPT and p*p are te*ting te*t-ground for TE*T ŠĐČ*Ž ŠĐč*Ž!
Speed test :
foreach (['stringCensorSlow', 'stringCensorFast'] as $func) {
$time = microtime(true);
for ($i = 0; $i < 10000; $i++) {
$func($string, $censor);
}
$time = microtime(true) - $time;
echo "{$func}() took $time\n";
}
output on my localhost was :
stringCensorSlow() took 1.9752140045166
stringCensorFast() took 0.11587309837341
Upgrade #1: added multibyte character safe.
Upgrade #2: added example for preg_replace, which is faster than mb_substr. Tnx to AbsoluteƵERØ
Upgrade #3: added speed test loop and result on my local PC machine.
Make an array of words and replacements. This should be your fastest option in terms of processing, but a little more methodical to setup. Remember when you're setting up your patterns to use the i modifier to make each pattern case insensitive. You could ultimately pull these from a database into the arrays. I've hard-coded the arrays here for the example.
<!DOCTYPE html>
<html>
<meta content="text/html; charset=UTF-8" http-equiv="content-type">
<?php
$word_to_alter = array(
'!(j)a(v)a(script)(s|ing|ed)?!i',
'!(p)h(p)!i',
'!(m)y(sql)!i',
'!(p)(yth)o(n)!i',
'!(r)u(by)!i',
'!(ВЗЛ)О(М)!iu',
);
$alteration = array(
'$1*$2*$3$4',
'$1*$2',
'$1*$2',
'$1$2*$3',
'$1*$2',
'$1*$2',
);
$string = "Welcome to the world of programming. You can learn PHP, MySQL, Python, Ruby, and Javascript all at your own pace. If you know someone who uses javascripting in their daily routine you can ask them about becoming a programmer who writes JavaScripts. взлом прохладно";
$newstring = preg_replace($word_to_alter,$alteration,$string);
echo $newstring;
?>
</html>
Output
Welcome to the world of programming. You can learn P*P, M*SQL, Pyth*n,
R*by, and J*v*script all at your own pace. If you know someone who
uses j*v*scripting in their daily routine you can ask them about
becoming a programmer who writes J*v*Scripts. взл*м прохладно
Update
It works the same with UTF-8 characters, note that you have to specify a u modifier to make the pattern treated as UTF-8.
u (PCRE_UTF8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This
modifier is available from PHP 4.1.0 or greater on Unix and from PHP
4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.
Why not just use a little helper function and pass it a word and the desired censor?
function censorWord($word, $censor) {
if(strpos($word, $censor)) {
return preg_replace("/$censor/",'*', $word, 1);
}
}
echo censorWord("Javascript", "a"); // returns J*avascript
echo censorWord("PHP", "H"); // returns P*P
Then you can check the word against your wordlist and if it is a word that should be censored, you can pass it to the function. Then, you also always have the original word as well as the censored one to play with or put back in your sentence.
This would also make it easy to change the number of letters censored by just changing the offset in the preg_replace. All you have to do is keep an array of words, explode the sentence on spaces or something, and then check in_array. If it is in the array, send it to censorWord().
Demo
And here's a more complete example doing exactly what you said in the OP.
function censorWord($word, $censor) {
if(strpos($word, $censor)) {
return preg_replace("/$censor/",'*', $word, 1);
}
}
$word_list = ['php','javascript'];
$data = "Javascript and php are awesome!";
$words = explode(" ", $data);
// pass each word by reference so it can be modified inside our array
foreach($words as &$word) {
if(in_array(strtolower($word), $word_list)) {
// this just passes the second letter of the word
// as the $censor argument
$word = censorWord($word, $word[1]);
}
}
echo implode(" ", $words); // returns J*vascript and p*p are awesome!
Another Demo
You could store a lowercase list of the censored words somewhere, and if you're okay with starring the second letter every time, do something like this:
if (in_array(strtolower($word), $censored_words)) {
$word = substr($word, 0, 1) . "*" . substr($word, 2);
}
If you want to change the first occurrence of a letter, you could do something like:
$censored_words = array('javascript' => 'a', 'php' => 'h', 'ruby' => 'b');
$lword = strtolower($word);
if (in_array($lword, array_keys($censored_words))) {
$ind = strpos($lword, $censored_words[$lword]);
$word = substr($word, 0, $ind) . "*" . substr($word, $ind + 1);
}
This is what I would do:
Create a simple database (text file) and make a "table" of all your censored words and expected censored results. E.G.:
PHP --- P*P
javascript --- j*vascript
HTML --- HT*L
Write PHP code to compare the database information to your simple censored file. You will have to use array explode to create an array of only words. Something like this:
/* Opening database of censored words */
$filename = "/files/censored_words.txt";
$file = fopen( $filename, "r" );
if( $file == false )
{
echo ( "Error in opening file" );
exit();
}
/* Creating an array of words from string*/
$data = explode(" ", $data); // What was "Javascript and PHP are awesome!" has
// become "Javascript", "and", "PHP", "are",
// "awesome!". This is useful.
If your script finds matching words, replace the word in your data with the censored word from your list. You would have to delimit the file first by \r\n and finally by ---. (Or whatever you choose for separating your table with.)
Hope this helped!

PHP SEO Functions

I am having a problem trying to understand functions with variables. Here is my code. I am trying to create friendly urls for a site that reports scams. I created a DB full of bad words to remove from the url if it is preset. If the name in the url contains a link I would like it to look like this: example.com-scam.php or html (whichever is better). However, right now it strips the (.) and it looks like this examplecom. How can I fix this to leave the (.) and add a -scam.php or -scam.html to the end?
functions/seourls.php
/* takes the input, scrubs bad characters */
function generate_seo_link($link, $replace = '-', $remove_words = true, $words_array = array()) {
//make it lowercase, remove punctuation, remove multiple/leading/ending spaces
$return = trim(ereg_replace(' +', ' ', preg_replace('/[^a-zA-Z0-9\s]/', '', strtolower($link))));
//remove words, if not helpful to seo
//i like my defaults list in remove_words(), so I wont pass that array
if($remove_words) { $return = remove_words($return, $replace, $words_array); }
//convert the spaces to whatever the user wants
//usually a dash or underscore..
//...then return the value.
return str_replace(' ', $replace, $return);
}
/* takes an input, scrubs unnecessary words */
function remove_words($link,$replace,$words_array = array(),$unique_words = true)
{
//separate all words based on spaces
$input_array = explode(' ',$link);
//create the return array
$return = array();
//loops through words, remove bad words, keep good ones
foreach($input_array as $word)
{
//if it's a word we should add...
if(!in_array($word,$words_array) && ($unique_words ? !in_array($word,$return) : true))
{
$return[] = $word;
}
}
//return good words separated by dashes
return implode($replace,$return);
}
This is my test.php file:
require_once "dbConnection.php";
$query = "select * from bad_words";
$result = mysql_query($query);
while ($record = mysql_fetch_assoc($result))
{
$words_array[] = $record['word'];
}
$sql = "SELECT * FROM reported_scams WHERE id=".$_GET['id'];
$rs_result = mysql_query($sql);
while ($row = mysql_fetch_array($rs_result)) {
$link = $row['business'];
}
require_once "functions/seourls.php";
echo generate_seo_link($link, '-', true, $words_array);
Any help understanding this would be greatly appreciated :) Also, why am I having to echo the function?
Your first real line of code has the comment:
//make it lowercase, remove punctuation, remove multiple/leading/ending spaces
Periods are punctuation, so they're being removed. Add . to the accepted character set if you want to make an exception.
Alter your regular expression (second line) to allow full stops:
$return = trim(ereg_replace(' +', ' ', preg_replace('/[^a-zA-Z0-9\.\s]/', '', strtolower($link))));
The reason your code needs to be echoed is because you are returning a variable in the function. You can change return in the function to echo/print if you want to print it out as soon as you call the function.

Create acronym from a string containing only words

I'm looking for a way that I can extract the first letter of each word from an input field and place it into a variable.
Example: if the input field is "Stack-Overflow Questions Tags Users" then the output for the variable should be something like "SOQTU"
$s = 'Stack-Overflow Questions Tags Users';
echo preg_replace('/\b(\w)|./', '$1', $s);
the same as codaddict's but shorter
For unicode support, add the u modifier to regex: preg_replace('...../u',
Something like:
$s = 'Stack-Overflow Questions Tags Users';
if(preg_match_all('/\b(\w)/',strtoupper($s),$m)) {
$v = implode('',$m[1]); // $v is now SOQTU
}
I'm using the regex \b(\w) to match the word-char immediately following the word boundary.
EDIT:
To ensure all your Acronym char are uppercase, you can use strtoupper as shown.
Just to be completely different:
$input = 'Stack-Overflow Questions Tags Users';
$acronym = implode('',array_diff_assoc(str_split(ucwords($input)),str_split(strtolower($input))));
echo $acronym;
$initialism = preg_replace('/\b(\w)\w*\W*/', '\1', $string);
If they are separated by only space and not other things. This is how you can do it:
function acronym($longname)
{
$letters=array();
$words=explode(' ', $longname);
foreach($words as $word)
{
$word = (substr($word, 0, 1));
array_push($letters, $word);
}
$shortname = strtoupper(implode($letters));
return $shortname;
}
Regular expression matching as codaddict says above, or str_word_count() with 1 as the second parameter, which returns an array of found words. See the examples in the manual. Then you can get the first letter of each word any way you like, including substr($word, 0, 1)
The str_word_count() function might do what you are looking for:
$words = str_word_count ('Stack-Overflow Questions Tags Users', 1);
$result = "";
for ($i = 0; $i < count($words); ++$i)
$result .= $words[$i][0];
function initialism($str, $as_space = array('-'))
{
$str = str_replace($as_space, ' ', trim($str));
$ret = '';
foreach (explode(' ', $str) as $word) {
$ret .= strtoupper($word[0]);
}
return $ret;
}
$phrase = 'Stack-Overflow Questions IT Tags Users Meta Example';
echo initialism($phrase);
// SOQITTUME
$s = "Stack-Overflow Questions IT Tags Users Meta Example";
$sArr = explode(' ', ucwords(strtolower($s)));
$sAcr = "";
foreach ($sArr as $key) {
$firstAlphabet = substr($key, 0,1);
$sAcr = $sAcr.$firstAlphabet ;
}
using answer from #codaddict.
i also thought in a case where you have an abbreviated word as the word to be abbreviated e.g DPR and not Development Petroleum Resources, so such word will be on D as the abbreviated version which doesn't make much sense.
function AbbrWords($str,$amt){
$pst = substr($str,0,$amt);
$length = strlen($str);
if($length > $amt){
return $pst;
}else{
return $pst;
}
}
function AbbrSent($str,$amt){
if(preg_match_all('/\b(\w)/',strtoupper($str),$m)) {
$v = implode('',$m[1]); // $v is now SOQTU
if(strlen($v) < 2){
if(strlen($str) < 5){
return $str;
}else{
return AbbrWords($str,$amt);
}
}else{
return AbbrWords($v,$amt);
}
}
}
As an alternative to #user187291's preg_replace() pattern, here is the same functionality without needing a reference in the replacement string.
It works by matching the first occurring word characters, then forgetting it with \K, then it will match zero or more word characters, then it will match zero or more non-word characters. This will consume all of the unwanted characters and only leave the first occurring word characters. This is ideal because there is no need to implode an array of matches. The u modifier ensures that accented/multibyte characters are treated as whole characters by the regex engine.
Code: (Demo)
$tests = [
'Stack-Overflow Questions Tags Users',
'Stack Overflow Close Vote Reviewers',
'Jean-Claude Vandàmme'
];
var_export(
preg_replace('/\w\K\w*\W*/u', '', $tests)
);
Output:
array (
0 => 'SOQTU',
1 => 'SOCVR',
2 => 'JCV',
)

Apply title-case to all words in string except for nominated acronyms

Example Input: SMK SUNGAI PUNAI
My Code:
$school = 'SMK SUNGAI PUNAI';
echo ucwords(strtolower($school));
Unwanted Output: Smk Sungai Punai
Question
How to make the output SMK Sungai Punai which allows SMK to remain in ALL-CAPS.
Update.
The problem I have list of 10,000 school names. From PDF, I convert to mysql. I copied exactly from PDF the name of schools -- all in uppercase.
How can I implement conditional title-casing?
As far as I understand you want to have all school names with the first character of every word in uppercase and exclude some special words ($exceptions in my sample) from this processing.
You could do that like this:
function createSchoolName($school) {
$exceptions = array('SMK', 'PTS', 'SBP');
$result = "";
$words = explode(" ", $school);
foreach ($words as $word) {
if (in_array($word, $exceptions))
$result .= " ".$word;
else
$result .= " ".strtolower($word);
}
return trim(ucwords($result));
}
echo createSchoolName('SMK SUNGAI PUNAI');
This example would return SMK Sungai Punai as required by your question.
You can very simply create a pipe-delimited set of excluded words/acronyms, then use (*SKIP)(*FAIL) to prevent matching those whole words.
mb_convert_case() is an excellent function to call because it instantly provides TitleCasing and it is multibyte safe.
Code: (Demo)
$pipedExclusions = 'SMK|USA|AUS';
echo preg_replace_callback(
'~\b(?:(?:' . $pipedExclusions . ')(*SKIP)(*FAIL)|\p{Lu}+)\b~u',
fn($m) => mb_convert_case($m[0], MB_CASE_TITLE),
'SMK SUNGAI PUNAI'
);
// SMK Sungai Punai
There's no really good way to do it. In this case you can assume it's an abbreviation because it's only three letters long and contains no vowels. You can write a set of rules that look for abbreviations in the string and then uppercase them, but in some cases it'll be impossible... consider "BOB PLAYS TOO MUCH WOW."
You can use something like this:
<?php
$str = 'SMK SUNGAI PUNAI';
$str = strtolower($str);
$arr = explode(" ", $str);
$new_str = strtoupper($arr[0]). ' ' .ucfirst($arr[1]). ' ' .ucfirst($arr[2]);
echo '<p>'.$new_str.'</p>';
// Result: SMK Sungai Punai
?>

Categories