if (in_array(<title>, $SomeArray)) return true - php

I have problem that it cant detect every word in string. similar to filter or tag or category or sort of..
$title = "What iS YouR NAME?";
$english = Array( 'Name', 'Vacation' );
if(in_array(strtolower($title),$english)){
$language = 'english';
} else if(in_array(strtolower($title),$france)){
$language = 'france';
} else if(in_array(strtolower($title),$spanish)){
$language = 'spanish';
} else if(in_array(strtolower($title),$chinese)){
$language = 'chinese';
} else if(in_array(strtolower($title),$japanese)){
$language = 'japanese';
} else {
$language = null;
}
output is null.. =/

No real problem here... the string in $title isn't in any of the arrays you are testing with, even in lower case.
Try testing every word in each language tab against the string instead.
$english = Array( 'Name', 'Vacation' );
$languages = array('english'=>$english,
'france'=>$france,
'spanish'=>$spanish,
'chinese'=>$chinese,
'japenese'=>$japenese);
while($line = mysql_fetch_assoc($result)) {
$title = $line['title'];
//$lower_title = strtolower($title); stristr() instead.
$language_found = false;
foreach($languages as $language_name=>$language) {
idx = 0;
while(($language_found === false) && idx < count($language)) {
$word = $language[$idx];
if(stristr($title, $word) !== false) {
$language_found = $language_name;
}
$idx++;
}
}
// got language name or false...
}
You could also breaking the string using explode of course and testing each word in the created array for each language.

The output is null because the string "what is your name?" is not in any of the language arrays. Note that you should not write language names in code.
Instead, a dictionary or array of languages (in the form of dictionaries or objects) allows future extension and separates data from control logic.

There are several logic issues:
The first problem is that you are trying to see if a multi-word string is in an array of single words. That will always fail to find a match. You would need to break up $title with explode and loop over the words.
ie) $title_words = explode(' ', strtolower($title));
foreach($title_words as $word){
//check language
//now you can use in_array and expect some matches
}
However that presents a second issue.What if a word is in multiple languages?
A third issue is in your sample, you have converted your search string to lowercase, but your array of matches has all words with an upper case first letter.
ie)
$english = Array( 'Name', 'Vacation' ); should be
$english = array( 'name', 'vacation' );
if you expect matches

Related

php strpos and approximate match with 1 character difference

I searched the system but couldn't find any help I could understand on this, so here goes...
I need to find an approximate match for a string in php.
Essentially I'm checking that all the $names are in the $cv string and if not it sets a flag to true.
foreach( $names as $name ) {
if ( strrpos( $cv, $name ) === false ) {
$nonameincv = true;
}
}
It works fine. However, I had a case of $cv = "marie_claire" and a $name = "clare" which set the flag (of course) but which I'd have liked for strpos to have "found" as it were.
Is it possible to do an approximate match so that if a string has 1 extra letter anywhere in it, it would match? For example so that:
$name = "clare" is found in $cv = "marie_claire"
$name = "caire" is found in $cv = "marie_claire"
$name = "laire" is found in $cv = "marie_claire"
and so on...
Note: This will work perfectly fine when there is difference of 1 character, as stated in question above.
Try this code snippet here
<?php
ini_set('display_errors', 1);
$stringToSearch="mare";
$wholeString = "marie_claire";
$wholeStringArray= str_split($wholeString);
for($x=0;$x<strlen($wholeString);$x++)
{
$tempArray=$wholeStringArray;
unset($tempArray[$x]);
if(strpos(implode("", $tempArray), $stringToSearch)!==false)
{
echo "Found: $stringToSearch in ".implode("", $wholeStringArray);
break;
}
}
Try this, not considering performance, but would work for your case.You can play with the the number of different chars deviation you want to accept.
$names = array("clare", "caire", "laire");
$cv = "marie_claire";
foreach( $names as $name ) {
$sname = str_split($name);
$words = explode('_', $cv);
foreach($words as $word) {
$sword = str_split($word);
$result = array_diff($sword, $sname);
if(count($result) < 2)
echo $name. ":true\r\n";
}
}

Simultaneous Preg_replace operation in php and regex

I know many of the users have asked this type of question but I am stuck in an odd situation.
I am trying a logic where multiple occurance of a specific pattern having unique identifier will be replaced with some conditional database content if there match is found.
My regex pattern is
'/{code#(\d+)}/'
where the 'd+' will be my unique identifier of the above mentioned pattern.
My Php code is:
<?php
$text="The old version is {code#1}, The new version is {code#2}, The stable version is {code#3}";
$newsld=preg_match_all('/{code#(\d+)}/',$text,$arr);
$data = array("first Replace","Second Replace", "Third Replace");
echo $data=str_replace($arr[0], $data, $text);
?>
This works but it is not at all dynamic, the numbers after #tag from pattern are ids i.e 1,2 & 3 and their respective data is stored in database.
how could I access the content from DB of respective ID mentioned in the pattern and would replace the entire pattern with respective content.
I am really not getting a way of it. Thank you in advance
It's not that difficult if you think about it. I'll be using PDO with prepared statements. So let's set it up:
$db = new PDO( // New PDO object
'mysql:host=localhost;dbname=projectn;charset=utf8', // Important: utf8 all the way through
'username',
'password',
array(
PDO::ATTR_EMULATE_PREPARES => false, // Turn off prepare emulation
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION
)
);
This is the most basic setup for our DB. Check out this thread to learn more about emulated prepared statements and this external link to get started with PDO.
We got our input from somewhere, for the sake of simplicity we'll define it:
$text = 'The old version is {code#1}, The new version is {code#2}, The stable version {code#3}';
Now there are several ways to achieve our goal. I'll show you two:
1. Using preg_replace_callback():
$output = preg_replace_callback('/{code#(\d+)}/', function($m) use($db) {
$stmt = $db->prepare('SELECT `content` FROM `footable` WHERE `id`=?');
$stmt->execute(array($m[1]));
$row = $stmt->fetch(PDO::FETCH_ASSOC);
if($row === false){
return $m[0]; // Default value is the code we captured if there's no match in de DB
}else{
return $row['content'];
}
}, $text);
echo $output;
Note how we use use() to get $db inside the scope of the anonymous function. global is evil
Now the downside is that this code is going to query the database for every single code it encounters to replace it. The advantage would be setting a default value in case there's no match in the database. If you don't have that many codes to replace, I would go for this solution.
2. Using preg_match_all():
if(preg_match_all('/{code#(\d+)}/', $text, $m)){
$codes = $m[1]; // For sanity/tracking purposes
$inQuery = implode(',', array_fill(0, count($codes), '?')); // Nice common trick: https://stackoverflow.com/a/10722827
$stmt = $db->prepare('SELECT `content` FROM `footable` WHERE `id` IN(' . $inQuery . ')');
$stmt->execute($codes);
$rows = $stmt->fetchAll(PDO::FETCH_ASSOC);
$contents = array_map(function($v){
return $v['content'];
}, $rows); // Get the content in a nice (numbered) array
$patterns = array_fill(0, count($codes), '/{code#(\d+)}/'); // Create an array of the same pattern N times (N = the amount of codes we have)
$text = preg_replace($patterns, $contents, $text, 1); // Do not forget to limit a replace to 1 (for each code)
echo $text;
}else{
echo 'no match';
}
The problem with the code above is that it replaces the code with an empty value if there's no match in the database. This could also shift up the values and thus could result in a shifted replacement. Example (code#2 doesn't exist in db):
Input: foo {code#1}, bar {code#2}, baz {code#3}
Output: foo AAA, bar CCC, baz
Expected output: foo AAA, bar , baz CCC
The preg_replace_callback() works as expected. Maybe you could think of a hybrid solution. I'll let that as a homework for you :)
Here is another variant on how to solve the problem: As access to the database is most expensive, I would choose a design that allows you to query the database once for all codes used.
The text you've got could be represented with various segments, that is any combination of <TEXT> and <CODE> tokens:
The old version is {code#1}, The new version is {code#2}, ...
<TEXT_____________><CODE__><TEXT_______________><CODE__><TEXT_ ...
Tokenizing your string buffer into such a sequence allows you to obtain the codes used in the document and index which segments a code relates to.
You can then fetch the replacements for each code and then replace all segments of that code with the replacement.
Let's set this up and defined the input text, your pattern and the token-types:
$input = <<<BUFFER
The old version is {code#1}, The new version is {code#2}, The stable version is {code#3}
BUFFER;
$regex = '/{code#(\d+)}/';
const TOKEN_TEXT = 1;
const TOKEN_CODE = 2;
Next is the part to put the input apart into the tokens, I use two arrays for that. One is to store the type of the token ($tokens; text or code) and the other array contains the string data ($segments). The input is copied into a buffer and the buffer is consumed until it is empty:
$tokens = [];
$segments = [];
$buffer = $input;
while (preg_match($regex, $buffer, $matches, PREG_OFFSET_CAPTURE, 0)) {
if ($matches[0][1]) {
$tokens[] = TOKEN_TEXT;
$segments[] = substr($buffer, 0, $matches[0][1]);
}
$tokens[] = TOKEN_CODE;
$segments[] = $matches[0][0];
$buffer = substr($buffer, $matches[0][1] + strlen($matches[0][0]));
}
if (strlen($buffer)) {
$tokens[] = TOKEN_TEXT;
$segments[] = $buffer;
$buffer = "";
}
Now all the input has been processed and is turned into tokens and segments.
Now this "token-stream" can be used to obtain all codes used. Additionally all code-tokens are indexed so that with the number of the code it's possible to say which segments need to be replaced. The indexing is done in the $patterns array:
$patterns = [];
foreach ($tokens as $index => $token) {
if ($token !== TOKEN_CODE) {
continue;
}
preg_match($regex, $segments[$index], $matches);
$code = (int)$matches[1];
$patterns[$code][] = $index;
}
Now as all codes have been obtained from the string, a database query could be formulated to obtain the replacement values. I mock that functionality by creating a result array of rows. That should do it for the example. Technically you'll fire a a SELECT ... FROM ... WHERE code IN (12, 44, ...) query that allows to fetch all results at once. I fake this by calculating a result:
$result = [];
foreach (array_keys($patterns) as $code) {
$result[] = [
'id' => $code,
'text' => sprintf('v%d.%d.%d%s', $code * 2 % 5 + $code % 2, 7 - 2 * $code % 5, 13 + $code, $code === 3 ? '' : '-beta'),
];
}
Then it's only left to process the database result and replace those segments the result has codes for:
foreach ($result as $row) {
foreach ($patterns[$row['id']] as $index) {
$segments[$index] = $row['text'];
}
}
And then do the output:
echo implode("", $segments);
And that's it then. The output for this example:
The old version is v3.5.14-beta, The new version is v4.3.15-beta, The stable version is v2.6.16
The whole example in full:
<?php
/**
* Simultaneous Preg_replace operation in php and regex
*
* #link http://stackoverflow.com/a/29474371/367456
*/
$input = <<<BUFFER
The old version is {code#1}, The new version is {code#2}, The stable version is {code#3}
BUFFER;
$regex = '/{code#(\d+)}/';
const TOKEN_TEXT = 1;
const TOKEN_CODE = 2;
// convert the input into a stream of tokens - normal text or fields for replacement
$tokens = [];
$segments = [];
$buffer = $input;
while (preg_match($regex, $buffer, $matches, PREG_OFFSET_CAPTURE, 0)) {
if ($matches[0][1]) {
$tokens[] = TOKEN_TEXT;
$segments[] = substr($buffer, 0, $matches[0][1]);
}
$tokens[] = TOKEN_CODE;
$segments[] = $matches[0][0];
$buffer = substr($buffer, $matches[0][1] + strlen($matches[0][0]));
}
if (strlen($buffer)) {
$tokens[] = TOKEN_TEXT;
$segments[] = $buffer;
$buffer = "";
}
// index which tokens represent which codes
$patterns = [];
foreach ($tokens as $index => $token) {
if ($token !== TOKEN_CODE) {
continue;
}
preg_match($regex, $segments[$index], $matches);
$code = (int)$matches[1];
$patterns[$code][] = $index;
}
// lookup all codes in a database at once (simulated)
// SELECT id, text FROM replacements_table WHERE id IN (array_keys($patterns))
$result = [];
foreach (array_keys($patterns) as $code) {
$result[] = [
'id' => $code,
'text' => sprintf('v%d.%d.%d%s', $code * 2 % 5 + $code % 2, 7 - 2 * $code % 5, 13 + $code, $code === 3 ? '' : '-beta'),
];
}
// process the database result
foreach ($result as $row) {
foreach ($patterns[$row['id']] as $index) {
$segments[$index] = $row['text'];
}
}
// output the replacement result
echo implode("", $segments);

Retrieve word from string

I have this code:
$getClass = $params->get('pageclass_sfx');
var_dump($getClass); die();
The code above returns this:
string(24) "sl-articulo sl-categoria"
How can I retrieve the specific word I want without mattering its position?
Ive seen people use arrays for this but that would depend on the position (I think) that you enter these strings and these positions may vary.
For example:
$myvalue = $params->get('pageclass_sfx');
$arr = explode(' ',trim($myvalue));
echo $arr[0];
$arr[0] would return: sl-articulo
$arr[1] would return: sl-categoria
Thanks.
You can use substr for that in combination with strpos:
http://nl1.php.net/substr
http://nl1.php.net/strpos
$word = 'sl-categoria';
$page_class_sfx = $params->get('page_class_sfx');
if (false !== ($pos = strpos($page_class_sfx, $word))) {
// stupid because you already have the word... But this is what you request if I understand correctly
echo 'found: ' . substr($page_class_sfx, $pos, strlen($word));
}
Not sure if you want to get a word from the string if you already know the word... You want to know if it's there? false !== strpos($page_class_sfx, $word) would be enough.
If you know exactly what strings you're looking for, then stripos() should be sufficient (or strpos() if you need case-sensitivity). For example:
$myvalue = $params->get('pageclass_sfx');
$pos = stripos($myvalue, "sl-articulo");
if ($pos === FALSE) {
// string "sl-articulo" was not found
} else {
// string "sl-articulo" was found at character position $pos
}
If you need to check if some word are in string you may use preg_match function.
if (preg_match('/some-word/', 'many some-words')) {
echo 'some-word';
}
But this solution can be used for a small list of needed words.
For other cases i suggest you to use some of this.
$myvalue = $params->get('pageclass_sfx');
$arr = explode(' ',trim($myvalue));
$result = array();
foreach($arr as $key=> $value) {
// This will calculates all data in string.
if (!isset($result[$value])) {
$result[$value] = array(); // or 0 if you don`t need to use positions
}
$result[$value][] = $key; // For all positions
// $result[$value] ++; // For count of this word in string
}
// You can just test some words like follow:
if (isset($result['sl-categoria'])) {
var_dump($result['sl-categoria']);
}

wildcard array comparison - improving efficiency

I have two arrays that I'm comparing and I'd like to know if there is a more efficient way to do it.
The first array is user submitted values, the second array is allowed values some of which may contain a wildcard in the place of numbers e.g.
// user submitted values
$values = array('fruit' => array(
'apple8756apple333',
'banana234banana',
'apple4apple333',
'kiwi435kiwi'
));
//allowed values
$match = array('allowed' => array(
'apple*apple333',
'banana234banana',
'kiwi*kiwi'
));
I need to know whether or not all of the values in the first array, match a value in the second array.
This is what I'm using:
// the number of values to validate
$valueCount = count($values['fruit']);
// the number of allowed to compare against
$matchCount = count($match['allowed']);
// the number of values passed validation
$passed = 0;
// update allowed wildcards to regular expression for preg_match
foreach($match['allowed'] as &$allowed)
{
$allowed = str_replace(array('*'), array('([0-9]+)'), $allowed);
}
// for each value match against allowed values
foreach($values['fruit'] as $fruit)
{
$i = 0;
$status = false;
while($i < $matchCount && $status == false)
{
$result = preg_match('/' . $match['allowed'][$i] . '/', $fruit);
if ($result)
{
$status = true;
$passed++;
}
$i++;
}
}
// check all passed validation
if($passed === $valueCount)
{
echo 'hurray!';
}
else
{
echo 'fail';
}
I feel like I might be missing out on a PHP function that would do a better job than a while loop within a foreach loop. Or am I wrong?
Update: Sorry I forgot to mention, numbers may occur more than 1 place within the values, but there will only ever be 1 wildcard. I've updated the arrays to represent this.
If you don't want to have a loop inside another, it would be better if you grouped your $match regex.
You could get the whole functionality with a lot less code, which might arguably be more efficient than your current solution:
// user submitted values
$values = array(
'fruit' => array(
'apple8756apple',
'banana234banana',
'apple4apple',
'kiwi51kiwi'
)
);
$match = array(
'allowed' => array(
'apple*apple',
'banana234banana',
'kiwi*kiwi'
)
);
$allowed = '('.implode(')|(',$match['allowed']).')';
$allowed = str_replace(array('*'), array('[0-9]+'), $allowed);
foreach($values['fruit'] as $fruit){
if(preg_match('#'.$allowed.'#',$fruit))
$matched[] = $fruit;
}
print_r($matched);
See here: http://codepad.viper-7.com/8fpThQ
Try replacing /\d+/ in the first array with '*', then do array_diff() between the 2 arrays
Edit: after clarification, here's a more refined approach:
<?php
$allowed = str_replace("*", "\d+", $match['allowed']);
$passed = 0;
foreach ($values['fruit'] as $fruit) {
$count = 0;
preg_replace($allowed, "", $fruit, -1, $count); //preg_replace accepts an array as 1st argument and stores the replaces done on $count;
if ($count) $passed++;
}
if ($passed == sizeof($values['fruit']) {
echo 'hurray!';
} else {
echo 'fail';
}
?>
The solution above does not remove the need for a nested loop, but it merely lets PHP do the inner loop, which may be faster (you should actually benchmark it)

Filter a set of bad words out of a PHP array

I have a PHP array of about 20,000 names, I need to filter through it and remove any name that has the word job, freelance, or project in the name.
Below is what I have started so far, it will cycle through the array and add the cleaned item to build a new clean array. I need help matching the "bad" words though. Please help if you can
$data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');
// freelance
// job
// project
$cleanArray = array();
foreach ($data1 as $name) {
# if a term is matched, we remove it from our array
if(preg_match('~\b(freelance|job|project)\b~i',$name)){
echo 'word removed';
}else{
$cleanArray[] = $name;
}
}
Right now it matches a word so if "freelance" is a name in the array it removes that item but if it is something like ImaFreelaner then it does not, I need to remove anything that has the matching words in it at all
A regular expression is not really necessary here — it'd likely be faster to use a few stripos calls. (Performance matters on this level because the search occurs for each of the 20,000 names.)
With array_filter, which only keeps elements in the array for which the callback returns true:
$data1 = array_filter($data1, function($el) {
return stripos($el, 'job') === FALSE
&& stripos($el, 'freelance') === FALSE
&& stripos($el, 'project') === FALSE;
});
Here's a more extensible / maintainable version, where the list of bad words can be loaded from an array rather than having to be explicitly denoted in the code:
$data1 = array_filter($data1, function($el) {
$bad_words = array('job', 'freelance', 'project');
$word_okay = true;
foreach ( $bad_words as $bad_word ) {
if ( stripos($el, $bad_word) !== FALSE ) {
$word_okay = false;
break;
}
}
return $word_okay;
});
I'd be inclined to use the array_filter function and change the regex to not match on word boundaries
$data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');
$cleanArray = array_filter($data1, function($w) {
return !preg_match('~(freelance|project|job)~i', $w);
});
Use of the preg_match() function and some regular expressions should do the trick; this is what I came up with and it worked fine on my end:
<?php
$data1=array('JoomlaFreelance','PhillyWebJobs','web2project','cleanname');
$cleanArray=array();
$badWords='/(job|freelance|project)/i';
foreach($data1 as $name) {
if(!preg_match($badWords,$name)) {
$cleanArray[]=$name;
}
}
echo(implode($cleanArray,','));
?>
Which returned:
cleanname
Personally, I would do something like this:
$badWords = ['job', 'freelance', 'project'];
$names = ['JoomlaFreelance', 'PhillyWebJobs', 'web2project', 'cleanname'];
// Escape characters with special meaning in regular expressions.
$quotedBadWords = array_map(function($word) {
return preg_quote($word, '/');
}, $badWords);
// Create the regular expression.
$badWordsRegex = implode('|', $quotedBadWords);
// Filter out any names that match the bad words.
$cleanNames = array_filter($names, function($name) use ($badWordsRegex) {
return preg_match('/' . $badWordsRegex . '/i', $name) === FALSE;
});
This should be what you want:
if (!preg_match('/(freelance|job|project)/i', $name)) {
$cleanArray[] = $name;
}

Categories