Matching a substring (an apostrophe) in a given word using regex - php

I have a server application which looks up where the stress is in Russian words. The end user writes a word жажда. The server downloads a page from another server which contains the stresses indicated with apostrophes for each case/declension like this жа'жда. I need to find that word in the downloaded page.
In Russian the stress is always written after a vowel. I've been using so far a regex that is a grouping of all possible combinations (жа'жда|жажда'). Is there a more elegant solution using just a regex pattern instead of making a PHP script which creates all these combinations?
EDIT:
I have a word жажда
The downloaded page contains the string жа'жда. (notice the
apostrophe, I do not before-hand know where the apostrophe in the
word is)
I want to match the word with apostrophe (жа'жда).
P.S.: So far I have a PHP script creating the string (жа'жда|жажда') used in regex (apostrophe is only after vowels) which matches it. My goal is to get rid of this script and use just regex in case it's possible.

If I understand your question,
have these options (d'isorder|di'sorder|dis'order|diso'rder|disor'der|disord'er|disorde'r|disorder‌​') and one of these is in the downloaded page and I need to find out which one it is
this may suit your needs:
<pre>
<?php
$s = "d'isorder|di'sorder|dis'order|diso'rder|disor'der|disord'er|disorde'r|disorder'|disorde'";
$s = explode("|",$s);
print_r($s);
$matches = preg_grep("#[aeiou]'#", $s);
print_r($matches);
running example: https://eval.in/207282

Uhm... Is this ok with you?
<?php
function find_stresses($word, $haystack) {
$pattern = preg_replace('/[aeiou]/', '\0\'?', $word);
$pattern = "/\b$pattern\b/";
// word = 'disorder', pattern = "diso'?rde'?r"
preg_match_all($pattern, $haystack, $matches);
return $matches[0];
}
$hay = "something diso'rder somethingelse";
find_stresses('disorder', $hay);
// => array(diso'rder)
You didn't specify if there can be more than one match, but if not, you could use preg_match instead of preg_match_all (faster). For example, in Italian language we have àncora and ancòra :P
Obviously if you use preg_match, the result would be a string instead of an array.

Based, on your code, and the requirements that no function is called and disorder is excluded. I think this is what you want. I have added a test vector.
<pre>
<?php
// test code
$downloadedPage = "
there is some disorde'r
there is some disord'er in the example
there is some di'sorder in the example
there also' is some order in the example
there is some disorder in the example
there is some dso'rder in the example
";
$word = 'disorder';
preg_match_all("#".preg_replace("#[aeiou]#", "$0'?", $word)."#iu"
, $downloadedPage
, $result
);
print_r($result);
$result = preg_grep("#'#"
, $result[0]
);
print_r($result);
// the code you need
$word = 'also';
preg_match("#".preg_replace("#[aeiou]#", "$0'?", $word)."#iu"
, $downloadedPage
, $result
);
print_r($result);
$result = preg_grep("#'#"
, $result
);
print_r($result);
Working demo: https://eval.in/207312

Related

How to replace matching characters in two sets of strings with bolded characters?

currently working on a following function:
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/".$search."/u";
$searchSuggestions = preg_replace($pattern, '<b>'.$search.'</b>', $searchSuggestions);
echo $searchSuggestions;
}
Let's say $searchSuggestions = hello
While the user is typing in the search box, which in this case the variable $search contains this input, a dropdown menu of all possible result suggestions are displayed. If a user types 'hello', then the search results like 'helloworld' or 'hello2' would pop up and the inputted word, int this case 'hello' would be bold in all outputted search results. So far it is working fine, however, big Characters are being replaced with small Characters and vice versa in the outputted search results. I have a feeling that the underlying problem might be in this function, however I am not entirely sure. If anyone has any suggestions or tips on where to look, it would be great!
If I should give out more info please do let me know, and I will edit the question immediately.
Thankyou!
Example output currently -
User types in search bar - 'hello'
result shown should be - 'Hello'
result actually being shown - 'hello'
P.S The results are accessed from an sql query. If a user types, than a query that gets data related to the words inputted is shown. For instance - 'SELECT * FROM test WHERE example LIKE '%hello%'
In database one can find the word Hello. Note the H has a big character.
I tried this following code
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/".$search."/u";
$searchSuggestions = preg_replace($pattern, '<b>'.$search.'</b>', $searchSuggestions);
echo $searchSuggestions;
}
Brief Explanation
$searchSuggestions
Hello
$search(being typed by the user)
h
The output should be
<b>H</b>ello
The output I am currently getting is
<b>h</b>ello
I tried to look further on how to properly solve this and it was a bit simpler than I expected. Did not have to use preg_replace for this. The following code did the trick for me:
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$lastPosition = strlen($search);
$firstPosition = stripos($searchSuggestions, $search);
$replacement = substr($searchSuggestions, $firstPosition, $lastPosition);
$searchSuggestions = substr_replace($searchSuggestions, '<b><u>'.$replacement.'</u></b>', $firstPosition, $lastPosition);
echo $searchSuggestions;
}
$lastPosition - we find how long the search input is, therfore getting the last position
$firstPosition - compare the two strings by finding a match(case insensitive), therefore finding the first position
$replacement - we remove what matches between search and searchSuggestion from the searchSuggestion string, with the help of the firstPosition and lastPosition variables.
At the end the searchSuggestion variable is replaced accordingly. Hope I was clear enough if not please let me know. (Answered my own question lol!)
You need to capture the found text you then can use that in the replacement bit. Your current implementation uses the input which may or may not match the case of the original string. So you should:
use the i modifier so your search is case insensitive
use word boundaries so partial matching doesn't occur
use capture group to capture the match
use preg_quote so any special regex characters don't affect regex operations
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/\b(". preg_quote($search) . ")\b/iu";
$searchSuggestions = preg_replace($pattern, '<b>$1</b>', $searchSuggestions);
echo $searchSuggestions;
}
If partial word matching is intended you can remove word boundaries:
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/(". preg_quote($search) . ")/iu";
$searchSuggestions = preg_replace($pattern, '<b>$1</b>', $searchSuggestions);
echo $searchSuggestions;
}
consider https://en.wikipedia.org/wiki/Scunthorpe_problem#Origin_and_history also though.

PHP trouble with preg_match

I thought I had this working; however after further evaluation it seems it's not working as I would have hoped it was.
I have a query pulling back a string. The string is a comma separated list just as you see here:
(1,145,154,155,158,304)
Nothing has been added or removed.
I have a function that I thought I could use preg_match to determine if the user's id was contained within the string. However, it appears that my code is looking for any part.
preg_match('/'.$_SESSION['MyUserID'].'/',$datafs['OptFilter_1']))
using the same it would look like such
preg_match('/1/',(1,145,154,155,158,304)) I would think. After testing if my user id is 4 the current code returns true and it shouldn't. What am I doing wrong? As you can see the id length can change.
It's better to have all your IDs in an array then checking if a desired ID is existed:
<?php
$str = "(1,145,154,155,158,304)";
$str = str_replace(array("(", ")"), "", $str);
$arr = explode(',', $str);
if(in_array($_SESSION['MyUserID'], $arr))
{
// ID existed
}
As your string - In dealing with Regular Expressions, however it's not recommended here, below regex will match your ID if it's there:
preg_match("#[,(]$ID[,)]#", $str)
Explanations:
[,(] # a comma , or opening-parenthesis ( character
$ID # your ID
[,)] # a comma , or closing-parenthesis ) character

Trying to grab a string with one substring static, and one dynamic

I'm trying to find the best way to grab the dynamic substring, but replace all of the content after.
This is what I'm trying to achieve:
{table_telecommunications}
The substring {table_ is always the same, the only that varies is telecommunications}.
I want to grab the word telecommunications so I can do a search on a MySQL table and then replace {table_telecommunications} with the content returned.
I thought of making a strpos and then explode and so on.
But I guess it would be easier with regex, but I have no skills on creating regex.
Could you possibly give me the best way to do this?
Edit: I'm saying possibly regex is the best way because I need to find strings that are in this format, but the second part is variable, just like {table_*}
Use Regex.
<?php
$string = "{table_telecommunications} blabla blabla {table_block}";
preg_match_all("/\{table_(.+?)\}/is", $string, $matches);
$substrings = $matches[1];
print_r($substrings);
?>
if (preg_match('#table_([^}]+)}#', '{table_telecommunications}', $matches)){
echo $matches[1];
}
That's a regex solution. You can do the same with explode:
$parts = explode('table_', '{table_telecommunications}');
echo substr($parts[1], 0, -1);
$input = '{table_telecommunications}';
$table_name = trim(implode('_', array_shift(explode($input, '_'))), '}');
Should be fast, no regex required.

PHP get specific string from url before and after unknown characters

I know it may sound as a common question but I have difficulty understanding this process.
So I have this string:
http://domain.com/campaign/tgadv?redirect
And I need to get only the word "tgadv". But I don't know that the word is "tgadv", it could be whatever.
Also the url itself may change and become:
http://domain.com/campaign/tgadv
or
http://domain.com/campaign/tgadv/
So what I need is to create a function that will get whatever word is after campaign and before any other particular character. That's the logic..
The only certain thing is that the word will come after the word campaign/ and that any other character that will be after the word we are searching is a special one ( i.e. / or ? )
I tried understanding preg_match but really cannot get any good result from it..
Any help would be highly appreciated!
I would not use a regex for that. I would use parse_url and basename:
$bits = parse_url('http://domain.com/campaign/tgadv?redirect');
$filename = basename($bits['path']);
echo $filename;
However, if want a regex solution, use something like this:
$pattern = '~(.*)/(.*)(\?.*)~';
preg_match($pattern, 'http://domain.com/campaign/tgadv?redirect', $matches);
$filename = $matches[2];
echo $filename;
Actually, preg_match sounds like the perfect solution to this problem. I assume you are having problems with the regex?
Try something like this:
<?php
$url = "http://domain.com/campaign/tgadv/";
$pattern = "#campaign/([^/\?]+)#";
preg_match($pattern, $url, $matches);
// $matches[1] will contain tgadv.
$path = "http://domain.com/campaign/tgadv?redirect";
$url_parts = parse_url($path);
$tgadv = strrchr($url_parts['path'], '/');
You don't really need a regex to accomplish this. You can do it using stripos() and substr().
For example:
$str = '....Your string...';
$offset = stripos($str, 'campaign/');
if ( $offset === false ){
//error, end of h4 tag wasn't found
}
$offset += strlen('campaign/');
$newStr = substr($str, $offset);
At this point $newStr will have all the text after 'campaign/'.
You then just need to use a similar process to find the special character position and use substr() to strip the string you want out.
You can also just use the good old string functions in this case, no need to involve regexps.
First find the string /campaign/, then take the substring with everything after it (tgadv/asd/whatever/?redirect), then find the next / or ? after the start of the string, and everything in between will be what you need (tgadv).

Get data out of string

I am going to parse a log file and I wonder how I can convert such a string:
[5189192e][game]: kill killer='0:Tee' victim='1:nameless tee' weapon=5 special=0
into some kind of array:
$log['5189192e']['game']['killer'] = '0:Tee';
$log['5189192e']['game']['victim'] = '1:nameless tee';
$log['5189192e']['game']['weapon'] = '5';
$log['5189192e']['game']['special'] = '0';
The best way is to use function preg_match_all() and regular expressions.
For example to get 5189192e you need to use expression
/[0-9]{7}e/
This says that the first 7 characters are digits last character is e you can change it to fits any letter
/[0-9]{7}[a-z]+/
it is almost the same but fits every letter in the end
more advanced example with subpatterns and whole details
<?php
$matches = array();
preg_match_all('\[[0-9]{7}e\]\[game]: kill killer=\'([0-9]+):([a-zA-z]+)\' victim=\'([0-9]+):([a-zA-Z ]+)\' weapon=([0-9]+) special=([0-9])+\', $str, $matches);
print_r($matches);
?>
$str is string to be parsed
$matches contains the whole data you needed to be pared like killer id, weapon, name etc.
Using the function preg_match_all() and a regex you will be able to generate an array, which you then just have to organize into your multi-dimensional array:
here's the code:
$log_string = "[5189192e][game]: kill killer='0:Tee' victim='1:nameless tee' weapon=5 special=0";
preg_match_all("/^\[([0-9a-z]*)\]\[([a-z]*)\]: kill (.*)='(.*)' (.*)='(.*)' (.*)=([0-9]*) (.*)=([0-9]*)$/", $log_string, $result);
$log[$result[1][0]][$result[2][0]][$result[3][0]] = $result[4][0];
$log[$result[1][0]][$result[2][0]][$result[5][0]] = $result[6][0];
$log[$result[1][0]][$result[2][0]][$result[7][0]] = $result[8][0];
$log[$result[1][0]][$result[2][0]][$result[9][0]] = $result[10][0];
// $log is your formatted array
You definitely need a regex. Here is the pertaining PHP function and here is a regex syntax reference.

Categories