I have a system where users can nput customer information. When the information is enetered I do a few things to clean the information such as changing the case, removing special characters etc. The one issue I have though is that Limited companies have the following syntax, company name:
company name SL
As I currently change everything to lower case and then use ucwords I end up with Sl. I am looking for the best way to overcome this and regex sprang to mind.
Unfortunately regex is not my strong point and was hoping someone could point me in the right direction. What I am hoping to do is to find the a string that contains two letters S and L in this order. I need to be able to find the string regardless of characters ie S.L., S.L and also regardless of case. If the string is found, replace this with SL.
Within this I would also need to know the characters it found to use string replace to change it.
If you imagine my current method using string replace is growing quite big:
return str_replace(array(',','sl.','s.l.','s.l','sl ',' sl','SL.','Sl.','S.L.','S.l.','S.l','S.L','SL'),array("","SL","SL","SL","SL","SL","SL","SL","SL","SL","SL","SL","SL"),self::properCase($name))
The issue with the above is also, if some enters say "Bill Slade sl", without a regex to match only those two letters, how could I ever say only upper case them. I need to ensure there is nothing either side.
Any help or pointers would be greatly appreciated.
Use this RegEx:
/\bs\.?l?\.?\b/i
RegEx Demo and Explanation
Using this RegEx with PHP:
$regex_pattern = "/\bs\.?l?\.?\b/i";
$string = "company name S.l\ncompany name Sl.\ncompany name S.l.\ncompany name Sl\ncompany name s.l.\ncompany name sl\ncompany name s.L";
$replacement = " SL";
$result = preg_replace($regex_pattern, $replacement, $string);
echo $result;
Try this working code on http://writecodeonline.com/php/ so you can see the results quickly. :)
Hope it helps.
Read up: preg_replace | PHP manual
You can use this regex: \bs[.]?l(?:[.]?|\b) and substitute with SL.
Here is an example and sample working code on TutorialsPoint:
$re = "/\\bs[.]?l(?:[.]?|\\b)/i";
$str = "company name Sl\ncompany name s.l.\ncompany name sl\ncompany name s.L";
$subst = "SL";
$result = preg_replace($re, $subst, $str);
$replacement= "SL";
$pattern = '/(\bs)(.?)(\s?)(l\b)(.?)/i';
$input = "
company name Sl
company name s.l.
company name sl
company name s.L
company name s. L
company name s l
company name S, L.
";
$result = preg_replace($pattern, $replacement, $input);
echo $result;
Go to this http://www.phpliveregex.com/p/aJi link and click to preg_replace
Related
I am using a regular expression to convert #user name to links.
For example if user enters #Alex Ferguson it should convert Alex Ferguson to hyperlink.
Here it's converting the first name to hyper link and excluding the last name.It looks for the word closer to #, if there is no space between first name and last name it works fine.
Is there any way to convert both first name and last name to hyper link.
Here is my code:
function convert($msg){
$message = preg_replace(array('/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))/', '/(^|[^a-z0-9_])#([a-z0-9_]+)/i', '/(^|[^a-z0-9_])#([a-z0-9_]+)/i'), array('$1', '$1#$2', '$1#$2'), $msg);
return $message;
}
Thanks..
The general method for this would be:
$regex = '~(?i)#[a-z]+[ ][a-z]+~';
$replaced = preg_replace($regex,'$0',$string);
Notes
I'll leave it for you to fill in the blanks
One issue with names is the range of allowable characters. What about Julie O'Hara? M.C. Cocoa? etc.
I'm currently looking to implement a search function on my website.
I have it working with 1 word/name, but I can't seem to figure out how to split and identify certain parts of the search string.
Example:
I have a user in my database with the name "Steve de Vette"
(My country has words in between almost all of the first and last names but not always, and sometimes more than one. ex: "Kees van der Berg") But his name is of course split up in multiple parts. "vNaam", "Tvoegsel"(meaning the "de" or "van der") and "aNaam".
This complicates things a bit for me, since I now have to split the search string, which on it's own isn't a big deal. But I need to know how I can get the correct results every time.
So I guess it comes down to this: How can I make it so that the name is split up like it should, or maybe there's a way to strip these thing all together, but for the likes of me I can't seem to figure it out.
Any help would be greatly appreciated!
EDIT:
I have tried just exploding the name and searching with multiple OR_LIKE clauses. This works until I have no "tussenvoegsel" and one of the Like statements reads "OR anaam LIKE '%%'"
split the string with explode and search for the first and last item.
$string1 = "Steve de Vette";
$string2 = "Kees van der Berg";
$ex1 = explode(" ", $string2);
$nr = count($ex1);
echo $ex1[0]; //firstname
echo ' ';
echo $ex1[$nr-1]; //lastname
Well you can use the PHP string searching funciton.
$pos = strpos($string, $character);
You could use this to find the first space in the name. So if you take "Steve de Vette", you could first find Steve as the first name, then the rest of the string you could search again or keep the rest of it as the last name.
This is a snippet of code taken from my own site.
$fname = strstr($entry," ",true); <-- finds the first name (all characters up to the first space)
$len = strlen($fname) + 1; <-- skips over the space to the last name
$entrylen = strlen($entry); <-- gets the length of the search string used
$sname = substr($entry, $len, $entrylen); <-- gets the rest of the string (last name)
Hope you find this helpful
What i do is strip out any spaces all together. I store spaces in my database like normal but use the replace feature when searching to strip out spaces. then strip out spaces from the search field as well and use the like with the wild card on the right hand side. I try to make the search as simple as possible. searching with one word seems to work better all together so forcing one word seems to be the thing that works for me.
Currently I am developing a web application to fetch Twitter stream and trying to create a natural language processing by my own.
Since my data is from Twitter (limited by 140 characters) there are many words shortened, or on this case, omitted space.
For example:
"Hi, my name is Bob. I m 19yo and 170cm tall"
Should be tokenized to:
- hi
- my
- name
- bob
- i
- 19
- yo
- 170
- cm
- tall
Notice that 19 and yo in 19yo have no space between them. I use it mostly for extracting numbers with their units.
Simply, what I need is a way to 'explode' each tokens that has number in it by chunk of numbers or letters without delimiter.
'123abc' will be ['123', 'abc']
'abc123' will be ['abc', '123']
'abc123xyz' will be ['abc', '123', 'xyz']
and so on.
What is the best way to achieve it in PHP?
I found something close to it, but it's C# and spesifically for day/month splitting. How do I split a string in C# based on letters and numbers
You can use preg_split
$string = "Hi, my name is Bob. I m 19yo and 170cm tall";
$parts = preg_split("/(,?\s+)|((?<=[a-z])(?=\d))|((?<=\d)(?=[a-z]))/i", $string);
var_dump ($parts);
When matching against the digit-letter boundary, the regular expression match must be zero-width. The characters themselves must not be included in the match. For this the zero-width lookarounds are useful.
http://codepad.org/i4Y6r6VS
how about this:
you extract numbers from string by using regexps, store them in an array, replace numbers in string with some kind of special character, which will 'hold' their position. and after parsing the string created only by your special chars and normal chars, you will feed your numbers from array to theirs reserved places.
just an idea, but imho might work for you.
EDIT:
try to run this short code, hopefully you will see my point in the output. (this code doesnt work on codepad, dont know why)
<?php
$str = "Hi, my name is Bob. I m 19yo and 170cm tall";
preg_match_all("#\d+#", $str, $matches);
$str = preg_replace("!\d+!", "#SPEC#", $str);
print_r($matches[0]);
print $str;
This is a homework assignment and my first experience using RegEx. I am starting to grasp the syntax and symbols used and can do some simple pattern matching/manipulation, but can't quite foresee how to achieve some of the goals of this assignment.
I have been given a text file that is formatted like this:
Steve Blenheim:238-923-7366:95 Latham Lane, Easton, PA 83755:11/12/56:20300
Betty Boop:245-836-8357:635 Cutesy Lane, Hollywood, CA 91464:6/23/23:14500
Igor Chevsky:385-375-8395:3567 Populus Place, Caldwell, NJ 23875:6/18/68:23400
Norma Corder:397-857-2735:74 Pine Street, Dearborn, MI 23874:3/28/45:245700
There are about 50 lines of names and corresponding info, each entry is on a new line and each 'field' is separated by a colon. Mostly I need to find specific things from the file and print them on a webpage but I don't quite understand.
Here is one problem I solved:
$myFile = "datebook.txt";
$data = file($myFile);//I have used this to place all data in an array, but it may be necessary to place the data into a string?
//1) Print all lines containing the pattern Street (case insensitive).
$pattern = "/street/i";
$linesFound = preg_grep($pattern, $data);
echo "<pre>", print_r($linesFound, true), "</pre>";
Here are some I have not and specific questions regarding them:
2) Print the first and last names in which the first name starts with a letter ‘B’.
How do I only search for first names and not last names, city names, etc?
How do I print the full name and only the full name?
5) Print Lori Gortz’s name and address.
I understand how to find the pattern 'Lori Gortz' but how do I return her address as well?
11) Print lines that end in exactly five digits.
12) Print the file with the first and last names reversed.
14) Give everyone a $250.00 raise.
Don't know how to do any of these. I assume the last number for each entry is their salary.
Any help is appreciated. Please respond with an explanation of the code as well, thank you.
Check the RegEx quick reference, I think you'll figure out most of your tasks there. For example, Lori's address would be a string after the number after the second colon and before the second coma (in her line, of course).
The best way to do all the task would be to go over each line and make an array with all the elements. That way you could easy replace names, increase salaries, check if it ends with 5 digits, etc.
You can also try this online tester. Good luck.
Edit:
Little help for a start:
^[A-z ]* this gets full names
^[A-z]* this gets first names
etc...
Edit2:
See what this code does:
$line = "Betty Boop:245-836-8357:635 Cutesy Lane, Hollywood, CA 91464:6/23/23:14500";
$regex = "/\s|:/";
$result = preg_split($regex, $line);
:)
I don't want to do all of them, but here's some hints.. For question 2:
^[A-Z]* B.*$
^ basically means a new line.
[A-Z]* means any number of characters from A-Z
Next we match a space
Next we match a B
The .* means any number of other characters.
Lastly, we match with an end of line using $
This can definitely be improved and made more flexible, but I'll let you do that..
I'm looking for the best reliable way to return the first and last name of a person given the full name, so far the best I could think of is the following regular expression:
$name = preg_replace('~\b(\p{L}+)\b.+\b(\p{L}+)\b~i', '$1 $2', $name);
The expected output should be something like this:
William -> William // Regex Fails
William Henry -> William Henry
William Henry Gates -> William Gates
I also want it to support accents, for instance "João".
EDIT: I understand that some names will not be properly identified, but this isn't a problem for me, since this is going to be used on a local site where the last word is the last name (might not be the whole surname though) but this isn't a problem since all I want is a quick way to say "Dear FIRST_NAME LAST_NAME"... So all this discussion, while totally valid, is useless to me.
Can someone help me with this?
This might not be what you want to hear, but I don't think this problem is suited to a regular expression since names are not regular. I don't think they are even context-sensitive or context-free. If anything, they are unrestricted (I would have to sit down and think that through more than I did before I say that for sure, though) and no regular expression engine can parse an unrestricted grammar.
Instead of a regex you might find it easier to do something like:
$parts = explode(" ", $name);
$first = $parts[0];
$last = ""
if (count($parts) > 1) {
$last = $parts[count($parts) - 1];
}
You might want to replace multiple consecutive bits of whitespace with a single space first, so you don't get empty bits, and get rid of trailing/leading whitespace:
$name = ereg_replace("[ \t\r\n]+", " ", trim($name));
As is, you're requiring a last name -- which, of course, your first example doesn't have.
Use clustered grouping, (?:...), and 0-or-1 count, ?, for the middle and last names as a whole to allow them to be optional:
'~\b(\p{L}+)\b (?: .+\b(\p{L}+)\b )?~ix' # x for spacing
This should allow the first name to be captured whether middle/last names are given or not.
$name = preg_replace('~\b(\p{L}+)\b(?:.+\b(\p{L}+)\b)?~i', '$1 $2', $name);
Depending on how clean your data is, I think you are going to have a tough time finding a single regex that does what you want. What different formats do you expect the names to be in? I've had to write similar code and there can be a lot of variations:
- first last
- last, first
- first middle last
- last, first middle
And then you have things like suffixes (Junior, senior, III, etc.) and prefixes ( Mr., Mrs, etc), combined names (e.g. John and Mary Smith). As some others have already mentioned you also have to deal with multi-part last names (e.g. Victor de la Hoya) as well.
I found I had to deal with all of those possibilities before I could reliably pull out the first and last names.
If you're defining first and last name as the text before the first space and after the last space, then just split the string on spaces and grab the first and last elements of the array.
However, depending on the context/scope of what you're doing, you may need to re-evaluate things - not all names around the world will meet this pattern.
I think your best option is to simply treat everything after the first name as the surname i.e.
William Henry Gates
Forename: William
Surname: Henry Gates
Its the safest mechanism as not everyone will enter their middle name anyway. You can't simply extract William - ignore Henry - and extract Gates as for all you know, Henry is part of the Surname.
Here is simple non regex way
$name=explode(" ",$name);
$first_name=reset($name);
$last_name=end($name);
$result=$first_name.' '.$last_name;