I'm trying to see what would be a good way to validate a US address, I know that there might be not a proper way of doing this, but I'm going for the basic way: #, Street name, City, State, and Zip Code.
Any ideas will be appreciate it. Thanks
Don't try. Somebody is likely to have a post office box, or an apartment number etc., and they will be really irate with you. Even a "normal" street name can have numbers, like 125th Street (and many others) in New York City. Even a suburb can have some numbered streets.
And city names can have spaces.
Ask the user to enter parts of the address in separate fields (Street name, City, State, and Zip Code) and use whatever validation appropriate for such a field. This is the general practice.
Alternatively, if you want simplest of regex that matches for four strings separated by three commas, try this:
/^(.+),([^,]+),([^,]+),([^,]+)$/
If things match, you can use additional pattern matching to check components of the address. There is no possible way to check the street address validity but you might be able to text postal codes and state codes.
There are way too many variations in address to be able to do this using regular expressions. You're better off finding a web service that can validate addresses. USPS has one - you'll have to request permission to use it.
I agree with salman: have user enter the data in different fields (one for zip, one for state, one for city, and one for the #/street name. Use a different regex for each field. For the street #/name the best expression i came up with was
/^[0-9]{1,7} [a-zA-z0-9]{2,35}\a*/
This is not a bulletproof solution but the assumption is that an address begins with a numeric for the street number and ends with a zip code which can either be 5 or 9 numbers.
([0-9]{1,} [\s\S]*? [0-9]{5}(?:-[0-9]{4})?)
Like I said, it's not bulletproof, but I've used it with marginal success in the past.
Over here in New Zealand, you can license the official list of postal addresses from New Zealand Post - giving you the data needed to populate a table with every valid postal address in New Zealand.
Validating against this list is a whole lot easier than trying to come up with a Regex - and the accuracy is much much higher as well, as you end up with three cases:
The address you're validating is in the list, so you know it is a real address
The address you're validating is very similar to one in the list, so you know it is probably a real address
The address you're validating is not similar in the list, so it may or may not be real.
The best you'll get with a RegEx is
The address you're validating matches the regex, so it might be a real address
The address you're validating does not match the regex, so it might not be a real address
Needing to know postal addresses is a pretty common situation for many businesses, so I believe that licensing a list will be possible in most areas.
The only sticky bit will be pricing.
Related
I have been searched by the internet, however I didn't find a perfect solution, so I faith that someone already did something like this.
So my issue is, I'm using a webservice were you send the VAT number and if is valid you got the Company info. However the address received is the full address not divided by parts.
For example:
Google Ireland VAT is IE6388047V
I got:
Company Name: GOOGLE IRELAND LIMITED
Address: 3RD FLOOR ,GORDON HOUSE ,BARROW STREET ,DUBLIN 4
So what I need is something like this:
3RD FLOOR ,GORDON HOUSE ,BARROW STREET ,DUBLIN 4
Converts this to:
Address: Ringsend Post Office, Gordon House, Barrow St
City: Dublin 4
Country: Ireland
Someone can help me and make the day?
Thank you so much!
You can separate your address with array using PHP function explode(",",$address). It will separate your address after ", " and store in array...
Parsing addresses is challenging because of the many formats you will encounter, even within a single country. There are services that can parse the input and even verify whether the address exists. They usually return the address in components as well as a composed whole. Unfortunately, parsing international addresses is usually only available via a paid service. One such service is provided by smartystreets.com:
International API documentation
International output fields
(Disclosure: I'm a developer at smartystreets.)
I am working on an email application. Here is the background:
Emailaddresses are stored in an addresses table
Message contents are stored in a message_contents table
There is a search bar on the top of the page, and the flow of action would be the user would type the following, for example: Jonathan Kushner programming and it would drill down to a point where essentially in this case it located an email containing Jonathan Kushner in the addresses and the word programming in the message body.
This is where the problem lies. Currently, I'm doing MySQL LIKES against email addresses that match the users query, as well as doing MySQL LIKES against the message contents that match the users query. The problem is,as for our example, its never going to find an email address that matches Jonathan Kushner programming and its not going to match any body contents that say Jonathan Kushner programming.
So, with that being said, what I need to accomplish is being able to somehow recognize the query input and discover whats a name and whats a body content. I can't fathom I'm the first person in the world to separate addresses from body contents in a database, so its surely possible.
Maybe I should set it up so the user has to type either a name or an emailaddress in the first words of the query and then after X words they are able to type in keywords that would be in the contents. I don't really like this idea, and it couldnt work because if the user didnt have a name or an email address to search for the whole theory is busted, but I'm just not sure how to solve this problem.
Another idea is to possibly separate the words and perform logic on each word and combinations of words.
Any help would be greatly appreciated.
I have done something like this for an ecommerce search function, I split the words up and then used a loop to add the words to a query.
WHERE address LIKE '%$search[0]%' OR address LIKE '%$search[1]%' OR contents LIKE '%$search[0]%'
Hope this is useful, I would play about with some queries to see what works before jumping into the code. When you do start code I would recommend splitting the search query into an array of words then using foreach loops to add them to the query.
Have fun and let me know how you get on!
I am getting spam due to gmail allowing the use of . in their emails, so someone like this spammer.
q.i.n.ghu.im.i.n.g.o.u.r#gmail.com
can get through by removing and/or adding another period in his naming structure.
This happens to be on a Joomla install, so I am specifically looking to create a component so I can add to multiple sites, or if there is a simple regex to add inline existing code. Also, is there anything being done about this, as this seems to be along the lines of and be newly termed a loosely typed email address.. that is crazy to me.
If your goal is to match this address against the others that are equivalent to it (because you've already got them blacklisted) then I'd simply normalize the address to it's most basic state before storing it. Lower case it, split it at the #, and if the right side is "gmail.com" then remove all dots from the left side and put the halves back together.
start with JOE.SCHMOE#GMAIL.COM
lowercase to joe.schmoe#gmail.com
split to joe.schmoe and gmail.com
since right side is gmail.com, remove dots from left
reassemble to joeschmoe#gmail.com
Now you've got the base address that you can block/ban/whatever.
You could do something simple like: /^(?:[^#]+\.){5,}[^#]+#(?:[^#]+\.)+[^#]+/
This is just quick toss up not meant for validation, but rather, a pointer to tell you if their email is scetchy. The key here is the {5,} quantifier that says if the email has 5 or more dots (like a.b.c.d.e.f) it will match. In other words be flagged as scetchy.
I hope this helps!
Explanation: http://regex101.com/r/lB5vG3
I have an HTML form that takes an input shipping address in parts (street address, city, state/province, postal code, and country). This form is then processed with PHP.
I'd like to convert this address into the correct format for the destination country. Are there any libraries or external services that I could use to do this conversion in PHP? If not, could I do it with Perl or a similar language?
Never used it but Geo::PostalAddress is a good starting point. Useful links to regulations if nothing else.
Note that various shipping companies (Fedex, DHL etc) have their own rules for address format.
For anyone still looking for a solution, there is now a professional made, well maintained open source library https://github.com/commerceguys/addressing that solves this exact problem.
In Perl you can use Class::Phrasebook. Using it is very easy.
use Class::Phrasebook;
my $pb = new Class::Phrasebook($log, "test.xml");
$pb->load("NL"); # using Dutch as the language
$phrase = $pb->get("ADDRESS",
{ street => "Chaim Levanon",
number => 88,
city => "Tel Aviv" } );
Now in your case the shipping address will be dynamic (which will be provided by the user) so you'll have to do some more work. You can create a XML file, add dictionaries for all the countries, add phrases (street address, city, state/province, postal code) in each dictionary. Write country specific data in each phrase like "Street address: $street" for English dictionary, "adresse: $street" for French dictionary etc. And then access the dictionary according to the user's country.
More information at CPAN.
I've thought about this problem and I've decided that a file/database with address templates listed for each country is the best solution for me.
However, I'm certain that the other solutions given would work as well.
I am currently developing a website for an electrical company. They would like some sort of postcode check on there. It would somehow work like this:
User enters postcode
See if we cover it
display results.
But I have never worked with postcodes before. How would I be able to check whether they cover it. I obviously need some sort of database listing the postcode or area they cover. But how would I also check if the postcode is valid.
The postcode lookup is obviously to see if the electrical company covers the user's area.
Thanks in advance.
I think instead of using a database to search your results, you would be better of looking at geo location, and using a 3rd party to calculate everything for you.
Google and Sony both provide Geo Location platforms
Sony has: http://www.placeengine.com/en
Google has: http://code.google.com/apis/maps/documentation/distancematrix/
I obviously need some sort of database listing the postcode or area they cover.
Yes. Then do a simple look up.
But how would I also check if the postcode is valid.
You would have to have a database listing all possible post codes and do a similar check.
I don't know which country you are in, but in most countries there is no programmatical way of determining wether a postcode is covered by a service or not.
Also no way of knowing wether it is valid or not without consulting a database that contains coverage.
There are databases available with postal codes, here is a dutch one: http://www.postcode.nl/index/269/1/0/overzicht-producten-en-diensten.html
You can validate the postal code using a regular expression.
The electrical company could surely give you a list of postcodes they do cover? Then it is a simple string matching from there....
I assume you are interested in the UK. To see if a postcode is valid grab the free CodePoint open data set from OS: http://www.ordnancesurvey.co.uk/oswebsite/products/code-point-open/
You could pull that into a DG and cross check user input. Just remember that this data is based on the Royal Mail PAF, so do not assume it is 100% accurate. Build a bit of flexibility/fault tolerance into your code.
If the client wants a specific radius covered, you coudl also use the OS data for distance calculations... and it is all FREE, as in both freedom and free beer :-)
As a first step you could check if the postcode is in a valid format: http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation - this check will pre-filter input for you.