I found the same question but that's not very helpful because that is not working in many cases. So, I'm writing this question may be somebody have a better solution for it.
These are my addresses example.
[0] => "Skattkarr Varmland SE-65671" //Sweden
[1] => "Rayleigh , Essex SS6 8YJ" //UK
[2] => "Horgen, Zürich 8810" //Switzerland
[3] => "Edmonton Alberta T5A 2L8" //Canada
[4] => "REDDING, CA 96003" //USA
[5] => "New York, NY 96003" //USA
[6] => "New York NY 96003" //USA
I tried alot, but for many cases I'm getting failed.
I can pass 2 or 3 but I can't pass for all. Especially when the the country changes.
I tried to explode(" ",$addr[0]), it giving me the state on 0 and city on 1, but I try to use explode(" ",$addr[6]), It will give me New as a state and York as city. And same for UK and Canada zip code will be wrong.
My last question was marked duplicate, but my query is different and This question does not help me.
In order to separate these strings into state city and zipcode, you will need to define rules that can apply to all of your strings.
If we separate them by space, New York is not gonna work since New York is a city but has space in the middle.
If we separate them by comma, some of them don't have comma.
If we separate by both space and comma, we cannot assume the last item will be zipcode since T5A 2L8 is zipcode but will be separated.
So there is no rule that I can think of that would work with your data. You should start from how these strings can be separated and identified. Try to apply it to the code and we will gladly help you.
I tried to use OSM nominatim, to separate and validate data.
Request
https://nominatim.openstreetmap.org/search?format=json&limit=1&addressdetails=1&q=1088+Burton+Dr.+REDDING,+CA+96003+US
Response
[
{
"place_id": 266720693,
"licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
"osm_type": "way",
"osm_id": 10591437,
"boundingbox": [
"40.591203388592",
"40.591303388592",
"-122.34939939898",
"-122.34929939898"
],
"lat": "40.59125338859231",
"lon": "-122.34934939898274",
"display_name": "1088, Burton Drive, Lancer Hills Estates, Redding, Shasta County, California, 96003, United States",
"class": "place",
"type": "house",
"importance": 0.621,
"address": {
"house_number": "1088",
"road": "Burton Drive",
"neighbourhood": "Lancer Hills Estates",
"city": "Redding",
"county": "Shasta County",
"state": "California",
"postcode": "96003",
"country": "United States",
"country_code": "us"
}
}
]
And this what I actually want, breaking down of the address string.
Related
I need to parse a street address in PHP a string that might have abbreviations.
This string comes from a text input.
The fields I need to search are:
street (alphanumeric - might have
building (alphanumeric - might have
number (alphanumeric - might have
area (numeric from 1 to 5)
other (unknown field & used to search in all the above fields in the database)
For example users submits one of this text text:
street Main Road Bulding H7 Number 5 Area 1
st Main Road bldg H7 Nr 5 Ar 5
stMain bldgh7
ar5 unknown other search parameter
street Main Road h7 2b
street main street str main road
The outcome I would like to see as a array:
[street]=>Main Road [building]=>h7 [number]=>5 [area]=>1
[street]=>Main Road [building]=>h7 [number]=>5 [area]=>5
[street]=>Main [building]=>h7
[area]=>5 [other]=>unknown other search parameter
[street]=>Main Road [other]=>h7 2b
[street]=>Main Street&&Main Road
My code so far...but dosen't work with examples 3.,4.,5.,6.:
<?php
//posted address
$address = "str main one bldg 5b other param area 1";
//to replace
$replace = ['street'=>['st','str'],
'building'=>['bldg','bld'],
'number'=>['nr','numb','nmbr']];
//replace
foreach($replace as $field=>$abbrs)
foreach($abbrs as $abbr)
$address = str_replace($abbr.' ',$field.' ',$address);
//fields
$fields = array_keys($replace);
//match
if(preg_match_all('/('.implode('|',array_keys($fields)).')\s+([^\s]+)/si', $address, $matches)) {
//matches
$search = array_combine($matches[1], $matches[2]);
//other
$search['other'] = str_replace($matches[0],"",$address);
}else{
//search in all the fields
$search['other'] = $address;
}
//search
print_r($search);
Code tester: http://ideone.com/j3q4YI
Wow, you've got one hairy mess to clean up. I've toiled for a few hours on this. It works on all of your samples, but I would NOT stake my career on it being perfect on all future cases. There are simply too many variations in addresses. I hope you can understand my process and modify it if/when new samples failed to be captured properly. I'll leave all my debugging comment in place, because I reckon you'll use them for future edits.
$addresses=array(
"street Main Road Bulding H7 Number 5 Area 1",
"st Main Road bldg H7 Nr 5 Ar 5",
"stMain bldgh7",
"ar5 unknown other search parameter",
"street Main Road h7 2b",
"street main street str main road"
);
$regex["area"]="/^(.*?)(ar(?:ea)?\s?)([1-5])(.*?)$/i";
$regex["number"]="/^(.*?)(n(?:umbe)?r\s?)([0-9]+)(.*?)$/i";
$regex["building"]="/^(.*?)(bu?i?ldi?n?g\s?)([^\s]+)(.*?)$/i";
$regex["corner"]="/^(.*?str?(?:eet)?)\s?(str?(?:eet)?.*)$/i"; // 2 streets in string
$regex["street"]="/^(.*?)(str?(?:eet)?\s?)([^\s]*(?:\s?ro?a?d|\s?str?e?e?t?|.*?))(\s?.*?)$/i";
$regex["other"]="/^(.+)$/";
$search=[];
foreach($addresses as $i=>$address){
echo "<br><div><b>$address</b> breakdown:</div>";
foreach($regex as $key=>$rgx){
if(strlen($address)>0){
//echo "<div>addr(",strlen($address),") $address</div>";
if(preg_match($rgx,$address,$matches)){
if($key=="other"){
$search[$i][$key]=$matches[0]; // everything that remains
}elseif($key=="corner"){
$search[$i]["street"]=""; // NOTICE suppression
// loop through both halves of corner address omitting element[0]
foreach(array_diff_key($matches,array('')) as $half){
//echo "half= $half<br>";
if(preg_match($regex["street"],$half,$half_matches)){
//print_r($half_matches);
$search[$i]["street"].=(strlen($search[$i]["street"])>0?"&&":"").ucwords($half_matches[3]);
$address=trim($half_matches[1].$half_matches[4]);
// $matches[2] is the discarded identifier
//echo "<div>$key Found: {$search[$i][$key]}</div>";
//echo "<div>Remaining: $address</div>";
}
}
}else{
$search[$i][$key]=($key=="street"?ucwords($matches[3]):$matches[3]);
$address=trim($matches[1].$matches[4]);
// $matches[2] is the discarded identifier
//echo "<div>$key Found: {$search[$i][$key]}</div>";
//echo "<div>Remaining: $address</div>";
//print_r($matches);
}
}
}else{
break; // address is fully processed
}
}
echo "<pre>";
var_export($search[$i]);
echo "</pre>";
}
The output is an array that satisfies your brief, but the keys are out of order because I captured the address components out of order -- this may not matter to you, so I didn't bother re-sorting it.
street Main Road Bulding H7 Number 5 Area 1 breakdown:
array (
'area' => '1',
'number' => '5',
'building' => 'H7',
'street' => 'Main Road',
)
st Main Road bldg H7 Nr 5 Ar 5 breakdown:
array (
'area' => '5',
'number' => '5',
'building' => 'H7',
'street' => 'Main Road',
)
stMain bldgh7 breakdown:
array (
'building' => 'h7',
'street' => 'Main',
)
ar5 unknown other search parameter breakdown:
array (
'area' => '5',
'other' => 'unknown other search parameter',
)
street Main Road h7 2b breakdown:
array (
'street' => 'Main Road',
'other' => 'h7 2b',
)
street main street str main road breakdown:
array (
'street' => 'Main Street&&Main Road',
)
...boy am I glad this project doesn't belong to me. Good luck!
Thank you for the help! I thought that I should do something like multiple preg_matches.
I just found a PHP extension that does exactly what I want.
The library is PHP Postal (https://github.com/openvenues/php-postal) and requires libpostal. It takes about 15-20 seconds to load the library when you run PHP, after this everything work ok.
Total execution time for parsing: 0.00030-0.00060 seconds.
$parsed = Postal\Parser::parse_address("The Book Club 100-106 Leonard St, Shoreditch, London, Greater London, EC2A 4RH, United Kingdom");
foreach ($parsed as $component) {
echo "{$component['label']}: {$component['value']}\n";
}
Output:
house: the book club
house_number: 100-106
road: leonard st
suburb: shoreditch
city: london
state_district: greater london
postcode: ec2a 4rh
country: united kingdom
All I had to do after this is to replace my labels and format the address.
Hope this will help others, who want to parse a address in PHP.
I am querying the Wikipedia API. Normally I get the following, and I echo out the extract.
array:4 [▼
"pageid" => 13275
"ns" => 0
"title" => "Hungary"
"extract" => """
<p><span></span></p>\n
<p><b>Hungary</b> (<span><span>/<span><span title="/ˈ/ primary stress follows">ˈ</span><span title="'h' in 'hi'">h</span><span title="/ʌ/ short 'u' in 'bud'">ʌ</span><span title="/ŋ/ 'ng' in 'sing'">ŋ</span><span title="'g' in 'guy'">ɡ</span><span title="/ər/ 'er' in 'finger'">ər</span><span title="/i/ 'y' in 'happy'">i</span></span>/</span></span>; Hungarian: <span lang="hu"><i>Magyarország</i></span> <span title="Representation in the International Phonetic Alphabet (IPA)">[ˈmɒɟɒrorsaːɡ]</span>) is a parliamentary constitutional republic in Central Europe. It is situated in the Carpathian Basin and is bordered by Slovakia to the north, Romania to the east, Serbia to the south, Croatia to the southwest, Slovenia to the west, Austria to the northwest, and Ukraine to the northeast. The country's capital and largest city is Budapest. Hungary is a member of the European Union, NATO, the OECD, the Visegrád Group, and the Schengen Area. The official language is Hungarian, which is the most widely spoken non-Indo-European language in Europe.</p>\n
But if the entry does not exist in Wiki then I get this.
array:3 [▼
"ns" => 0
"title" => "Kisfelegyhaza"
"missing" => ""
]
So my question is how do I check if extract exists?
I tried the following but it does not work.
$wiki_array = The data received from Wiki
if (array_key_exists('extract',$wiki_array)){
// do something
}
$wiki_array = The data received from Wiki
if( isset($wiki_array['extract']) ){
// do something
}
isset($var) to check if that var is setted (so not null)
For anyone facing a the same problem, here is the solution I used.
foreach($wiki_array['query']['pages'] as $page){
if( isset($page['extract']) ){
echo '<p>';
echo $page['extract'];
echo '</p>';
}
}
I have a db with many tables. One table represents products, one other table represents categories. One product can belong to many categories. So when I request my db to display the products with their categories I of course retrieve many rows as many categories each product has.
For instance the result would be
product_name|category |city
Test1 |cinema |paris
Test1 |entertainment |paris
Test1 |Other |paris
Test2 |Food |new york
Test2 |Restaurant |new york
Test2 |Night |new york
What I am trying to do is to create a JSON object using a PHP script for each product name which looks like this :
[
{
"product_name": "Test1",
"categorie": [
"cinema",
"entertainment",
"Other"
],
"city": "paris"
},
{
"product_name": "Test2",
"categorie": [
"Food",
"Restaurant",
"Night"
],
"city": "new york"
}
]
When I tried to use json_encode but unsuccessfully I got duplicate rows.
thanks for your help
You will need to reorder your data before the json_encode
If $rows is the response of your db, you have to go throw
$data = [];
foreach($rows as $row) {
$name = $row['product_name'];
if(!isset($data[$name])) {
$data[$name] = [
'product_name' => $name,
'city'=>$row['city'],
'categories'=>[]
];
}
$data[$name]['categories'][] = $row['category'];
}
json_encode(array_values($data)); //this is what you want
Thats it!
Maybe you need while ($row=mysqli_fetch_row($result)) instead of foreach, depending on your sql query builder you use. But the idea is the same
I am trying to decode a JSON file in PHP (which I do all the time). But when I try and use this file: https://www.hyzyne.co.nz/updatewlg/nzta.json . It decodes it but then only shows the last part of the JSON in the array.
PHP:
$xml = file_get_contents('https://www.hyzyne.co.nz/updatewlg/nzta.json');
$data = json_decode($xml, true);
print_r($data);
If you view the file you will see it is quite long.
When I print the above all I get is:
Array ( [data] => Array ( [0] => Array ( [roadevent] => Array ( [alternativeRoute] => Follow Detours [directLineDistance1] => 1.55 km southeast of Taradale [directLineDistance2] => 1.62 km southwest of Jervoistown [directLineDistance3] => 1.66 km east of Waiohiki [endDate] => 2013-12-03T18:25:18.677+13:00 [eventComments] => Now Clear [eventDescription] => Crash [eventId] => 84295 [eventIsland] => North Island [eventType] => Road Hazard [expectedResolution] => Until further notice [impact] => Caution [locationArea] => SH 50 Taradale [locations] => Array ( [location] => 050-0005/04.88 Taradale ) [planned] => false [startDate] => 2013-12-03T17:17:00.000+13:00 [status] => Resolved [wktGeometry] => SRID=27200;POINT (2841479.57385071 6176775.805777006) [eventCreated] => 2013-12-03T17:20:18.450+13:00 [eventModified] => 2013-12-03T18:25:18.380+13:00 [informationSource] => Police [supplier] => Official [eventRegions] => Array ( [eventRegion] => Taranaki, Manawatu-Wanganui, Hawke's Bay & Gisborne Region ) ) ) ) )
Which is not all of the JSON, I did make the JSON file myself. But I am unable to recognise any problems in the JSON.
Even when putting my JSON through http://jsonlint.com/ I only get the last segment. Does anyone know what I am doing wrong?
Thanks
You have an issue with your json. You are repeating the property "roadevent". I'm assuming they all shouldn't be contained in the same object { inside the array [.
{
"data": [{
"roadevent": {
"alternativeRoute": "Local Roads",
"directLineDistance1": "1.43 km west of Heathcote Valley",
"directLineDistance2": "1.45 km southwest of Ferrymead",
"directLineDistance3": "1.68 km southwest of Mount Pleasant",
"eventComments": "Bridges Not To Be Crossed By Any Overweight Loads Except For Iso Containers Being Moved On Existing Overweight Permits. Any Other Overweight Loads (including Those Travelling On Area Permits) Will Be Considered On A Case By Case Basis.",
"eventDescription": "Other",
"eventId": "47078",
"eventIsland": "South Island",
"eventType": "Road Hazard",
"expectedResolution": "Until further notice",
"impact": "Vehicle Restrictions",
"locationArea": "SH 74 Christchurch - Tunnel Rd, Horotane Valley Overpasses No 1 And 2 (bsn 217 & 218)",
"locations": {
"location": "074-0019/02.60-D Heathcote Valley"
},
"planned": "false",
"restrictions": "Road constricted for oversize and non-standard vehicles",
"startDate": "2011-03-03T17:47:00.000+13:00",
"status": "Active",
"wktGeometry": "SRID=27200;POINT (2485362.087794318 5737151.779842964)",
"eventCreated": "2011-03-03T17:49:29.260+13:00",
"eventModified": "2012-08-15T17:52:54.210+12:00",
"informationSource": "NMC",
"supplier": "Official"
},
"roadevent": {
"alternativeRoute": "-",
"directLineDistance1": "0.69 km southwest of Hakataramea",
"directLineDistance2": "0.92 km northeast of Kurow",
"directLineDistance3": "6.43 km southeast of Lake Waitaki",
"eventComments": "Speed Restriction In Place For Heavy Vehicles Of 20kph.",
"eventDescription": "Other",
"eventId": "62595",
"eventIsland": "South Island",
"eventType": "Road Hazard",
"expectedResolution": "Until further notice",
"impact": "Caution",
"locationArea": "SH 82 Waitaki River Bridge No 1 ( Kurow Bridges)",
"locations": {
"location": "082-0053/16.68 -"
},
"planned": "false",
"startDate": "2012-06-14T12:17:00.000+12:00",
"status": "Active",
"wktGeometry": "SRID=27200;POINT (2310306.0879597953 5605922.516155325)",
"eventCreated": "2012-06-14T12:17:55.127+12:00",
"eventModified": "2012-08-15T17:46:26.197+12:00",
"informationSource": "NMC",
"supplier": "Official"
},
just replace ,"roadevent": to ,
JSON ending Gisborne Region"}}]}}
JSON starting {"data":{"roadevent":[{"alternativeRoute"
OK, let's say when a user selects a country, they are also added with a "federation". These federations are pretty much region-centric.
Let's say I have something like this:
function getFedration($country_iso) {
// 6 federations
// afc = asian nations
// caf = african nations
// cocacaf = north & central america and Caribbean nations
// conmebol = south america
// ofc = Oceanian nations
// uefa = european nations
$afc = array("Japan", "China", "South Korea");
$caf = array("Cameroon", "Chad", "Ivory Coast");
$concacaf = array("United States" , "Canada", "Mexico");
$conmebol = array("Argetina", "Brazil", "Chile");
$ofc = array("Fiji", "New Zealand", "Samoa");
$uefa = array("Spain", "England", "Montenegro");
/*
PSEUDO-code
If $country_iso is in either of six arrays... mark that as the federation...
*/
return $federation;
}
I know, it says a country's name but when it comes down to it, it will be country's iso like JP instead of Japan, CN instead of China, et cetera.
So, I was wondering, is this a feasible thing or is there a better way you'd think?
How about putting all federations into an array, in order to loop through it? Makes things easier, like so:
function countryToFederation($country_iso) {
$federations = array(
"afc" => array("Japan", "China", "South Korea"),
"caf" => array("Cameroon", "Chad", "Ivory Coast"),
"concacaf" => array("United States" , "Canada", "Mexico"),
"conmebol" => array("Argetina", "Brazil", "Chile"),
"ofc" => array("Fiji", "New Zealand", "Samoa"),
"uefa" => array("Spain", "England", "Montenegro"),
);
foreach($federations as $federation) {
if(in_array($country_iso, $federation)) {
return $federation;
}
}
}
If a federation can only belong to one country, I would create one array instead:
$countryToFederationMap = array(
'Japan' => 'AFC',
'China' => 'AFC',
'Cameroon' => 'CAF',
// ...
);
Then the federation is simply:
return $countryToFederationMap[$country];