php : parse xml file Multi line strings

php : parse xml file Multi line strings - php

I need to form a table for each road with a shape_leng and a coordinates(multilinestring) columns for each road there can be any number of lines for a road i need to save them in one row
Xml file is of this format: a few roads are multiline strings, and a few others have only one line:
And I have tried to parse it
(Note that shape_leng is single for a road, but coordinate lines can be single or many for a road.)
So I am unable to add them in a particular order like shape_leng and coordinates.

If you want to insert all coordinates into a single database row, I think you have to construct the XPath and loop a bit differently. Loop through the roads, then use XPath to get all coordinates belonging to that road. Eg:
// get all the roads and loop through them
$roads = $xml->xpath("//e:featureMember/b:AA_ROAD");
$i=0;
while(isset($roads[$i]))
{
// get the coordinates for the current road
$coordinates = $roads[i]->xpath("/b:the_geom/e:MultiLineString/e:lineStringMember/e:LineString/e:coordinates");
$shapel = $roads[i]->xpath("/b:SHAPE_Leng");
// add a second loop to concatinate all the $coordinates
$j=0;
while (isset($coordinates[$j])) {
// TODO concatinate coordinates
}
// insert the row
$b=mysql_query("INSERT IGNORE INTO `new`.`road1` (`coordstr`, `shapeleng`) values (GEOMFROMTEXT(concat('MULTILINESTRING ($a )')), '$shapel[$i]') ");
$i++;
echo "<br />";
echo $i;
}

Related

Grab Lists from Database then Grab Top Tags In Each List

So, let's say I have a database where users can add "tags" that PHP turns into a comma separated list. The user puts in 'Orange, Peppers, Biscuits Onions Grapes' and It turns into 'orange,peppers,biscuits,onions,grapes'. Now, I'm pretty sure that will be easy enough, and I don't need help there. But now these "tags" are listed in the SQL Database.
$individuallist = $databaserow['database_list'];
$arrayoflist = explode(',', $individuallist );
foreach($arrayoflist as $individualtag) {
//Display Tags
}
So, good, I can grab these tags and use them for the specific item they relate to and I can take the list and turn it into an array and foreach them to display each individual one.
However, I need to take all the lists in the database and add them together. For example:
while($databaserow = mysql_fetch_assoc($databaseresult)) {
$database_array[] = $databaserow ['database_list'];
}
So these two example lists will be combined into an array
// The Two Lists
// 'orange,peppers,biscuits,onions,grapes'
// 'peppers,orange,market,turkey,juice'
$database_full_list = implode(',', $database_array);
// The Full List
// 'orange,peppers,biscuits,onions,grapes,peppers,orange,market,turkey,juice'
Now that I have the full list of tags, I need to count to see which Tags are the Top 30. The idea is that as more tags are added to the database, the Top 30 Tags would be listed in order of how many there are of them.
Orange (2)
Peppers (2)
Bisquits (1)
Market (1)
etc.
I don't know how to this part of the coding.

$split_tags = explode(',', $database_full_list);
$count_tags = array();
foreach($split_tags as $tag) {
if(!array_key_exists($tag, $count_tags))
$count_tags[$tag] = 0;
$count_tags[$tag]++;
}
asort($count_tags);
foreach(array_reverse($count_tags) as $tag => $count)
echo "$tag ($count)<br/>";

PHP: check user input against text file?

I have the following street names and house numbers in a text file:
Albert Dr: 4116-4230, 4510, 4513-4516
Bergundy Pl: 1300, 1340-1450
David Ln: 3400, 4918, 4928, 4825
Garfield Av: 5000, 5002, 5004, 5006, 8619-8627, 9104-9113
....
This data represents the boundary data for a local neighborhood (i.e., what houses are inside the community).
I want to make a PHP script that will take a user's input (in the form of something like "4918 David Lane" or "3000 Bergundy") search this list, and return a yes/no response whether that house exists within the boundaries.
What would be an efficient way to parse the input (regex?) and compare it to the text list?
Thanks for the help!

It's better to store this info in a database so that you don't have to parse out the data from a text file. Regexes are also not generally applicable to find a number in a range so a general purpose language is advised as well.
But... if you want to do it with regexes (and see why it's not a good idea)
To lookup the numbers for a street use
David Ln:(.*)
To then get the numbers use
[^,]*

You could simply import the file into a string. After this is done, breack each line of the file in an array so Array(Line 1=> array(), Line 2=> array(), etc. After this is done, you can explode using :. After, you'll simply need to search in the array. Not the fastest way, but it may be faster then regex.
You should sincerely consider using a database or re-think how your file are.

Try something like this, put your street names inside test.txt.. Now that you are able to get the details inside the text file, just compare it with the values that you submit in your form.
$filename = 'test.txt';
if(file_exists($filename)) {
if($handle = fopen($filename, 'r')) {
$name = array();
while(($file = fgets($handle)) !==FALSE) {
preg_match('#(.*):(.*)#', $file, $match);
$array = explode(',', $match[2]);
foreach($array as $val) {
$name[$match[1]][] = $val;
}
}
}
}

As mentioned, using a database to store street numbers that are relational to your street names would be ideal. I think a way you could implement this with your text file though is to create a a 2D array; storing the street names in the first array and the valid street numbers in their respective arrays.
Parse the file line by line in a loop. Parse the street name and store in array, then use a nested loop to parse all of the numbers (for ones in a range like 1414-1420, you can use an additional loop to get each number in the range) and build the next array in the initial street name array element. When you have your 2D array, you can do a simple nested loop to check it for a match.
I will try to make a little pseudo-code for you..
pseudocode:
$addresses = array();
$counter = 0;
$line = file->readline
while(!file->eof)
{
$addresses[$counter] = parse_street_name($line);
$numbers_array = parse_street_numbers($line);
foreach($numbers_array as $num)
$addresses[$counter][] = $num;
$line = file->readline
$counter++;
}

It's better if you store your streets in a separate table with IDs, and store numbers in separate table one row for each range or number and street id.
For example:
streets:
ID, street
-----------
1, Albert Dr
2, Bergundy Pl
3, David Ln
4, Garfield Av
...
houses:
street_id, house_min, house_max
-----------------
1, 4116, 4230
1, 4510, 4510
1, 4513, 4516
2, 1300, 1300
2, 1340, 1450
...
In the rows, where no range but one house number, you set both min and max to the same value.
You can write a script, that will parse your txt file and save all data to db. That should be as easy as several loops and explode() with different parameters and some insert queries too.
Then with first query you get street id
SELECT id FROM streets WHERE street LIKE '%[street name]%'
After that you run second query and get answer, is there such house number on that street
SELECT COUNT(*)
FROM houses
WHERE street_id = [street_id]
AND [house_num] BETWEEN house_min AND house_max
Inside [...] you put real values, dont forget to escape them to prevent sql injections...
Or you even can run just one query using JOIN.
Also you should make sure that your given house number is integer, not float.

Limiting XML/HTML string length

So I am trying to parse an XML file and display first 150 words of an article with READ MORE link. It doesn't exactly parse 150 words though. I am also not sure how to make it so it does not parse IMG tag code, etc... the code is below
// Script displays 3 most recent blog posts from blog.pinchit.com (blog..pinchit.com/api/read)
// The entries on homepage show the first 150 words of description and "READ MORE" link
// PART 1 - PARSING
// if it was a JSON file
// $string=file_get_contents("http://blog.pinchit.com/api/read");
// $json_a=json_decode($string,true);
// var_export($json_a);
// XML Parsing
$file = "http://blog.pinchit.com/api/read";
$posts_to_display = 3;
$posts = array();
// get all the file nodes
if(!$xml=simplexml_load_file($file)){
trigger_error('Error reading XML file',E_USER_ERROR);
}
// counter for posts member array
$counter = 0;
// Accessing elements within an XML document that contain characters not permitted under PHP's naming convention
// (e.g. the hyphen) can be accomplished by encapsulating the element name within braces and the apostrophe.
foreach($xml->posts->post as $post){
//post's title
$posts[$counter]['title'] = $post->{'regular-title'};
// post's full body
$posts[$counter]['body'] = $post->{'regular-body'};
// post's body's first 150 words
//for some reason, I am not sure if it's exactly 150
$posts[$counter]['preview'] = substr($posts[$counter]['body'], 0, 150);
//strip all the html tags so it doesn't mess up the page
$posts[$counter]['preview'] = strip_tags($posts[$counter]['preview']);
//post's id
$posts[$counter]['id'] = $post->attributes()->id;
$posts_to_display--;
$counter++;
//exit the for loop after we parse out all the articles that we want
if ($posts_to_display == 0 ) break;
}
// Displays all of the posts
foreach($posts as $post){
echo "<b>" . $post['title'] . "</b>";
echo "<br/>";
echo $post['preview'];
echo " <a href='http://blog.pinchit.com/post/" . $post[id] . "'>Read More</a>";
echo "<br/><br/>";
}
Here are how results look now.
Editor's Pick: Club Sportiva
Nothing makes you feel as totally free and in control as a day behind the wheel of a sleek, sophisticated, sexy sports car. It’s no surprise Read More
Pinchy Drinks & Rocks: The Hotel Utah Saloon
Hotel Utah Read More
Monday Menu: Spicy Grapefruit, Paprika, Creamsicles
Feeling summery and savory today, and we have to admit it took a lot to resist the urge to make this an all appetizers, all desserts, or all drinks Read More

The HTML tags are counting against your character total. Strip the tags out first, then take your preview sample:
$preview = strip_tags($posts[$counter]['body']);
$posts[$counter]['preview'] = substr($preview, 0, 150).'...';
Also, one usually adds an ellipse ("...") to the end of truncated text to indicate that it continues.
Note that this has the potential disadvantage of removing tags you DO want, like <p> and <br>. If you want to preserve those, you can pass them as the second argument for strip_tags:
$preview = strip_tags($posts[$counter]['body'], '<br><p>');
$posts[$counter]['preview'] = substr($preview, 0, 150).'...';
BUT, be forewarned that XML-style tags might throw this off (<br />). If you're dealing with XML/HTML mixed, you might need to elevate your tag filtering using something like htmLawed, but the concept remains the same - get rid of the HTML before you truncate.

Looking at the tag <regular-body> it seems to contain HTML. Therefore I would recommend trying to parse that into a DOMDocument ( http://www.php.net/manual/en/domdocument.loadhtml.php ). You then would be able to loop through all the items and ignore certain tags (ex. ignore <img> but keep <p>). After that, you can then render out what you want and truncate it to 150 characters.

PHP/HTML - Multiple page screen scrape, export to .txt with commas between dates and values

I am attempting to scrape the web page (see code) - as well as those pages going back in time (you can see the date '20110509' in the page itself) - for simple numerical strings. I can't seem to figure out through much trial and error (I'm new to programming) how to parse the specific data in the table that I want. I have been trying to use simple PHP/HTML without curl or other such things. Is this possible? I think my main issue is
using the delimiters that are necessary to get the data from the source code.
What I'd like is for the program to start at the very first page it can, say for example '20050101', and scan through each page till the current date, grabbing the specific data for example, the "latest close" (column), "closing arm" (row), and have that value for the corresponding date exported to a single .txt file, with the date being separated from the value with a comma. Each time the program is run, the date/value should be appended to the existing text file.
I am aware many lines of the code below are junk, it's part of my learning process.
<html>
<title>HTML with PHP</title>
<body>
<?php
$rawdata = file_get_contents('http://online.wsj.com/mdc/public/page/2_3021-tradingdiary2-20110509.html?mod=mdc_pastcalendar');
//$data = substr(' ', $data);
//$begindate = '20050101';
//$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
//if (preg_match(' <td class="text"> ' , $data , $content)) {
//$content = str_replace($newlines
echo $rawdata;
///file_put_contents( 'NYSETRIN.html' , $content , FILE_APPEND);
?>
<b>some more html</b>
<?php
?>
</body>
</html>

All right so let's do this. We're going to first load the data into an HTML parser, then create an XPath parser out of it. XPath will help us navigate around the HTML easily. So:
$date = "20110509";
$data = file_get_contents("http://online.wsj.com/mdc/public/page/2_3021-tradingdiary2-{$date}.html?mod=mdc_pastcalendar");
$doc = new DOMDocument();
#$doc->loadHTML($data);
$xpath = new DOMXpath($doc);
Now then we need to grab some data. First off let's get all the data tables. Looking at the source, these tables are indicated by a class of mdcTable:
$result = $xpath->query("//table[#class='mdcTable']");
echo "Tables found: {$result->length}\n";
So far:
$ php test.php
Tables found: 5
Okay so we have the tables. Now we need to get specific column. So let's use the latest close column you mentioned:
$result = $xpath->query("//table[#class='mdcTable']/*/td[contains(.,'Latest close')]");
foreach($result as $td) {
echo "Column contains: {$td->nodeValue}\n";
}
The result so far:
$ php test.php
Column contains: Latest close
Column contains: Latest close
Column contains: Latest close
... etc ...
Now we need the column index for getting the specific column for the specific row. We do this by counting all of the previous sibling elements, then adding one. This is because element index selectors are 1 indexed, not 0 indexed:
$result = $xpath->query("//table[#class='mdcTable']/*/td[contains(.,'Latest close')]");
$column_position = count($xpath->query('preceding::*', $result->item(0))) + 1;
echo "Position is: $column_position\n";
Result is:
$ php test.php
Position is: 2
Now we need to get our specific row:
$data_row = $xpath->query("//table[#class='mdcTable']/*/td[starts-with(.,'Closing Arms')]");
echo "Returned {$data_row->length} row(s)\n";
Here we use starts-with, since the row label has a utf-8 symbol in it. This makes it easier. Result so far:
$ php test.php
Returned 4 row(s)
Now we need to use the column index to get the data we want:
$data_row = $xpath->query("//table[#class='mdcTable']/*/td[starts-with(.,'Closing Arms')]/../*[$column_position]");
foreach($data_row as $row) {
echo "{$date},{$row->nodeValue}\n";
}
Result is:
$ php test.php
20110509,1.26
20110509,1.40
20110509,0.32
20110509,1.01
Which can now be written to a file. Now, we don't have the markets these apply to, so let's go ahead and grab those:
$headings = array();
$market_headings = $xpath->query("//table[#class='mdcTable']/*/td[#class='colhead'][1]");
foreach($market_headings as $market_heading) {
$headings[] = $market_heading->nodeValue;
}
Now we can use a counter to reference which market we're on:
$data_row = $xpath->query("//table[#class='mdcTable']/*/td[starts-with(.,'Closing Arms')]/../*[$column_position]");
$i = 0;
foreach($data_row as $row) {
echo "{$date},{$headings[$i]},{$row->nodeValue}\n";
$i++;
}
The output being:
$ php test.php
20110509,NYSE,1.26
20110509,Nasdaq,1.40
20110509,NYSE Amex,0.32
20110509,NYSE Arca,1.01
Now for your part:
This can be made into a function that takes a date
You'll need code to write out the file. Check out the filesystem functions for hints
This can be made extendible to use different columns and different rows

I'd recommend using the HTML Agility Pack, its a HTML parser which is very handy for finding particular content within a HTML document.

SimpleXML problem, wrapping a list of items in one <tag>

Problem: I want to get all locations from a database and encase them all in the <loc> tag, the following code puts each location in a <loc> tag, leading to several <loc> when I only need one. I know it's doing this because it's inside a loop (the simpleXML part), but have no idea how to solve it.
if($r7){ //$r7 = If query was successfull..
while($row = mysqli_fetch_array($r7, MYSQLI_ASSOC)){
$convXML_from_loc = $convXML_from->addChild('loc',$row['location']);
}
}
If I take it out of the loop, it just puts the first location in the database there (iirc).
The alternative to this is just echo "<xml>"; which I thought was bad practice, because nothing would have parent and child elements, and everything would be on the same level.
I would appreciate any guidance on this issue, as well as links to any relevant information on this subject.
Regards.
EDIT: If it was unclear, I need to put them all within a single loc tag, like <loc>Row 1, Row 2</loc>. At the moment it's giving <loc>Row 1</loc><loc>Row 2</loc>.

You have a while loop that iterates over every row. You're going to do whatever is in the loop for as many times as there are rows. So you will be making as many <loc> elements as there are rows, one for each row.
The solution is to generate a string that contains all of the row data inside of the while loop, and then add the <loc> element with that string outside of the loop. This causes addChild() to be called only one time, thus creating only one <loc> element.
$location_data;
if($r7){ //$r7 = If query was successfull..
while($row = mysqli_fetch_array($r7, MYSQLI_ASSOC)){
$location_data .= $row['location'] . ' ,';
}
}
$location_data = substr($location_data, 0, -2); // strip off the final space and comma
$convXML_from_loc = $convXML_from->addChild('loc',$location_data); // will result in '<loc>Row 1, Row 2</loc>'

You mean something that will generate the below XML?
<loc>
<location>row 1</location>
<location>row 2</location>
<location>row 3</location>
</loc>
Try:
if($r7){ //$r7 = If query was successfull..
$loc = $convXML_from->addChild('loc'); // Add the <loc> tag
while($row = mysqli_fetch_array($r7, MYSQLI_ASSOC)){
$loc->addChild('location',$row['location']); // Add a <location> with value
}
}
Given your further explanation, try:
if($r7){ //$r7 = If query was successful...
$locations = array();
while($row = mysqli_fetch_array($r7, MYSQLI_ASSOC)){
$locations[] = $row['location']);
}
$convXML_from->addChild('loc', implode(', ', $locations)); // Add the <loc> tag
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

php : parse xml file Multi line strings - php

Related

Grab Lists from Database then Grab Top Tags In Each List

PHP: check user input against text file?

Limiting XML/HTML string length

PHP/HTML - Multiple page screen scrape, export to .txt with commas between dates and values

SimpleXML problem, wrapping a list of items in one <tag>

Categories

Resources