I have a requirement to insert a string between two markers.
Initially I get a sting (from a file stored on the server) between #DATA# and #END# using:
function getStringBetweenStrings($string,$start,$end){
$startsAt=strpos($string,$start)+strlen($start);
$endsAt=strpos($string,$end, $startsAt);
return substr($string,$startsAt,$endsAt-$startsAt);
}
I do some processing and based on the details of the string, query for some records. If there are records I need to be able to append them at the end of the string and then re-insert the string between #DATA# and #END# within the file on the server.
How can I best achieve this?
Is it possible to insert a record at a time in the file before #END# or is it best to manipulate the string on the server and just re-insert over the existing string in the file on the server?
Example of Data:
AGENT_REF^ADDRESS_1^ADDRESS_2^ADDRESS_3^ADDRESS_4^TOWN^POSTCODE1^POSTCODE2^SUMMARY^DESCRIPTION^BRANCH_ID^STATUS_ID^BEDROOMS^PRICE^PROP_SUB_ID^CREATE_DATE^UPDATE_DATE^DISPLAY_ADDRESS^PUBLISHED_FLAG^LET_RENT_FREQUENCY^TRANS_TYPE_ID^NEW_HOME_FLAG^MEDIA_IMAGE_00^MEDIA_IMAGE_TEXT_00^MEDIA_IMAGE_01^MEDIA_IMAGE_TEXT_01^~
#DATA#
//Property records would appear here and match the string above, each field separated with ^ and terminating with ~
//Once the end of data has been reached, it will be fully terminated with:
#END#
When I check for new properties, I do the following:
Get all existing properties between #DATA# and #END#
Get the IDs of the properties and query for new properties which don't match these IDs
I then need to re-insert the new properties before #END# but after the last property in the file.
The structure of the file is a Rightmove BLM file.
Just do an str_replace() of the old data with the new:
$str = str_replace('#DATA#'.$oldstr.'#END#', '#DATA#'.$newstr.'#END#', $str);
I would extract the data in 3 steps:
1) Extract the data from the file:
<?php
preg_match("/#DATA#(.+)#END#/s", $string, $data);
?>
2) Extract each row of data:
<?php
preg_match_all("/((?:.+\^){2,})~/", $data[1], $rows, PREG_PATTERN_ORDER);
// The rows with data will be stored in $rows[1]
?>
3) Manipulate the data in each row or add new rows:
<?php
//Add
// Add new row to the end of the array
$data[1][] = implode('^', $newRowArray);
//Use
// Creates an array with all the data from the row '0'
$rowData = preg_split("/\^/", $data[1][0], -1, PREG_SPLIT_NO_EMPTY);
//Save the changes
//$newData should be all the rows together (with the '~' at the end of each row)
//$string is the original string with all the information
$file = preg_replace("/(#DATA#\r?\n).+(\r?\n#END#)/s", "\1".$newData."\2", $string);
I hope this can help you in your problem.
Related
Am importing live data from an XML to my live wordpress website. Am using WP-ALL-IMPORT and i have a situation here
I need to import location for my post but my XML gives the coordinates as a single strong ( not longitude and latitude separately) like below
<geopoints>55.25424242,25.15498337</geopoints>
So how do i remove the value after and before the "," comma with [str_replace]
or is there any other way to do this
if you have the value of geopoint tag
the you can use explode
$myGeoPointFromXml = '55.25424242,25.15498337';
$myRes = explode(',', $myGeoPointFromXml ); // this return an array for each value seprated by comma
echo $myRes[0] // show 55.25424242
echo $myRes[1] // show 25.15498337
Hello Every one,
I have an issue that is, I want to add an integer into a string for example I have two text field one is called series start and the other one is called series end now if user enters
FAHD1000001 into series start field
AND
FAHD1000100 into series end field
the algorithm should store 100 values into the database with increment into the each entry that is going to store into the database. i.e
FAHD1000001, FAHD1000002, FAHD1000003, ........., FAHD1000100
Is it possible to do so, if yes then how. Please help me
You have to do something like this but there is a loop hole if you have any numeric value in name like F1AHD00001 than it will not work
$str1 = $request['start'];
$str2 = $request['end'];
$startint=preg_replace("/[^0-9]/","",$str1);
$endint=preg_replace("/[^0-9]/","",$str2 );
$words = preg_replace('/[[:digit:]]/', '', $str1);
for($i=$startint;$i<$endint;$i++){
$newstring=$words.$i;
//Save this new String
}
I have extracted records from a database and stored them on an HTML page with only text. Each record is stored in a <p> paragraph field and separated by a line break <br /> and a line <hr>.
For example:
Company Name<br/>
555-555-555<br />
Address Line 1<br />
Address Line 2<br />
Website: www.example.com<br />
I just need to place these records into a CSV file. I used fputcsv in combination with array() and file_get_contents() but it read my the entire source code of the webpage into a .csv file and alot of data was missing as well. These are multiple records stored in the same format. So after an entire record block as seen above, it is separate by an <hr> line tag. I want to read the company name into the Name column, the Phone number into the Phone column, the addresses into the Address column and the Website into the Website column as shown below.
http://i.stack.imgur.com/00Gxw.png
How can i do this?
Snippet of the HTML:
1 Stop Signs<br />
480-961-7446<br />
500 N. 56th Street<br />
Chandler, AZ 85226<br />
<br />
Website: www.1stopsigns.com<br />
<br />
</p><br /><hr><br />
It's spaced like this in the source of the HTML.
Assuming that your data follows a pattern where every record is separated by a <hr> tag and every field within is separated by a <br /> then you should be able to split out the data.
There are loads of ways to do this, but a naive way that might work using explode() might be something like:
// open a file pointer to csv
$fp = fopen('records.csv', 'w');
// first, split each record into a separate array element
$records = explode('<hr>', $str);
// then iterate over this array
foreach ($records as $record) {
// strip tags and trim enclosing whitespace
$stripped = trim(strip_tags($record));
// explode by end-of-line
$fields = explode(PHP_EOL, $stripped);
// array walk over each field and trim whitespace
array_walk($fields, function(&$field) {
$field = trim($field);
});
// create row
$row = array(
$fields[0], // name
$fields[1], // phone
sprintf('%s, %s', $fields[2], $fields[3]), // address
$fields[6], // web
);
// write cleaned array of fields to csv
fputcsv($fp, $row);
}
// done
fclose($fp);
Where $str is the page data you are parsing. Hope this helps.
EDIT
Didn't notice the specific field requirements originally. Updated the example.
Assuming the html that shown above is well formed,my approach to this problem must be in 2 phases.
First. Clear a little bit the html text to be more efficient to export or manage the information. Here try to clear the items you want to save and delete those you know you don't want to require in the near future.
$html = preg_replace("|\s{2,}|si"," ",$html); // clear non neccesary spaces
$html = preg_replace("|\n{2,}|si","\n",$html); // convert more return line to only one
$html = preg_replace("|<br />|si","##",$html); // replace those tags with this one
Then you'll have a more clean html to work with similar to this....
1 Stop Signs##
480-961-7446##
500 N. 56th Street##
Chandler, AZ 85226##
Website: www.1stopsigns.com##
##
</p>##<hr>##
Second. Now you can explode the fields or make an implode into a comma separate value to form a csv
// here you'll have the fields to work with into the array called $csv_parts
$csv_parts = explode("##",$html);
// imploding, so there you have the formatted csv similar to 1 Stop Signs,480-961-7446,..
$csv = implode(",",$csv_parts);
Now you'll have a two ways to work with the html for extracting the fields or exporting the csv.
Hope this helps or give you an idea to develop what you need.
By far the easiest way would be to simply take the block, drop everything from the <hr> tag forward then split the string as a string array on the <br /> tags.
I have some columns in my table, descriptions column contains some information like;
a1b01,Value 1,2,1,60|a1b01,Value2,1,1,50|b203c,Value 1,0,2,20
with a SQL command, i need to update it.
In there, I'll use a PHP function for updating, if first and second parameters exist in current records (in description column) together.
Eg: if user wants to change the value of description that includes a1b01,Value 1 I'll execute a SQL command like that;
function do_action ($code,$value,$new1,$new2,$newresult) {
UPDATE the_table SET descriptions = REPLACE(descriptions, $code.','.$value.'*', $code.','.$value.','.$new1.','.$new2.','.$newresult)";
}
(star) indicates that, these part is unknown (This is why i need a regex)
My question is : how can i get
a1b01,Value 1,2,1,60|
part from below string
a1b01,Value 1,2,1,60|a1b01,Value2,1,1,50|b203c,Value 1,0,2,20
via regex, but a1b01 and Value 1 should be get as parameter.
I just want that; when I call do_action like that;
do_action ("a1b01","Value 1",2,3,25);
record will be : a1b01,Value 1,2,3,25|a1b01,Value2,1,1,50|b203c,Value 1,0,2,20(first part is updated...)
You don't necessarily need to use a regular expression to do this, you could use the explode function since it is all delimited
So you could do as follows:
$descriptionArray = explode('|', $descriptions); //creates array of the a1b01,Value 1,2,1,60 block
//then for each description explode on ,
for($i = 0; i < count($descriptionArray); $i++){
$parameters = explode(',', $descriptionArray[$i]);
do_action ($parameters[0],$parameters[1],$parameters[2],$parameters[3],$parameters[4]);
}
I am attempting to scrape the web page (see code) - as well as those pages going back in time (you can see the date '20110509' in the page itself) - for simple numerical strings. I can't seem to figure out through much trial and error (I'm new to programming) how to parse the specific data in the table that I want. I have been trying to use simple PHP/HTML without curl or other such things. Is this possible? I think my main issue is
using the delimiters that are necessary to get the data from the source code.
What I'd like is for the program to start at the very first page it can, say for example '20050101', and scan through each page till the current date, grabbing the specific data for example, the "latest close" (column), "closing arm" (row), and have that value for the corresponding date exported to a single .txt file, with the date being separated from the value with a comma. Each time the program is run, the date/value should be appended to the existing text file.
I am aware many lines of the code below are junk, it's part of my learning process.
<html>
<title>HTML with PHP</title>
<body>
<?php
$rawdata = file_get_contents('http://online.wsj.com/mdc/public/page/2_3021-tradingdiary2-20110509.html?mod=mdc_pastcalendar');
//$data = substr(' ', $data);
//$begindate = '20050101';
//$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
//if (preg_match(' <td class="text"> ' , $data , $content)) {
//$content = str_replace($newlines
echo $rawdata;
///file_put_contents( 'NYSETRIN.html' , $content , FILE_APPEND);
?>
<b>some more html</b>
<?php
?>
</body>
</html>
All right so let's do this. We're going to first load the data into an HTML parser, then create an XPath parser out of it. XPath will help us navigate around the HTML easily. So:
$date = "20110509";
$data = file_get_contents("http://online.wsj.com/mdc/public/page/2_3021-tradingdiary2-{$date}.html?mod=mdc_pastcalendar");
$doc = new DOMDocument();
#$doc->loadHTML($data);
$xpath = new DOMXpath($doc);
Now then we need to grab some data. First off let's get all the data tables. Looking at the source, these tables are indicated by a class of mdcTable:
$result = $xpath->query("//table[#class='mdcTable']");
echo "Tables found: {$result->length}\n";
So far:
$ php test.php
Tables found: 5
Okay so we have the tables. Now we need to get specific column. So let's use the latest close column you mentioned:
$result = $xpath->query("//table[#class='mdcTable']/*/td[contains(.,'Latest close')]");
foreach($result as $td) {
echo "Column contains: {$td->nodeValue}\n";
}
The result so far:
$ php test.php
Column contains: Latest close
Column contains: Latest close
Column contains: Latest close
... etc ...
Now we need the column index for getting the specific column for the specific row. We do this by counting all of the previous sibling elements, then adding one. This is because element index selectors are 1 indexed, not 0 indexed:
$result = $xpath->query("//table[#class='mdcTable']/*/td[contains(.,'Latest close')]");
$column_position = count($xpath->query('preceding::*', $result->item(0))) + 1;
echo "Position is: $column_position\n";
Result is:
$ php test.php
Position is: 2
Now we need to get our specific row:
$data_row = $xpath->query("//table[#class='mdcTable']/*/td[starts-with(.,'Closing Arms')]");
echo "Returned {$data_row->length} row(s)\n";
Here we use starts-with, since the row label has a utf-8 symbol in it. This makes it easier. Result so far:
$ php test.php
Returned 4 row(s)
Now we need to use the column index to get the data we want:
$data_row = $xpath->query("//table[#class='mdcTable']/*/td[starts-with(.,'Closing Arms')]/../*[$column_position]");
foreach($data_row as $row) {
echo "{$date},{$row->nodeValue}\n";
}
Result is:
$ php test.php
20110509,1.26
20110509,1.40
20110509,0.32
20110509,1.01
Which can now be written to a file. Now, we don't have the markets these apply to, so let's go ahead and grab those:
$headings = array();
$market_headings = $xpath->query("//table[#class='mdcTable']/*/td[#class='colhead'][1]");
foreach($market_headings as $market_heading) {
$headings[] = $market_heading->nodeValue;
}
Now we can use a counter to reference which market we're on:
$data_row = $xpath->query("//table[#class='mdcTable']/*/td[starts-with(.,'Closing Arms')]/../*[$column_position]");
$i = 0;
foreach($data_row as $row) {
echo "{$date},{$headings[$i]},{$row->nodeValue}\n";
$i++;
}
The output being:
$ php test.php
20110509,NYSE,1.26
20110509,Nasdaq,1.40
20110509,NYSE Amex,0.32
20110509,NYSE Arca,1.01
Now for your part:
This can be made into a function that takes a date
You'll need code to write out the file. Check out the filesystem functions for hints
This can be made extendible to use different columns and different rows
I'd recommend using the HTML Agility Pack, its a HTML parser which is very handy for finding particular content within a HTML document.