Merchant Id|Merchant|Website|Transaction In Period|Voids In Period|Gross Sales Amount|Total Commision in Period
9766|Mountains Plus Outdoor Gear|www.MPGear.com|1|0|88.91|8.89
12447|Meritline.com|www.meritline.com|5|0|52.86|3.71
16213|East Coast Fashions|www.discountstripper.com|1|0|32.27|3.23
17226|HRM USA|www.HeartRateMonitorsUSA.com|1|0|189.9|6.65
I am getting above string from url now now I want convert that string to array based on split delimiter |
But there is a problem after end of each new row there is not placed delimiter | so I want that to placed that delimiter after each row end.
Note :: all columns will be predefined and it will return same all time request.
I am using this code to convert that string to array . This code is working perfect if there will be all delimiter placed correctly in string.
$NumColumns = 6;
$Delim = "|";
$data = array_chunk(explode($Delim,$contents), $NumColumns);
output will be like this
Array
(
[0] => Array
(
[0] => Merchant Id
[1] => Merchant
[2] => Website
[3] => Transaction In Period
[4] => Voids In Period
[5] => Gross Sales Amount
[6] => Total Commision in Period
)
[1] => Array
(
[0] => 9766
[1] => Mountains Plus Outdoor Gear
[2] => www.MPGear.com
[3] => 1
[4] => 0
[5] => 88.91
[6] => 8.89
)
-----
----
)
Try using str_getcsv or explode.
<?php
$data_from_url = "Merchant Id|Merchant|Website|Transaction In Period|Voids In Period|Gross Sales Amount|Total Commision in Period
9766|Mountains Plus Outdoor Gear|www.MPGear.com|1|0|88.91|8.89
12447|Meritline.com|www.meritline.com|5|0|52.86|3.71
16213|East Coast Fashions|www.discountstripper.com|1|0|32.27|3.23
17226|HRM USA|www.HeartRateMonitorsUSA.com|1|0|189.9|6.65";
$splitted_data = explode(PHP_EOL, $data_from_url);
/**
* print_r($splitted_data) will be
* Array
* (
[0] => "Merchant Id|Merchant|Website|Transaction In Period|Voids In Period|Gross Sales Amount|Total Commision in Period"
[1] => "9766|Mountains Plus Outdoor Gear|www.MPGear.com|1|0|88.91|8.89"
...
* )
*/
// You can now iterate through the lines
$output1 = array();
foreach($splitted_data as $row) {
$output1[] = str_getcsv(trim($row), '|'); // parses a
// OR
//$output1[] = explode('|', trim($row));
}
// OR use array_map with callback function
$output2 = array_map(function($line){ return explode('|', trim($line)); }, $splitted_data);
$output3 = array_map(function($line){ return str_getcsv(trim($line), '|'); }, $splitted_data);
var_dump($output1, $output2, $output3); // The result you want to achive
?>
I would do this as a 2-step process. First split it into lines on the \n (newline) character, and then split each line on the | character.
You can do that in only a couple lines, like this:
$lines = explode("\n", $contents);
$data = array_map(function($line) {return explode('|', trim($line));}, $lines);
You can see this working here: http://phpfiddle.org/main/code/9fv-h2c
Once I split the contents into individual lines, I'm using the array_map() function to apply the same operation to every element of the array (every line).
array_map() calls a callback function (which I define as an anonymous function) for each element in the array. In this instance, I've defined a simple function that trims the line to remove any extra spaces there may be, and then splits it on the | character to get the individual fields.
If the array_map line is a bit complicated, I could illustrate how it's working by rewriting it without the anonymous function like this:
function processLine($line) {
$line = trim($line);
$fields = explode('|', $line);
return $fields;
}
$data = array_map('processLine', $lines);
...or even rewriting it without using array_map like this:
function processLine($line) {
$line = trim($line);
$fields = explode('|', $line);
return $fields;
}
$data = array();
foreach ($lines as $l) {
$data[] = processLine($l);
}
If I understand your question correctly, you're trying to split the data, but want to keep the delimiter | as part of the string.
Using preg_split, you can do this like so:
$arr = preg_split('/([^|]*\|)/', $string, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
The expression matches zero or more chars that aren't | ([^|]*)
and matches the delimiting |. the combination of the two is used as delimiter. In other words, everything is a delimiter now.
That's why we have to use the predefined constant PREG_SPLIT_DELIM_CAPUTRE.
Of course, between the delimiters, there's nothing, but preg_split will capture this nothing-ness, too and add empty matches in the resulting array. That's why we have to use the other predefined constant: PREG_SPLIT_NO_EMPTY.
The two constants are combined by means of the bitwise OR operator |
$output = explode(PHP_EOL, $input);
foreach($output as &$line)
{//$line is a reference here, important!
$line = preg_split('/([^|]*\|)/', $line, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
}
This should produce the desired ouptut, assuming you want the delimiting | kept in the strings. if not:
$output = explode(PHP_EOL, $input);
foreach($output as &$line)
{
$line = explode('|', $line);
}
That's it, really...
Related
I've recently started working with PHP and am trying to make a list from a .txt BUT removing all the unnecessary components.
a line got this
item = 0 egg came_from_chicken
I want to remove the item = and the came_from_chicken which leaves me with 0 egg.
Now after some searching I've found substr() to remove the first 5 characters of each of my lines. Later I've also found that strtok() can remove the rest of the unwanted text after the second tab. Unfortunately I cannot combine these. So my question is: How to remove the first 5 chars from each line and remove everything after the second tab from each line?
I've got this so far:
<?php
$lines = file('../doc/itemlist.txt');
$newf = array();
foreach ($lines as $line) {
$newf[] = strtok($line, "\t") . "\t" . strtok("\t");
}
var_dump($newf);
?>
This works like a charm to remove everything after egg but still have to remove item =.
The quick-and-dirty way is to just wrap it all:
$newf[] = substr(strtok($line, "\t") . "\t" . strtok("\t"), 5);
But, I personally have a dislike for strtok() (can't explain why, I just don't like it). If you don't need to strip everything off from the second tab, but from the last tab (in your example the second tab is the last tab), I would use strrpos() to find the location of the last tab, and dump that into substr():
$newf[] = substr($line, 5, strrpos($line, "\t")-5);
That -5 is to compensate for the 5 characters you strip off from the beginning. If you need to start at character 6 instead of 5, you should also subtract 6 from whatever strrpos() returns.
Edit Never mind that whole last part, I just saw the example format you posted and you really need the second tab instead of the last tab.
You can use a regular expression:
<?php
$lines = file('./file.txt');
$newf = array();
foreach ($lines as $line) {
$newf[] = preg_replace('/.*=\s*(.*)\t.*/s', '$1', $line);
}
var_dump($newf);
Output:
array(1) {
[0]=>
string(5) "0 egg"
}
Assuming that you will be receiving similar patterns to the example you gave.
You can just do a simple line of:
$str = "item = 0 egg came_from_chicken";
$parts = preg_split('/\s+/', $str);
echo $parts[2] . $parts[3];
I know this is not the answer you are looking for using strtok but I believe it would be much easier to do the following code below:
<?php
$lines = '../doc/itemlist.txt';
// array to store all data
$newf = array();
foreach ($lines as $line) {
// you can do an explode which will turn it into an array and you can then get any values you want
// $newf [] = strtok ($line, "\t") . "\t" . strtok("\t"); // throw away
// lets say we use [item = 0 egg came_from_chicken] as our string
// we split it into an array for every tab or spaces found
$values = preg_split ('/\s+/', $line);
//returns
// Array
// (
// [0] => item
// [1] => =
// [2] => 0
// [3] => egg
// [4] => came_from_chicken
// [5] =>
// )
// lastly store your values which would be sub 2 and sub 4
$newf [] = $values [2] . ' ' . $values [3];
}
var_dump($newf);
// return
// array (size=3)
// 0 => string '0 aaa' (length=5)
// 1 => string '1 bbb' (length=5)
// 2 => string '2 ccc' (length=5)
?>
This approach will work with any string preceeding the = symbol.
foreach ($lines as $line) {
$newf[] = implode("\t", array_slice(explode("\t", trim(explode('=', $line, 2)[1])), 0, 2));
}
I am scraping the following kind of strings from an external resource which I can't change:
["one item",0,0,2,0,1,"800.12"],
["another item",1,3,2,5,1,"1,713.59"],
(etc...)
I use the following code to explode the elements into an array.
<?php
$id = 0;
foreach($lines AS $line) {
$id = 0;
// remove brackets and line end comma's
$found_data[] = str_replace(array('],', '[',']', '"'), array('','','',''), $line);
// add data to array
$results[$id] = explode(',', $line);
}
Which works fine for the first line, but as the second line uses a comma for the thousands seperator of the last item, it fails there. So somehow I need to disable the explode to replace stuff between " characters.
If all values would be surrounded by " characters, I could just use something like
explode('","', $line);
However, unfortunately that's not the case here: some values are surrounded by ", some aren't (not always the same values are). So I'm a bit lost in how I should proceed. Anyone who can point me in the right direction?
You can use json_decode here since your input string appears to be a valid json string.
$str = '["another item",1,3,2,5,1,"1,713.59"]'
$arr = json_decode($str);
You can then access individual indices from resulting array or print the whole array using:
print_r($arr);
Output:
Array
(
[0] => another item
[1] => 1
[2] => 3
[3] => 2
[4] => 5
[5] => 1
[6] => 1,713.59
)
Thanks for taking a look at this. I'm using PHP. I have a string like so:
[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don't so much dance as rhythmically convulse.[/QUOTE]
And I want to pull out the values in the quotes and create an associative array like so:
["name" => "Max-Fischer", "post" => "486662533", "member" => "123"]
Then, I would like to remove the opening and closing [QUOTE] tags and replace them with custom HTML like so:
<blockquote>Max-Fischer wrote: I don't so much dance as rhythmically convulse.</blockquote>
So the main problem is creating the preg_match() or preg_replace() to handle first: grabbing the values out in an array, and second: removing the tags and replacing them with my custom content. I can figure out how to use the array to create the custom HTML, I just can't figure how to use regular expressions well enough to achieve it.
I tried a match like this to get the attribute values:
/(\S+)=[\"\']?((?:.(?![\"\']?\s+(?:\S+)=|[>\"\']))+.)[\"\']?/
But this only returns:
[QUOTE
And that's not even addressing how to put the values (if I can get them) into an array.
Thanks in advance for your time.
Cheers.
If the tag you're looking for is always going to be quote, then perhaps something a little simpler is possible:
$s ='"[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don\'t so much dance as rhythmically convulse.[/QUOTE]';
$r = '/\[QUOTE="(.*?)"\](.*)\[\/QUOTE\]/';
$m = array();
$arr = array();
preg_match($r, $s, $m);
// m[0] = the initial string
// m[1] = the string of attributes
// m[2] = the quote itself
foreach(explode(',', $m[1]) as $valuepair) { // split the attributes on the comma
preg_match('/\s*(.*): (.*)/', $valuepair, $mm);
// mm[0] = the attribute pairing
// mm[1] = the attribute name
// mm[2] = the attribute value
$arr[$mm[1]] = $mm[2];
}
print_r($arr);
print $m[2] . "\n";
this gives the following output:
Array
(
[name] => Max-Fischer
[post] => 486662533
[member] => 123
)
I don't so much dance as rhythmically convulse.
If you want to handle the case where there is more than one quote in the string, we can do this by modifying the regex to be slightly less greedy, and then using preg_match_all, instead of preg_match
$s ='[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don\'t so much dance as rhythmically convulse.[/QUOTE]';
$s .='[QUOTE="name: Some-Guy, post: 486562533, member: 1234"]Quidquid latine dictum sit, altum videtur[/QUOTE]';
$r = '/\[QUOTE="(.*?)"\](.*?)\[\/QUOTE\]/';
// ^ <--- added to make it less greedy
$m = array();
$arr = array();
preg_match_all($r, $s, $m, PREG_SET_ORDER);
// m[0] = the first quote
// m[1] = the second quote
// m[0][0] = the initial string
// m[0][1] = the string of attributes
// m[0][2] = the quote itself
// element for each quote found in the string
foreach($m as $match) { // since there is more than quote, we loop and operate on them individually
$quote = array();
foreach(explode(',', $match[1]) as $valuepair) { // split the attributes on the comma
preg_match('/\s*(.*): (.*)/', $valuepair, $mm);
// mm[0] = the attribute pairing
// mm[1] = the attribute name
// mm[2] = the attribute value
$quote[$mm[1]] = $mm[2];
}
$arr[] = $quote; // we now build a parent array, to hold each individual quote
}
print_r($arr);
This gives output like:
Array
(
[0] => Array
(
[name] => Max-Fischer
[post] => 486662533
[member] => 123
)
[1] => Array
(
[name] => Some-Guy
[post] => 486562533
[member] => 1234
)
)
I managed to resolve yout problem: to get an associative array. I hope it will help you.
Here is code
$str = <<< PP
[QUOTE=" name : Max-Fischer,post : 486662533,member : 123 "]I don't so much dance as rhythmically convulse.[/QUOTE]
PP;
preg_match_all('/^\[QUOTE=\"(.*?)\"\](?:.*?)]$/', $str, $matches);
preg_match_all('/([a-zA-Z0-9]+)\s+:\s+([a-zA-Z0-9]+)/', $matches[1][0], $result);
$your_data = array_combine($result[1],$result[2]);
echo "<pre>";
print_r($your_data);
I am using a explode and str_replace on the get parameter of the query string URL. My goal is to split the strings by certain characters to get to the value in the string that I want. I am having issues. It should work but doesn't.
Here are two samples of links with the query strings and delimiters I'm using to str_replace.
http://computerhelpwanted.com/jobs/?occupation=analyst&position=data-analyst
as you can see the URL above parameter is position and the value is data-analyst. The delimiter is the dash -.
http://computerhelpwanted.com/jobs/?occupation=analyst&position=business+systems+analyst
and this URL above uses same parameter position and value is business+systems+analyst. The delimiter is the + sign.
The value I am trying to get from the query string is the word analyst. It is the last word after the delimiters.
Here is my code which should do the trick, but doesn't for some reason.
$last_wordPosition = str_replace(array('-', '+')," ", end(explode(" ",$position)));
It works if the delimiter is a + sign, but fails if the delimiter is a - sign.
Anyone know why?
You have things in the wrong order:
$last_wordPosition = end(explode(" ", str_replace(array('-', '+'), " ", $position)));
You probably want to split it up so as to not get the E_STRICT error when not passing an variable to end:
$words = explode(" ", str_replace(array('-', '+'), " ", $position));
echo end($words);
Or something like:
echo preg_replace('/[^+-]+(\+|-)/', '', $position);
As #MarkB suggested you should use parse_url and parse_str since it is more appropriate in your case.
From the documentation of parse_url:
This function parses a URL and returns an associative array containing any of the various components of the URL that are present.
From the documentation of parse_str:
Parses str as if it were the query string passed via a URL and sets variables in the current scope.
So here is what you want to do:
$url1 = 'http://computerhelpwanted.com/jobs/?occupation=analyst&position=data-analyst';
$url2 = 'http://computerhelpwanted.com/jobs/?occupation=analyst&position=business+systems+analyst';
function mySplit($str)
{
if (preg_match('/\-/', $str))
$strSplited = split('-', $str);
else
$strSplited = split(' ', $str);
return $strSplited;
}
parse_str(parse_url($url1)['query'], $output);
print_r($values = mySplit($output['position']));
parse_str(parse_url($url2)['query'], $output);
print_r($values = mySplit($output['position']));
OUTPUT
Array
(
[0] => data
[1] => analyst
)
Array
(
[0] => business
[1] => systems
[2] => analyst
)
You said that you needed the last element of those values. Therefore you could find end and reset useful:
echo end($values);
reset($values);
Answering my own question to show how I ended up doing this. Seems like way more code than what the accepted answer is, but since I was suggested to use parse_url and parse_str but couldn't get it working right, I did it a different way.
function convertUrlQuery($query) {
$queryParts = explode('&', $query);
$params = array();
foreach ($queryParts as $param) {
$item = explode('=', $param);
$params[$item[0]] = $item[1];
}
return $params;
}
$arrayQuery = convertUrlQuery($_SERVER['QUERY_STRING']); // Returns - Array ( [occupation] => designer [position] => webmaster+or+website-designer )
$array_valueOccupation = $arrayQuery['occupation']; // Value of occupation parameter
$array_valuePosition = $arrayQuery['position']; // Value of position parameter
$split_valuePosition = explode(" ", str_replace(array('-', '+', ' or '), " ", $array_valuePosition)); // Splits the value of the position parameter into separate words using delimeters (- + or)
then to access different parts of the array
print_r($arrayQuery); // prints the array
echo $array_valueOccupation; // echos the occupation parameters value
echo $array_valuePosition; // echos the position parameters value
print_r($split_valuePosition); // prints the array of the spitted position parameter
foreach ($split_valuePosition as $value) { // foreach outputs all the values in the split_valuePosition array
echo $value.' ';
}
end($split_valuePosition); // gets the last value in the split_valuePosition array
implode(' ',$split_valuePosition); // puts the split_valuePosition back into a string with only spaces between each word
which outputs the following
arrayQuery = Array
(
[occupation] => analyst
[position] => data-analyst
)
array_valueOccupation = analyst
array_valuePosition = data-analyst
split_valuePosition = Array
(
[0] => data
[1] => analyst
)
foreach split_valuePosition =
- data
- analyst
end split_valuePosition = analyst
implode split_valuePosition = data analyst
Let's say I have this input:
I can haz a listz0rs!
# 42
# 126
I can haz another list plox?
# Hello, world!
# Welcome!
I want to split it so that each set of hash-started lines becomes a list:
I can haz a listz0rs!
<ul>
<li>42</li>
<li>126</li>
</ul>
I can haz another list plox?
<ul>
<li>Hello, world!</li>
<li>Welcome!</li>
</ul>
If I run the input against the regex "/(?:(?:(?<=^# )(.*)$)+)/m", I get the following result:
Array
(
[0] => Array
(
[0] => 42
)
[1] => Array
(
[0] => 126
)
[2] => Array
(
[0] => Hello, world!
)
[3] => Array
(
[0] => Welcome!
)
)
This is fine and dandy, but it doesn't distinguish between the two different lists. I need a way to either make the quantifier return a concatenated string of all the occurrences, or, ideally, an array of all the occurrences.
Ideally, this should be my output:
Array
(
[0] => Array
(
[0] => 42
[1] => 126
)
[1] => Array
(
[0] => Hello, world!
[1] => Welcome!
)
)
Is there any way of achieving this, and if not, is there a close alternative?
If you want to do this with regular expressions, you'll need two. Use the regex ^(#.*\r?\n)+ to match each list and add tags around it. Within each list (as matched by the first regex), search-and-replace ^#.* with <li>$0</li> to add tags around each list item. Both regexes require ^ to match at line breaks (/m flag in PHP).
In PHP you can use preg_replace_callback and preg_replace to achieve this in just a few lines of code.
$result = preg_replace_callback('/^(#.*\r?\n)+/m', 'replacelist', $subject);
function replacelist($groups) {
return "<ul>\n" .
preg_replace('/^#.*/m', ' <li>$0</li>', $groups[0])
. "</ul>\n";
}
I'd say don't try to do it all in a single regex - instead, first use a regex to match sets of consecutive lines that all begin with # signs and wrap those lines with a <ul></ul> pair. Then use a second regex (or not even a regex at all - you could just split on line breaks) to match each individual line and convert it to <li></li> format.
If it was me I would:
explode("\n", $input) into an array where 1 key = line
foreach through that array
whenever you get a line that doesn't start with a #, that's when you add your closing/opening ul tags
Add a little more to deal with unexpected input (like two non hash lines in a row) and you're good.
Looks like Syntax Error has already explained what I'm doing. But here goes the link to a working example.
With structured content like this, I would not do this as a regex. How about another approach?
$your_text = <<<END
I can haz a listz0rs!
# 42
# 126
I can haz another list plox?
# Hello, world!
# Welcome!
END;
function printUnorderedList($temp) {
if (count($temp)>0) {
print "<ul>\n\t<li>" .implode("</li>\n\t<li>", $temp) . "</li>\n</ul>\n";
}
}
$lines = explode("\n", $your_text);
$temp = array();
foreach($lines as $line) {
if (substr($line, 0, 1) == '#') {
$temp[] = trim(substr($line,1));
} else {
printUnorderedList($temp);
$temp = array();
echo $line . "\n";
}
}
printUnorderedList($temp);
You could avoid regex altogether, and simply try a simpler approach by having it read the file, line by line (an array of lines), and every time it encounters a non-hash-started line, it starts a new list. Like so:
// You can get this by using file('filename') or
// just doing an explode("\n", $input)
$lines = array(
'I can haz a listz0rs!',
'# 42',
'# 126',
'I can haz another list plox?',
'# Hello, world!',
'# Welcome!'
);
$hashline = false;
$lists = array();
$curlist = array();
foreach ($lines as $line) {
if ($line[0] == '#')
$curlist[] = $line;
elseif ($hashline) {
$lists[] = $curlist;
$curlist = array();
$hashline = false;
}
}
A little clean-up may be in order, but hopefully it helps.
(after reading new answers, this is basically an indepth explanation of Syntax Error's answer.)
EDIT: You may want it to strip off the # at the beginning of each line too.