I would like to explode this kind of string in PHP:
$foo = "foo.txt da\ code.txt bar.txt";
And I want to explode it to have:
["foo.txt", "da\ code.txt", "bar.txt"]
I know I can use preg_split but I don't see what to put as regular expression.
Could someone help me?
You could match on a positive look behind for any space preceded by alpha characters.
<?php
$foo = "foo.txt da\ code.txt bar.txt";
print_r(preg_split("/(?<=[a-zA-Z])\s/", $foo));
You could also use a negative lookbehind for the general case
<?php
$foo = "foo.txt da\ code.txt bar.txt something.mp3 other# 9asdf";
print_r(preg_split('/(?<!\\\\)\s/', $foo));
Here's a funky, non-regex "parser". All kinds of fun. What was the world like before regular expressions? I mean, it must have been work. ;)
<?php
$foo = "foo.txt da\ code.txt bar.txt";
$foos = array();
$char = 0;
$index = 0;
$lookback = '';
while ($char < strlen($foo)) {
$lookback = $foo{$char-4} . $foo{$char-3} . $foo{$char-2} . $foo{$char-1} . $foo{$char};
if ($lookback == '.txt ') $index++;
$foos[$index] .= $foo{$char++};
}
print_r(array_map('trim', $foos));
?>
http://codepad.org/XMfLemeg
You could use preg_split() but it is slow so if escaped space is your only problem a lot faster would be to do it as simply as:
$array1 = explode(' ', $list);
$array2 = [];
$appendNext = false;
foreach($array1 as $elem)
{
if ($appendNext)
{
array_push($array2, array_pop($array2) . ' ' . $elem);
}
else
{
$array2[] = $elem;
}
$appendNext = (substr($elem, -1) === '\\');
}
var_dump($array2);
If you really want to do it via regex here is a working solution:
print_r(preg_split("/(?<!\\\)\s/", $foo));
http://codepad.org/ngbDaxA3
but it will be slower than above
Related
What is the most efficient pattern to replace dots in dot-separated string to an array-like string e.g x.y.z -> x[y][z]
Here is my current code, but I guess there should be a shorter method using regexp.
function convert($input)
{
if (strpos($input, '.') === false) {
return $input;
}
$input = str_replace_first('.', '[', $input);
$input = str_replace('.', '][', $input);
return $input . ']';
}
In your particular case "an array-like string" can be easily obtained using preg_replace function:
$input = "x.d.dsaf.d2.d";
print_r(preg_replace("/\.([^.]+)/", "[$1]", $input)); // "x[d][dsaf][d2][d]"
From what I can understand from your question; "x.y.z" is a String and so should "x[y][z]" be, right?
If that is the case, you may want to give the following code snippet a try:
<?php
$dotSeparatedString = "x.y.z";
$arrayLikeString = "";
//HERE IS THE REGEX YOU ASKED FOR...
$arrayLikeString = str_replace(".", "", preg_replace("#(\.[a-z0-9]*[^.])#", "[$1]", $dotSeparatedString));
var_dump($arrayLikeString); //DUMPS: 'x[y][z]'
Hope it helps you, though....
Using a fairly simple preg_replace_callback() that simply returns a different replacement for the first occurrence of . compared to the other occurrences.
$in = "x.y.z";
function cb($matches) {
static $first = true;
if (!$first)
return '][';
$first = false;
return '[';
}
$out = preg_replace_callback('/(\.)/', 'cb', $in) . ((strpos('.', $in) !== false) ? ']' : ']');
var_dump($out);
The ternary append is to handle the case of no . to replace
already answered but you could simply explode on the period delimiter then reconstruct a string.
$in = 'x.y.z';
$array = explode('.', $in);
$out = '';
foreach ($array as $key => $part){
$out .= ($key) ? '[' . $part . ']' : $part;
}
echo $out;
$str1 = "HEADINGLEY";
$str2 = "HDNGLY";
how can i seach string 1 to see if it contains all the characters from string 2 in the order they exist in string 2 and return a true or false value ?
Any help appreciated :D
if this helps, this is the code i have todate..
echo preg_match('/.*H.*D.*N.*G.*L.*Y.*/', $str1, $matches);
returns 0
I guess I'd use preg_match
http://us.php.net/manual/en/function.preg-match.php
...and turning $str2 into a regular expression using str_split to get an array and implode to turn it back into a string with ".*" as glue between characters.
http://us.php.net/manual/en/function.str-split.php
http://us.php.net/manual/en/function.implode.php
Upate -- I should have suggested str_split instead of explode, and here is the code that will get you a regular expression from $str2
$str2pat = "/.*" . implode(".*", str_split($str2)) . ".*/";
You could use some regular expression match like .*H.*D.*N.*G.*L.*Y.* and preg_match.
Should be easy enough to add the .* in the second string.
This code is in C# as I don't know PHP but should be easy to understand.
int index1 = 0, index2 = 0;
while (index1 < str1.Length && index2 < str2.Length)
{
if (str1[index1] == str2[index2]) index2++;
index1++;
}
if (index2 == str2.Length) return true;
Imagine that your string is array and after that use array_diff function.
http://hu2.php.net/array_diff
$str1 = "HEADINGLEY";
$str2 = "HDNGLY";
$arr1 = str_split($str1, 1);
$arr2 = str_split($str2, 1);
$arr2_uniques = array_diff($arr2, $arr1);
return array() === $arr2_uniques;
Possibly this snippet will do it:
var contains = true;
for (var i=0; i < str2.length; i++)
{
var char = str2.charAt(i);
var indexOfChar = str1.indexOf(char);
if (indexOfChar < 0)
{
contains = false;
}
str1 = str1.substr(indexOfChar);
}
You can use pre_match function, like that:
$split_str2 = str_split($str2);
// Create pattern from str2
$pattern = "/" . implode(".*", $split_str2) . "/";
// Find
$result = preg_match($pattern, $str1);
You can probably use some regular expression but be weary of strings that contained regex specific characters. This solution does it programmatically.
function contains_all_characters($str1, $str2)
{
$i1 = strlen($str1) - 1;
$i2 = strlen($str2) - 1;
while ($i2 >= 0)
{
if ($i1 == -1) return false;
if ($str1[$i1--] == $str2[$i2]) $i2--;
}
return true;
}
You can try:
$str = 'HEADINGLEY';
if (preg_match('/[^H]*H[^D]*D[^N]*N[^G]*G[^L]*L[^Y]*Y/', $str, $m))
var_dump($m[0]);
Update: Even better is to build regex like this and then use it:
$str = 'HEADINGLEY';
$pattern = 'HDNGLY';
$regex = '#' . preg_replace('#(.)#', '[^$1]*$1', $pattern) . '#';
if (preg_match($regex, $str, $m))
var_dump($m[0]);
OUTPUT:
string(10) "HEADINGLEY"
I have a flatfile database and it is data seperated by delimiters.
I allow people to use the delimiter in their input but I make sure to escape it with a \ beforehand.
The problem is my explode() function still attempts to split the escaped delimiters, so how do I tell it to ignore them?
Use preg_split instead. By using a regex you can match a delimeter only if it is not preceded with a backslash.
Edit:
preg_split('~(?<!\\\)' . preg_quote($delimeter, '~') . '~', $text);
None of the solutions here correctly handle any number of escape characters, or they leave them in the output. Here's an alternative:
function separate($string, $separator = '|', $escape = '\\') {
if (strlen($separator) != 1 || strlen($escape) != 1) {
trigger_error(__FUNCTION__ . ' requires delimiters to be single characters.', E_USER_WARNING);
return;
}
$segments = [];
$string = (string) $string;
do {
$segment = '';
do {
$segment_length = strcspn($string, "$separator$escape");
if ($segment_length) {
$segment .= substr($string, 0, $segment_length);
}
if (strlen($string) <= $segment_length) {
$string = null;
break;
}
if ($escaped = $string[$segment_length] == $escape) {
$segment .= (string) substr($string, ++$segment_length, 1);
}
$string = (string) substr($string, ++$segment_length);
} while ($escaped);
$segments[] = $segment;
} while ($string !== null);
return $segments;
}
This will process a raw string like foo\|ba\r\\|baz| into foo|bar\, baz, and an empty string.
If you want to retain the escape character in the output, you will have to modify the function.
Note: this will have unpredictable behaviour if you're using mb function overloading.
Input Data
key1=val1;key2=val2start\;val2end;key3=val3\\;key4=val4\\\;key5=val5\\\\;key6=val6
REGEX
/(.*?[^\\](\\\\)*?);/
Example
<?php
$data="key1=val1;key2=val2start\\;val2end;key3=val3\\\\;key4=val4\\\\\\;key5=val5\\\\\\\\;key6=val6";
$regex='/(.*?[^\\\\](\\\\\\\\)*?);/';
preg_match_all($regex, $data.';', $matches);
print_r($matches[1]);
Output
Array
(
[0] => key1=val1
[1] => key2=val2start\;val2end
[2] => key3=val3\\
[3] => key4=val4\\\;key5=val5\\\\
[4] => key6=val6
)
You will find this solution more useful than using regex for large strings. I employ a stream to allow usage of fgetcsv, which is optimized for this sort of thing.
<?php
function escaped_explode($string,$delimit,$escape=NULL,$enclosure=NULL,$max_line_length=0){
$r=[];
$stream = fopen('php://memory','r+');
fwrite($stream, $string);
rewind($stream);
while (($data = fgetcsv($stream,$max_line_length,$delimit,$enclosure,$escape)) !== FALSE)
$r=array_merge($r,$data);
fclose($stream);
return $r;
}
?>
Usage:
$pipelined_values = escaped_explode($source,'|','\\');
This is convenient also because you have the option of using enclosures, such as quotes, instead of only escape characters. This is nice if you run into parsing someone's blobs of JSON values, or other syntax, as you can both enclose and escape.
$source= <<<JSON
'{ "key":"val", "n":0}',
'{ "key":"val", "n":1, "name": "French du\'Name" }',
'{ "key":"val", "n":2}'
JSON;
Can be interpreted
<?php
$objects=[];
$raw= escaped_explode($source, ',', '\\', "'");
foreach($raw as $r)
$objects[] = json_decode($r);
?>
Basically I want to turn a string like this:
<code> <div> blabla </div> </code>
into this:
<code> <div> blabla </div> </code>
How can I do it?
The use case (bc some people were curious):
A page like this with a list of allowed HTML tags and examples. For example, <code> is a allowed tag, and this would be the sample:
<code><?php echo "Hello World!"; ?></code>
I wanted a reverse function because there are many such tags with samples that I store them all into a array which I iterate in one loop, instead of handling each one individually...
My version using regular expressions:
$string = '<code> <div> blabla </div> </code>';
$new_string = preg_replace(
'/(.*?)(<.*?>|$)/se',
'html_entity_decode("$1").htmlentities("$2")',
$string
);
It tries to match every tag and textnode and then apply htmlentities and html_entity_decode respectively.
There isn't an existing function, but have a look at this.
So far I've only tested it on your example, but this function should work on all htmlentities
function html_entity_invert($string) {
$matches = $store = array();
preg_match_all('/(&(#?\w){2,6};)/', $string, $matches, PREG_SET_ORDER);
foreach ($matches as $i => $match) {
$key = '__STORED_ENTITY_' . $i . '__';
$store[$key] = html_entity_decode($match[0]);
$string = str_replace($match[0], $key, $string);
}
return str_replace(array_keys($store), $store, htmlentities($string));
}
Update:
Thanks to #Mike for taking the time to test my function with other strings. I've updated my regex from /(\&(.+)\;)/ to /(\&([^\&\;]+)\;)/ which should take care of the issue he raised.
I've also added {2,6} to limit the length of each match to reduce the possibility of false positives.
Changed regex from /(\&([^\&\;]+){2,6}\;)/ to /(&([^&;]+){2,6};)/ to remove unnecessary excaping.
Whooa, brainwave! Changed the regex from /(&([^&;]+){2,6};)/ to /(&(#?\w){2,6};)/ to reduce probability of false positives even further!
Replacing alone will not be good enough for you. Whether it be regular expressions or simple string replacing, because if you replace the < > signs then the < and > signs or vice versa you will end up with one encoding/decoding (all < and > or all < and > signs).
So if you want to do this, you will have to parse out one set (I chose to replace with a place holder) do a replace then put them back in and do another replace.
$str = "<code> <div> blabla </div> </code>";
$search = array("<",">",);
//place holder for < and >
$replace = array("[","]");
//first replace to sub out < and > for [ and ] respectively
$str = str_replace($search, $replace, $str);
//second replace to get rid of original < and >
$search = array("<",">");
$replace = array("<",">",);
$str = str_replace($search, $replace, $str);
//third replace to turn [ and ] into < and >
$search = array("[","]");
$replace = array("<",">");
$str = str_replace($search, $replace, $str);
echo $str;
I think i have a small sollution, why not break html tags into an array and then compare and change if needed?
function invertHTML($str) {
$res = array();
for ($i=0, $j=0; $i < strlen($str); $i++) {
if ($str{$i} == "<") {
if (isset($res[$j]) && strlen($res[$j]) > 0){
$j++;
$res[$j] = '';
} else {
$res[$j] = '';
}
$pos = strpos($str, ">", $i);
$res[$j] .= substr($str, $i, $pos - $i+1);
$i += ($pos - $i);
$j++;
$res[$j] = '';
continue;
}
$res[$j] .= $str{$i};
}
$newString = '';
foreach($res as $html){
$change = html_entity_decode($html);
if($change != $html){
$newString .= $change;
} else {
$newString .= htmlentities($html);
}
}
return $newString;
}
Modified .... with no errors.
So, although other people on here have recommended regular expressions, which may be the absolute right way to go ... I wanted to post this, as it is sufficient for the question you asked.
Assuming that you are always using html'esque code:
$str = '<code> <div> blabla </div> </code>';
xml_parse_into_struct(xml_parser_create(), $str, $nodes);
$xmlArr = array();
foreach($nodes as $node) {
echo htmlentities('<' . $node['tag'] . '>') . html_entity_decode($node['value']) . htmlentities('</' . $node['tag'] . '>');
}
Gives me the following output:
<CODE> <div> blabla </div> </CODE>
Fairly certain that this wouldn't support going backwards again .. as other solutions posted, would, in the sense of:
$orig = '<code> <div> blabla </div> </code>';
$modified = '<CODE> <div> blabla </div> </CODE>';
$modifiedAgain = '<code> <div> blabla </div> </code>';
I'd recommend using a regular expression, e.g. preg_replace():
http://www.php.net/manual/en/function.preg-replace.php
http://www.webcheatsheet.com/php/regular_expressions.php
http://davebrooks.wordpress.com/2009/04/22/php-preg_replace-some-useful-regular-expressions/
Edit: It appears that I haven't fully answered your question. There is no built-in PHP function to do what you want, but you can do find and replace with regular expressions or even simple expressions: str_replace, preg_replace
I am trying to do something similar to hangman where when you guess a letter, it replaces an underscore with what the letter is. I have come up with a way, but it seems very inefficient and I am wondering if there is a better way. Here is what I have -
<?
$word = 'ball';
$lettersGuessed = array('b','a');
echo str_replace( $lettersGuessed , '_' , $word ); // __ll
echo '<br>';
$wordArray = str_split ( $word );
foreach ( $wordArray as $letterCheck )
{
if ( in_array( $letterCheck, $lettersGuessed ) )
{
$finalWord .= $letterCheck;
} else {
$finalWord .= '_';
}
}
echo $finalWord; // ba__
?>
str_replace does the opposite of what I want. I want what the value of $finalWord is without having to go through a loop to get the result I desire.
If I am following you right you want to do the opposite of the first line:
echo str_replace( $lettersGuessed , '_' , $word ); // __ll
Why not create an array of $opposite = range('a', 'z'); and then use array_diff () against $lettersGuessed, which will give you an array of unguessed letters. It would certainly save a few lines of code. Such as:
$all_letters = range('a', 'z');
$unguessed = array_diff ($all_letters, $lettersGuessed);
echo str_replace( $unguessed , '_' , $word ); // ba__
It's an array, foreach is what you're suppose to be doing, it's lightning fast anyways, I think you are obsessing over something that's not even a problem.
You want to use an array becuase you can easily tell which indexes in the array are the ones that contain the letter, which directly correlates to which place in the string the _ should become a letter.
Your foreach loop is a fine way to do it. It won't be slow because your words will never be huge.
You can also create a regex pattern with the guessed letters to replace everything except those letters. Like this:
$word = 'ball';
$lettersGuessed = array('b','a');
$pattern = '/[^' . implode('', $lettersGuessed) . ']/'; // results in '/[^ba]/
$maskedWord = preg_replace($pattern, '_', $word);
echo $maskedWord;
Another way would be to access the string as an array, e.g.
$word = 'ball';
$length = strlen($word);
$mask = str_pad('', $length, '_');
$guessed = 'l';
for($i = 0; $i < $length; $i++) {
if($word[$i] === $guessed) {
$mask[$i] = $guessed;
}
}
echo $mask; // __ll