Get Everything between two characters - php

I'm using PHP. I'm trying to get a Regex pattern to match everything between value=" and " i.e. Line 1 Line 2,...,to Line 4.
value="Line 1
Line 2
Line 3
Line 4"
I've tried /.*?/ but it doesn't seem to work.
I'd appreciate some help.
Thanks.
P.S. I'd just like to add, in response to some comments, that all strings between the first " and last " are acceptable. I'm just trying to find a way to get everything between the very first " and very last " even when there is a " in between. I hope this makes sense. Thanks.

Assuming the desired character is "double quote":
$pat = '/\"([^\"]*?)\"/'; // text between quotes excluding quotes
$value='"Line 1 Line 2 Line 3 Line 4"';
preg_match($pat, $value, $matches);
echo $matches[1]; // $matches[0] is string with the outer quotes

if you just want answer and not want specific regex,then you can use this:
<?php
$str='value="Line 1
Line 2
Line 3
Line 4"';
$need=explode("\"",$str);
var_dump($need[1]);
?>

/.*?/ has the effect to not match the new line characters. If you want to match them too, you need to use a regular expression like /([^"]*)/.
I agree with Josh K that a regular expression is not required in this case (especially if you know there will not be any apices apart the one to delimit the string). You could adopt the solution given by him as well.

If you must use regex:
if (preg_match('!"([^"]+)"!', $value, $m))
echo $m[1];

You need s pattern modifier. Something like: /value="(.*)"/s

I'm not a regex guru, but why not just explode it?
// Say $var contains this value="..." string
$arr = explode('value="');
$mid = explode('"', $arr[1]);
$fin = $mid[0]; // Contains what you're looking for.

The specification isn't clear, but you can try something like this:
/value="[^"]*"/
Explanation:
First, value=" is matched literally
Then, match [^"]*, i.e. anything but ", possibly spanning multiple lines
Lastly, match " literally
This does not allow " to appear between the "real" quotes, not even if it's escaped by e.g. preceding with a backslash.
The […] is a character class. Something like [aeiou] matches one of any of the lowercase vowels. [^…] is a negated character class. [^aeiou] matches one of anything but the lowercase vowels.
References
regular-expressions.info/Examples - Programming Language Constructs - Strings
Has variations on different string patterns (e.g. allowing escaped quotes)
Related questions
Difference between .*? and .* for regex
As much as is practical, negated character class is always a better option than .*?

Related

laravel route with any character until next / [duplicate]

I am looking for a pattern that matches everything until the first occurrence of a specific character, say a ";" - a semicolon.
I wrote this:
/^(.*);/
But it actually matches everything (including the semicolon) until the last occurrence of a semicolon.
You need
/^[^;]*/
The [^;] is a character class, it matches everything but a semicolon.
^ (start of line anchor) is added to the beginning of the regex so only the first match on each line is captured. This may or may not be required, depending on whether possible subsequent matches are desired.
To cite the perlre manpage:
You can specify a character class, by enclosing a list of characters in [] , which will match any character from the list. If the first character after the "[" is "^", the class matches any character not in the list.
This should work in most regex dialects.
Would;
/^(.*?);/
work?
The ? is a lazy operator, so the regex grabs as little as possible before matching the ;.
/^[^;]*/
The [^;] says match anything except a semicolon. The square brackets are a set matching operator, it's essentially, match any character in this set of characters, the ^ at the start makes it an inverse match, so match anything not in this set.
None of the proposed answers did work for me. (e.g. in notepad++)
But
^.*?(?=\;)
did.
Try /[^;]*/
Google regex character classes for details.
sample text:
"this is a test sentence; to prove this regex; that is g;iven below"
If for example we have the sample text above, the regex /(.*?\;)/ will give you everything until the first occurence of semicolon (;), including the semicolon: "this is a test sentence;"
Try /[^;]*/
That's a negating character class.
This was very helpful for me as I was trying to figure out how to match all the characters in an xml tag including attributes. I was running into the "matches everything to the end" problem with:
/<simpleChoice.*>/
but was able to resolve the issue with:
/<simpleChoice[^>]*>/
after reading this post. Thanks all.
this is not a regex solution, but something simple enough for your problem description. Just split your string and get the first item from your array.
$str = "match everything until first ; blah ; blah end ";
$s = explode(";",$str,2);
print $s[0];
output
$ php test.php
match everything until first
This will match up to the first occurrence only in each string and will ignore subsequent occurrences.
/^([^;]*);*/
"/^([^\/]*)\/$/" worked for me, to get only top "folders" from an array like:
a/ <- this
a/b/
c/ <- this
c/d/
/d/e/
f/ <- this
Really kinda sad that no one has given you the correct answer....
In regex, ? makes it non greedy. By default regex will match as much as it can (greedy)
Simply add a ? and it will be non-greedy and match as little as possible!
Good luck, hope that helps.
This works for getting the content from the beginning of a line till the first word,
/^.*?([^\s]+)/gm
I faced a similar problem including all the characters until the first comma after the word entity_id. The solution that worked was this in Bigquery:
SELECT regexp_extract(line_items,r'entity_id*[^,]*')

PHP preg_replace pattern only seems to work if its wrong?

I have a string that looks like this
../Clean_Smarty_Projekt/tpl/templates_c\.
../Clean_Smarty_Projekt/tpl/templates_c\..
I want to replace ../, \. and \.. with a regulare expression.
Before, I did this like this:
$result = str_replace(array("../","\..","\."),"",$str);
And there it (pattern) has to be in this order because changing it makes the output a little buggy. So I decided to use a regular expression.
Now I came up with this pattern
$result = preg_replace('/(\.\.\/)|(\\[\.]{1,2})/',"",$str);
What actually returns only empty strings...
Reason: (\\[\.]{1,2})
In Regex101 its all ok. (Took me a couple of minutes to realize that I don't need the /g in preg_replace)
If I use this pattern in preg_replace I have to do (\\\\[\.]{1,2}) to get it to work. But that's obviously wrong because im not searching for two slashes.
Of course I know the escaping rulse (escaping slashes).
Why doesn't this match correctly ?
I suggest you to use a different php delimiter. Within the / delimiter, you need to use three \\\ or four \\\\ backslashes to match a single backslash.
$string = '../Clean_Smarty_Projekt/tpl/templates_c\.'."\n".'../Clean_Smarty_Projekt/tpl/templates_c\..';
echo preg_replace('~\.\./|\\\.{1,2}~', '', $string)
Output:
Clean_Smarty_Projekt/tpl/templates_c
Clean_Smarty_Projekt/tpl/templates_c

PHP Regex: match text urls until space or end of string

This is the text sample:
$text = "asd dasjfd fdsfsd http://11111.com/asdasd/?s=423%423%2F gfsdf http://22222.com/asdasd/?s=423%423%2F
asdfggasd http://3333333.com/asdasd/?s=423%423%2F";
This is my regex pattern:
preg_match_all( "#http:\/\/(.*?)[\s|\n]#is", $text, $m );
That match the first two urls, but how do I match the last one? I tried adding [\s|\n|$] but that will also only match the first two urls.
Don't try to match \n (there's no line break after all!) and instead use $ (which will match to the end of the string).
Edit:
I'd love to hear why my initial idea doesn't work, so in case you know it, let me know. I'd guess because [] tries to match one character, while end of line isn't one? :)
This one will work:
preg_match_all('#http://(\S+)#is', $text, $m);
Note that you don't have to escape the / due to them not being the delimiting character, but you'd have to escape the \ as you're using double quotes (so the string is parsed). Instead I used single quotes for this.
I'm not familar with PHP, so I don't have the exact syntax, but maybe this will give you something to try. the [] means a character class so |$ will literally look for a $. I think what you'll need is another look ahead so something like this:
#http:\/\/(.*)(?=(\s|$))
I apologize if this is way off, but maybe it will give you another angle to try.
See What is the best regular expression to check if a string is a valid URL?
It has some very long regular expressions that will match all urls.

Regular expression to match single dot but not two dots?

Trying to create a regex pattern for email address check. That will allow a dot (.) but not if there are more than one next to each other.
Should match:
test.test#test.com
Should not match:
test..test#test.com
Now I know there are thousands of examples on internet for e-mail matching, so please don't post me links with complete solutions, I'm trying to learn here.
Actually the part that interests me the most is just the local part:
test.test that should match and test..test that should not match.
Thanks for helping out.
You may allow any number of [^\.] (any character except a dot) and [^\.])\.[^\.] (a dot enclosed by two non-dots) by using a disjunction (the pipe symbol |) between them and putting the whole thing with * (any number of those) between ^ and $ so that the entire string consists of those. Here's the code:
$s1 = "test.test#test.com";
$s2 = "test..test#test.com";
$pattern = '/^([^\.]|([^\.])\.[^\.])*$/';
echo "$s1: ", preg_match($pattern, $s1),"<p>","$s2: ", preg_match($pattern, $s2);
Yields:
test.test#test.com: 1
test..test#test.com: 0
This seams more logical to me:
/[^.]([\.])[^.]/
And it's simple. The look-ahead & look-behinds are indeed useful because they don't capture values. But in this case the capture group is only around the middle dot.
strpos($input,'..') === false
strpos function is more simple, if `$input' has not '..' your test is success.
To answer the question in the title, I'd update the RegExp by Junuxx and allow dots in the beginning and end of the string:
'/^\.?([^\.]|([^\.]\.))*$/'
which is optional . in the beginning followed by any number of non-. or [non-. followed by .].
^([^.]+\.?)+#$
That should do for the what comes before the #, I'll leave the rest for you.
Note that you should optimise it more to avoid other strange character setups, but this seems sufficient in answering what interests you
Don't forget the ^ and $ like I first did :(
Also forgot to slash the . - silly me

Replacing a string using preg_match

I'm having trouble using preg_match to find and replace a string. The string of interest is:
<span style="font-size:0.6em">EXPIRATION DATE: 04/30/2011</span>
I need to target and replace the date, "04/30/2011" with a different date. Can someone throw me a bone a give me the regular expression to match this pattern using preg_match in PHP? I also need it to match in such a way that it only replaces up to the first closing span and not closing span tags later in the code, e.g.:
<span style="font-size:0.6em">EXPIRATION DATE: 04/30/2011</span><span class="hello"></span>
I'm not versed in regex, and although I've spent the last hour trying to learn enough to make this work, I'm utterly failing. Thanks so much!
EDIT: As you can see this has gotten me exhausted. I did mean preg_replace, not preg_match.
If you're after a replacement, consider using preg_replace(), something like
preg_replace('#(\d{2})/(\d{2})/(\d{4})#', '<new date>', $string);
How about this:
$toBeFoundPattern = '/([0-9][0-9])\/([0-9][0-9])\/([0-9][0-9][0-9][0-9])/';
$toBeReplacedPattern = '$2.$1.$3';
$inString = '<span style="font-size:0.6em">EXPIRATION DATE: 04/30/2011</span>';
// Will convert from US date format 04/30/2011 to european format 30.04.2011
echo preg_replace( $toBeFoundPattern, $toBeReplacedPattern, $inString );
and prints
EXPIRATION DATE: 30.04.2011
Patterns always begin and end with identical so called delimiter characters. Often the character / is used.
$1 references the string, which matched the first string matched by ([0-9][0-9]), $2 references be (...) and $3 the four letters matched by the last (...).
[...] matched a single character, which is one of those listed inside the brackets. E.g. [a-z] matches all lower case letters.
To use the special meaning character / inside of a pattern, you need to escape it by \ to make it be the literal slash character.
Update: Using {..} as pointed out below is shorthand for repeated patterns.
Regex should be:
(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d
If you want to only match one instance, this is OK. For multiple instances, use preg_match_all instead. Taken from http://www.regular-expressions.info/regexbuddy/datemmddyyyy.html.
Edit: are you looking to just search and replace inside a PHP script or do you want to do some javascript live replacement?

Categories