Extract words between two words with regex

Extract words between two words with regex - php

I have a string:
8 nights you doodle poodle
I wish to retrieve every thing between nights and poodle, so in the above example, the output should be you doodle.
I'm using the below regex. Please can someone point out what I may be doing wrong?
if (preg_match("nights\s(.*)\spoodle", "8 nights you doodle poodle", $matches1)) {
echo $matches1[0]."<br />";
}

You're close, but you're accessing the wrong index on $matches1. $matches1[0] will return the string that matched in preg_match();
Try $matches1[1];
Also, you need to enclose your regex in / characters;
if (preg_match("/nights\s(.*)\spoodle/", "8 nights you doodle poodle", $matches1)) {
echo $matches1[1]."<br />";
}
Output
you doodle<br />

You probably want something like this
if (preg_match("/nights\s(.*)\spoodle/", "8 nights you doodle poodle", $matches1)) {
echo $matches1[1]."<br />";
}
Check out rubular.com to test your regular expressions. Here is another relevant question:
Using regex to match string between two strings while excluding strings

Related

preg_match() in php: storing the matches in variable

This is the code:
<?php
$eqn1="0.068683000000003x1+2.046124y1+-0.4153z1=0.486977512";
if(preg_match("/^[0-9]\.[0-9]{1,}x[0-9]$/",$eqn1,$vx1))
{
echo "X1 is:". $vx1[0];
echo "Match found.";
}
else
echo "Match not found.";
?>
OUTPUT:
Match not found.
Here, I'm trying to extract the value of x1 (that is,0.068683000000003) and storing it in the variable '$vx1'. It always returns, "Match not found.". What is wrong with my code? If you find any errors, please provide a solution.
Thanks.

Your regex works fine but you have to remove the ^ and $ char. You also have to put brackets around the group you want to capture.
if(preg_match("/([0-9]\.[0-9]*)x[0-9]$/",$eqn1,$vx1))
The chars ^ and $ mean to capture a line only containing the enclosed regex. But your line consist of more then just this number.
If you want to use them any way you have to expand the regex to something like this:
^.*="([0-9]\.[0-9]*)x[0-9].*$
See here http://regexr.com/3diqc for a working example.

This should solve the problem :
<?php
$eqn1="0.068683000000003x1+2.046124y1+-0.4153z1=0.486977512";
preg_match("/^[0-9]\.[0-9]{1,}/",$eqn1,$vx1);
print_r($vx1);

<?php
$eqn1="0.068683000000003x1+2.046124y1+-0.4153z1=0.486977512";
if(preg_match("/(?<vx1>^[0-9]\.[0-9]{1,})x[0-9]/", $eqn1, $vx1)) {
echo "X1 is:". $vx1['vx1']. "\n";
echo "Match found.";**strong text**
} else {
echo "Match not found.";
}
result
X1 is:0.068683000000003x1
Match found.array(3) {
[0]=>
string(19) "0.068683000000003x1"
["vx1"]=>
string(17) "0.068683000000003"
[1]=>
string(17) "0.068683000000003"
}
regular error : if(preg_match("/^[0-9].[0-9]{1,}x[0-9]$/",$eqn1,$vx1))

Regex - the difference in \\n and \n

Sorry to add another "Regex explanation" question to the internet but I must know the reason for this. I have ran this regex through RegexBuddy and Regex101.com with no help.
I came across the following regex ("%4d%[^\\n]") while debugging a time parsing function. Every now and then I would receive an 'invalid date' error but only during the months of January and June. I mocked up some code to recreate exactly what was happening but I can't figure out why removing the one slash fixes it.
<?php
$format = '%Y/%b/%d';
$random_date_strings = array(
'2015/Jan/03',
'1985/Feb/13',
'2001/Mar/25',
'1948/Apr/02',
'1948/May/19',
'2020/Jun/22',
'1867/Jul/09',
'1901/Aug/11',
'1945/Sep/21',
'2000/Oct/31',
'2009/Nov/24',
'2015/Dec/02'
);
$year = null;
$rest_of_string = null;
echo 'Bad Regex:';
echo '<br/><br/>';
foreach ($random_date_strings as $date_string) {
sscanf($date_string, "%4d%[^\\n]", $year, $rest_of_string);
print_data($date_string, $year, $rest_of_string);
}
echo 'Good Regex:';
echo '<br/><br/>';
foreach ($random_date_strings as $date_string) {
sscanf($date_string, "%4d%[^\n]", $year, $rest_of_string);
print_data($date_string, $year, $rest_of_string);
}
function print_data($d, $y, $r) {
echo 'Date string: ' . $d;
echo '<br/>';
echo 'Year: ' . $y;
echo '<br/>';
echo 'Rest of string: ' . $r;
echo '<br/>';
}
?>
Feel free to run this locally but the only two outputs I'm concerned about are the months of June and January. "%4d%[^\\n]" will truncate $rest_of_string to /Ju and /Ja while "%4d%[^\n]" displays the rest of the string as expected (/Jan/03 & /Jun/22).
Here's my interpretation of the faulty regex:
%4d% - Get four digits.
[^\\n] - Look for those digits in between the beginning of the string and a new line.
Can anyone please correct my explanation and/or tell me why removing the slash gives me the result I expect?
I don't care for the HOW...I need the WHY.

Like #LucasTrzesniewski pointed out, that's sscanf() syntax, it has nothing to do with Regex. The format is explained in the sprintf() page.
In your pattern "%4d%[^\\n]", the two \\ translate to a single backslash character. So the correct interpretation of the "faulty" pattern is:
%4d - Get four digits.
%[^\\n] - Look for all characters that are not a backslash or the letter "n"
That's why it matches everything up until the "n" in "Jan" and "Jun".
The correct pattern is "%4d%[^\n]", where the \n translates to a new line character, and it's interpretation is:
%4d - Get four digits.
%[^\n] - Look for all characters that are not a new line

PHP preg_match() in mobile number

How to make preg_match() in mobile number that only accepts:
"+" , "63" , "+63" , "09" at start ,
and a "-" than can be placed between the number?
The number should cointain only 1 "+" and at the beginning.
The "-" should be placed anywhere between the number but only once.
Limitations on 09, 63, and + 63? On 09,
Only exact 11 digits is possible including 09.
On 63, only exact 12 digits is possible including 63.
On +63, only exact 13 digits is possible including +63.
Example:
+639164455539
639164455539
09164455539
0916-4455539
Here's my code:
form name="myform" id="myform" method="post" action="index.php">
<input type="text" id="mobile" name="mobile" placeholder="Input mobile number"> <br /><br /><br />
<?php
$mobile = $_POST['mobile'];
if (isset($_POST['submit_btn'])) {
$submit = $_POST['submit_btn'];
if (preg_match("/^(09|63)[\d]{9}$/m", $mobile)) {
// valid mobile number
echo $mobile;
}
else{
echo "ERROR!";
}
}
?>
<br />
<input type="submit" id="submit_btn" name="submit_btn" value="Submit!">
</form>

try this:
if (preg_match("/^\+(09|63)-?[\d]{9}$/m", $mobile)) {
// valid mobile number
echo $mobile;
}
Demo

Try this regular expression:
/^(?:09|\+?63)(?:\d(?:-)?){9,10}$/m
It matches all examples in the updated question
See http://regex101.com/r/bS0tW2
With the added requirement of "at most one hyphen" we get
/^(?:09|\+?63)(?!.*-.*-)(?:\d(?:-)?){9,10}$/m
The negative look ahead says "from this point forward you cannot match two hyphens".
Updated demo:
http://regex101.com/r/pZ4eJ6
Finally - your requirement about number of digits, while not exactly as you stated in your comment, can probably be met with
/^(?:09|\+?639)(?!.*-.*-)(?:\d(?:-)?){9}$/m
Which actually says:
(?:09|\+?639) - start with either 09 or (optional +)639
(?!.*-.*-) - there may not be more than one hyphen in what follows
(?:\d(?:-)?){9} - there must be exactly 9 digits in what follows; don't count a (single) hyphen
Demo at http://regex101.com/r/zQ5lU8
If you really need exactly what you said, you need to make a bigger "or" statement. Something like this:
^((?:09)(?!.*-.*-)(?:\d(?:-)?){9}$|^\+?63(?!.*-.*-)(?:\d(?:-)?){10}$)
This basically splits the expression in two: either "one starting with the international access code", or "one that doesn't". I don't think it is better because I'm pretty sure that your "international" mobile number always starts with +639, so validating that specific sequence is better for detecting a valid mobile number. However, I just read that for the Philippines
Mobile phone area codes are three digits long and always start with
the number 9, although recently new area codes have been issued with 8
as the starting digit, particularly for VOIP phone numbers.
You might want to consider that as you create your validation...

Hope this will help you. I think you can make even shorter.
$yournumber="+639164455539";
if (preg_match_all("/(^\+?63(?!.*-.*-)(?!.*\+.*\+)(?:\d(?:-)?){10,11}$)|(^09(?!.*-.*-)(?!.*-.*-)(?:\d(?:-)?){9}$)/",$yournumber))
{
echo "Valid";
}
else
{
echo "Invalid number";
}

PHP: How to find the beginning and end of a substring in a string?

This is the content of one mysql table field:
Flash LEDs: 0.5W
LED lamps: 5mm
Low Powers: 0.06W, 0.2W
Remarks(1): this is remark1
----------
Accessories: Light Engine
Lifestyle Lights: Ambion, Crane Fun
Office Lights: OL-Deluxe Series
Street Lights: Dolphin
Retrofits: SL-10A, SL-60A
Remarks(2): this is remark2
----------
Infrared Receiver Module: High Data Rate Short Burst
Optical Sensors: Ambient Light Sensor, Proximity Sensor, RGB Color Sensor
Photo Coupler: Transistor
Remarks(3): this is remark3
----------
Display: Dot Matrix
Remarks(4): this is remark4
Now, I want to read the remarks and store them in a variable. Remarks(1), Remarks(2), etc. are fixed. 'this is remark1', etc. come from form input fields, so they are flexible.
Basically what I need is: Read everything between 'Remarks(1):' and '--------' and save it in a variable.
Thanks for your help.

You can use regex:
preg_match_all("~Remarks\(([^)]+)\):([^\n]+)~", $str, $m);
As seen on ideone.
The regex will put X in match group 1, Y in match group 2 (Remarks(X): Y)

This would be a job for regular expressions, which allow you to match on exactly the kinds of rules your requirements express. Here is a tutorial for you.

Use preg function for this or otherwise you can explode and implode function to get correct result. Don't Use Substring it may not provide correction.
Example of Implode and Explode Function for your query string :
$sdr = "Remarks(4): this is remark4";
$sdr1 = explode(":",$sdr);
$frst = $sdr1[0];
$sdr2 = array_shift($sdr1);
$secnd = implode(" ", $sdr1);
echo "First String - ".$frst;
echo "<br>";
echo "Second String - ".$secnd;
echo "<br>";
Your Answer :
First String - Remarks(4)
Second String - this is remark4

How to show what text it is finding that is similar?

I am experimenting with finding similar text between a string and an online article. I am playing with similar_text() in php that shows the percentage a string matches. But I am trying to figure out how to echo out what similar_text() is finding that is similar. Is there any way to do this?
Here is a sample of what I am trying to do:
$similarText = similar_text($articleContent, $wordArr[$wordNum][1], $p);
//if(strpos($articleContent, $wordArr[$wordNum][1] ) !== false)
if($p > .25)
{
$test =($wordArr[$wordNum][1] - similar_text($articleContent, $wordArr[$wordNum][1]));
echo $test."<br/>";
echo "Percent: $p%"."<br/>";
echo "MATCH NAME<br/>";
print_r($wordArr[$wordNum]);
echo "<br/><br/>";
}
The similar text gives me a percentage of the words that I am matching, but I kind of want to see how it is working, and actually show the word it matches to the word it is matching. Like echo out:
echo $matcher." matches ".$matchee

Consider make a example for get a better answer.
<?
similar_text($string1, $string2, $p);
echo "Percent: $p%";
?>
If you need see how much characters have been changed.
<?=(strlen($string2) - similar_text($string,$string2));?>

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Extract words between two words with regex - php

Related

preg_match() in php: storing the matches in variable

Regex - the difference in \\n and \n

PHP preg_match() in mobile number

PHP: How to find the beginning and end of a substring in a string?

How to show what text it is finding that is similar?

Categories

Resources