I'm trying to extract ID from a possibly huge text, what did I miss?
preg_match_all('/(ID\s\d+)/', "ID 20380843, ID 20675712", $matches);
print_r( $matches[0] );
Only return:
Array
(
[0] => ID 20380843
)
Instead of:
Array
(
[0] => ID 20380843
[1] => ID 20675712
)
Did you copy that string from your code? Because there is something sneaky happening.
When I copied the code to my editor, it gave me this for string:
"ID 20380843, ID ?20675712"
As you can see, there is a questionmark-sign in the 2nd, thus failing your expression :)
Your problem isn't preg_replace_all, it's your source file. There's an invisible unicode character in the second ID - you can see by copy/pasting it into this Unicode Converter, you'll see U+200B show up in various forms in the lower boxes:
Unicode U+hex notation
preg_match_all('/(ID\s\d+)/', "ID 20380843, ID U+200B^20675712", $matches);
(emphasis mine)
This is the Unicode Zero-Width Spaaace, which is apparently not included in \s as PHP's PREG defines it.
print_r(matches) instead of print_r(matches[0]);
try
preg_match_all('/(ID\s\d+)/', "ID 20380843, ID 20675712", $matches);
print_r( $matches );
Related
I have several source codes that I'm applying preg_match_all on.
this is what I tried:
$lazy = file_get_contents("Some_Source_code.txt");
if(!preg_match("#method_(.*)\(int var0, int var1, int var2\)#", $lazy, $function_name))
die("nothing here");
preg_match_all("#method_".$function_name[1]."\(.*\){1}#", $lazy, $matches);
print_r($matches);
but the output comes like this:
Array
(
[0] => Array
(
[0] => method_2393(int var0, int var1, int var2)
[1] => method_2393(0, 0, 0)).equals(this.field_1351.getText().toString()))
)
)
ok, what I want is $matches[0][1]. But
How can I stop it once it detects the closing parentheses ' ) ' just like the first one.
I can process the line after I extract it, but how can I do it with regex?
I searched the answers of similar problems but they were too specific.
Modify the regex as
#method_".$function_name[1]."\([^)]*\){1}#
Where you got wrong
#method_".$function_name[1]."\(.*\){1}#
here you used \(.*\) where .* would match anything including the )
Changes made
\([^)]*\) here [^)]* it matches anything other than ) so that it ends with the first occurence of the )
You can also use a lazy matching using .*? instead of .* which is gready and consumes as much as characters as it can
Iam working with Double byte japaneese character website, i need to check the user enter a single byte katakana.Site developed in php platform.
This is the preg match that i used for checking
'/[\x{3040}-\x{309F}]/u'
I'm not 100% sure if this the test string I use is legal $string. I'll remove the answer (or try to update it) if it works out different. As the string is manual input (escaped the backslash initially), instead of raw;
$string = "\\xe3\\x80\\x85"; // RAW input might still be '\xe3\x80\x85' here
$result = preg_match_all("/\\\\xe3\\\\x8[0-3]\\\\x[8-9a-b][0-9a-f]/u", $string, $matches);
echo $string;
echo '<pre>';
print_r($matches);
echo '</pre>';
This prints out;
\xe3\x80\x85
Array
(
[0] => Array
(
[0] => \xe3\x80\x85
)
)
Thus; 々
I've this function that parses some content to retrieve homemade link tag and convert it to normal link tag.
Possible input:
<p>blabalblahhh <moolinkx pageid="121">text to click</moolinkx> blablabah</p>
Output :
<p>blabalblahhh text to click blablabah</p>
Here is my code:
$regex = '/\<moolinkx pageid="(.{1,})"\>(.{1,})\<\/moolinkx\>/';
preg_match_all( $regex, $string, $matches );
It works perfectly well if there is only one in the string. But as soon as there is a second one, it doesn't work.
Input:
<p>blabalblahhh <moolinkx pageid="121">text to click</moolinkx> blablabah.</p>
<p>Another <moolinkx pageid="128">text to clickclick</moolinkx> again blablablah.</p>
That's what I got when I print_r($matches):
Array
(
[0] => Array
(
[0] => <moolinkx pageid="121">text to click</moolinkx> blablabah.</p><p>Another <moolinkx pageid="128">text to clickclick</moolinkx>
)
[1] => Array
(
[0] => 121">text to click</moolinkx> blablabah.</p><p>Another <moolinkx pageid="128
)
[2] => Array
(
[0] => text to clickclick
)
)
I'm not at ease with regex, so it must be something very trivial... but I can't pinpoint what it is :(
Thank you very much in advance!
NB: This is my first post here, though I've been using this terrific Q&A for ages!
Use a negative Regex:
$regex = '/<moolinkx pageid="([^"]+)">([^<]+)<\/moolinkx>/';
Explained demo here: http://regex101.com/r/sI3wK5
You are using a greedy selector, which is recognising everything between the first openning tag and the last closing tag as the content between the tags. Change your regex to:
$regex = '/\<moolinkx pageid="(.+?)"\>(.+?)\<\/moolinkx\>/';
preg_match_all( $regex, $string, $matches );
Notice the .{1,} has changed to .+?. The + means one or more instances, and the ? tells the regex to select the fewest characters it can to fulfil the expression.
ok, not sure if stupid or just monday.
It's actually quite simple. I have a textbox, in which I enter Text. A word gets marked with a hash (#), which then gets saved to the DB as the hashtag for that sentence.
Now, my funciton looks like this:
public function getHashtag($text)
{
print_r($text);
preg_match_all('/(#\w+)/', $text, $hashTag);
print_r($hashTag);
die();
if (isset($hashTag[0][0])) {
$hashTag = $hashTag[0][0];
return $hashTag;
} else {
return '';
}
}
the print_r are just debug stuff.
All I want to achieve is to get the word with the hash. Works great, EXCEPT if someone enters a Word in french which has àèé or other characters in it.
The output then just stops at the first special char.
#dfsdfaàèé asda sda sd asd aArray ( [0] => Array ( [0] => #dfsdfa ) [1] => Array ( [0] => #dfsdfa ) )
any ideas? :D
Just use this expression /(#[^\s[:punct:]]+)/.
Reads as "A # plus at least one character that is not white-space or punctuation."
The [:punct:] is one of the POSIX character classes.
I have a product that is sold to multiple customers, each customer has its own unique product code derived from the my original product code e.g
My code: 1245-65
Customer 1: 1245/65
Customer 2: 1245.65
My question: Is there any way to analyse such a string and find what is separating its integers? My goal is to have a settings page where a demo customer code would be entered then all product codes would be derived from that example code. I'm sure PHP can handle this!
EXTRA INFO:
Sorry, I haven't given enough information. There might be a situation where the separator is an alphabetical quantity e.g 1245ABC65. I hate updating a question like this when so many people have given valid answers :( my fault.
You can use a regular expression to find the separator.
$str = '1245/65';
preg_match("/\d+(.)\d+/", $str, $separator);
$separator = $separator[1];
You may want to look for non numeric characters using preg_match_all
preg_match_all('/[^0-9]/', '1245-95', $matches);
print_r($matches);
//Array ( [0] => Array ( [0] => - ) ) in the example
With the updated question, you have to write :
$str = '1245ABC65';
preg_match("/\d+([^0-9]+)\d+/", $str, $separator);
echo $separator = $separator[1];
or
preg_match_all('/[^0-9]+/', '1245ABC95', $matches);
print_r($matches);
//Array ( [0] => Array ( [0] => 'ABC' ) ) in the example
Use preg_split and regular expressions, to search for others characters than numbers.
$separador = preg_split ('/\d/', '1234/65', -1, PREG_SPLIT_NO_EMPTY)
$separador = $separador[0];