Regex for pulling video ID from Rumble URL - php

Argh-- regular expressions make me crazy, I've just spent 20 minutes trying to get this to fly and I'm having no luck. And I know someone here will be able to pop this out in like 2 seconds! :-)
Here's a sample source URL: https://rumble.com/v30sqt-oreo-ice-cream-cake.html
I want to extract the "v30sqt" characters. Actually, I want to extract any characters after "rumble.com/" and before the first dash. It might be alphanumeric, it might be all letters, it might be longer than 6 characters, etc. That's the video ID.
This is for php preg_match.

You can simply use parse_url instead of using regex along with explode and current function like as
$url = "https://rumble.com/v30sqt-oreo-ice-cream-cake.html";
$parsed_arr = explode("-",ltrim(parse_url($url, PHP_URL_PATH),"/"));
echo current($parsed_arr);
or
echo $parsed_arr[0];
Demo

Try this one should work for you :
/(?<=rumble.com\/).*?\b/g
Demo and Explaination

Go for:
<?php
$url = "https://rumble.com/v30sqt-oreo-ice-cream-cake.html";
$regex = '~rumble\.com/(?P<video>[^-]+)~';
if (preg_match($regex, $url, $match)) {
echo $match['video'];
# v30sqt
}
?>
With a demo on ideone.com.

Related

URL Regex issue, php

I have a URL regex I use (and have used quite frequently). It does me well for finding various URL formats and http protocols. That said, I wouldn't be writing here if all was dandy in Dandyland.
I've encountered a hiccup that my current regex below is causing.
When searching a string for URLs, if a string consists of something like example...see it will treat it as a URL. There can be any number of periods, however it only pulls the last 3 characters after the last period.
Any ideas how to resolve this?
Example:
$string = "Here's a url, hello.com. But this...shouldn't show.";
$url_regex = "/((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?#)?([a-z0-9-.]*)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_\-~#\(\)\%]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:#&#%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/i";
preg_match_all($url_regex, $string, $urls);
return $urls;
The problem here was that you had added a period within the allowed characters which meant there could be more than one consecutive periods. Also \b is important when you're dealing with inline searches.
\b((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_-]+(\:[a-z0-9+!*(),;?&=\$_-]+)?#)?([a-z0-9-]*)\.([a-z]+){2,3}(\:[0-9])?(\/([a-z0-9+\$_\-~#\(\)\%]?)+)*\/?(\?[a-z+&\$_-][a-z0-9;:#&#%=+\/\$_-]*)?(#[a-z_-][a-z0-9+\$_-]*)?\b
Debuggex Demo
Edit: Updated the answer to ignore matches like example.c
Following code solve your issue. I have test at my end.
$string = "Here's a url, hello.com. But this...shouldn't show.";
$url_regex = "/((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?#)?([a-z0-9-]+?)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_\-~#\(\)\%]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:#&#%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/i";
preg_match_all($url_regex, $string, $urls);
Use https and http with urls in the string.
$string = "this is my website http://example.com and this is my friend website https://pqr.com etc, this...shouldn't show";
$regex = '/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i';
preg_match_all($regex, $string, $matches);
print_r($matches[0]);

How to cut out everything from a string except certain part of it in php?

Let's say I have string like this:
Village_name(315|431 K64)
What I want to do is when I paste that into let's say text box, and click a button, all I will be left with is 315|431.
Is there a way of doing this?
Use the below regex and then replace the match with \1.
(\d+\|\d+)|.
It captures the number|number part and matches all the remaining chars. By replacing all the matched chars with \1 will give you the number|number part only.
DEMO
In php, you may use this also.
(?:\d+\|\d+)(*SKIP)(*F)|.
substring which was matched by \d+\|\d+ regex would be matched first and the following (*SKIP)(*F) makes the regex to fail. Now thw . after the pipe symbol would match all the chars except number|number because we already skipped that part.
DEMO
I know this question has been answered and the answer has been accepted. But I still want to suggest this answer, as you really don't need to use PHP to realize your requirement. Just use Javascript. Its enough:
var str = 'Village_name(315|431 K64)';
var pattern = /\((\w+\|\w+) /;
var res = str.match(pattern);
document.write(res[1]);
Please try this:-
<?php
$str = 'Village_name(315|431 K64)';
preg_match_all('/(?:\d+\|\d+)/', $str, $matches);
echo "<pre/>";print_r($matches);//print in array format completly
$i=0;
foreach($matches as $match){ //iteration through one foreach as you asked
echo $match[$i];
$i++;
}
?>
Output:- http://prntscr.com/74ddg9
Note:- explode can work with some adjustment but if the format only like what you given.So go for preg_match_all. It's best.

preg_replace with Regex - find number-sequence in URL

I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!
You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo
You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.
You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)

PHP function to return only parts of string that contain certain characters?

i have a string as follows:
$product_req = "CATEGORY-ACTIVE-8,CATEGORY-ACTIVE-4,ACTIVE-6,ACTIVE-9";
and i need a function that returns only the numbers preceded by "CATEGORY-ACTIVE-" (without the quotes) so in other words it should return: 8,4 and leave everything else out.
Is there any php function that can do this?
Thank you.
Use Preg_match_all and extract the first match
$input_lines="CATEGORY-ACTIVE-8,CATEGORY-ACTIVE-4,ACTIVE-6,ACTIVE-9"
preg_match_all("/CATEGORY-ACTIVE-(\d+)/", $input_lines, $output_array);
print_r(join(',',$output_array[1]));
output
8,4
Is there any php function that can do this?
Yes you can play around and achieve it with PHP Native functions by writing some code logic. But do it with Regular Expressions (to keep it simple and short).
Using PHP Functions..
<?php
$str = 'CATEGORY-ACTIVE-8,CATEGORY-ACTIVE-4,ACTIVE-6,ACTIVE-9';
$str=explode(',',$str);
$temparr=array();
foreach($str as $v)
{
if(strpos($v,'CATEGORY-ACTIVE-')!==false)
{
$temparr[]=str_replace('CATEGORY-ACTIVE-','',$v);
}
}
echo implode(',',$temparr); //"prints" 8,4
Use regular expressions and implode it atlast (Preferred way..)
<?php
$str = 'CATEGORY-ACTIVE-8,CATEGORY-ACTIVE-4,ACTIVE-6,ACTIVE-9';
preg_match_all('/CATEGORY-ACTIVE-(.*?),/', $str, $matches);
echo implode(',',$matches[1]); //8,4
I'd use a lookaround assertion to accomplish this:
(?<=CATEGORY-ACTIVE-)(\d+)
Visualization:
Code:
$str = 'CATEGORY-ACTIVE-8,CATEGORY-ACTIVE-4,ACTIVE-6,ACTIVE-9';
preg_match_all('/(?<=CATEGORY-ACTIVE-)(\d+)/', $str, $matches);
print_r($matches[1]);
Output:
Array
(
[0] => 8
[1] => 4
)
Demo
Yes there is, Feel free to explore the wonderful world of Regex!
http://il1.php.net/preg_match
I recommend you do a bit of reading on this yourself as "getting the answers" when it comes to regex is a sin, You learn nothing from it.
I'm not uber experienced with it myself, but it's one of those things that you must learn 'hands on', theory won't cut it here.
in theory it would look like this
$str = 84838493849384938;
preg_match_all(/[8.4]/, $str);
You can also go play around with REgex at this site http://www.phpliveregex.com/

Extracting URLs from a JSON-like string

I need to extract the first URL from some content. The content may be like this:
({items:[{url:"http://cincinnati.ebayclassifieds.com/",name:"Cincinnati"},{url:"http://dayton.ebayclassifieds.com/",name:"Dayton"}],error:null});
or may contain only a link
({items:[{url:"http://portlandor.ebayclassifieds.com/",name:"Portland (OR)"}],error:null});
currently I have :
$pattern = "/\:\[\{url\:\"(.*)\"\,name/";
preg_match_all($pattern, $htmlContent, $matches);
$URL = $matches[1][0];
however it works only if there is a single link so I need a regex which should work for the both cases.
You can use this REGEX:
$pattern = "/url\:\"([^\"]+)\"/";
Worked for me :)
Hopefully this should work for you
<?php
$str = '({items:[{url:"http://cincinnati.ebayclassifieds.com/",name:"Cincinnati"},{url:"http://dayton.ebayclassifieds.com/",name:"Dayton"}],error:null});'; //The string you want to extract the 1st URL from
$match = ""; //Define the match variable
preg_match("%(((ht|f)tp(s?))\://)?(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu|gov|mil|net|org|biz|info|name|museum|us|ca|uk)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&\%\$#\=~_\-]+))*%",$str,$match); //I Googled for the best Regular expression for URLs and found the one included in the preg_match
echo $match[0]; //Return the first item in the array (the first URL returned)
?>
This is the website that I found the regular expression on: http://regexlib.com/Search.aspx?k=URL
like the others have said, json_decode should work for you aswell
That smells like JSON to me. Try using http://php.net/json_decode
Looks like JSON to me, visit http://php.net/manual/en/book.json.php and use json_decode().

Categories