PHP - substr() only if certain character exists?

PHP - substr() only if certain character exists? - php

I'm using the YouTube Data API v3 to grab video titles and IDs to embed videos on a website. I'm just currently having a problem displaying the title in the way that I want it. Some of the video titles have text in brackets at the end, which I don't want to display on the website. I am currently using:
$videoTitle = substr($videoTitle, 0, strpos($videoTitle, '('));
The problem is that the titles that don't include brackets aren't being displayed. I'm not that experienced with PHP so I'm not sure of a way around this.
Any help will be appreciated. Thanks,
Oli.

First check whether or not the string contains the character, then modify it if it does. Otherwise leave it alone.
You can use strpos to check for the existence of the character, since it returns false if it does not exist in the string.
$videoTitle = strpos($videoTitle, '(') === false ? $videoTitle : substr($videoTitle, 0, strpos($videoTitle, '('));
or
if (strpos($videoTitle, '(') !== false)
$videoTitle : substr($videoTitle, 0, strpos($videoTitle, '('))

If you just split the string at the (, you can only use the first part as video title, like:
$splitString = explode('(', $videoTitle);
$videoTitle = $splitString[0];
But video titles can look different all the time and you can't really rely on a safe method to remove them.

You may use like below.If you will share a link that will be more helpfull.
<?php
$urlarray = explode("(",$videoTitle);
$videoTitle = $urlarray[0];
?>

Related

Create array of specific substrings

We use a custom CMS, build with PHP MySQL
I have a customer who embeds youtube videos in the content of the site. That is one string, that he can edit with CKeditor. That all works just fine.
He now wants to have those videos displayed on a different location within the same page.
I do not want to create a separate input field in the system just for this, for multiple reasons.
The solution I need is this:
I want to extract the (multiple) < iframe >youtube blah blah< /iframe > from the content string and create an array of iframe strings. Then I can display them elsewhere on the page.
For not displaying videos in the original content location I can use preg_replace to strip the iframes out of the content string.
I however have no idea how to fetch those substrings and form that new array in PHP.
Hope you have an idea and that my explanation is clear.
EDIT after getting the answer from Michel
The complete code I am using now:
$string = '<iframe>youtube iframe</iframe>Some cool text in between blahblah<iframe>moreyoutube</iframe>';
//catch the iframes
$iframe=array();
$parts=explode('<iframe',$string);
if (count($parts) > 1){ //make sure a string without iframes does not end up in the array
foreach($parts as $p){
if( strpos($p,'youtube') !== false ){
$v=explode('</iframe>',$p);
$iframe[]= '<iframe'.$v[0].'</iframe>';
}
}
}
//strip out iframes
$string = preg_replace('/<iframe(.*?)<\/iframe>/', '', $string);
This will give you a string without iframes, and an array of iframes to display seperately.
Thanks to Michel for the answer.

One way of doing it:
explode the content string on <iframe>.
Loop the resulting array and look with strpos for the word youtube (to rule out other iframes on the page).
If you find any, add <iframe> and </iframe> to the result
$string='<div>blabla</div><iframe src="youtube.org.com.uk.sk"></iframe><div>blahblah</div>';
$iframe=array();
$parts=explode('<iframe',$string);
foreach($parts as $p){
if( strpos($p,'youtube') !== false ){
$v=explode('</iframe>',$p);
$iframe[]= '<iframe'.$v[0].'</iframe>';
}
}

PHP website data mining Preg_Match Undefined Offset

I'm working on a PHP project for school. The task is to build a website to grab and analyze data from another website. I have the framework set up, and I am able to grab certain data from the desired site, but I can't seem to get the syntax right for other data that I need to obtain.
For example, the site that I am currently analyzing is a page for a specific item returned from a search of Amazon.com (e.g. search amazon.com for "iPad" and pick the first result). I am able to grab the title of the product's page, but I need to grab the review count and the price, and therein lies the issue. I'm using preg_match to get the title (works fine), but I'm not able to get the reviews nor the price. I continue to get the Undefined Offset error, which I've discovered means that there is nothing being returned that matches the given criterion. Simply checking to see whether something has been returned will not help me, since I need to obtain these data for my analysis. The 's that I'm trying to mine are unique on the page, so there is only one instance of each.
The Page Source for my product page contains the following snippits of HTML that I need to grab. (The website can, and needs to be able to handle, anything, but for this example, I searched "iPad").
<span id="priceblock_ourprice" class="a-size-medium a-color-price">$397.74</span>
I need the 397.74.
<span id="acrCustomerReviewText" class="a-size-base">1,752 customer reviews</span>
I need the 1,752.
I've tried all combinations of escape characters, wildcards, etc., but I can't seem to get beyond the Undefined Offset error. An example of my code is as follows where $link is the URL, and $f is an empty array in which I want to store the result (Note: There is NOT a space after the '<' in "< span..." It just erased everything up to the "...(.*)..." when I typed it as "< span..." without the space):
preg_match("#\< span id\=\"priceblock\_ourprice\" class\=\"a\-size\-medium a\-color\-price\"\>(.*)\<\/span\>#", file_get_contents($link), $f);
$price=$f[1]; //Offset error occurs on this line
echo $price;
Please help. I've been beating my head against this for the past two days now. I'm hoping I'm just doing something stupid. This is my first experience with preg_match and data mining. Thank you much in advanced for your time and assistance.

Code
As stated by #cabellicar123, you shouldn't use regex with html.
I believe what you are looking for is strpos() and substr(). It should look something like this:
function get_content($string, $begintag, $endtag) {
if (strpos($string, $begintag) !== False) {
$location = strpos($string, $begintag) + strlen($begintag);
$leftover = substr($string, $location);
$contents = substr($leftover, 0, strpos($leftover, $endtag));
return $contents;
}
}
// Usage (Change the variables):
$str = file_get_contents('http://www.amazon.com/OLB3-Official-League-Recreational-Ball/dp/B004KOBRMC/');
$beg = '<b class="priceLarge">$';
$end = '</b>';
get_content($str, $beg, $end);
I've provided a working example which would return the price of the object on the page, in this case, the price of a rawlings baseball.
Explanation
I'll go through the code, line by line, and explain every piece.
function get_content($string, $begintag, $endtag)
$string is the string being searched through (in this case an amazon page), $begintag is the opening tag of the element being searched for, and $closetag is the closing tag of that element. NOTE: This will only use the first instance of the opening tag, more than that will be ignored.
if (strpos($string, $begintag) !== False)
Checks if the beginning tag actually exists. Note the !== False; that's because strpos can return 0, which evaluates to False.
$location = strpos($string, $begintag) + strlen($begintag);
strpos() will return the first instance of $begintag in $string, therefore the length of the $begintag must be added to the strpos() to get the location of the end of $begintag.
$leftover = substr($string, $location);
Now that we have the $location of the opening tag, we need to narrow the $string down by setting $leftover to the part of the $string after $location.
$contents = substr($leftover, 0, strpos($leftover, $endtag));
This gets the position of the $endtag in $leftover, and stores everything before that $endtag in $contents.
As for the last few lines of code, they are specific to this example and just need to be changed to fit the circumstances.

extracting facebook photo id from a LONG url

I have searched this website on how to extract facebook id from url that starts from photo.php?fbid= but i have a long url and know how to get photo id
Example1 : photo.php?fbid=10151987845617397 (the complete url is stored in the $url variable which is checked using preg_match i believe)
!preg_match("|^http(s)?://(www.)?facebook.com/photo.php(.*)?$|i", $url) || !$pid
the above code fetches facebook id 10151987845617397 and puts it in the variable $pid.
If I have a long url, how can i change the code?
Here is the url
Example2 : https://www.facebook.com/nokia/photos/a.338008237396.161268.36922302396/10151987845617397/?type=1&theater
In the above url 10151987845617397 is the photo id that i need to capture and put it in variable $pid.
what changes do i need to do in the preg_match string?
In other words to get the photoid 10151987845617397 as output in the $pid variable:
For url facebookcom/photo.php?fbid=10151987845617397
The syntax is !preg_match("|^http(s)?://(www.)?facebook.com/photo.php(.*)?$|i", $url) || !$pid
So for url facebookcom/nokia/photos/a.338008237396.161268.36922302396/10151987845617397/?type=1&theater
What would be the syntax
Please help
Thanks

The simple solution and quite readable: Use the entire string as a regex, use () around what you want to match:
// $tmp[1] = www or nothing
// $tmp[2] = "user" (i.e nokia)
// $tmp[3] = album id?
// $tmp[4] = photos
// $tmp[5] = Long url as requested
function extract_id_from_album_url($url) {
preg_match('/https?:\/\/(www.)?facebook\.com\/([a-zA-Z0-9_\- ]*)\/([a-zA-Z0-9_\- ]*)\/([a-zA-Z0-9_\.\-]*)\/([a-zA-Z0-9_\-]*)(\/\?type=1&theater\/)?/i', $url, $tmp);
return isset($tmp[5]) ? $tmp[5] : false;
}
Backslashes are needed to ensure the . is seen as a literal (and not regex syntax). Questionmarks to allow optional urls. Using more regex syntax can make the matching "query" much shorter and extendable, but also makes it harder to read.

Remove certain part of string in PHP [duplicate]

This question already has answers here:
Get domain name (not subdomain) in php
(18 answers)
Closed 10 years ago.
I've already seen a bunch of questions on this exact subject, but none seem to solve my problem. I want to create a function that will remove everything from a website address, except for the domain name.
For example if the user inputs: http://www.stackoverflow.com/blahblahblah I want to get stackoverflow, and the same way if the user inputs facebook.com/user/bacon I want to get facebook.
Do anyone know of a function or a way where I can remove certain parts of strings? Maybe it'll search for http, and when found it'll remove everything until after the // Then it'll search for www, if found it'll remove everything until the . Then it keeps everything until the next dot, where it removes everything behind it? Looking at it now, this might cause problems with sites as http://www.en.wikipedia.org because I'll be left with only en.
Any ideas (preferably in PHP, but JavaScript is also welcome)?
EDIT 1:
Thanks to great feedback I think I've been able to work out a function that does what I want:
function getdomain($url) {
$parts = parse_url($url);
if($parts['scheme'] != 'http') {
$url = 'http://'.$url;
}
$parts2 = parse_url($url);
$host = $parts2['host'];
$remove = explode('.', $host);
$result = $remove[0];
if($result == 'www') {
$result = $remove[1];
}
return $result;
}
It's not perfect, at least considering subdomains, but I think it's possible to do something about it. Maybe add a second if statement at the end to check the length of the array. If it's bigger than two, then choose item nr1 instead of item nr0. This obviously gives me trouble related to any domain using .co.uk (because that'll be tree items long, but I don't want to return co). I'll try to work around on it a little bit, and see what I come up with. I'd be glad if some of you PHP gurus out there could take a look as well. I'm not as skilled or as experienced as any of you... :P

Use parse_url to split the URL into the different parts. What you need is the hostname. Then you will want to split it by the dot and get the first part:
$url = 'http://facebook.com/blahblah';
$parts = parse_url($url);
$host = $parts['host']; // facebook.com
$foo = explode('.', $host);
$result = $foo[0]; // facebook

You can use the parse_url function from PHP which returns exactly what you want - see

Use the parse_url method in php to get domain.com and then use replace .com with empty string.
I am a little rusty on my regular expressions but this should work.
$url='http://www.en.wikipedia.org';
$domain = parse_url($url, PHP_URL_HOST); //Will return en.wikipedia.org
$domain = preg_replace('\.com|\.org', '', $domain);
http://php.net/manual/en/function.parse-url.php
PHP REGEX: Get domain from URL
http://rubular.com/r/MvyPO9ijnQ //Check regular expressions

You're looking for info on Regular Expression. It's a bit complicated, so be prepared to read up. In your case, you'll best utilize preg_match and preg_replace. It searches for a match based on your pattern and replaces the matches with your replacement.
preg_match
preg_replace
I'd start with a pattern like this: find .com, .net or .org and delete it and everything after it. Then find the last . and delete it and everything in front of it. Finally, if // exists, delete it and everything in front of it.
if (preg_match("/^http:\/\//i",$url))
preg_replace("/^http:\/\//i","",$url);
if (preg_match("/www./i",$url))
preg_replace("/www./i","",$url);
if (preg_match("/.com/i",$url))
preg_replace("/.com/i","",$url);
if (preg_match("/\/*$/",$url))
preg_replace("/\/*$/","",$url);
^ = at the start of the string
i = case insensitive
\ = escape char
$ = the end of the string
This will have to be played around with and tweaked, but it should get your pointed in the right direction.

Javascript:
document.domain.replace(".com","")
PHP:
$url = 'http://google.com/something/something';
$parse = parse_url($url);
echo str_replace(".com","", $parse['host']); //returns google

This is quite a quick method but should do what you want in PHP:
function getDomain( $URL ) {
return explode('.',$URL)[1];
}
I will update it when I get chance but basically it splits the URL into pieces by the full stop and then returns the second item which should be the domain. A bit more logic would be required for longer domains such as www.abc.xyz.com but for normal urls it would suffice.

take facebook page url and store id and slug separately

I'm developing a web app where users enter their facebook page url either in this format:
http://www.facebook.com/pages/Graffiti/119622954518
or
http://www.facebook.com/thefirkinandfox
With php - how do I detect which format automatically, then split (explode?) the parts (the slug and the id or just the slug if the second version).
There is sometimes query data at the end of the url when viewing your own facebook page as an administrator, how do I detect and remove that? I think the answer will be regex of some kind - but I've really only used this to make sure an input is email and still didn't understand it that well... thanks in advance.
Possible entires may or may not include http:// at the beginning... I'd like to account for this...

If you want to use one regexp, try this:
$url = 'www.facebook.com/pages/Graffiti/119622954518';
if(preg_match('#^(https?://)?(www\.)?facebook\.com/((pages/([^/]+)/(\d+))|([^/]+))#', $url, $matches)) {
$slug = isset($matches[5]) ? $matches[5] : (isset($matches[7]) ? $matches[7] : null);
$id = isset($matches[6]) ? $matches[6] : null;
}

Two parts:
^http://www.facebook.com/pages/([^/]+)/([^/]+)(?:\?.*)$
If the first one doesn't match, use this:
^http://www.facebook.com/([^/]+)(?:\?.*)$
The explosion, you mention is the value of the capturing group.
So the code might look something like this:
$subject = "my string";
if (preg_match ('#^http://www.facebook.com/pages/([^/]+)/([^/]+)(?:\?.*)$#', $subject))
print ($groups[1] + ' ' + $groups[1]);
else if (preg_match ('#^http://www.facebook.com/([^/]+)(?:\?.*)$#', $subject))
print ($groups[1]);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP - substr() only if certain character exists? - php

If you just split the string at the (, you can only use the first part as video title, like: $splitString = explode('(', $videoTitle); $videoTitle = $splitString[0]; But video titles can look different all the time and you can't really rely on a safe method to remove them.

You may use like below.If you will share a link that will be more helpfull. <?php $urlarray = explode("(",$videoTitle); $videoTitle = $urlarray[0]; ?>

Related

Create array of specific substrings

PHP website data mining Preg_Match Undefined Offset

extracting facebook photo id from a LONG url

Remove certain part of string in PHP [duplicate]

take facebook page url and store id and slug separately

Categories

Resources