am using below to show browser page title on a joomla website. The only issue occurs when there are apostrophe in the title.
$browserpagetitle= 'My site - '.$this->item->title;
$document = JFactory::getDocument();
$document->setTitle($browserpagetitle);
If the item title is Apple's. it will show : My site - Apple's
I have tried :
$browserpagetitle= 'My site - '.$this->item->title;
$document = JFactory::getDocument();
echo html_entity_decode($document->setTitle($browserpagetitle), ENT_QUOTES);
as suggested here but no luck
Add
<meta charset="utf-8" />
immediately after your opening
<html>
<!DOCTYPE html>
<head>
<meta charset="utf-8">
<title>*page title here*</title>
</head>
Please try below code
$browserpagetitle= 'My site - '.$this->item->title;
$document = JFactory::getDocument();
$document->setTitle(htmlspecialchars_decode($browserpagetitle,ENT_QUOTES));
Related
I have to get data from a .html file, into a distant server. With file_gets_content I can retrieve the informations but when I want to test it I have some problems.
For example I can have 0 or 1 into my .html page. In my .php I want to do something if the file_gets_content return 0 or 1 but for now I didn't find how I can do it
Here is my .html :
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>test</title>
</head>
<body>
0
</body>
</html>
My PHP code :
$home = file_get_contents('http://192.168.1.XXX/wordpress/read.html');
You can use this php library https://github.com/sunra/php-simple-html-dom-parser
$dom =HtmlDomParser::file_get_html('http://192.168.1.XXX/wordpress/read.html');
$bodyText=$dom->find("body",0)->innertext;
alternative solution is
$home = file_get_contents('http://192.168.1.XXX/wordpress/read.html');
$dom = new DOMDocument();
$dom->loadHTML($home);
if(($body=$dom->getElementsByTagName("body"))->length>0){
$text=$body[0]->nodeValue
}
In PHP, I'm currently making a xpath query but I need to make it case insensitive.
I'm using is XPath 1.0 which from my query means I've got to use some thing called a translate function but I'm unsure of how to do this.
Here is my query test PHP file :
$html = <<<'HTML'
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
<meta NAME="Description" content="Test Case">
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<Link Rel="Canonical" href="http://www.testsite.com/" />
<Title>My Title</Title>
</head>
<Body>
Test Case
</Body>
</html>
HTML;
$domDoc = new DOMDocument();
$domDoc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
// Canonical link
$xpath = new DOMXPath($domDoc);
$canonicalTags = $xpath->query('//link[#rel=\'canonical\']'); // Return nothing
//some use translate(WhatVariable?, 'ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞŸŽŠŒ', 'abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿžšœ')
var_dump($canonicalTags);
Any help would be greatly appreciated. Thanks.
Basically, translate is used to convert dynamic value that you need to compare to be all lower-case (or all upper-case). In this case, you want to apply translate() to rel attribute value, and compare the result to lower-case literal "canonical" (formatted for readability) :
//link[
translate(#rel, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') = 'canonical'
]
I have an iOS app for a public library that shares links to Facebook. The links point to a single domain, which contains a relatively simple PHP script that redirects to three different destination domains based on the linked content (catalog items, calendar events, and user-generated lists). I have it set up like this because I'm using iOS universal links and I don't have control over all of the link destinations, so I need a central location for the apple-app-site-association file.
In this PHP script, I'm attempting to set OG tags dynamically based on the type of content that was shared. Here's the script:
<?php
$shareType = $_GET['t'];
$contentId = $_GET['id'];
$base_catalog_url='XXXXXXXXXXXX';
$base_list_url='XXXXXXXXXXXXX';
$base_event_url='XXXXXXXXXXXXXX';
if($shareType=='0'){
$oclc;
if(strlen($contentId)==8){
$oclc = 'ocm'.$contentId;
}
if(strlen($contentId)==9){
$oclc = 'ocn'.$contentId;
}
$url = $base_catalog_url.'searchCatalog?'.http_build_query(array('clientID' =>'sdIPhoneApp','term1'=>$oclc));
$resp = simplexml_load_file($url);
$pageTitle = $resp->HitlistTitleInfo->title;
$isbn = $resp->HitlistTitleInfo->ISBN;
$imageURL = 'http://www.syndetics.com/index.aspx?isbn='.$isbn.'/lc.gif&client=XXXXXXX';
$redirectURL = 'XXXXXXXXXXXX'.$contentId;
error_log($redirectURL);
echo '<html>
<head>
<meta property="og:image" content="'.$imageURL.'" />
<meta property="og:title" content="'.$pageTitle.'" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="#acpl" />
<meta name="twitter:title" content="'.$pageTitle.'" />
<meta name="twitter:description" content="Allen County Public Library" />
<meta name="twitter:image" content="'.$imageURL.'" />
<meta http-equiv="refresh" content="0;URL='.$redirectURL.'">
</head>
</html>';
}
if($shareType=='1'){
$url = $base_event_url.http_build_query(array('eventid' =>$contentId));
$response = file_get_contents($url);
$json = json_decode($response);
$event = $json[0];
$imageURL = $event->Image;
$pageTitle = $event->Title;
$description = $event->Description;
if(strlen($imageURL)<5){
$imageURL = 'https://XXXXXXXXX/appIcon200.png';
}
$redirectURL = 'XXXXXXXXXXX'.$contentId;
echo '<html>
<head>
<meta property="og:image" content="'.$imageURL.'" />
<meta property="og:title" content="'.$pageTitle.'" />
<meta property="og:description" content="'.$description.'" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="#acpl" />
<meta name="twitter:title" content="'.$pageTitle.'" />
<meta name="twitter:description" content="'.$description.'" />
<meta name="twitter:text:description" content="'.$description.'" />
<meta name="twitter:image" content="'.$imageURL.'" />
<meta http-equiv="refresh" content="0;URL='.$redirectURL.'">
</head>
</html>';
}
if($shareType=='2'){
$url = $base_list_url.http_build_query(array('listId' =>$contentId,'userKey'=>0));
$response = file_get_contents($url);
$json = json_decode($response);
$imageURL = $json->coverImageURL;
$pageTitle = $json->listName;
$pageTitle = ucwords(strtolower($pageTitle));
$redirectURL = "XXXXXXXXXXXX";
echo '<html>
<head>
<meta property="og:image" content="'.$imageURL.'" />
<meta property="og:title" content="'.$pageTitle.'" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="#acpl" />
<meta name="twitter:title" content="'.$pageTitle.'" />
<meta name="twitter:description" content="Allen County Public Library" />
<meta name="twitter:image" content="'.$imageURL.'" />
<meta http-equiv="refresh" content="0;URL='.$redirectURL.'">
</head>
</html>';
}
?>
So, based on the type of content that was shared, I fetch a page title and image to provide in the OG tags. The redirection always works, regardless of whether Facebook pulls in the tags, but the tags are utilized only about half the time. You can see this in the iOS app. Tags pulled in successfully:
Tags not pulled in:
It seems to be random whether the tags are displayed for a given item. In the access logs on my server, when the tags are successfully displayed, I see a line like this:
66.220.158.119 - - [09/Sep/2016:09:54:50 -0400] "GET /share.php?t=1&id=76137 HTTP/1.1" 206 3771 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
However, when the tags are not displayed, there's nothing in the access log or the error log. This suggests that Facebook (or the Facebook component in iOS) is not even attempting to read the tags in these cases. Does this mean Facebook mistakenly thinks it has this data cached?
Another interesting tidbit is what happens when I try to debug one of these failed URLs on the Facebook sharing debugger (https://developers.facebook.com/tools/debug/). I'll get an error message along the lines of:
The 'og:image' property should be explicitly provided, even if a value can be inferred from other tags.
And when I click "See what our scraper sees for your URL." I get the response "The document returned no data".
The interesting thing is that when I click "Scrape again", it usually gives the same error for the first few times, then after 3 or 4 attempts it suddenly works and the tags are displayed. My first thought there is that this has to do with how I'm dynamically fetching the content for the tags, but as I noted above, in the cases where the tags aren't displayed, the access log shows that Facebook isn't even requesting anything from my server.
Thanks for your help; this has me pulling my hair out!
UPDATE: Here's an example URL if you'd like to try it out in the Facebook debugger if you'd like: https://amshare.acpl.lib.in.us/0_930144011
The number after the underscore is the OCLC number of the book, so you could plug in other values there. As I mentioned, after a few scrapes it usually starts working, then later fails to work again, etc.
I could be possible that facebook caches the share.php file and ignores the GET Vars.
you could try to rewrite the URL to a "pretty permalink". Put this in your htaccess file (if you have apache):
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^share/(.*)/(.*)$ share.php?t=$1&id=$2 [L,NC]
this makes out of http://your-url.com/share/4/yeah this: http://your-url.com/?t=4&id=yeah
The $_GET var looks like this:
Array ( [t] => 4 [id] => yeah )
With this you could solve this problem (if it is REALLY caching). I had a lot of issues with the facebook scraper in the past. sometimes it ignores get vars and it caches like hell...
Try adding some headers to your response to prevent caching.
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
I'm struggling with this. The idea is to replace all <link> tags, containing specific href attribute inside given string (which comes from a buffer and it is regular HTML, but malformed sometimes).
I've tried to use the PHP DOM approach, also the SimpleHTMLDOM parser library, so far nothing works for me (the problem is that DOM approach returns only links inside <body> element, but not those in <head> section of the page), so I decided to use regex.
Here is the non-working PHP DOM approach code:
function remove_css_links($string = "", $css_files = array()) {
$css_files = array("http://www.example.com/css/css.css?ver=2.70","style.css?ver=3.8.1");
$xml = new DOMDocument();
$xml->loadHTML($string);
$link_list = $xml->getElementsByTagName('link');
$link_list_length = $link_list->length;
//The cycle
for ($i = 0; $i < $link_list_length; $i++) {
$attributes = $link_list->item($i)->attributes;
$href = $attributes->getNamedItem('href');
if (in_array($href->value, $css_files)) {
//Remove the HTML node
}
}
$string = $xml->saveHTML();
return $string;
}
Here is the regex code, however I know that all of you do not recommend to use it for parsing of HTML, but let's not discuss this here and now:
$html_text = '
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="shortcut icon" href="http://www.example.com/favicon.ico" />
<link rel="alternate" type="application/rss+xml" title="Website » Feed" href="/feed/" />
<link rel=\'stylesheet\' href=\'http://www.example.com/css/css.css?ver=2.70\' type=\'text/css\' media=\'all\' /></head>
<body>...some content...
<link rel=\'stylesheet\' id=\'css\' href=\'style.css?ver=3.8.1\' type=\'text/css\' media=\'all\' />
</body></html>
';
$url = preg_quote("http://www.example.com/css/css.css?ver=2.70");
$pattern = "~<link([^>]+) href=".$url."/?>~";
$link = preg_replace($pattern, "", $html_text);
The problem with the regex is that the href attribute can be at any place inside <link> tag and this one, which I use, can detect any type of <link> tags, as you can see I do not want to remove the shortcut icon or alternate types of them, as well as anything different than given URL as href attribute. You can notice that the <link> tags contains different type of quotes, single and/or double.
However, I'm open to suggestions and if it is possible to make the DOM approach work, rather than use regex - it's OK.
OK, so here you are :
<?php
$html_text = '
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="shortcut icon" href="http://www.example.com/favicon.ico" />
<link rel="alternate" type="application/rss+xml" title="Website » Feed" href="/feed/" />
<link rel="stylesheet" href="http://www.example.com/css/css.css?ver=2.70" type="text/css" media="all" /></head>
<body>...some content...
<link rel="stylesheet" id="css" href="style.css?ver=3.8.1" type="text/css" media="all" />
</body></html>
';
$d = new DOMDocument();
#$d->loadHTML($html_text);
$xpath = new DOMXPath($d);
$result = $xpath->query("//link");
foreach ($result as $link)
{
$href = $link->getattribute("href");
if ($href=="whatyouwanttofilter")
{
$link->parentNode->removeChild($link);
}
}
$output= $d->saveHTML();
echo $output;
?>
Tested and working. Have fun! :-)
The general idea is :
Load your HTML into a DOMDocument
Look for link nodes, using XPath
Loop through the nodes
Depending on the node's href attribute, delete the node (actually, remove the child from its... parent - well, yep, that's the php way... lol)
After doing all the cleaning-up, re-save the HTML and get it back into a string
I have this link http://www.geobytes.com/IpLocator.htm?GetLocation&template=php3.txt&IpAddress=
and return the meta tags
<meta name="known" content="true">
<meta name="internet" content="EN">
and other. On page php i tried this
<?php
$tags = get_meta_tags('http://www.geobytes.com/IpLocator.htm?GetLocation&template=php3.txt&IpAddress=');
print $tags['city']; // city name
?>
not work and return a white page why?
Try to Use:
print_r(
get_meta_tags("http://www.geobytes.com/IpLocator.htm?GetLocation&template=php3.txt&IpAddress=")
);