How to find an image and get path in twig generated content - php

I am a newbie to PHP Symfony and Twig. I need to dynamically set meta property og:image from the post content, if there is an image. I have a twig code like this.
<div class="content">
{{ d.description|raw }}
</div>
I can and get an image url from the div content if any and set it as a meta property og:image by using javascript but Facebook does not parse JavaScript? Is there a way to do this on server side?

Facebook indeed only reads the source code of the page, so you won't be able to js your way out of it, you have to rely on server side.
You'd have to process the content, in the controller, by using a DOM parser (you can use PHP's own DOMDocument class https://www.php.net/manual/en/class.domdocument.php) and search for an img node (or lack thereof, in case you'll need to provide a default value)
Once you have found the url of the src of the image (user provided or default), pass this as a variable to the view and echo it inside the appropriate meta tag.
Do not forget to print the full path of the image, including the domain and protocol as Facebook now requires it.

Related

Browsers are changing html markup. How force all browsers to use native markup?

I have a problem and hope someone can help me.
I use iframe with src="http://my.own.domain/some/path/file-1";
This url send me a content from "http://some-site.com/path1/path2/path3/qwerty.html";
But before sending the content I am pre-proccessing links and resources.
For example if css <link rel="/css/style1">, I add protocol and host to it and makes something like <link rel="http://some-site.com/css/style1">
After what I'm clicking on some page element and read current node information by js ( name and attributes of current node, name and attributes of parent node and goes up till I see html tag).
This data I send to php script using ajax.
Using php I convert it to XPath selector and see that my selector is incorrect.
//html
/body[0]
/div[#id='wrap']
/div[#id='main']
/table[contains(#class, 'content-wrapper')][1]
/tbody[1]
/tr[1]
/td[contains(#class, 'content-wrap')][1]
/div[contains(#class, 'content')][1]
/div[contains(#class, 'node')][1]
/div[contains(#class, 'techs')][1]
/table[1]
/tbody[1]
/tr[4]
/td[contains(#class, 'techs-right')][1]
But native markup of that page is:
//html
/body[0]
/div[#id='wrap']
/div[#id='main']
/table[contains(#class, 'content-wrapper')][1]
/*/tbody[1] - without this*/
/tr[1]
/td[contains(#class, 'content-wrap')][1]
/div[contains(#class, 'content')][1]
/div[contains(#class, 'node')][1]
/div[contains(#class, 'techs')][1]
/table[1]
/*/tbody[1] - without this*/
/tr[4]
/td[contains(#class, 'techs-right')][1]
It seems like browser is modifying incorrect markup and makes it correct.. But this is a hitch for me. How to turn this off?

i want to get data from another website and display it on mine but with my style.css

So my school has this very annoying way to view my rooster.
you have to bypass 5 links to get to my rooster.
this is the link for my class (it updates weekly without changing the link)
https://webuntis.a12.nl/WebUntis/?school=roc%20a12#Timetable?type=1&departmentId=0&id=2147
i want to display the content from that page on my website but with my
own stylesheet.
i don't mean this:
<?php
$homepage = file_get_contents('http://www.example.com/');
echo $homepage;
?>
or an iframe....
I think this can be better done using jquery and ajax. You can get jquery to load the target page, use selectors to strip out what you need, then attach it to your document tree. You should then be able to style it anyway you like.
I would recommend you to use the cURL library: http://www.php.net/manual/en/curl.examples.php
But you have to extract part of the page you want to display, because you will get the whole HTML document.
You'd probably read the whole page into a string variable (using file_get_contents like you mentioned for example) and parse the content, here you have some possibilities:
Regular expressions
Walking the DOM tree (eg. using PHPs DOMDocument classes)
After that, you'd most likely replace all the style="..." or class="..." information with your own.

Dynamically create elements from unrendered code using jquery and/or php

I want to store html that isn't to be rendered until needed either within a tag that can hold raw html code without rendering it on page load or store it within a php or jquery variable for later use. I then want to be able to insert the html into the DOM on button click and have it render.
I've tried storing it within an xmp tag as that can store html code with the < and > characters without using character codes for them, but when trying to insert it into the DOM, the updated source shows it had been copied but it wouldn't render on screen. Also tried storing it within a code tag, which worked on a desktop browser but not in mobile safari. Since this is a webapp mobile browser compatibility is important.
Anyone know of a good method of doing this?
Try <script> tags with a type of text/plain or text/html:
<script type="text/plain" id="example">
<div class="example">
<h2>Hello</h2>
<p>World</p>
</div>
</script>
$(".button").click(function () {
var html = $("#example").text();
$("#destination").html(html);
});
It depends on where do you want to generate the content in question. If it's easier for you setup to generate it on the server side, you can use css to hide those parts (like display:none) and just remove the css property or grab the nodes with javascript and put them elsewhere with something like this:
$('.target').html($('.hidden_node').html());
If you want to generate the content on the js side, you can build it as a long string and just shove it into the target, or you can use jquery's node generation syntax like:
$('<div />').attr({
class: 'test'
}).appendTo("body");
Or you can use one of the various javascript templating solutions like mustache or handlebars.

Return and embed image with pure javascript , like php?

I'm having trouble figuring out if it's possible to embed an html or js document as an image, like so:
<img src="http://blah.com/image.js" />
or
<img src="http://blah.com/image.html" />
The general idea being that when the browser tries to access the file, it would execute the file clientside and get the actual image, and would then embed it as usual. I realize this can be done easily with PHP, but I'm looking for a non-server solution.
Problems being the content type it transfers as is wrong, and more importantly I think this violates every crossdomain and sandbox rule, to which I don't think there's any way around.
As long as the document you are linking to can display the binary data this will work.
Follow this article to solve the binary load with javascript, http://emilsblog.lerch.org/2009/07/javascript-hacks-using-xhr-to-load.html
Then you can also include base64 data in img tags like this
<img src="" />
The src attribute must point to a URI that eventually results in actual image data. Perhaps you should consider leaving it blank and then creating a script that generates a data: URI and replaces it into the attribute.

how to extract all image urls from a html source and download them using curl?

I am using curl to get the images from html source code of an external webpage. I am getting img original='imageurl' on view page source in Firefox. But when i select the particular images then it shows img src='imageurl' on view selection source in in Firefox.
How can I get this type of image using curl?
Currently I am using regex to get the image:
preg_match_all('/<img[^>]+>/i',$output, $result);
print_r($result);
But it doesn't display any image.
I am very confused about what to do here. Anyone have any thoughts?
I am very confused about what to do here.
The confusion probably results from that you use your webbrowser to view the source of an URL. Even if it's often the case that the source of the page displayed by the webbrowser is the data that curl would return as well, this is not always the case.
Especially the Firefox feature view selection source will not display that selection from the original resource, but often something else. To prevent that, you need to disable javascript in your Firefox browser­Docs. Because often documents are modified with javascript and you want to see the original, not the modification because curl is not able to run javascript, it can only get "the original".
Anyone have any thoughts?
Disable javascript in your browser.
Reload the page.
Locate the fragment of the HTML-source-code you're interested in.
Write it down, e.g. into a string.
Request the page with CURL. Output the source.
Locate that string in there. If it's not in there, search the curl request result for the string you're interested and use that instead.
Write a regular expression that is able to obtain what you need from that string.
Use that regular expression in your program then.
Your web browser is reformatting the HTML according to how it understands/parses the HTML page.
When you choose "View Page Source" it shows you the original source code served from the server.
When you select content and choose "View Selection Source" it shows what the browser has parsed into DOM (what the browser understands) for the selected content.
I am guessing you're using Firefox
If you are attempting to use cURL to process the HTML served from the server, you must not look at "View Selection Source" of the page, always refer to "View Page Source"..
Ultimately
You should rather refer to the ACTUAL result from cURL
For example:
$content = curl_exec($ch);
header("Content-type: text/plain");
echo $content;
That should echo exactly what cURL has received from the server...
NOTE: This is a re-post of https://stackoverflow.com/questions/8754844/can-not-get-images-using-curl
Furthermore
If you want to fetch the actual image inside a <img src=""> tag then you need to pin-point the IMG tag in the result HTML response using preg_match, and do a seperate cURL request to the IMG SRC

Categories