I'm in the process of (slowly) learning how to make my websites more secure. I was checking out D&D Beyond, and noticed a few things I've never seen before, and I would like to learn more about.
Portions of the source code don't show up when you View the Source.
It's hard to explain. I tried to explain it in a different post, and I got a ton of snarky remarks. I'm telling you, I know what I saw. I would like to know how this is possible and how I can replicate it.
I typically write in PHP/JQuery, so I'd primarily like to learn more using those languages.
Example:
You can create a Character using their Character Builder, then view your Character Sheet. The main portion of your character's stats are enclosed in a very large parent div: ".character_sheet"
If you MANUALLY save your Character Sheet to your Desktop, you can see the HTML for this section. If you inspect this section in Firefox, you can also see the data. However, if you try to CTRL+U while in the browser, the HTML in this section does not appear. It also will not appear if you try to curl/fopen/file_get_contents
Additionally, images are not visible by normal means.
For Example: I am aware of how to disable right-clicking on a website, but if someone wanted to take my images, all they'd have to do is open my source code and look at the image url and save it from there.
On the D&D Beyond site, I can bring up Firefox's web inspector where an Image SHOULD be, take a look at the CSS, and... nothing. No link to an image, where one should be. I don't know how they're getting images to appear without css/html. I'd be very interested to know how this is done.
If anyone has any insight/guesses/etc and can point me in the right direction to learn some more, I'd really appreciate it!
Server-side code such as PHP is always hidden to visitors (unless you have a security vulnerability of some sort).
Client-side code such as HTML, JavaScript and CSS is always visible to the visitor. Even if you can't see it immediately in the DOM, it will be hiding there somewhere.
The most likely scenario is that it is hidden within an embedded .js or .css file, which would look similar to the following:
<script src="scripts.js"></script>
<link rel="stylesheet" type="text/css" href="theme.css">
HTML can be outputted to the page through JavaScript, which will not show in the DOM (though it would show up with a PHP echo). HTML can also be 'hidden' through use of <iframe> tags and HTML imports.
JavaScript has a wide array of ways in which it can be obscured / malformed, so it can be hard to track down. You may some some strange, 'unreadable' code in the DOM / .js files, which in turn could be outputting the HTML itself.
Please consider the below points,
All client side resources are viewable although you can make it easyless readable by javascript and it's better to do most of your codes by server side.
You need to know about what search engines love if your app is a public web site & will be indexed by those search engines, as some search engines don't scrape to the web pages which have only JavaScript code.
You can create images without <img> tags using CSS background-image Property.
there are some useful lib's to make your code more hard readable like Closure Compiler Service & JSFuck & JS Packers although it's better to make it by yourself and just add like those techniques to your knowledge, noting that this will make your code size larger.
and at all there are no white page source, it should contains at least <script> and if you saw a real white page it may be disabled from sever side to be viewable at top of window and it may be works if embedded in iframe or by sending specific headers to it or whatever else.
You can make your server & client sides cooperate :) to get great result and more secured.
Related
This question already has an answer here:
How to block developers tools (like Firebug) in a page?
(1 answer)
Closed 8 years ago.
When I make a website my friend always copies my HTML and CSS code from my website and upload the website giving his name. I have blocked right click in my website. But he can copy any code by Firebug. Is it possible to block Firebug in my website? OR is there any code to protect my website?
Besides you can't hide your HTML code completely, there are some ways to make "stealers" job a bit harder.
Source Code Padding
Really, the oldest trick in the book. It involves adding a ton of white space before the start of your code so that the view source menu appears blank. However, must all people will notice the scroll bars and will scroll around to find your code. As pointless and silly as this method is, there are some still who use it.
No Right Click Scripts
These scripts stop users from right-clicking, where the "View Source" function is located.
Cons: Notoriously hard to get working across browsers and to actually work properly.
The right-click menu, or context menu, includes many helpful tools for users, including navigation buttons and the "Bookmark Page" button. Most users don't take kindly to having their browser functionality disabled and are inclined not to revisit such pages.
The View Source function is also available through the top Menu. At the main menu bar at the top of your browser, select View, and then in the sub-menu, you'll see "View Source" or something similar. Also, there are keyboard shortcuts like Ctrl+U that can be used to view source. All this method does is add about a two second delay to someone trying to view your source and it does irritate users who aren't trying to view your source.
"JavaScript Encryption"
This is by far the most popular way to try to hide one's source code. It involves taking your code, using a custom made function to "encrypt" it somehow, and then putting it in an HTML file along with a function that will decrypt it for the browser. A User is able to view the source, however, it isn't understandable.
Cons: Your website is only usable for users with JavaScript enabled. This rules out search engines, users who've chosen to disable JavaScript, and users using a textual browser (such as the blind) that doesn't have JavaScript capabilities. Remember, JavaScript is a luxury, not a necessity on the web.
You have to include a means of decrypting the page so the browser can display it. Someone who understands JavaScript can easily decrypt the page.
Many browsers provide alternative ways around this. Some allow you to save the page, decrypted for easy viewing later. Others, like FireFox, include tools like the DOM Inspector, which allows you to easily view and copy the XML of the page, decrypted.
HTML Protection Software
There are some less than honest people who want to sell you software to quickly and conveniently "protect" your source code. This type of software generally employs the above methods, in varying ways, to hide your source code. Many people think that if they are buying it, it must work. It doesn't. As we've seen, the above methods are all easily circumvented, and all this software does is implement these horribly flawed methods for you and take your money. Don't fall for them, I've yet to see a single one that's worked, and they never will.
Isn't there Any Hope?
The bottom line is that browsers need to see the unencrypted, plain text source code to create a webpage. For that reason, it's impossible to hide your HTML source code. If the browser can read it, which it needs to be able to do to render a webpage, then so can a user. That's the bottom line.
But My Page Was Stolen!
A lot of people look for this after having their website pirated. I know it's cruel that in a few minutes someone can steal hours of your work, but hiding your source code can't help you. Contacting the person in question and asking them to take it down solves many cases. Otherwise, contact the web host or the person's ISP and explaining the situation is a good course of action. I can't give you legal advice, but if you feel that your copyrights are being infringed, you can contact a lawyer. But hiding (or "encrypting") your source, won't do much of anything at all.
The Bottom Line
Unfortunately, the short answer to this question is, you can't. There have been various methods put forth, but all of these are easily circumvented. In the end, the only sure fire way to make sure no one can steal your source code is to never put it on the Internet at all.
Hope that helps
Source
I'm developing a Wordpress site, which I'm fairly new to. I'm not sure if this is a stupid question or not but I haven't been able to return any decent google results regarding this. Anyway, is there a way to find out what PHP function is generating a piece of HTML code using a browser code inspector like Chrome's? Thanks!
No.
Once the data arrive to the browser, all the PHP code have been processed and you can't know what part of PHP generated which part of the HTML code.
No - not without modifying the php code to enable some kind of debugging. Chrome can only give you information about the received html document on the client side (you). But php code gets parsed server side.
You kind of can:
Download a copy of the theme and plugins folder
Open the page on your site that you want to find the function for.
Find a div/class that is specific to section e.g. <article>
Open a text editor like notepad++ (one that will allow you to search through multiple files at ones)
Use the find feature of chosen text editor and search for the div/class
The result will show you a list of pages where that term is.
Look through those pages for the function you are looking for (it might take a few goes)
The above it is a bit of a roundabout way of doing it, but I think other than looking through each file separately, it is you next best way.
I am working on an applet that allows the user to input a URL to a news article or other webpage (in Japanese) and view the contents of that page within an iFrame in my page. The idea is that once the content is loaded into the page, the user can highlight words using their cursor, which stores the selected text in an array (for translating/adding to a personal dictionary of terms) and surrounds the text in a red box (div) according to a stylesheet defined on my domain. To do this, I use cURL to retrieve the HTML of the external page and dump it into the source of the iFrame.
However, I keep running into major formatting problems with the retrieved HTML. The big problem is preserving style sheets, and to fix this, I've used DOMDocument to add tags to the section of the retrieved HTML. This works for some pages/URLs, but there are still lots of style problems with the output HTML for many others. For example, div layers crash into each other, alignments are off, and backgrounds are missing. This is made a bit more problematic as I need to embed the output HTML into a new in order to make the onClick javascript function for passing text selections in the embedded content to work, which means the resulting source ends up looking like this:
<div onclick="parent.selectionFunction()" id ="studyContentn">
<!-- HTML of output from cURL, including doctype declarations and <html>,<head> tags -->
</div>
It seems like for the most part a lot of the formatting issues I keep running into are largely arbitrary. I've tried using php Tidy to clean output from HTML, but that also only works for some pages but not many others. I've got a slight suspicion it may have to do with CDATA declarations that get parsed oddly when working with DOMDocument, but I am not certain.
Is there a way I can guarantee that HTML output from cURL will be rendered correctly and faithfully in all instances? Or is there perhaps a better way of going about doing this? I've tried a bunch of different ways of approaching this issue, and each gets closer to a solution but brings its own new problems as well.
Thanks -- let me know if I can clarify anything.
If I understand correctly you are trying to pull the html of a complete web page and display it under your domain, in your html. This is always going to be tricky, a lot of java script will break, relative url's will be wrong and as you mentioned, styles as well. Your probably also changing the dimensions that the page is displayed in. These can all be worked around but your going to be fighting an uphill battle with each new site, or if a current site change design
I'd probably take a different approach to the problem. You might want to write a browser plugin as the interface to the external web site instead. Then your applet can sit on top of the functional and tested (hopefully) site. Then you can focus on what you need to do for your applet rather than a never ending list of fiddly html issues.
I am trying to do a similar thing. It is very difficult to conserve the formatting, and the JS scripts in webpage complicated the thing. I finally gave up the complete the idea of completely displaying the original format, but do it with a workaround:
Select only headers, links, lists, paragraph which you are interested at.
Add the domain path of your ownsite to links.
You may wrap the headers, links etc. items by your own class.
Display it
in your case you want to select text and store it, which is another topic. What I did is to parse the HTMl in two levels, and then it is easy to do the selection. Keep in mind IE and Firefox/Chrome needs to be dealt with separately.
I have problem with PHP and JavaScript/CSS.
I have database with table. The table has a descriptions of articles. I want to echo the descriptions of the articles from database. Unfortunately many of them has a JavaScript or CSS included ( Then some article text), so when I use echo, it shows all of that code (and after that text). Is there any way to not show the JavaScript/CSS part and show only the text? For example with str_replace and regular expression? If yes, can somebody write me how it should look like?
Thanks for help and let me know if u need more info (code etc.)
Use HTMLPurifier - it will remove the scripts, css and any harmfull content from your articles. Since it is a CPU-intensive operations, it's better to run article trough HTMLPurifer before saving in the database, then to run it each time you are showing the article.
If you're trying to remove tags from a user's post, you can call strip_tags. This will get rid of css links, script tags, etc. It will not get rid of the style attribute, but if you get rid of div, span, p, etc. that won't matter -- there will be no tag for it to reside on.
As has been stated by others, it is generally best to sanitize your input (data from user before it goes into the DB), than it is to sanitize your output.
If you're trying to simply hide the JS and CSS from users, you can use Packer to obfusicate Javascript from less-savvy users, use Packer and use base 62 encoding. The JS will still work but will look like jiberish. Be aware that more knowledgeable users can attempt to unobfusicate the code, so any critical security risks in the JS still exists. Don't think any JS that accesses your databases directly will be safe; instead remove database access from the Javascript for security. If the JS is just to do fancy things like move elements around the page it's probably fine to just obfuscate it.
Only consider this if YOU have complete control and awareness of all JS included with the articles. If this is something your anonmous or otherwise not 120% trusted users can upload, you need to kill that functionality and use HTML Purifier to remove any JS they might add. It is not safe to output user entered JS, for you or your users.
For the CSS, I'm not sure why you want to hide it, and CSS can't be obfuscated quite like JS can; the styles will still be in plain English, best you can do is butcher the class/id names and whitespace; outputting CSS that YOU generated isn't a real security risk though, and even if people reverse engineer it I wouldn't be that afraid.
Again, if this is something anonymous/non trusted users can ADD to your site on their own, you don't want this at all, so remove the ability to upload CSS with an article using the HTML Purifier Darhazer mentioned.
You can try the following regex to remove the script and css:
"<script[\d\D]*?>[\d\D]*?</script>"
"<style[\d\D]*?>[\d\D]*?</style>"
It should help, but it cannot remove all the scripts. Like onclick="javascript:alert(1)".
I have a client who wants me to do CSS coding only, but doesn't want to give me the php files.
Right now, I just have access to the live website (with no CSS).
It is entirely made with tables and I want to use divs instead
I'm not sure if it is possible to do the coding
I thought about copying and pasting the generated HTML code from each page
Will this cause possible problems with the end result?
Yes, this will cause huge problems: you'll do an awesome job, client will have trouble integrating it with their site, client will abandon your awesome work.
IMO, you should let the client know that you'll do the best you can with what they have given you, but you would be able to save them a lot of work and do a better job if you could have access to the source code.
If you know that you can't make the client happy with what they have given you, though, it would be doing everyone a disservice for you to try.
If you absolutely can't convince them to give you access to the source, then this client sounds stupid:
He has a layout which is table based.
He wants you to magically make it look better with CSS, without having access to the source.
"#Phoenix I don't see any classes or IDs." - there are no classes or ids to hook into.
You might be able to do it if you used some CSS3 selectors to, for example, select the 3rd td inside a td inside the 2nd table to apply styles to ;)
But, that won't help if you have to support older browsers, which makes this impossible at the moment without doing something differently.
I don't have full knowledge of your situation, but here's what I would probably do (if I couldn't convince them to give me access to the source):
Open the live site.
Copy the HTML source code.
Paste it into a new local file.
Add this into the <head> section: <base href="http://the-clients-site.com/" />.
This will let all the assets on the page load from the client's actual site.
Now, you have something to work with.
You have to keep track of ALL changes you make to the file.
The first change should be adding your own blank style tag.
Then, you can add id and class to whichever elements you feel need it.
You should try to avoid moving around elements, unless it's absolutely required. Those changes are a whole lot harder to explain to someone. I know from experience.
You should be able to style the page properly now.
Then, you deliver the completed page, and the documented list of changes you had to make to the HTML (add id, here add class there).
The client should then be able to integrate the changes into his site.
Well, at a bare minimum they'll need to modify ther PHP to reference your CSS. More importantly, you need to be able to hook your CS up to elements - Do tables/rows/etc. have Ids or classes attached?
If they are clever and have some good separation between code and presentation (using a templating engine or similar) then you can probably just edit the template / css.
If they won't let you edit the PHP and you come up with a new awesome layout, they will have a nightmare job trying to integrate it and probably won't bother.
I don't see the problem. You can style tables just as easily as divs. You don't have to know how the wall is built to know how to paint it, which is pretty much all you've been hired to do. Only problem I could see would be if they haven't added any classes or ids to the elements yet. After all, what the browser/client sees is the only thing that needs styling, and since you can see everything that the browser sees, you can see everything that needs styling.
If they have added classes/ids, then just take a copy of a page and style it in a testing area, and then once it looks nice, you take a copy of another page and make sure it looks nice with it too, add to the CSS if there are any new unstyled elements that didn't exist on the first page, once it looks nice, then move on to another page, and another repeating the process until you are satisfied that it appears that every page within reason would look nice with it.
If they haven't added classes/ids, tell them they need to in some capacity before you can work on it, perhaps provide some guidance on the issue.
I'm actually doing this right now for SO.
I'm working on a userscript that provides an alternate "clean" stylesheet for the StackExchange network. I have no access to the SO engine. I am using the Chrome Inspector to look at how the elements are set up. I recommend the same. (Although it is a little different, since I'm modifying the original CSS file.)
You can easily identify what you want to style with the Inspector and then work from there. I would suggest that you ask your client for a list of classes and IDs though. (I got that in the form of an existing stylesheet, you can go about it in a different way, if that suits you and your client.)