I've just discovered that my website (html/php) is vulnerable to XSS attacks.
Is there any way to sanitize my data besides manually adding htmlspecialchars to each individual variable that I send to the webpage (and proably missing a few thereby leaving it still open to attack)?
No, there is no shortcut. Data escaping always needs to happen on a case by case basis; not only with regards to HTML, but to any other textual format as well (SQL, JSON, CSV, whathaveyou). The "trick" is use tools which do not require you to think about this much and hence may allow you to "miss" something. If you're just echoing strings into other strings, you're working at the bare metal level and you do need a lot of conscious effort to escape everything. The generally accepted alternative is to use a templating language which implicitly escapes everything.
For example, Twig:
The PHP language is verbose and becomes ridiculously verbose when it
comes to output escaping:
<?php echo $var ?>
<?php echo htmlspecialchars($var, ENT_QUOTES, 'UTF-8') ?>
In comparison, Twig has a very concise syntax, which make
templates more readable:
{{ var }}
{{ var|escape }}
{{ var|e }} {# shortcut to escape a variable #}
To be on the safe side, you can enable automatic output escaping globally or for a block of code:
{% autoescape true %}
{{ var }}
{{ var|raw }} {# var won't be escaped #}
{{ var|escape }} {# var won't be doubled-escaped #}
{% endautoescape %}
This still lets you shoot yourself in the foot, but is a lot better.
One step up still is PHPTAL:
<div class="item" tal:repeat="value values">
<div class="title">
<span tal:condition="value/hasDate" tal:replace="value/getDate"/>
<a tal:attributes="href value/getUrl" tal:content="value/getTitle"/>
</div>
<div id="content" tal:content="value/getContent"/>
</div>
It requires you to write valid HTML simply to compile the template, and the template engine is fully aware of HTML-syntax and will process all user data at the level of a DOM, instead of a string soup. This relegates HTML to a pure serialisation format (which it should be anyway) which is produced by a serialiser whose only job it is to turn an object oriented data structure into text. There's no way to mess up that syntax through bad escaping.
Related
I´m currently working on a code review of a friend and I found an XSS-Vulnerability I´d like to understand properly:
Lets say i I have a Variable foo.bar with the input <h1>test</h1>
I now figured out this pattern:
{{foo.bar}} -> no XSS
{% trans with { '%var%': foo.bar } %} My "%var%" {% endtrans %} -> XSS
{% trans with { '%var%': foo.bar | e('html') } %} My "%var%" {% endtrans %} -> no XSS
I thought I´ll run a Regex Pattern trough his whole code to find potential other places for bad encoding of HTML Character, but I did not quite understand when twig is encoding HTML tags and when not. I do understand the "e" (Encoding) function which decodes my variable value in html entities, but why is {{foo.bar}} encoding the characters while {% trans with ... is not?
I would search with this pattern for Coding mistakes in Twig:
Regex:
'\{%(.){0,2}[trans](.){0,2}[with].*'
-> Searching for "{%[space?]trans[space?] with"
as I guess everytime he missed the |e('html') there might be an issue. Am I on the right track? Do I miss something??
I hope i can find more clarification on this topic here :)
Twig always escapes but "trans with" is part of symfony and not twig. It is not autoescaped because it is passed to a tag, and the tag may output it but that is not a certainty so this is why they refuse to autoescape.
I personally always use the |trans() filter instead so by default you know you are safe, you can still ofcourse use |raw if needed.
https://symfony.com/doc/current/translation/templates.html
Using the translation tags or filters have the same effect, but with one subtle difference: automatic output escaping is only applied to translations using a filter. In other words, if you need to be sure that your translated message is not output escaped, you must apply the raw filter after the translation filter:
I need to render raw HTML on page using twig. Issue that I'm having is that when I have two concurrent HTML element separated by white-space, that white-space gets removed.
How can I preserve that space?
I'm rendering HTML string as so:
{{ set _html = entity.html }}
{{ _html|raw }}
For example:
<p>start <span class="some-class">one</span> <span class="some-class">two</span> stop</p>
Is rendered as:
<p>start <span class="some-class">one</span><span class="some-class">two</span> stop</p>
I'm sure that twig raw function is sanitizing my data and therefore my issue.
How I see it:
As cale_b recommended, I will be using following CSS hack to add a space before element that lacks my spacing:
.monograph {
* + span:before {
content: ' ';
}
}
As answered in here and also in my case, the issue was that I had a {% spaceless %} tag that wrapped my content. So |raw was working correctly but it was the spaceless tag that was stripping the spaces.
I started to use twig as template engine and I like it somehow.
The only thing I don't know how to disable it, is the optimized html that it renders (newest version of twig).
Twig seems to remove all unused white spaces and line breaks.
In Productive Mode it is quite useful if you have a page that should have a high rank in google.
But during Development it is not really usefull.
So my question: How do you disable this?
If you use the spaceless tag twig remove whitespace between HTML tags, not whitespace within HTML tags or whitespace in plain text:
{% spaceless %}
<div>
<strong>foo</strong>
</div>
{% endspaceless %}
output will be <div><strong>foo</strong></div>
For more information on whitespace control, read the dedicated section of the documentation and learn how you can also use the whitespace control modifier on your tags.
Your twig version is 1.18.1 ?
In my Laravel app I allow users to store some text from a text area. When outputting the text I would like to escape the text retrieved from the DB, but also convert any line breaks from the text into <p> tags. I have a function nl2p() that works well for this, but it gets escaped when I place it inside the triple brackets defeating the purpose: {{{ nl2p($bio) }}}
I tried doing something like this:
<?php $formatted_bio = {{{ $user->bio }}}; ?>
<h2>{{ nl2p($formatted_bio) }}</h2>
but data can't be echoed into a variable like that. Any creative solutions out there I may have overlooked?
Try using the e() helper function Laravel provides. It is basically what Blade calls under the hood when you do the triple braces.
So you'd have:
<h2>{{ nl2p(e($user->bio)) }}</h2>
I use yml text files for storing small paragraph of texts for my Silex/Twig website:
use Symfony\Component\Yaml\Yaml;
$loader = Yaml::parse('/path/to/file.yml');
My files look like:
block_2:
id: 2
title: "Title"
body: |
Lore ipsum <strong>legend</strong>
Lore ipsum dolorem etc.
In my Twig file I display variable
<p>{{ block.body }}</p>
Now the output is not as expected as carriage returns and html entities are not properly parsed. Instead the text is rendered in browser
Lore ipsum <strong>legend</strong> Lore ipsum dolorem etc.
How do I properly parse html and line breaks?
By default, twig escapes all input of the templates. This makes your templates very safe.
In some cases however, it is save to output the raw input. This can be done in 4 ways:
Using the escape filter, marking it as safe HTML: <p>{{ block.body|escape('html') }}</p> (best solution)
Using the raw filter: <p>{{ block.body|raw }}</p> (usefull if you are 200% sure it'll be safe)
Using the autoescape tag, marking it as safe HTML (this prevents faulty JavaScript from comming in your code): {% autoescape 'html' %}<p>{{ block.body }}</p>{% endautoescape %} (usefull when doing it for multiple inputs in the same section)
By disabling auto escaping all together for all templates (not recommended)
See also HTML Escaping in the documentation