Smarter word wrap with Smarty? - php

I'm trying to find a way to wrap a long headline after a specific number of words, based on the total character count of the headline. My purpose is to make the bottom line of the wrapped text longer than the top line to increase readability.
I'd like to use Smarty to find the character count of the headline, then decide how long to make the first line based on the default font size and the width of the containing element. But I'm not a coder and don't know the best way to make arrays, foreach loops, iteration counts, and other stuff that's probably necessary to pull this off.
I'm basically trying to:
Find the total character count of the headline using {$item.name|count_characters:true}
If the total character count is between 60 and 100 characters, add a br tag at the end of the first word that ends past 30 characters.

I believe that you can do this with register_modifier(). Basically, you write a php function to insert the tag, then register it as a modifier. After you've done that, use it in smarty like you would other modifiers, ie:
{$variable|break_title}
In general, it's better not to do complex formatting within smarty templates. Things are cleanest the closer your templates are to vanilla html.
Possible implementation:
function break_title($title) {
return wordwrap($title, 59, '<br />\n');
}
/* later */
$smarty->register_modifier('break_title', 'break_title');
If you want to take font size into account, you can set a global configuration variable indicating the number of characters to break after.
EDIT:
As the commentor mentions, if there is an existing php function that does what you want, you can access it without registering the function:
{$variable|wordwrap:59:"<br />\n"}

Related

PHP Data Scraping

I would like to scrape some data from a website using PHP using
preg_match("/ /i/s", $contents, $matches);
The website I am trying to get data from looks like this
https://www.spareroom.co.uk/flatshare/?search_id=592135669&
I would like to scrape the line that says;
Showing 1-17 of 17 results
I want to use (.*?) to get the total number of properties (in this case 17) for a website to show this information separately.
How can I use preg_match when the data I am scraping varies according to the amount of properties available?
I look forward to any assistance.
David
Going by the example page it looks like this line appears once on a page. If it appears multiple times you may want preg_match_all to return multiple results. Another tricky thing about doing this is changes that get made to a web page from time to time. So here is a solution that will work right now, but you can also tweak things to account for changes in the web page (something I can't tell from a single example):
preg_match( "#<.*?>\s*(\d+)\s*<.*?>\s+results#i", $page, $results );
So I use the i flag to make it case insensitive. That way if they capitalize "results" or something it won't break.
<.*?>
Keep in mind that you are going to be getting the HTML code which has tags you can't see from the front. In this case there are strong tags around the total. But maybe they will change this to a different tag in the future? So I just used open/close angle brackets with wildcards for the contents. Oh and the question mark is so it's not greedy and stops at the closest angle bracket.
\s*
This looks for 0 or more spaces. Right now there is a single space between the strong tag and the total. What if they remove that space or add more? This should cover you in both cases.
(\d+)
The parenthesis is what captures content to the $results array. Inside it is saying 1 or more digits, so only numbers.
\s*
Like earlier, looking for 0 or more space characters.
<.*?>
This is to match the closing strong tag but takes account that they could later use a different closing tag.
\s+results
This is looking for one or more spaces before the word results. We know there has to be at least one, but they could make changes in the future that will put more spaces in there (even though the webpage will only display one).
$results will have two elements The first one will be the entire phrase, and the second element will contain just the capture phrase (between the parenthesis).
There are a million variations you can do to account for variations in the HTML, but this is one that maybe can get you started and you could tweak.

preg_replace function for multiple matches in PHP

I am trying to get the base64 code for the insert image string using summer note js editor.
I manage to get this
preg_replace('#<img\ssrc="data:image/([\w]+);base64,([a-zA-Z0-9+/-_=]+)"\sstyle="width:\s[0-9]+px;">#',"IMAGE REMOVED FROM CHROME\r\n", $content,-1, $counter);
AS well as a few variations depending on
The position of the style and data-filename and src are always changing depending on different browsers and so i have a few variations.
1) Are there easier way to do this?
Like if i have all components of img src, style and data-filename, i will just match the string? I can create all the variations but if i were to do $content = preg_replace 10 times for 10 different variations, isn't it extremely slow just to find one match? And it becomes increasingly slower if my $string is extremely long.
2) I need to pull out the base64 string to save it as a image, how can i use the regex above to help me to fopen, fwrite, fclose?
Thanks in advanced.
You might use something like this:
<img(\s+src="data:image/([\w]+);base64,([\w+/-_=]+)")?(\s+style="width:\s*([\d]+)px;")?\s*>
You can look at this working example.
This way, it covers the absence of src and/or style attributes.
It's up to you to eventually refine it to:
change the 2 attribute groups to non-capturing, depending on how you find them useful or not
also make their contents partially optional, or even optionally richer (e.g. style might include other properties)
Some additional points:
in order to face any case, I replaced \s by \s+ between attributes, and at the opposite by \s* inside of style
I simplified [a-zA-Z0-9+/-_=]+ and [0-9]+ expressions, which become [\w+/-_=]+ and [\d]+
I added capturing parenthesis around this last one (from your question, seems that you need to get width when available)

Options for list within a fixed width, HTML PHP

Do you know of a good solution for dealing with text entries of variable lengths lists with a fixed width?
I have an unordered list with a set number of items. Each item contains a title. The title can be variable length, not only in number of characters, but also number of characterset (korean, japanese, roman, etc.).
One option seems to be cutting the text length down with PHP and adding "..." at the end, but since character widths can be variable, the exact cutoff can also be variable. Another would be to make items fixed with and hide overflow, but this seems inelegant (because characters might get cutoff right in their centers...).
Do you know of a good tutorial or solution for something like this?
Using CSS: text-overflow: ellipsis. Doesn't work in all browsers though. More information: http://www.quirksmode.org/css/textoverflow.html.

How to contain text from breaking elements?

This is a very simple question, but embarrassingly enough I am not sure how to implement this :\
I have div elements with text boxes inside them for users to write comments,replies etc... Very standard concept. The div that contains the text is a certain width. If someone where to write something with out using any spaces, instead of breaking down a line when it hits the right edge of the div, it just keeps going and breaks the element.
I am just wanting to know how to make it line break when it hits the edge spaces provided by user or not.
thanks in advance
There is a word-wrap CSS property that will force long words to wrap:
word-wrap: break-word;
You might also want to look into overflow.
Hmmmm...This is an interesting problem. I'm thinking what I would do would be to implement a JavaScript function that would "read" the length of lines and if a single block of writing is longer than the max length, I would grab a substring of (max length - 1), append a dash (-) to the end of it, add in a line break, and proceed from there.

How to get sentences from the website html

Hello I want to extract all sentences from a html document. How can i perform that? as there are many conditions like first we need to strip tags, then we need to identify sentences which may end with . or ? or ! also there might be conditions like email address and website address also may have . in them How do we make some script like this?
It's called programming ;). Start by dividing the task in simpler sub-tasks and implement those. For example, in your case, I'd design the program like this:
Download and parse the HTML document
Extract all text content (pay special attention to <script> and <style> elements)
Merge the text content to one long string
Solve the problem of finding sentences in a string (likely, just parse until you find a stop character in ".!?" and then start a new sentence)
Discard false positives (Like empty sentences, number-only sentences etc.)
First you should strip certain tags which are inline formatting elemnts like:
I <b>strongly</b> agree.
But you sbhould leave in block-level elements, like DIV and P because there are even stronger delimiters than . ? and !
Then you have to process the content in these block level elements. Typically there are navigation links with one word, you might want to filter them out later, so it is not the right choice to strip away the block structure of the document.
At this point you can safely use the regex pattern to identify blocks:
>([^<]+)<
When you have your blocks you can filter out the short ones (navigation elemnts) and strip the big ones (paragraphs of text) using your sentence delimiter.
There are interesting questions when a fullstop character signals an end of the sentenct and when is it just a decimal point, but I leave that to you. :)

Categories