I need to count words in a string using PHP or Javascript (preferably PHP). The problem is that the counting needs to be the same as it works in Microsoft Word, because that is where the people assemble their original texts in so that is their reference frame.
PHP has a word counting function (http://php.net/manual/en/function.str-word-count.php) but that is not 100% the same as far as I know.
Any pointers?
The real problem here is that you're trying to develop a solution without really understanding the exact requirements. This isn't a coding problem so much as a problem with the specs.
The crux of the issue is that your word-counting algorithm is different to Word's word-counting algorithm - potentially for good reason, since there are various edge-cases to consider with no obvious answers. Thus your question should really be "What algorithm does Word use to calculate word count?" And if you think about this for a bit, you already know the answer - it's closed-source, proprietary software so no-one can know for sure. And even if you do work it out, this isn't a public interface so it can easily be changed in the next version.
Basically, I think it's fundamentally a bad idea to design your software so that it functions identically to something that you cannot fully understand. Personally, I would concentrate on just developing a sane word-count of your own, documenting the algorithm behind it and justifying why it's a reasonable method of counting words (pointing out that there is no One True Way).
If you must conform to Word's attempt for some short-sighted business reason, then the number one task is to work out what methodology they use to the point where you can write down an algorithm on paper. But this won't be easy, will be very hard to verify completely and is liable to change without notice... :-/
Bit of a mine-field as MS word counts are considered wrong and unreliable by profesionals who depend on word counts -- journalists, translators, and, lawers who are often involved in legal procedures where motions and submisions must be less than a specific number fo words.
Having said that this article-
http://dotnetperls.com/word-count
describes a pretty good regex algorithm implemented in C# -- but should be faily easy to transalate into php.
I think his small inaccuracies are based on two factors -- MS Word misses out words not conatined in "regular paragraphs" so footnotes, text box and table wrapped words may or may not be counted. Also I think the EVIL smart quotes feature messing with hypens may affect the results. So it may be worth changing all the 'el-dash' and 'em-dash' characters back to the normal minus sign.
The following JS code gives a word count of 67. OpenOffice gives the same number.
str = "I need to count words in a string using PHP or Javascript (preferably PHP). The problem is that the counting needs to be the same as it works in Microsoft Word, because that is where the people assemble their original texts in so that is their reference frame. PHP has a word counting function (http://php.net/manual/en/function.str-word-count.php) but that is not 100% the same as far as I know.";
wordCount = str.split(/\s+/g).length;
function countWords( $text )
{
$text = preg_replace('![^ \pL\pN\s]+!u', '', strtolower($text));
$text = trim( preg_replace('![ \s]+!u', ' ', $text) );
$count = count( explode(' ', $text) );
return $count;
}
you can use this code for word count
<title>Untitled Document</title>
<script type="text/javascript" src="mootools.svn.js"></script>
<script type="text/javascript">
window.addEvent('domready', function()
{
$('myInput').addEvent('keyup', function()
{
max_chars = 0;
current_value = $('myInput').value;
current_length = current_value.length;
remaining_chars = max_chars+current_length;
$('counter_number').innerHTML = remaining_chars;
if(remaining_chars<=5)
{
$('counter_number').setStyle('color', '#990000');
} else {
$('counter_number').setStyle('color', '#666666');
}
});
});
</script>
<style type="text/css">
body{
font-family:"Lucida Grande", "Lucida Sans Unicode", Verdana, Arial, Helvetica, sans-serif;
font-size:12px;
color:#000000;
}
a:link, a:visited{color:#0066CC;}
label{display:block;}
.counter{
font-family:Georgia, "Times New Roman", Times, serif;
font-size:16px;
font-weight:bold;
color:#666666
}
</style>
</head>
<body>
<label for="myInput">Write something here:</label>
<input type="text" id="myInput" maxlength="20" />
<span id="counter_number" class="counter">20</span>
Remaining chars
and download the mootools library...
Related
I'm formatting fraction with MathJax and are having problem displaying it properly.
$disp = '<h1>$${{10 \over 9 }} of 99 $$</h1><br>';
echo $disp;
For some reason, i cannot get a space before and after the word 'of'. Any pointers is greatly appreciated. Thanx in advance.
This is better handled as
$disp = '<h1>$${10 \over 9}\text{ of }99$$</h1><br>';
as the accepted answer does not get the font or spacing for "of" correct.
It also seems that you may be using <H1> simply to get a larger size. If so, that is bad practice, as <H1> is a structural element indicating a top-level heading (not a layout element for a larger size). Unless this expression really is a top-level heading, you should not use <H1> for it. For example, people using assistive technology like screen readers often are given a list of the headings so they can quickly jump to the important starting points of your page, so if you make all your expressions be headings, that will complicate their already difficult task of navigating your page.
Layout should be controlled by CSS, so you could use a <div> with a class around your display math if you want to size it. Or you could use one of the TeX macros like \Large or \LARGE to make the math larger from within the expression. But don't use a heading indicator unless it really is the start of a new section of your page.
Here are some examples:
.dmath {
font-size: 200%;
}
<script src="https://cdn.jsdelivr.net/npm/mathjax#3/es5/tex-chtml.js"></script>
Bad:
<h1>$${10 \over 9} of 9$$</h1>
Better using CSS and <code>\text{}</code>:
<div class="dmath">
$${10 \over 9}\text{ of }9$$
</div>
Better using <code>\LARGE</code> and <code>\text{}</code>:
$$\LARGE {10 \over 9}\text{ of }9$$
<br><br><br><br>
Usually, \ keep the space between letters.
$disp = '<h1>$${{10 \over 9 }}\ of\ 99 $$</h1><br>';
Reference - Spacing in math mode
If you're reading this you probably noticed that the CSS property text-transform:capitalize; does not convert THIS into This. Instead the, non-initial characters remain capitalized, so the transformation has no effect in this case. So how can we achieve this result?
I've seen this asked often and most answers are quick to promote using javascript to accomplish this. This will work, but it is unnecessary if you are writing or customizing a template/theme for a PHP CMS like Wordpress, Drupal, or Joomla.
To some degree you can achieve this with CSS using the pseudo class ::first-letter and should work all the way back to IE 5.5 :-(
NOTE: this is very dependent on your html structure, and will not work in all cases, but can be useful from time to time. Hit "run code snippet" to the see the result below.
.progTitle {
text-transform: lowercase;
}
.progTitle::first-letter {
text-transform: uppercase;
}
<p class="progTitle">THIS IS SOME TEST TEXT IN UPPERCASE THAT WILL WORK. </p>
<p class="progTitle">this is some test text in lowercase that will work. </p>
<p class="progTitle"><i class="fa fa-bars"></i> THIS WILL NOT WORK </p>
The bad news is that there is no such thing as text-transform : title-case which would guarantee the result to be title cased. The good news is that there IS a way to do it, which doesn't require javascript (as is often suggested for this situation). If you are writing a theme for a CMS you can use strtolower() and ucwords() to convert the relevant text to title case.
BEFORE (THIS DOESN'T WORK):
<style>
.title-case{ text-transform:capitalize; }
</style>
<span class="title-case">ORIGINAL TEXT</span>
AFTER:
<?php echo ucwords( strtolower('ORIGINAL TEXT') ); ?>
If you are writing a theme, you'll probably be working with variables instead of text strings, but the function and the concept work the same way. Here's an example using the native Wordpress function get_the_title() to return the page title as a variable:
<?php
$title = get_the_title();
$title = strtolower($title);
$title = ucwords($title);
<h1>
<?php echo $title;
</h1>
?>
Hope this helps someone. Happy coding.
The best way to do this is to have a class or element for the particular text and use this CSS rule:
.my_text {
text-transform: capitalize;
}
<p class="my_text">hello stackoverflow!!</p>
h1 {
text-transform: capitalize;
}
<h1>hello stackoverflow!!</h1>
Here is a working example in a Joomla 1.5.22 website running Virtuemart 1. The purpose is to take a string which is originally UPPERCASE, and convert it to Proper Case.
UPPERCASE:
<?php echo $list[$i]->name; ?>
Proper Case:
<?php echo ucwords( strtolower($list[$i]->name) ); ?>
This can be achieved with one simple rule:
text-transform: capitalize;
You just write text-transform: none;
I am building a wordpress plugin which is generating an HTML table and sending to gravityforms html block via shortcode.
My problem is that cell contents can contain:
23.24
1,234.665
123.4
etc...
Notice the differing number of decimal places.
Is there a non-hack & best practice way of aligning this column data by decimal point? In this case, Aligning right will not work.
Inserting 0s is not acceptable because this indicates a degree of accuracy which is not there.
As you can see, I have attempted to use align="char" char="." inside the td elements with no luck.
Any help anybody can help with this would be much appreciated.
Many thanks.
Is there a way of using printf("%8.3f",d1) or similar without actually printing to the screen? e.g. structuring the variable d1 for later use but not actually printing it?
There is no direct way to do this. HTML 4.01 has align=char, but without any browser support. CSS 2.0 had a counterpart, using the text-align property, with equal lack of support, so it was dropped from CSS 2.1. CSS3 drafts have a nice system for such alignment, but indicated as being in danger of being cut from the spec if there are no (correct) implementations.
As a workaround, you could right-pad the values with something invisible (blank) so that when the values aligned to the right, the decimal markers get aligned. There are several ways to try to achieve this:
1) Use digit 0 but set a style on it, making it invisible, e.g.
123.4<span class=s>00</span>
with
.s { visibility: hidden; }
2) Use FIGURE SPACE U+2007, defined to have the same width as digits (when digits are of equal width), e.g.
123.4
For this to work, you need to set the font so that it contains U+2007. According to http://www.fileformat.info/info/unicode/char/2007/fontsupport.htm even Arial contains it, but I’m afraid this might not apply to old versions of Arial still in use.
3) Use a no-break space and set its width to the desired number of digits, using the ch unit (define to have the width of digit 0), though this unit is relatively new and not supported by old browsers. Example:
123.4<span class=d2> </span>
with
.d2 { width: 2ch; display: inline-block; }
I would probably use the first method. As a matter of principle, it has the drawback that when CSS is off, the data contains zeroes, which imply wrong information about accuracy, whereas in other methods, switching CSS off “only” messes up the layout.
(It’s probably obvious that digits must be of equal advance width, so that you can align numeric data at all. This just means that the font used for the values must have that property. Most fonts will do in this respect, but e.g. Georgia, Constantia, and Corbel won’t.)
I wrote a jQuery plugin that solves this. It's found here: https://github.com/ndp/align-column
Using your raw HTML table, it will align a column by decimal points:
$('table').alignColumn(3);
It does this by adding another column, but does its best to not corrupt the other spacing. There's also a reference to a different solution on the Github page.
Would it be acceptable to put the value into two columns?
Use sprintf() to convert the value into a string, and then put the bits up to the decimal point in the left column (but right aligned), and the decimal places in the second column.
See http://jsfiddle.net/alnitak/p4BhB/, but ignore the JS bit...
The thing is, you've gotta ensure that they all have the same number of digits after the decimal.
Once you do that, use text-align. All it will take is a: style='text-align: right'
Better still, you could use a css class instead of inline styles. Your markup would look like this:
<tr><td>Item 1</td><td>15</td><td class='price'>£123.25</td></tr>
Then in your stylesheet:
td.price{
text-align: right;
}
With php, you can format a number as a string with number_format. You don't have to echo it or print it, just wrap your variable in that function. For example:
$table .= "<td class='price'>£" . $price . "</td></tr>";
becomes:
$table .= "<td class='price'>£" . number_format($price,3) . "</td></tr>";
It might be overkill but I needed the same thing and just solved with a length of the output and adding whitespace based on that length.
I.e.:
if (strlen($meal_retail) == 5) {
echo " ";
}
else (strlen($meal_retail) == 6) {
echo " ";
}
This lined up my decimals correctly with a bit of extra doing, and i'm sure an array could clean the above code up even nicer.
Additionally, i've been conforming my numbers adjusting with:
echo money_format('%i',$meal_retail) (makes it a two decimal money number)
Just wanted to provide my solution as I was looking at this page before coming up with my own resolution.
this is my solution, hope it help!!
<style type="text/css">
td{
font-family: arial;
}
.f{
width: 10px;
color: white;
-moz-user-select: none;
}
</style>
<table>
<tr><td><span class="f">00</span>1.1<span class="f">00</span></td></tr>
<tr><td><span class="f">0</span>12.34<span class="f">0</span></td></tr>
<tr><td>123.456</td></tr>
</table>
with this, you can't see the zeros and can't select them!
I have used javascript for this, I hope this will help.......
</tr>
</table>
</body>
for(var i=0; i<numarray.length; i++){
var n = numarray[i].toString();
var res= n.split(".");
n = res[0];
if(highetlen < n.length){
highetlen = n.length;
}
}
for(var j=0; j<numarray.length; j++){
var s = numarray[j].toString();
var res= s.split(".");
s = res[0];
if(highetlen > s.length){
var finallevel = highetlen - s.length;
var finalhigh = "";
for(k=0;k<finallevel;k++){
finalhigh = finalhigh+ ' ';
}
numarray[j] = finalhigh + numarray[j];
}
var nadiss = document.getElementById("nadis");
nadiss.innerHTML += "<tr><td>" + numarray[j] + "</td></tr>";
}
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
text-overflow:ellipsis in Firefox 4?
I have the same issue mentioned in Truncating long strings with CSS: feasible yet?. It's been nearly two years since that post, and Firefox still ignores the text-overflow: ellipsis; property.
My current solution is to truncate long strings in PHP like so:
if(strlen($some_string) > 30)
$some_string = substr($some_string,0,30)."...";
That more or less works, but it doesn't look as nice or as accurate as text-overflow: ellipsis; in browsers that support it. The actual width of thirty characters varies since I'm not using a monospace font. The XML fix and jQuery plugins posted in the other thread appear to no longer work in Firefox either.
Is there currently a way to do this in CSS that is browser independent? If not, is there a way to measure the width of a string given a font and font size in PHP so that I might more accurately place my ellipsis?
This answer might be useful for getting your output truncated to the nearest word, and then simply append a … (…) HTML entity onto the end of the output to get your final output.
As you've noticed there's not sufficiently wide browser support yet the CSS solution yet, and you've still got to worry about old browsers too.
It is a shame that all browsers don't handle the same CSS features. However, you could always do something like this using JavaScript (with help from jQuery).
Here's an example of how such a thing might look: http://jsfiddle.net/VFucm/
The basic idea is to turn your string into an array of words, like so:
var words = full.split(/\s+/g);
Loop through them and take the first N (in this case I chose 24) and push them into another array:
for (var i = 0; i < 24; i++) {
short.push(words[i]);
}
Throw them back into the HTML element they came from:
$('.snip').html(short.join(" ") + ' <span class="expand">...</span>');
... here I added a "link" to expand the shortend text. It's made to look and act like a link using CSS. I also provided a function to replace the shortened text with the foll text again:
$('.expand').click(function() {
$('.snip').html(full);
});
If a user types in a long line without any spaces or white space, it will break formating by going wider than the current element. Something like:
HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA.............................................................................................................................................
I've tried just using wordwrap() in PHP, but the problem with that is if there is a link or some other valid HTML, it breaks.
There seems to be a few options in CSS, but none of them work in all browsers. See word-wrap in IE.
How do you solve this problem?
in CSS3:
word-wrap:break-word
I was trying to solve the same problem and I found de solution here:
http://perishablepress.com/press/2010/06/01/wrapping-content/
Solution: adding to the container the following CSS properties
div {
white-space: pre; /* CSS 2.0 */
white-space: pre-wrap; /* CSS 2.1 */
white-space: pre-line; /* CSS 3.0 */
white-space: -pre-wrap; /* Opera 4-6 */
white-space: -o-pre-wrap; /* Opera 7 */
white-space: -moz-pre-wrap; /* Mozilla */
white-space: -hp-pre-wrap; /* HP Printers */
word-wrap: break-word; /* IE 5+ */
}
The idea is using them all so you get better cross-browser compatibility
Hope this helps
I like to use the overflow: auto CSS property/value pairing. This will render the parent object the way you'd expect it to appear. If the text within the parent is too wide, scrollbars appear within the object itself. This will keep the structure the way you want it to look and provide the viewer with the ability to scroll over to see more.
Edit: the nice thing about overflow: auto compared to overflow: scroll is that with auto, the scrollbars will only appear when overflowing content exists. With scroll, the scrollbars are always visible.
I haven't personally used it, but Hyphenator looks promising.
Also see related (possibly duplicate) questions:
word wrap in css / js
Who has solved the long-word-breaks-my-div problem? (hint: not stackoverflow)
I'm surprised that nobody has mentioned one of my favorite solutions to this problem, the <wbr> (optional line-break) tag. It's fairly well-supported in browsers and essentially tells the browser that it can insert a line-break if it's necessary. There's also the related zero-width space character, with the same meaning.
For the use case mentioned, displaying user comments on a web page, I would assume that there is already some output formatting to prevent injection attacks, etc. So it's simple to add these <wbr> tags every N characters in words that are too long, or in links.
This is especially useful when you need control over the format of the output, which CSS doesn't always let you do.
I would put the post in a div that would have a fixed width setting overflow to scroll (or to hide completely depending on the content).
so like:
#post{
width: 500px;
overflow: scroll;
}
But that's just me.
EDIT: As cLFlaVA points out... it is better to use auto then scroll. I do agree with him.
There is no "perfect" HTML/CSS solution.
The solutions either hide the overflow (ie scrolling or just hidden) or expand to fit. There is no magic.
Q: How can you fit a 100cm wide object into a space only 99cm wide?
A: You can't.
You can read break-word
EDIT
Please check out this solution
How to apply a line wrap/continuation style and code formatting with css
or
How to prevent long words from breaking my div?
I dodge the problem by not having my right sidebar fixed like that :P
Here's what I do in ASP.NET:
Split the text field on spaces to get all the words
Iterate the words looking for words that are longer than a certain amount
Insert every x characters (e.g. every 25 characters.)
I looked at other CSS based ways of doing this, but didn't find anything that worked cross-browser.
based on Jon's suggestion the code that I created:
public static string WrapWords(string text, int maxLength)
{
string[] words = text.Split(' ');
for (int i = 0; i < words.Length; i++)
{
if (words[i].Length > maxLength) //long word
{
words[i] = words[i].Insert(maxLength, " ");
//still long ?
words[i]=WrapWords(words[i], maxLength);
}
}
text = string.Join(" ", words);
return (text);
}
I didn't want to add libraries to my pages just for word breaking.
Then I wrote a simple function which I provide below, hope it helps people.
(It is breaking by 15 characters, and applying "& shy;" between, but you can change it easily in the code)
//the function:
BreakLargeWords = function (str)
{
BreakLargeWord = function (word)
{
var brokenWords = [];
var wpatt = /\w{15}|\w/igm;
while (wmatch = wpatt.exec(word))
{
var brokenWord = wmatch[0];
brokenWords.push(brokenWord);
if (brokenWord.length >= 15) brokenWords.push("");
}
return brokenWords.join("");
}
var match;
var word = "";
var words = [];
var patt = /\W/igm;
var prevPos = 0;
while (match = patt.exec(str))
{
var pos = match.index;
var len = pos - prevPos;
word = str.substr(prevPos, len);
if (word.length > 15) word = BreakLargeWord(word);
words.push(word);
words.push(match[0]);
prevPos = pos + 1;
}
word = str.substr(prevPos);
if (word.length > 15) word = BreakLargeWord(word);
words.push(word);
var text = words.join("");
return text;
}
//how to use
var bigText = "Why is this text this big? Lets do a wrap <b>here</b>! aaaaaaaaaaaaa-bbbbb-eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee";
var goodText = BreakLargeWords(bigText);
Add the Zero width space () to the string and it will wrap.
Here is a Javacript example:
let longWordWithOutSpace = 'pneumonoultramicroscopicsilicovolcanoconiosis';
// add between every character to make it wrap
longWordWithOutSpace.split('').join('');
! I did not wanted to make my code more complex with Javascript.
my developing Env was Blazor and UI was for Smartphone.
the Code had a list of file names and some of them where a very long name without space or any other helping Char.
for me this works:
https://developer.mozilla.org/en-US/docs/Web/CSS/overflow-wrap
overflow-wrap: anywhere;
" overflow-wrap: normal; " not work becase it needs space in strings to wrap.
"overflow-wrap: break-word;" not worked for me maybe because it was not a word or something else. I am not sure!
I have posted a solution which uses JavaScript and a simple Regular Expression to break long word so that it can be wrapped without breaking your website layout.
Wrap long lines using CSS and JavaScript
I know that this is a really old problem and since I had the same problem I searched for a easy solution.
As mentioned in the first post I decided to use the php-function wordwrap.
See the following code example for information ;-)
<?php
$v = "reallyreallyreallylonglinkreallyreallyreallylonglinkreallyreallyreallylonglinkreallyreallyreallylonglinkreallyreallyreallylonglinkreallyreallyreallylonglink";
$v2 = wordwrap($v, 12, "<br/>", true);
?>
<html>
<head>
<title>test</title>
</head>
<body>
<table width="300" border="1">
<tr height="30">
<td colspan="3" align="center" valign="top">test</td>
</tr>
<tr>
<td width="100"><?php echo $v2; ?></td>
<td width="100"> </td>
<td width="100"> </td>
</tr>
</table>
</body>
</html>
Hope this helps.