Is there a built-in php function, or a simple (efficient!) way to combine built-in functions, to give a string sentence case ("Sentence one. Sentence two.")?
PHP has similar built-in functions, but none that I can find for my it to my purposes:
ucfirst(strtolower("SENTENCE ONE. AND HERE'S TWO.")) returns "Sentence one. and here's two."; ucwords(strtolower("SENTENCE ONE. AND HERE'S TWO.")) "Sentence One. And Here's Two."
function sentence_case($str) {
$cap = true;
$ret='';
for($x = 0; $x < strlen($str); $x++){
$letter = substr($str, $x, 1);
if($letter == "." || $letter == "!" || $letter == "?"){
$cap = true;
}elseif($letter != " " && $cap == true){
$letter = strtoupper($letter);
$cap = false;
}
$ret .= $letter;
}
return $ret;
}
This will preserve existing proper noun capitals, acronyms and abbreviations.
You could split the string on ".", then ucfirst each sentence. Not the most elegant solution, but it works.
$sentences = explode(".",$paragraph);
$text = "";
foreach($sentences as $sentence) {
$text .= ucfirst(strtolower($sentence)).".";
}
Try this:
function sentenceCase($s){
$str = strtolower($s);
$cap = true;
for($x = 0; $x < strlen($str); $x++){
$letter = substr($str, $x, 1);
if($letter == "." || $letter == "!" || $letter == "?"){
$cap = true;
}elseif($letter != " " && $cap == true){
$letter = strtoupper($letter);
$cap = false;
}
$ret .= $letter;
}
return $ret;
}
Taken from php.net Works with more than just periods as line endings.
I came up with this solution using preg_split. It will try to split sentences on . boundaries where there is one or more spaces after the period.
It is still pretty efficient, but arguably less so that it's explode counterpart.
<?php
$str = "SENTENCE ONE. AND HERE'S TWO.";
$sentences = preg_split('/(\.\s+)/', $str, null, PREG_SPLIT_DELIM_CAPTURE);
array_walk(&$sentences, create_function('&$val', '$val = ucfirst(strtolower($val));'));
$str = implode('', $sentences);
echo $str; // Sentence one. And here's two.
Will work with new line breaks not only spaces.
function sentenceCase($text){
$cap = true; $newText = '';
for($x = 0; $x < strlen($text); $x++){
$letter = substr($text, $x, 1);
if($letter == '.' || $letter == '!' || $letter == '?' || $letter == "\n"){
$cap = true;
} elseif($letter != ' ' && $cap == true){
$letter = strtoupper($letter);
$cap = false;
}
$newText .= $letter;
}
return $newText;
}
Related
Here is my function that makes the first character of the first word of a sentence uppercase:
function sentenceCase($str)
{
$cap = true;
$ret = '';
for ($x = 0; $x < strlen($str); $x++) {
$letter = substr($str, $x, 1);
if ($letter == "." || $letter == "!" || $letter == "?") {
$cap = true;
} elseif ($letter != " " && $cap == true) {
$letter = strtoupper($letter);
$cap = false;
}
$ret .= $letter;
}
return $ret;
}
It converts "sample sentence" into "Sample sentence". The problem is, it doesn't capitalize UTF-8 characters. See this example.
What am I doing wrong?
The most straightforward way to make your code UTF-8 aware is to use mbstring functions instead of the plain dumb ones in the three cases where the latter appear:
function sentenceCase($str)
{
$cap = true;
$ret = '';
for ($x = 0; $x < mb_strlen($str); $x++) { // mb_strlen instead
$letter = mb_substr($str, $x, 1); // mb_substr instead
if ($letter == "." || $letter == "!" || $letter == "?") {
$cap = true;
} elseif ($letter != " " && $cap == true) {
$letter = mb_strtoupper($letter); // mb_strtoupper instead
$cap = false;
}
$ret .= $letter;
}
return $ret;
}
You can then configure mbstring to work with UTF-8 strings and you are ready to go:
mb_internal_encoding('UTF-8');
echo sentenceCase ("üias skdfnsknka");
Bonus solution
Specifically for UTF-8 you can also use a regular expression, which will result in less code:
$str = "üias skdfnsknka";
echo preg_replace_callback(
'/((?:^|[!.?])\s*)(\p{Ll})/u',
function($match) { return $match[1].mb_strtoupper($match[2], 'UTF-8'); },
$str);
I use php preg_match to match the first & last word in a variable with a given first & last specific words,
example:
$first_word = 't'; // I want to force 'this'
$last_word = 'ne'; // I want to force 'done'
$str = 'this function can be done';
if(preg_match('/^' . $first_word . '(.*)' . $last_word .'$/' , $str))
{
echo 'true';
}
But the problem is i want to force match the whole word at (starting & ending) not the first or last characters.
Using \b as boudary word limit in search:
$first_word = 't'; // I want to force 'this'
$last_word = 'ne'; // I want to force 'done'
$str = 'this function can be done';
if(preg_match('/^' . $first_word . '\b(.*)\b' . $last_word .'$/' , $str))
{
echo 'true';
}
I would go about this in a slightly different way:
$firstword = 't';
$lastword = 'ne';
$string = 'this function can be done';
$words = explode(' ', $string);
if (preg_match("/^{$firstword}/i", reset($words)) && preg_match("/{$lastword}$/i", end($words)))
{
echo 'true';
}
==========================================
Here's another way to achieve the same thing
$firstword = 'this';
$lastword = 'done';
$string = 'this can be done';
$words = explode(' ', $string);
if (reset($words) === $firstword && end($words) === $lastword)
{
echo 'true';
}
This is always going to echo true, because we know the firstword and lastword are correct, try changing them to something else and it will not echo true.
I wrote a function to get Start of sentence but it is not any regex in it.
You can write for end like this. I don't add function for the end because of its long...
<?php
function StartSearch($start, $sentence)
{
$data = explode(" ", $sentence);
$flag = false;
$ret = array();
foreach ($data as $val)
{
for($i = 0, $j = 0;$i < strlen($val), $j < strlen($start);$i++)
{
if ($i == 0 && $val{$i} != $start{$j})
break;
if ($flag && $val{$i} != $start{$j})
break;
if ($val{$i} == $start{$j})
{
$flag = true;
$j++;
}
}
if ($j == strlen($start))
{
$ret[] = $val;
}
}
return $ret;
}
print_r(StartSearch("th", $str));
?>
I have a simple task to do with PHP, but since I'm not familiar with Regular Expression or something... I have no clue what I'm going to do.
what I want is very simple actually...
let's say I have these variables :
$Email = 'john#example.com'; // output : ****#example.com
$Email2 = 'janedoe#example.com'; // output : *******#example.com
$Email3 = 'johndoe2012#example.com'; // output : ***********#example.com
$Phone = '0821212121'; // output : 082121**** << REPLACE LAST FOUR DIGIT WITH *
how to do this with PHP? thanks.
You'll need a specific function for each. For mails:
function hide_mail($email) {
$mail_segments = explode("#", $email);
$mail_segments[0] = str_repeat("*", strlen($mail_segments[0]));
return implode("#", $mail_segments);
}
echo hide_mail("example#gmail.com");
For phone numbers
function hide_phone($phone) {
return substr($phone, 0, -4) . "****";
}
echo hide_phone("1234567890");
And see? Not a single regular expression used. These functions don't check for validity though. You'll need to determine what kind of string is what, and call the appropriate function.
For e-mails, this function preserves first letter:
function hideEmail($email)
{
$parts = explode('#', $email);
return substr($parts[0], 0, min(1, strlen($parts[0])-1)) . str_repeat('*', max(1, strlen($parts[0]) - 1)) . '#' . $parts[1];
}
hideEmail('hello#domain.com'); // h****#domain.com
hideEmail('hi#domain.com'); // h*#domain.com
hideEmail('h#domain.com'); // *#domain.com
I tried for a single-regex solution but don't think it's possible due to the variable-length asterisks. Perhaps something like this:
function anonymiseString($str)
{
if(is_numeric($str))
{
$str = preg_replace('/^(\d*?)\d{4}$/', '$1****');
}
elseif(($until = strpos($str, '#')) !== false)
{
$str = str_repeat('*', $until) . substr($str, $until + 1);
}
return $str;
}
I create one function to do this, works fine for me. i hope help.
function ofuscaEmail($email, $domain_ = false){
$seg = explode('#', $email);
$user = '';
$domain = '';
if (strlen($seg[0]) > 3) {
$sub_seg = str_split($seg[0]);
$user .= $sub_seg[0].$sub_seg[1];
for ($i=2; $i < count($sub_seg)-1; $i++) {
if ($sub_seg[$i] == '.') {
$user .= '.';
}else if($sub_seg[$i] == '_'){
$user .= '_';
}else{
$user .= '*';
}
}
$user .= $sub_seg[count($sub_seg)-1];
}else{
$sub_seg = str_split($seg[0]);
$user .= $sub_seg[0];
for ($i=1; $i < count($sub_seg); $i++) {
$user .= ($sub_seg[$i] == '.') ? '.' : '*';
}
}
$sub_seg2 = str_split($seg[1]);
$domain .= $sub_seg2[0];
for ($i=1; $i < count($sub_seg2)-2; $i++) {
$domain .= ($sub_seg2[$i] == '.') ? '.' : '*';
}
$domain .= $sub_seg2[count($sub_seg2)-2].$sub_seg2[count($sub_seg2)-1];
return ($domain_ == false) ? $user.'#'.$seg[1] : $user.'#'.$domain ;
}
Output: a******#gmail.com
$email = str_replace(substr($old_email, 1, strlen(explode("#", $old_email)[0])-1), "**********", $old_email);
This is a quick fix to the question above;
It ensures just the first character of the email address as the extension shows up.
You can increase or reduce the number of asterisks depending
Is there a way to do this without writing my own function?
For example:
$text = 'Test <span><a>something</a> something else</span>.';
$text = cutText($text, 2, null, 20, true);
//result: Test <span><a>something</a></span>
I need to make this function indestructible
My problem is similar to
This thread
but I need a better solution. I would like to keep nested tags untouched.
So far my algorithm is:
function cutText($content, $max_words, $max_chars, $max_word_len, $html = false) {
$len = strlen($content);
$res = '';
$word_count = 0;
$word_started = false;
$current_word = '';
$current_word_len = 0;
if ($max_chars == null) {
$max_chars = $len;
}
$inHtml = false;
$openedTags = array();
for ($i = 0; $i<$max_chars;$i++) {
if ($content[$i] == '<' && $html) {
$inHtml = true;
}
if ($inHtml) {
$max_chars++;
}
if ($html && !$inHtml) {
if ($content[$i] != ' ' && !$word_started) {
$word_started = true;
$word_count++;
}
$current_word .= $content[$i];
$current_word_len++;
if ($current_word_len == $max_word_len) {
$current_word .= '- ';
}
if (($content[$i] == ' ') && $word_started) {
$word_started = false;
$res .= $current_word;
$current_word = '';
$current_word_len = 0;
if ($word_count == $max_words) {
return $res;
}
}
}
if ($content[$i] == '<' && $html) {
$inHtml = true;
}
}
return $res;
}
But of course it won't work. I thought about remembering opened tags and closing them if they were not closed but maybe there is a better way?
This works perfectly for me:
function trimContent ($str, $trimAtIndex) {
$beginTags = array();
$endTags = array();
for($i = 0; $i < strlen($str); $i++) {
if( $str[$i] == '<' )
$beginTags[] = $i;
else if($str[$i] == '>')
$endTags[] = $i;
}
foreach($beginTags as $k=>$index) {
// Trying to trim in between tags. Trim after the last tag
if( ( $trimAtIndex >= $index ) && ($trimAtIndex <= $endTags[$k]) ) {
$trimAtIndex = $endTags[$k];
}
}
return substr($str, 0, $trimAtIndex);
}
Try something like this
function cutText($inputText, $start, $length) {
$temp = $inputText;
$res = array();
while (strpos($temp, '>')) {
$ts = strpos($temp, '<');
$te = strpos($temp, '>');
if ($ts > 0) $res[] = substr($temp, 0, $ts);
$res[] = substr($temp, $ts, $te - $ts + 1);
$temp = substr($temp, $te + 1, strlen($temp) - $te);
}
if ($temp != '') $res[] = $temp;
$pointer = 0;
$end = $start + $length - 1;
foreach ($res as &$part) {
if (substr($part, 0, 1) != '<') {
$l = strlen($part);
$p1 = $pointer;
$p2 = $pointer + $l - 1;
$partx = "";
if ($start <= $p1 && $end >= $p2) $partx = "";
else {
if ($start > $p1 && $start <= $p2) $partx .= substr($part, 0, $start-$pointer);
if ($end >= $p1 && $end < $p2) $partx .= substr($part, $end-$pointer+1, $l-$end+$pointer);
if ($partx == "") $partx = $part;
}
$part = $partx;
$pointer += $l;
}
}
return join('', $res);
}
Parameters:
$inputText - input text
$start - position of first character
$length - how menu characters we want to remove
Example #1 - Removing first 3 characters
$text = 'Test <span><a>something</a> something else</span>.';
$text = cutText($text, 0, 3);
var_dump($text);
Output (removed "Tes")
string(47) "t <span><a>something</a> something else</span>."
Removing first 10 characters
$text = cutText($text, 0, 10);
Output (removed "Test somet")
string(40) "<span><a>hing</a> something else</span>."
Example 2 - Removing inner characters - "es" from "Test "
$text = cutText($text, 1, 2);
Output
string(48) "Tt <span><a>something</a> something else</span>."
Removing "thing something el"
$text = cutText($text, 9, 18);
Output
string(32) "Test <span><a>some</a>se</span>."
Hope this helps.
Well, maybe this is not the best solution but it's everything I can do at the moment.
Ok I solved this thing.
I divided this in 2 parts.
First cutting text without destroying html:
function cutHtml($content, $max_words, $max_chars, $max_word_len) {
$len = strlen($content);
$res = '';
$word_count = 0;
$word_started = false;
$current_word = '';
$current_word_len = 0;
if ($max_chars == null) {
$max_chars = $len;
}
$inHtml = false;
$openedTags = array();
$i = 0;
while ($i < $max_chars) {
//skip any html tags
if ($content[$i] == '<') {
$inHtml = true;
while (true) {
$res .= $content[$i];
$i++;
while($content[$i] == ' ') { $res .= $content[$i]; $i++; }
//skip any values
if ($content[$i] == "'") {
$res .= $content[$i];
$i++;
while(!($content[$i] == "'" && $content[$i-1] != "\\")) {
$res .= $content[$i];
$i++;
}
}
//skip any values
if ($content[$i] == '"') {
$res .= $content[$i];
$i++;
while(!($content[$i] == '"' && $content[$i-1] != "\\")) {
$res .= $content[$i];
$i++;
}
}
if ($content[$i] == '>') { $res .= $content[$i]; $i++; break;}
}
$inHtml = false;
}
if (!$inHtml) {
while($content[$i] == ' ') { $res .= $content[$i]; $letter_count++; $i++; } //skip spaces
$word_started = false;
$current_word = '';
$current_word_len = 0;
while (!in_array($content[$i], array(' ', '<', '.', ','))) {
if (!$word_started) {
$word_started = true;
$word_count++;
}
$current_word .= $content[$i];
$current_word_len++;
if ($current_word_len == $max_word_len) {
$current_word .= '-';
$current_word_len = 0;
}
$i++;
}
if ($letter_count > $max_chars) {
return $res;
}
if ($word_count < $max_words) {
$res .= $current_word;
$letter_count += strlen($current_word);
}
if ($word_count == $max_words) {
$res .= $current_word;
$letter_count += strlen($current_word);
return $res;
}
}
}
return $res;
}
And next thing is closing unclosed tags:
function cleanTags(&$html) {
$count = strlen($html);
$i = -1;
$openedTags = array();
while(true) {
$i++;
if ($i >= $count) break;
if ($html[$i] == '<') {
$tag = '';
$closeTag = '';
$reading = false;
//reading whole tag
while($html[$i] != '>') {
$i++;
while($html[$i] == ' ') $i++; //skip any spaces (need to be idiot proof)
if (!$reading && $html[$i] == '/') { //closing tag
$i++;
while($html[$i] == ' ') $i++; //skip any spaces
$closeTag = '';
while($html[$i] != ' ' && $html[$i] != '>') { //start reading first actuall string
$reading = true;
$html[$i] = strtolower($html[$i]); //tags to lowercase
$closeTag .= $html[$i];
$i++;
}
$c = count($openedTags);
if ($c > 0 && $openedTags[$c-1] == $closeTag) array_pop($openedTags);
}
if (!$reading) //read only tag
while($html[$i] != ' ' && $html[$i] != '>') { //start reading first actuall string
$reading = true;
$html[$i] = strtolower($html[$i]); //tags to lowercase
$tag .= $html[$i];
$i++;
}
//skip any values
if ($html[$i] == "'") {
$i++;
while(!($html[$i] == "'" && $html[$i-1] != "\\")) {
$i++;
}
}
//skip any values
if ($html[$i] == '"') {
$i++;
while(!($html[$i] == '"' && $html[$i-1] != "\\")) {
$i++;
}
}
if ($reading && $html[$i] == '/') { //self closed tag
$tag = '';
break;
}
}
if (!empty($tag)) $openedTags[] = $tag;
}
}
while (count($openedTags) > 0) {
$tag = array_pop($openedTags);
$html .= "</$tag>";
}
}
It's not idiot proof but tinymce will clear this thing out so further cleaning is not necessary.
It may be a little long but i don't think it will eat a lot of resources and it should be faster than regex.
I have a set of articles, in which I want to style the first letter from each article (with CSS).
the articles usually start with a paragrah, like:
<p> bla bla </p>
So how could I wrap the first letter from this text within a <span> tag ?
Unless you need to do something extremely fancy, there's also the :first-letter CSS selector.
<?php
$str = '<p> bla bla </p>';
$search = '_^<p> *([\w])(.+) *</p>$_i';
$replacement = '<p><span>$1</span>$2</p>';
$new = preg_replace( $search, $replacement, $str );
echo $new."\n";
You can do this in all CSS.
CSS supports "Pseudo-Elements" where you can choose the first letter / first word and format it differently from the rest of the document.
http://www.w3schools.com/CSS/CSS_pseudo_elements.asp
There's a compatibility chart; some of these may not work in IE 6
http://kimblim.dk/css-tests/selectors/
you could add a Php span but might not be as clean
$s = " la la ";
$strip = trim(strip_tags($s));
$t = explode(' ', $strip);
$first = $t[0];
// then replace first character with span around it
$replace = preg_replace('/^?/', '$1', $first);
// then replace the first time of that word in the string
$s = preg_replace('/'.$first.'/', $replace, $s, 1);
echo $s;
//not tested
I've not found a versatile method yet, but a traditional code implementation (that may be slower) works:
function pos_first_letter($haystack) {
$ret = false;
if (!empty($haystack)) {
$l = strlen($haystack);
$t = false;
for ($i=0; $i < $l; $i++) {
if (!$t && ($haystack[$i] == '<') ) $t = true;
elseif ($t && ($haystack[$i] == '>')) $t = false;
elseif (!$t && !ctype_space($haystack[$i])) {
$ret = $i;
break;
}
}
}
return $ret;
}
Then call:
$i = pos_first_letter( $your_string );
if ($i !== false) {
$output = substr($s, 0, $i);
$output .= '<span>' . substr($s, $i, 1) . '</span>';
$output .= substr($s, $i+1);
}