Im having problem with changing multiple HEX colors into span. Current code just change one color. Any idea how to make it work for multiple colors ?
function convertHexToSpan($name)
{
$name = preg_replace('/\*#([a-f\d]{6})(.*)\*[a-f\d]+/', "<span style='color:$1'>$2</span>", $name);
return $name;
}
$text = "#ff6600Hello #ff0000world";
$newText = convertHexToSpan($text);
OUTPUT SHOULD BE "<span style='color:#ff600'>Hello</span><span style='color:#ff0000'>world</span>
Updating your Regular Expression will get you most of the way there, but we have to make some assumptions that differ slightly from your original question.
If you use the following as the expression:
/(#[a-f\d]{6})([^ ]+)/
preg_replace does the repetition searching for you as regex isn't really for iterating, so I removed the second hex search. This finds the 6 hex digits as a first group, then the next group is any character that is not a space.
Note: I am assuming that you are trying to break on word boundaries, but will need to modify if that is not the case. I am also assuming you want to preserve the space between the words after conversion, but your example shows no space.
To remove the space between words, you would just need to modify the regex to match the spaces (and then they will get removed), which would be as follows:
/(#[a-f\d]{6})([^ ]+)( )+/
Related
I've been trying to write something that would remove some parts of the words that are declared as unwanted from users' posts. This is what I came up with:
$badWords = array("damn", "hell", "fool"); //we declare an array that will contain all the words we don't want
$txtlower = strtolower($text); //we lowercase the entire text
foreach ($badWords as $word) { //iterate through the array. $word is each bad word respectively
if (strpos($txtlower, $word) !== false) { //check if the lowercased text contains any bad words (since we lowercased the entire text, it will also lowercase and thus detect all upper or mixed case types of any bad word the user has typed)
$wordIndex = strpos($txtlower, $word); //get the index of the bad word in the lowercased text. This index will be the same in the original text
$wordLength = strlen($word); //get the length of the bad word. Now we get back to the original text, i.e. $text
$typedWord = substr($text, $wordIndex, $wordLength); //this is the original bad word that the user has typed, with the case type intact
$replacePart = substr($typedWord, 1, 3); //take the part from the 2nd up to the 5th character of the bad word
$text = str_replace($replacePart, "...", $text); //replace the $replacePart part with the dots, BUT in the original text, not the lowercased text (important, otherwise it would submit the entire post as lowercase)
}
}
($text is the text the user types in the text box and then submits as a post)
Now this works 99% of time. It removes both the upper and lowercase versions of the words, as well as any mixed type (for example DAmn or fOoL).
The only case where it doesn't work is if the same unwanted word appears more than once in the text. Then it will fix only the first instance of it. So
Damn, is this DAMn
will become
D..., is this DAMn
Is there a way to do this, or perhaps some regex solution that would include removing just one part of the word instead of the entire thing?
Thanks!
Your code can be simplified.
$badWords = ["damn","hell","fool"];
$filteredText = preg_replace_callback(
"(".implode("|",array_map('preg_quote',$badWords)).")i",
function($match) {
return $match[0][0] // first letter left as-is
.str_repeat(".",strlen($match[0])-1); // as many dots as there are letters left
},
$text
);
However please note that word filters like this are an exercise in futility. You cannot be a..ured that innocent words, even a simple greeting like h...o, will be left alone. Sure, you can use word boundaries (\b) to only match whole words.
But then there's the issue of people finding bypa..es. I'm sure you've seen them around many forums. Character substitutions can pa$$ right through your filter. Inserting spaces as seen here is another way.
My personal favourite is the "Zero Width Space" character, which allows me to type an otherwise filtered word with no apparent difference, defeating the filter entirely.
Humans are creative. Stop them from doing what they want, and they will find ways around them. It is, generally, a much better use of time to just say "don't use bad language" in your community's rules, and enlist human moderators to handle the (relatively) rare cases of it occurring.
I hope this helps. You can find more information about this problem in this informative video by Tom Scott.
I have a string that is:
<p><img src="../filemanager/image.png?1476187745382"/></p> some text ...
I would like to remove everything after a .png or .jpg when question mark occurs. The goal is to remove the timestamp added ?1476187745382 but not the "/></p> some text ...
Keeping in mind that the timestamp, will change and the what comes after the the image > will also be different.
I have looked at different solutions, but they all remove either the exact occurrence or everything after a certain character, which is not what I need to do.
This is what I have looked at:
PHP remove characters after last occurrence of a character in a string
Remove portion of a string after a certain character
Can someone point me to the right direction?
Not always needed, but a regex will do it:
$string = preg_replace('/\?[\d]{13}/', '', $string);
If the timestamp is not always 13 digits then replace the {13} with just a +.
$path = "../filemanager/image.png?1476187745382";
$subpath = explode('?',$path)[0];
I want to use PHP to clean up some titles by capitalizing each word, including those following a slash. However, I do not want to capitalize the words 'and', 'of', and 'the'.
Here are two example strings:
accounting technology/technician and bookkeeping
orthopedic surgery of the spine
Should correct to:
Accounting Technology/Technician and Bookkeeping
Orthopedic Surgery of the Spine
Here's what I currently have. I'm not sure how to combine the implosion with the preg_replace_callback.
// Will capitalize all words, including those following a slash
$major = implode('/', array_map('ucwords',explode('/',$major)));
// Is supposed to selectively capitalize words in the string
$major = preg_replace_callback("/[a-zA-Z]+/",'ucfirst_some',$major);
function ucfirst_some($match) {
$exclude = array('and','of','the');
if ( in_array(strtolower($match[0]),$exclude) ) return $match[0];
return ucfirst($match[0]);
}
Right now it capitalizes all words in the string, including the ones I don't want it to.
Well, I was going to try a recursive call to ucfirst_some(), but your code appears to work just fine without the first line. ie:
<?php
$major = 'accounting technology/technician and bookkeeping';
$major = preg_replace_callback("/[a-zA-Z]+/",'ucfirst_some',$major);
echo ucfirst($major);
function ucfirst_some($match) {
$exclude = array('and','of','the');
if ( in_array(strtolower($match[0]),$exclude) ) return $match[0];
return ucfirst($match[0]);
}
Prints the desired Accounting Technology/Technician and Bookkeeping.
Your regular expression matches strings of letters already, you don't seem to need to worry about the slashes at all. Just be aware that a number or symbol [like a hyphen] in the middle of a word will cause the capitalization as well.
Also, disregard the people harping on you about your $exclude array not being complete enough, you can always add in more words as you come across them. Or just Google for a list.
It should be noted that there is no single, agreed-upon "correct" way to determing what should/should not be capitalized in this way.
You also want to make sure if words like an and the are used at the start of a sentence that they are all caps.
Note: I can not think of any terms like this that start with of or and at the start but it is easier to fix things like that before odd data creeps into your program.
There is a code snipplet out there that I have used before at
http://codesnippets.joyent.com/posts/show/716
It is referred on the php.net function page for ucwords in the comments section
http://php.net/manual/en/function.ucwords.php#84920
I was trying to split a string on non-alphanumeric characters or simple put I want to split words. The approach that immediately came to my mind is to use regular expressions.
Example:
$string = 'php_php-php php';
$splitArr = preg_split('/[^a-z0-9]/i', $string);
But there are two problems that I see with this approach.
It is not a native php function, and is totally dependent on the PCRE Library running on server.
An equally important problem is that what if I have punctuation in a word
Example:
$string = 'U.S.A-men's-vote';
$splitArr = preg_split('/[^a-z0-9]/i', $string);
Now this will spilt the string as [{U}{S}{A}{men}{s}{vote}]
But I want it as [{U.S.A}{men's}{vote}]
So my question is that:
How can we split them according to words?
Is there a possibility to do it with php native function or in some other way where we are not dependent?
Regards
Sounds like a case for str_word_count() using the oft forgotten 1 or 2 value for the second argument, and with a 3rd argument to include hyphens, full stops and apostrophes (or whatever other characters you wish to treat as word-parts) as part of a word; followed by an array_walk() to trim those characters from the beginning or end of the resultant array values, so you only include them when they're actually embedded in the "word"
Either you have PHP installed (then you also have PCRE), or you don't. So your first point is a non-issue.
Then, if you want to exclude punctuation from your splitting delimiters, you need to add them to your character class:
preg_split('/[^a-z0-9.\']+/i', $string);
If you want to treat punctuation characters differently depending on context (say, make a dot only be a delimiter if followed by whitespace), you can do that, too:
preg_split('/\.\s+|[^a-z0-9.\']+/i', $string);
As per my comment, you might want to try (add as many separators as needed)
$splitArr = preg_split('/[\s,!\?;:-]+|[\.]\s+/', $string, -1, PREG_SPLIT_NO_EMPTY);
You'd then have to handle the case of a "quoted" word (it's not so easy to do in a regular expression, because 'is" "this' quoted? And how?).
So I think it's best to keep ' and " within words (so that "it's" is a single word, and "they 'll" is two words) and then deal with those cases separately. For example a regexp would have some trouble in correctly handling
they 're 'just friends'. Or that's what they say.
while having "'re" and a sequence of words of which the first is left-quoted and the last is right-quoted, the first not being a known sequence ('s, 're, 'll, 'd ...) may be handled at application level.
This is not a php-problem, but a logical one.
Words could be concatenated by a -. Abbrevations could look like short sentences.
You can match your example directly by creating a solution that fits only on this particular phrase. But you cant get a solution for all possible phrases. That would require a neuronal-computing based content-recognition.
lets say I have an html document
how can I remove every thing from the document
I want to remove the HTML tags
I want to remove any special character
I want to remove everything except letters
and extract the text
Thanks
You can use strip_tags and preg_replace to accomplish this:
function clean($in)
{
// Remove HTML
$out = strip_tags($in);
// Filter all other characters
return preg_replace("/[^a-z]+/i", "", $out);
}
[^a-z] will match any character other than A to Z, the + sign specifies that it should match any sequence length of such characters and the /i-modifier specifies that it's a case insensitive search. All matched characters will be replaced with an empty string leaving only the characters left.
If you want to keep spaces you can use [^a-z ] instead and if you want to keep numbers as well [^a-z0-9 ]. This allows you to whitelist all allowed characters and discard the rest.
Use strip_tags() to get rid of HTML first, then use Emil H's regex.
Prepend a
$in = preg_replace("/<[^>]*>/", "", $in);
to Emil H's solution, so your Tags will get striped. Else, a "<p>Hello World</p>" will appear as "pHelloWorldp"