I've got a string:
$string = "Something here 2014 another text here";
I need to detect position of the first 4 digits number that begins with "20".
So the result of the example would be 15th character of the $string.
Since you have commented with code you tried, I now feel comfortable answering your question properly :) Thank you for trying first!
Your attempt:
preg_match('/20\d\d/', "Something here 2014 another text here",
$matches, PREG_OFFSET_CAPTURE);
... is absolutely correct, however as you correctly pointed out, it would also match 20140 (and indeed 12014 would match too).
To fix this behaviour, you can add word boundaries - because numbers count as word characters. Your regex becomes:
'/\b20\d\d\b/'
This will ensure that there are no numbers (or letters, for that matter) immediately before or after your target four-digit number :)
What about...
$needle = "20";
$pos = strpos($string , $needle);
EDIT:
as requested, a way to get the string from this
$date = substr ($string , $pos , 4 ]);
Related
There is a method to know which characters does not match a preg_match function?
For example:
preg_match('/^[a-z]*$/i', 'Hello World!');
Is there some function to know the incorrect char, in this case spance and "!"?
Thanks for your replies, but the problem in your examples is you don't indicate the begin and the end of the string. Your examples works with string contained in another one and not with the string that is exactly like I defined in the pattern.
For example, if I had to validate the italian fiscal code of a subject, composed by a string formatted like this:
XXX XXX YY X YY X YYY X (X = letter, Y = number - without spaces)
which pattern is:
'/^[A-Z]{6}[0-9]{2}[A-Z]{1}[0-9]{2}[A-Z]{1}[0-9]{3}[A-Z]{1}$/i'
I must validate the string that match exactly what I defined in the pattern.
If I use your code and I wrong 1 (only 1) character, the whole string was returned as error.
http://eval.in/9178
The problem of the reverse pattern occurs in a complex pattern, where are inserted the AND or the OR.
What I want to know is why the preg_match fails and not only if it fails or not.
Have you tried something like this?
$nonMatchingCharacters = preg_replace('/[a-z]/', '', $wholeString);
That should strip out the 'legal' characters, leaving only the ones that you want to mention in your validation error message.
You could also do other treatments like...
$nonMatchingCharactersArray = array_unique(explode('', $nonMatchingCharacters));
...if you want an array of unique, non-matching characters, and not just a string with bits stripped out of it.
That will indicate you the space and !
preg_match_all('/[^a-z]/i', 'Hello World!', $matches);
var_dump($matches);
http://eval.in/9132
Just remove everything that matches with preg_replace, then split into an array what remains.
<?php
$str = preg_replace('/([0-9]{2}[a-z]*)/i', '', '03Hello 02World!');
$characters = str_split($str);
var_dump($characters);
http://eval.in/9152
i am asking the question which can be answered by a small attempt in googling but i am not finding anything which can do this, so i am asking here. pls dont downvote and close it, it can be useful for others also:
my problem is: i need to look for some portion of string, and find that and replace that. but my problem is that that portion of string is changing everytime, so i need to inject some regexp.
$url = "www.google.com/test=2";
replace the 'test=2' with 'test=1'
$result = "www.google.com/test=1"
the thing is: the slug can have any number between 1 - 20: test=\d{1,20} this is the case. i tried preg_replace, substr_replace but none of them can do this.
I don't see what's wrong with the following /test=\d+/? I think your problem is that you're forgetting the modifiers (in your example: http://codepad.org/ZZNNr1bH, but you have to use modifiers like so: http://codepad.org/9TEkPYJo)
<?php
$url = "www.google.com/test=2";
$result = preg_replace("/test=\d+/", "test=1", $url);
// ^ ^
var_dump($result);
?>
Update:
That said, like Ashley mentioned, \d{1,20} doesn't mean "one to twenty" but rather "any digit character repeated 1 to 20 times".
If you only want digits from 0-20, use the following regex:
/test=([0-9]|1[0-9]|20)/
Basically meaning (a number from 0-9 OR the number 1 FOLLOWED by any number between 0-9 OR the number 20)
It could also be shortened to
/test=(1?\d|20)/
Meaning (1 repeated 1 or 0 times followed by a digit from 0-9 OR the number 20)
Also, \d{1,20} won't match the numbers between 1 and 20, but 0 and 99,999,999,999,999,999,999 which is not exactly what I think you're after.
$url = "www.google.com/test=2";
$replacement = 1;
echo preg_replace('/(.*test)=([1-20])/', "$1=1" ,$url);
If I understand the question correctly, this should do.
$url = "website.com/test=1/asdasd/....";
$check= "test=1";
$newval = "test=2"
if(strstr($url,$check))
$result = preg_replace($check, $newval, $url);
if the value is contained inside the string, then it can be replaced.
I found some partial help but cannot seem to fully accomplish what I need. I need to be able to do the following:
I need an regular expression to replace any 1 to 3 character words between two words that are longer than 3 characters with a match any expression:
For example:
walk to the beach ==> walk(.*)beach
If the 1 to 3 character word is not preceded by a word that's longer than 3 characters then I want to translate that 1 to 3 letter word to '<word> ?'
For example:
on the beach ==> on ?the ?beach
The simpler the rule the better (of course, if there's an alternative more complicated version that's more performant then I'll take that as well as I eventually anticipate heavy usage eventually).
This will be used in a PHP context most likely with preg_replace. Thus, if you can put it in that context then even better!
By the way so far I have got the following:
$string = preg_replace('/\s+/', '(.*)', $string);
$string = preg_replace('/\b(\w{1,3})(\.*)\b/', '${1} ?', $string);
but that results in:
walk to the beach ==> 'walk(.*)to ?beach'
which is not what I want. 'on the beach' seems to translate correctly.
I think you will need two replacements for that. Let's start with the first requirement:
$str = preg_replace('/(\w{4,})(?: \w{1,3})* (?=\w{4,})/', '$1(.*)', $str);
Of course, you need to replace those \w (which match letters, digits and underscores) with a character class of what you actually want to treat as a word character.
The second one is a bit tougher, because matches cannot overlap and lookbehinds cannot be of variable length. So we have to run this multiple times in a loop:
do
{
$str = preg_replace('/^\w{0,3}(?: \w{0,3})* (?!\?)/', '$0?', $str, -1, $count);
} while($count);
Here we match everything from the beginning of the string, as long as it's only up-to-3-letter words separated by spaces, plus one trailing space (only if it is not already followed by a ?). Then we put all of that back in place, and append a ?.
Update:
After all the talk in the comments, here is an updated solution.
After running the first line, we can assume that the only less-than-3-letter words left will be at the beginning or at the end of the string. All others will have been collapsed to (.*). Since you want to append all spaces between those with ?, you do not even need a loop (in fact these are the only spaces left):
$str = preg_replace('/ /', ' ?', $str);
(Do this right after my first line of code.)
This would give the following two results (in combination with the first line):
let us walk on the beach now go => let ?us ?walk(.*)beach ?now ?go
let us walk on the beach there now go => let ?us ?walk(.*)beach(.*)there ?now ?go
Im reluctant to ask but I cant figure out php preg_replace and ignore certain bits of the sting.
$string = '2012042410000102';
$string needs to look like _0424_102
The showing numbers are variable always changing and 2012 changes ever year
what I've tried:
^\d{4}[^\d{4}]10000[^\d{3}]$
^\d{4}[^\d]{4}10000[^\d]{3}$
Any help would be appreciated. I know it's a noob question but easy points for whoever helps.
Thanks
Your first regex is looking for:
The start of the string
Four digits (the year)
Any single character that is not a digit nor { or }
The number 10000
Any single character that is not a digit nor { or }
The end of the string
Your second regex is looking for:
The start of the string
Four digits (the year)
Any four characters that are not digits
The number 10000
Any three characters that are not digits
The end of the string
The regex you're looking for is:
^\d{4}(\d{4})10000(\d{3})$
And the replacement should be:
_$1_$2
This regex looks for:
The start of the string
Four digits (the year)
Capture four digits (the month and day)
The number 10000
Capture three digits (the 102 at the end in your example)
The end of the string
Try the following:
^\d{4}|10000(?=\d{3}$)
This will match either the first four digits in a string, or the string '10000' if there are three digits after '10000' before the end of the string.
You would use it like this:
preg_replace('/^\d{4}|10000(?=\d{3}$)/', '_', $string);
http://codepad.org/itTgEGo4
Just use simple string functions:
$string = '2012042410000102';
$new = '_'.str_replace('10000', '_', substr($string, 4));
http://codepad.org/elRSlCIP
If they're always in the same character locations, regular expressions seem unnecessary. You could use substrings to get the parts you want, like
sprintf('_%s_%s', substr($string,4,4), substr($string,13))
or
'_' . substr($string,4,4) . '_' . substr($string,13)
I have a textarea on page with UTF8 encoding.
How to count all sentences with php?
Update:
Sentence starts with a capital letter and ending by dot, question or exclamation mark.
From PHP's point of view, a <textarea> is simply another <input>, so it will be available through $_GET or $_POST as normal when the form is submitted.
Sentence counting in itself is quite complicated - you could count the number of sentences by the number of periods (.) in the text, but this would fail with abbreviations e.g. e.g.. You could do so by counting the number of periods followed by a space and then a capital letter, but this would fail for abbreviations followed by common nouns, and also for people who don't use capital letters at the beginning of their sentences. You could decide an average sentence length (say 70 characters) and approximate sentences = characters/70. None of these solutions are perfect (or even good, in my opinion).
UPDATE: Following your updated question, the following should be helpful:
<?php
preg_match_all("/(^|[.!?])\s*[A-Z]/",$_POST['textarea'],$matches);
$count = count($matches);
As Nobody was saying already, it depends on how you define a sentence. Is it a ? Is it a linebreak? Is it a capital?
I think it's really hard to define "a sentence", because for every definition you can think of 100 exceptions to that rule.
Anyway, if you come up with a definition, you could thus count the occurences of that in your textarea. Such as the number of linebreaks, the number of dots or the number of capital letters. Or combine all of those into one definition. So basically, just take the contents of your textarea and process some function on it. :-)
That's the best that can be answered to this question imo.
Edit After your edit my answer is:
function starts_with_upper($str) {
$chr = mb_substr ($str, 0, 1, "UTF-8");
return mb_strtolower($chr, "UTF-8") != $chr;
}
//Get sentences splitted by a dot and starting with a capital letter.
$total = 0;
$sentences = explode('.', rtrim($text, '.'));
for ($i = 0; $i < count($sentences); $i++) {
$sentence = $sentences[i];
if (starts_with_upper($sentence)) {
$total++;
}
}
echo "You have " . $total . " sentences ending in a dot.
If you treat sentence as a piece of words with dot at the end you can count dots in your text.
If you use new line, count \n's.