Regex to match any single alphabet in string php - php

I hate this code. What it does is take zipcode from html pages are verifies if the zipcode is USA or Canada.
<?php
$contents = file_get_contents('853755.html');
preg_match_all('/<td id="u_zipCode">(.*?)<\/td>/s', $contents, $matches);
$data = implode("|",$matches[0]);
$string = str_replace(' ', '', $data);
if(preg_match('/^[1-9][0-9]*$/',$string)){
echo "Canada";
}
else
{
echo "USA";
}
?>
I tried very had to search solution for it. But i can't find the solution. I even tried the regex that it should only contain Numbers not a single alphabet. But this doesn't seem to work in any way. The zipcode are given in this format,
USA: 97365
USA: 97365-97366
Canada: j8n7s1
Canada: N2L5Y6
Kindly help me with this solution. Thanks

Since there can only be one element with a particular ID, you don't need to use preg_match_all(), you can use preg_match().
preg_match('/<td id="u_zipCode">(.*?)<\/td>/s', $contents, $match);
Then you don't need to use implode(). The part that matches the capture group will be in element 1 of the match array.
$string = str_replace(' ', '', $match[1]);
if(preg_match('/^\d{5}(-\d{4})?$/',$string)) {
echo "USA";
} else {
echo "Canada";
}

Related

PHP extract one part of a string

I have to extract the email from the following string:
$string = 'other_text_here to=<my.email#domain.fr> other_text_here <my.email#domain.fr> other_text_here';
The server send me logs and there i have this kind of format, how can i get the email into a variable without "to=<" and ">"?
Update: I've updated the question, seems like that email can be found many times in the string and the regular expresion won't work well with it.
You can try with a more restrictive Regex.
$string = 'other_text_here to=<my.email#domain.fr> other_text_here';
preg_match('/to=<([A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4})>/i', $string, $matches);
echo $matches[1];
Simple regular expression should be able to do it:
$string = 'other_text_here to=<my.email#domain.fr> other_text_here';
preg_match( "/\<(.*)\>/", $string, $r );
$email = $r[1];
When you echo $email, you get "my.email#domain.fr"
Try this:
<?php
$str = "The day is <tag> beautiful </tag> isn't it? ";
preg_match("'<tag>(.*?)</tag>'si", $str, $match);
$output = array_pop($match);
echo $output;
?>
output:
beautiful
Regular expression would be easy if you are certain the < and > aren't used anywhere else in the string:
if (preg_match_all('/<(.*?)>/', $string, $emails)) {
array_shift($emails); // Take the first match (the whole string) off the array
}
// $emails is now an array of emails if any exist in the string
The parentheses tell it to capture for the $matches array. The .* picks up any characters and the ? tells it to not be greedy, so the > isn't picked up with it.

Extracting a string between an "id_" and a ".html"

I want to extract the id from this Youku video:
http://v.youku.com/v_show/id_XNTU2NzQyNzQ0.html?f=19275195&ev=3
The id is the random letter between the id_ and .html
How to accomplish that?
Use this
$input = 'http://v.youku.com/v_show/id_XNTU2NzQyNzQ0.html?f=19275195&ev=3';
preg_match('~id_(.*?).html~', $input, $output);
echo $output[1];
Output
XNTU2NzQyNzQ0
Codepad
You can try below code:
<?php
$varStr = 'http://v.youku.com/v_show/id_XNTU2NzQyNzQ0.html?f=19275195&ev=3';
$filename = basename($varStr);
preg_match_all('/id_(.*)\.html/', $filename, $match);
echo $match[1][0];
?>
Just for the sake of using named results in your REGEX, I would recommend doing something like this. Everyone else's work just fine, I've just added the named grouping as well as a non-greedy approach by ignoring periods
<?
$regex = "/\id_(?P<video_id>[^\.]*)\./";
if(preg_match($regex, "http://v.youku.com/v_show/id_XNTU2NzQyNzQ0.html?f=19275195&ev=3", $matches)) {
echo $matches['video_id'];
}

Preg Replace with mentions

I have some problems with the preg_replace.
I would change the mentions in a links but the name isn't a username.
So in the name there are spaces, i found a good solution but i don't know to do it.
Sostantially i would that preg_replace the words that are between # and ,
For example:
#John Doeh, #Jenna Diamond, #Sir Duck Norman
and replace to
VAL
How do I do it?
I think that you want it like:
John Doeh
For this try:
$myString="#John Doeh, #Jenna Diamond, #Sir Duck Norman";
foreach(explode(',',$myString) as $str)
{
if (preg_match("/\\s/", $str)) {
$val=str_replace("#","",trim($str));
echo "<a href='user.php?name=".$val."'>".$val."</a>";
// there are spaces
}
}
Based on my assumption you want to remove strings which start with #Some Name, in a text like: #Some Name, this is a message.
Then replace that to an href, like: First_Name
If that is the case then the following regex will do:
$str = '#First_Name, say something';
echo preg_replace ( '/#([[:alnum:]\-_ ]+),.*/', '$1', $str );
Will output:
First_Name
I also added support for numbers, underscores and dashes. Are those valid in a name aswell? Any other characters that are valid in a #User Name? Those are things that are important to know.
Two methods:
<?php
// preg_replace method
$string = '#John Doeh, #Jenna Diamond, #Sir Duck Norman';
$result = preg_replace('/#([\w\s]+),?/', '$1', $string);
echo $result . "<br>\n";
// explode method
$arr = explode(',', $string);
$result2 = '';
foreach($arr as $name){
$name = trim($name, '# ');
$result2 .= ''.$name.' ';
}
echo $result2;
?>

How to convert a string with numbers and spaces into an int

I have a small problem. I am tryng to convert a string like "1 234" to a number:1234
I cant't get there. The string is scraped fro a website. It is possible not to be a space there? Because I've tried methods like str_replace and preg_split for space and nothing. Also (int)$abc takes only the first digit(1).
If anyone has an ideea, I'd be greatefull! Thank you!
This is how I would handle it...
<?php
$string = "Here! is some text, and numbers 12 345, and symbols !£$%^&";
$new_string = preg_replace("/[^0-9]/", "", $string);
echo $new_string // Returns 12345
?>
intval(preg_replace('/[^0-9]/', '', $input))
Scraping websites always requires specific code, you know how you receive the input - and you write code that is required to make it usable.
That is why first answer is still str_replace.
$iInt = (int)str_replace(array(" ", ".", ","), "", $iInt);
$str = "1 234";
$int = intval(str_replace(' ', '', $str)); //1234
I've just came into the same issue, however the answer that was provided wasn't covering all the different cases I had...
So I made this function (the idea popped in my mind thanks to Dan) :
function customCastStringToNumber($stringContainingNumbers, $decimalSeparator = ".", $thousandsSeparator = " "){
$numericValues = $matches = $result = array();
$regExp = null;
$decimalSeparator = preg_quote($decimalSeparator);
$regExp = "/[^0-9$decimalSeparator]/";
preg_match_all("/[0-9]([0-9$thousandsSeparator]*)[0-9]($decimalSeparator)?([0-9]*)/", $stringContainingNumbers, $matches);
if(!empty($matches))
$matches = $matches[0];
foreach($matches as $match):
$numericValues[] = (float)str_replace(",", ".", preg_replace($regExp, "", $match));
endforeach;
$result = $numericValues;
if(count($numericValues) === 1)
$result = $numericValues[0];
return $result;
}
So, basically, this function extracts all the numbers contained inside of a string, no matter how many text there is, identifies the decimal separator and returns every extracted number as a float.
One can specify what decimal separator is used in one's country with the $decimalSeparator parameter.
Use this code for removing any other characters like .,:"'\/, !##$%^&*(), a-z, A-Z :
$string = "This string involves numbers like 12 3435 and 12.356 and other symbols like !## then the output will be just an integer number!";
$output = intval(preg_replace('/[^0-9]/', '', $string));
var_dump($output);

Capitalize First Letter Of Each Word except for URLs

Can someone tell me please how to do this:
Input:
hello http://DOMAIN.com/asdakdjk.php?asd=231&adsj=23 u.s. nicely done!
Result:
Hello http://DOMAIN.com/asdakdjk.php?asd=231&adsj=23 U.S. Nicely Done!
Including words in separated by '.' if possible such as in U.S.
Thanks
try this:
<?php
function capitalizeNonURLs($input)
{
preg_match('#(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)#', $input, $matches);
$url = $matches[1];
$temp = ucwords($input);
$output = str_ireplace($url, $url, $temp);
return $output;
}
$str = "hello http://domain.com/asdakdjk.php?asd=231&adsj=23 u.s. nicely done!";
echo capitalizeNonURLs($str);
Keep in mind that this function does not handle abbreviations (it won't change usa to USA). Country codes can be handled in several different ways. One is to make a hashmap of country codes and replace them or use regular expression for that as well.
To keep urls lower:
$strarray = explode(' ',$str);
for($i=0;$i<count($strarray))
{
if(substr($strarray[$i],0,4)!='http')
{
$strarray[$i] = ucfirst($strarray[$i])
}
}
$new_str = implode('',$strarray);

Categories