Extract educational qualifications from someone's title via regex - php

I need to write a regular expression which will filter education qualifications of people from a particular file.
For example :
B.Tech B.Com M.S S.S.C
I have been trying with the following code
if(strlen($data<=6))
{
$regexp="/^[A-Za-z\.]$/";
if(preg_match($regexp,$data))
{
echo 'Education = ' . $data .'<br />';
}
}
But I only get the last dot and the character after it.

How about /.*/? It would match your whole string - which is an education qualifications (like you wanted).
PS: you didn't bother explaining the whole string structure and additional criterias - so you get so generic answer.

Try using \b(([\w]+[.][\w]+)([.][\w])*) Using this you can extract B.Tech, B.com, M.S .... so on from any string
Explanation Here

Related

PHP - Search a string for specific values, return related text

I'm using a PDF reader for PHP to load a big .pdf file that will store each page as a seperate, huge string in a big array.
This results in an output like this:
"Official certificate Surname: Doe First Name: John Date of birth:
10th of June, 1970 Place of Birth etc etc..."
How do I search for the specific text "Surname" and then select whatever text comes after that until "First Name" to return it as $var_surname.
The syntax used in the .pdf file will always be the same, so I have no problem using such absolute conditions for searching for the text.
I genuinely don't know where to start. Sorry if this question feels vague, let me know if more information is required.
if(preg_match('/Surname:[\s]+([\w]+)[\s]+First/i', $input, $matches)){
echo $matches[1];
}
will echo Doe
You could use a function like strrpos() to find out where the string surname ends, at what position. Then you could use strpos() to find out there the strin first name starts, what position. Afthr you know the positions you could chop between them and store it as $var_surnam . (using substr()). Hope this helps.
It would be better to figuring out the pattern and then write some methods. After that just pass the string by calling the methods. Based on given information this my best possible answer. Surely you will need to use builtin String methods.

preg_replace limit issue, handling array values

I've been working with the Sphider search engine for an internal website, we need to be able to quickly search for contact details in exported .htm(l) files.
$fulltxt = ereg_replace("[_A-Za-z0-9-]+(\.[_A-Za-z0-9-]+)*#[A-Za-z0-9-]+(\.[A-Za-z0-9-]+)*(\.[A-Za-z]{2,3})", "\\0", $fulltxt);
I am replacing e-mail addresses with a convenient mailto: link so users can open Outlook straight from the search results.
However,
while (preg_match("/[^\>](".$change.")[^\<]/i", " ".$fulltxt." ", $regs)) {
$fulltxt = preg_replace("/".$regs[1]."/i", "<b>".$regs[1]."</b>", $fulltxt);
}
It replaces all matches in the search results with bold tags, which resuts into the tags been included in Outlook's 'To...' field. It looks something like this in HTML (thanks Yuriy):
<b>name</b>.surname#domain
I have tried adding a value to the 'limit' parameter:
while (preg_match("/[^\>](".$change.")[^\<]/i", " ".$fulltxt." ", $regs)) {
$fulltxt = preg_replace("/".$regs[1]."/i", "<b>".$regs[1]."</b>", $fulltxt, 1);
}
Supposingly this should be the solution to my problem by simply replacing only the first occurrence (being the name as the pattern is name-phone num-email and we always search by name), instead it only makes it incredibly slow to the point i get a timeout message from the server. I've been trying various solutions but have been out of luck.
Any ideas? Am i doing something wrong?
Thanks.
(*Original heavily edited).
Did I understand you right that something like this happens?
<b>email#domain</b>
Why don't you put tags into search results first, and only then apply "mailto:" anchors to emails? Added 's would be easy to filter out in the patter on that second step.

Replace Specifc Full Links Between href=" " Using PHP

I have tried searching through related answers but can't quite find something that is suitable for my specific needs. I have quite a few affiliate links within 1,000s of articles on one of my wordpress sites - which all start with the same url format and sub-domain structure:
http://affiliateprogram.affiliates.com/
However, after the initial url format, the query string appended changes for each individual url in order to send visitors to specific pages on the destination site.
I am looking for something that will scan a string of html code (the article body) for all href links that include the specific domain above and then replace THE WHOLE LINK (whatever the query string appended) with another standard link of my choice.
href="http://affiliateprogram.affiliates.com/?random=query_string&page=destination"
gets replaced with
href="http://www.mylink.com"
I would ideally like to do this via php as I have a basic grasp, but if you have any other suggestions I would appreciate all input.
Thanks in advance.
<?php
$html = 'href="http://affiliateprogram.affiliates.com/?random=query_string&page=destination"';
echo preg_replace('#http://affiliateprogram.affiliates.com/([^"]+)#is', 'http://www.mylink.com', $html);
?>
http://ideone.com/qaEEM
Use a regular expression such as:
href="(https?:\/\/affiliateprogram.affiliates.com\/[^"]*)"
$data =<<<EOT
bar
foo
<a name="zz" href="http://affiliateprogram.affiliates.com/?query=random&page=destination&string">baz</a>
EOT;
echo (
preg_replace (
'#href="(https?://affiliateprogram.affiliates.com/[^"]*)"#i',
'href="http://www.mylink.com"',
$data
)
);
output
bar
foo
<a name="zz" href="http://www.mylink.com">baz</a>
$a = '<a class="***" href="http://affiliateprogram.affiliates.com/?random=query_string&page=destination" attr="***">';
$b = preg_replace("/<a([^>]*)href=\"http:\/\/affiliateprogram\.affiliates\.com\/[^\"]*\"([^>]*)>/", "<a\\1href=\"http://www.mylink.com/\"\\2>", $a);
var_dump($b); // <a class="***" href="http://www.mylink.com/" attr="***">
That's quite simple, as you only need a single placeholder for the querystring. .*? would normally do, but you can make it more specific by matching anything that's not a double quote:
$html =
preg_replace('~ href="http://affiliateprogram\.affiliates\.com/[^"]*"~i',
' href="http://www.mylink.com"', $html);
People will probably come around and recomend a longwinded domdocument approach, but that's likely overkill for such a task.

php extract UK postal code and validate it

I have some text blocks like
John+and+Co-Accountants-Hove-BN31GE-2959519
I need a function to extract the postcode "BN31GE". It may happen to not exist and have a text block without postcode so the function must also validate if the extracted text is valid postcode .
John+and+Co-Accountants-Hove-2959519
The UK Government Data Standard for postcodes is:
((GIR 0AA)|((([A-PR-UWYZ][0-9][0-9]?)|(([A-PR-UWYZ][A-HK-Y][0-9][0-9]?)|(([A-PR-UWYZ][0-9][A-HJKSTUW])|([A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRVWXY])))) [0-9][ABD-HJLNP-UW-Z]{2}))
Edit: I had the above in some (personal) code with a reference to a now non-existence UK government web page. The appropriate British Standard is BS7666 and information on this is currently available here. That lists a slightly different regex.
Find below code to extract valid UK postal code. It return array if post code found otherwise empty.
<?php
$getPostcode="";
$str="John+and+Co-Accountants-Hove-BN31GE-2959519";
$getArray = explode("-",$str);
if(is_array($getArray) && count($getArray)>0) {
foreach($getArray as $key=>$val) {
if(preg_match("/^(([A-PR-UW-Z]{1}[A-IK-Y]?)([0-9]?[A-HJKS-UW]?[ABEHMNPRVWXY]?|[0-9]?[0-9]?))\s?([0-9]{1}[ABD-HJLNP-UW-Z]{2})$/i",strtoupper($val),$postcode)) {
$getPostcode = $postcode[0];
}
}
}
print"<pre>";
print_r($getPostcode);
?>
Use a regex: preg_grep function,
I don't know the format of english postcodes but you could go with something like:
(-[a-zA-Z0-9]+-)+
This matches
"-Accountants-"
"-BN31GE-"
You can then proceed at taking always the second value or you can enhance you regex to match exactly english postcodes, something like maybe
([A-Z0-9]{6})

php search and replace

I am trying to create a database field merge into a document (rtf) using php
i.e if I have a document that starts
Dear Sir,
Customer Name: [customer_name], Date of order: [order_date]
After retrieving the appropriate database record I can use a simple search and replace to insert the database field into the right place.
So far so good.
I would however like to have a little more control over the data before it is replaced. For example I may wish to Title Case it, or convert a delimited string into a list with carriage returns.
I would therefore like to be able to add extra formatting commands to the field to be replaced. e.g.
Dear Sir,
Customer Name: [customer_name, TC], Date of order: [order_date, Y/M/D]
There may be more than one formatting command per field.
Is there a way that I can now search for these strings? The format of the strings is not set in stone, so if I have to change the format then I can.
Any suggestions appreciated.
You could use a templating system like Smarty, that might make your life easier, as you can do {$customer_name|ucwords} or actually put PHP code in your email template.
Try a RegEx and preg_replace_callback:
function replace_param($matches)
{
$parts = explode(',',$matches[0]);
//$parts now contains an array like: customer_name,TC,SE,YMD
// do some substitutions and:
return $text;
}
preg_replace_callback('/\[([^\]]+)\]/','replace_param',$rtf);
You can use explode on it to separate them into array values.
For Example:
$customer_name = 'customer_name, TC';
$get_fields = explode(',', $customer_name);
foreach($get_fields as $value)
{
$new_val = trim($value);
// Now do whatever you want to these in here.
}
Sorry if I'm not understanding you.

Categories