How to find a specific word in a external page using php ?
(dom or pregmatch, or what else ?)
example in foo.com source code with :
span name="abcd"
I want to check if the word abcd is in foo.com in php
if(preg_match('/span\s+name\=\"abcd\"/i', $str)) echo 'exists!';
To check if a string of characters exist:
<?php
$term = 'abcd';
if ( preg_match("/$term/", $str) ) {
// yes it does
}
?>
To check if that string exists as a word in its own right (ie, is not in the middle of a larger word) use word boundary matchers:
<?php
$term = 'abcd';
if ( preg_match("/\b$term\b/", $str) ) {
// yes it does
}
?>
For a case-insensitive search, add the i flag after the last slash in the regex:
<?php
$term = 'abcd';
if ( preg_match("/\b$term\b/i", $str) ) {
// yes it does
}
?>
$v = file_get_contents("http://foo.com");
echo substr_count($v, 'abcd'); // number of occurences
//or single match
echo substr_count($v, ' abcd ');
Here are other few ways to find specific word
<?php
$str = 'span name="abcd"';
if (strstr($str, "abcd")) echo "Found: strstr\n";
if (strpos($str, "abcd")) echo "Found: strpos\n";
if (ereg("abcd", $str)) echo "Found: ereg\n";
if (substr_count($str, 'abcd')) echo "Found: substr_count\n";
?>
$name = 'foo.php';
file_get_contents($name);
$contents=$pattern = preg_quote('abcd', '/');
// finalise the regular expression, matching the whole line
$pattern = "/^.*$pattern.*\$/m";
// search, and store all matching occurences in $matches
if(preg_match_all($pattern, $contents, $matches)){
echo implode("\n", $matches[0]);
}
else{
echo "not exist word";
}
Related
I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);
You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>
It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.
How to resolve this problem:
Write a PHP program that finds the word in a text.
The suffix is separated from the text by a pipe.
For example: suffix|SOME_TEXT;
input: text|lorem ips llfaa Loremipsumtext.
output: Loremipsumtext
My code is this, but logic maybe is wrong:
$mystring = fgets(STDIN);
$find = explode('|', $mystring);
$pos = strpos($find, $mystring);
if ($pos === false) {
echo "The string '$find' was not found in the string '$mystring'.";
}
else {
echo "The string '$find' was found in the string '$mystring',";
echo " and exists at position $pos.";
}
explode() returns an array, so you need to use $find[0] for the suffix, and $find[1] for the text. So it should be:
$suffix = $find[0];
$text = $find[1];
$pos = strpos($text, $suffix);
if ($pos === false) {
echo "The string '$suffix' was not found in '$text'.";
} else {
echo "The string '$suffix' was found in '$text', ";
echo " and exists at position $pos.";
}
However, this returns the position of the suffix, not the word containing it. It also doesn't check that the suffix is at the end of the word, it will find it anywhere in the word. If you want to match words rather than just strings, a regular expression would be a better method.
$suffix = $find[0];
$regexp = '/\b[a-z]*' . $suffix . '\b/i';
$text = $find[1];
$found = preg_match($regexp, $text, $match);
if ($found) {
echo echo "The suffix '$suffix' was found in '$text', ";
echo " and exists in the word '$match[0]'.";
} else {
echo "The suffix '$suffix' was not found in '$text'.";
}
Objective: strings with ' should match the string without it.
Example:
$first_string = "alex ern o'brian";
$second_string = "alex-ern o brian";
$pattern = array("/(-|\.| )/", "/(')/");
$replace = array(' ', '(\s|)');
$first_string = preg_replace($pattern, $replace, $first_string);
$second_string = preg_replace($pattern, $replace, $second_string);
$first_string_split = preg_split("/(-|\.| )/", $first_string);
$first_string_split[] = $first_string;
$second_string_split = preg_split("/(-|\.| )/", $second_string);
$second_string_split[] = $second_string;
$first_string = array_slice($first_string_split, -1)[0];
$second_string = array_slice($second_string_split, -1)[0];
if(in_array($first_string, $second_string_split) || in_array($second_string, $first_string_split))
{
echo 'true';
} else {
echo 'false';
}
I think you are expecting this.
Solution 1: Try this code snippet here
Regex: (\s|) this will match either space or null.
<?php
ini_set('display_errors', 1);
$string = "o'brian";
$string=str_replace("'", "(\s|)",$string);
$list = array("o'neal", "o brian", "obrian");
$result=array();
foreach($list as $value)
{
if(preg_match("/$string/", $value))
{
$result[]=$value;
}
}
print_r($result);
Solution 2:
Regex: [a-z]+ will match character from a to z.
$string1="o brian";
$string2="obrian";
if(preg_match("/".implode(" ", $matches[0])."/", $string1))
{
echo "matched";
}
if( preg_match("/".implode("", $matches[0])."/", $string2))
{
echo "matched";
}
I'm not sure if I got your question right, but this should do it:
(?<=\w)'(?=\w)
It matches every ' character, which is followed and preceded by a word character. The word character \w is equal to [a-zA-Z0-9_].
Here is a live example to test the regex
Here is a live PHP example
need to extract an info from a string which strats at 'type-' and ends at '-id'
IDlocationTagID-type-area-id-492
here is the string, so I need to extract values : area and 492 from the string :
After 'type-' and before '-id' and after 'id-'
You can use the preg_match:
For example:
preg_match("/type-(.\w+)-id-(.\d+)/", $input_line, $output_array);
To check, you may need the service:
http://www.phpliveregex.com/
P.S. If the function preg_match will be too heavy, there is an alternative solution:
$str = 'IDlocationTagID-type-area-id-492';
$itr = new ArrayIterator(explode('-', $str));
foreach($itr as $key => $value) {
if($value === 'type') {
$itr->next();
var_dump($itr->current());
}
if($value === 'id') {
$itr->next();
var_dump($itr->current());
}
}
This is what you want using two explode.
$str = 'IDlocationTagID-type-area-id-492';
echo explode("-id", explode("type-", $str)[1])[0]; //area
echo trim(explode("-id", explode("type-", $str)[1])[1], '-'); //492
Little Simple ways.
echo explode("type-", explode("-id-", $str)[0])[1]; // area
echo explode("-id-", $str)[1]; // 492
Using Regular Expression:
preg_match("/type-(.*)-id-(.*)/", $str, $output_array);
print_r($output_array);
echo $area = $output_array[1]; // area
echo $fnt = $output_array[2]; // 492
You can use explode to get the values:
$a = "IDlocationTagID-type-area-id-492";
$data = explode("-",$a);
echo "Area ".$data[2]." Id ".$data[4];
$matches = null;
$returnValue = preg_match('/type-(.*?)-id/', $yourString, $matches);
echo($matches[1]);
I have the following code in PHP, which searches a file for lines containing a string:
<?php
echo "Results for: ";
echo ($_POST['query']);
echo "<br><br>";
$file = 'completed.db';
$searchfor = ($_POST['query']);
header('Content-Type: text/plain');
$contents = file_get_contents($file);
$pattern = preg_quote($searchfor, '/');
$pattern = "/^.*$pattern.*\$/m";
if(preg_match_all($pattern, $contents, $matches)){
echo "Found matches:\n";
echo implode("\n", $matches[0]);
}
else{
echo "No matches found";
}
?>
In the line, echo implode("\n", $matches[0]); in echos an array, separated by spaces. If I wanted to separate the items by a different string, say $entry = '<br>', how would you do it?
For example, if $matches was
one
two
three
Then, the command should echo:
one<br>two<br>three<br>
Just add the break tag to the string
echo implode("\n<br>", $matches[0]);
And one more to trail since implode only goes between.
echo "<br>";
The following code simply uses implode with a different first argument, bu also adds the separator to the end (as per specification)
$entry = '<br>';
echo implode($entry, $matches[0]).$entry;