PHP, Database: Preserving and searching strings with whitespace - php

I have the following scenario:
I am storing user input that MAY contain whitespace. The problem is, when I do a SELECT from the database (MySql), it does not find a match. If I strip the whitespace, the search on db works, but the strings are all messy looking if I store them this way.
How can I properly store strings (with whitespace) in the database AND correctly do SELECT statements that will find those stored strings??
Thanks for the help.
UPDATE:
Here is what I am working with now:
$filename = "This is a string with white space";
//being stored in the db like this
$trim_string = preg_replace("/\s+/", "", $filename);
//search db as Thisisastringwithwhitespace
Still finds no match though?

If the user would know the entire string, I would go with Blake suggestion of using TRIM() in the Mysql query, if you want the partial search, and want it fast, I would create a secondary table with the words and another one for the position where appears in the strings, something like this :
Word( id, string )
WordInString( id, source_string_id, word_id, position )
The search would be in 2 steps :
1. get the words ids
2. get from WordInString the top list with greater number of Words hits, and show the options to the end user.
Hope it helps !

Related

PHP How to check if a String is contained in a text from the database using php

I'm creating a paraphrasing system, where a user inputs text and the system paraphrases for them.
My database looks like this:
KeyWord: dainty
Synonyms1: choice; delicious; tasty; juicy; luscious; palatable; savoury
Synonyms2: ethereal; beautiful; fragile; charming; petite; frail; elegant
where Keyword (varchar), Synonym1 (text), and Synomy2 (text) are database columns. The example above is one row of a database with 3 fields and their values.
This how it works if the system finds, for example, a word like tasty, it can be replaced by any of the words separated by a semicolon from either Synomyn1 or Synonym2 or the keyword because they are all synonyms.
Let me explain how the word search is working. The system first searches for the word in the Keyword column, if the word is not found, I go further and search for a word in the Synmon1 column and so on.
My Problem is checking the user's specific word in the Synonym1 or Synonym2 columns. When I use the LIKE clause, the generic way of searching from the database, the system is not searching for a full name, instead, it's searching for characters. For example, let's assume the writer's text is: "Benson has an ice cube", the system is assuming the ice was found in the choice. I don't want that, I want to search for a full word.
If anyone has understood me, please help to solve this.
If I understand your question, you want to search for ice in columns Synonyms1 and Synonyms2 but make sure you do not inadvertently match a word such as choice.
If you have ever read or heard anything on the subject of database normalization you would realize that your database does not even meet the requirements for 1NF (first normal form) becuase it has columns that consist of repeating values, which, as you have found out, makes searching inefficient and difficult. But let's move on:
A synonym column might just contain one word, so it might look like:
ethereal
Or:
ethereal; beautiful; fragile; charming; petite; frail; elegant
Thus the word you are looking for might be:
the entire column value
preceded by nothing and followed by a ;
preceded by a space and followed by a ;
preceded by a space and followed by nothing
So if your version of MySQL does not support regular expressions, then if you are looking for example the word ice in column Synonyms2, the WHERE clause should be:
WHERE (
Synonyms2 = 'ice'
OR
Synonyms2 like 'ice;%'
OR
Synonyms2 like '% ice;%'
OR
Synonyms2 like '% ice'
)
If you are running SQL 8+, then:
WHERE regexp_like(Synonyms2, '( |^)ice(;|$)')
This states that ice must be preceded by either a space or start of string and followd by either a ; or end of string.

How to extract one value from imploded array in MySQL row

I'm using implode to insert few values into one row in MySQL database.
implode(' ', $_POST['tag']);
Assuming that I have table named product with row named tags with 3 different values that inserted inside like this:
usb adapter charger
I have tried using this method using like operator (%), but that didn't worked.
$sql = "SELECT * FROM product WHERE tags='%usb%'";
How can I extract only one value from the imploded array using WHERE in mysql query?
I agree with the comments about re-designing the database. At first read it seems that using LIKE would definitely get the result you want but after reading #Patrick Q's pan - panther example, it makes a lot sense that LIKE is not really a good solution. There are ways to get exactly the tag string you're looking for but it may hurt the performance and the query will be longer and complex. Hence the following are to demonstrate how the query would look like with your current tags data value:
MySQL query:
SELECT tags,
SUBSTRING_INDEX(SUBSTRING_INDEX(tags,' ',FIND_IN_SET('usb',REPLACE(tags,' ',','))),' ',-1) v
FROM mytable
HAVING v = 'usb';
As you can see, there are a few functions being used just to get the exact string from the data cell. Since your example data was separating with spaces and FIND_IN_SET identify value separation by comma, REPLACE take place on the tags column first to replace spaces with comma. Then with SUBSTRING_INDEX twice to get the string using the location extracted in FIND_IN_SET. Finally at the end HAVING to get only the tag you're looking for.
Further demo here : https://www.db-fiddle.com/f/joDa7MNcQL2RakTgBa7qBM/3

php Insert to db where another column contains some words

I have a new question cause i didnt find it anywhere.
I have a db which contains 4 columns. I did my bot to insert array to a column.Now i have to fill another columns.
My filled column contains site links. Exmp: www.dizipub.com/person-of-interest-1-sezon-2-bolum-izle
I need to take "person-of-ınterest" part and write it to another column as kind of a "Person of Interest". And also "1-sezon-2-bolum" as "Sezon 1 - Bölüm 1".
I couldnt find it to do with php not sql. I need to make it with bot. Can someone help me about it please.
database
There is a column named bolumlink where i put the links. As i told i need to take some words from these links. For instance:
dizi column needs to be filled with "Pretty Little Liars" in first 9 row.
It can be done by SQL Update with Like which allows you to select rows with pattern based search using wild-cards:
% matches any number of characters, even zero characters.
_ matches exactly one character.
update your_table set dizi = 'Pretty Little Liars' where bolumlink like '%pretty-little-liars%'
NOTE:
Updating your database using like without limit or conditions with unique columns can be dangerous. This code might affect the whole table if empty string is passed.

How do I find records when data entry has been inconsistent?

A group of people have been inconsistently entering data for a while.
Some people will enter this:
101mxeGte - TS 200-10
And other people will enter this
101mxeGte-TS-200-10
The sad thing is, those are supposed to be identical records.
They will also search inconsistently. If a record was entered one way, some people will search the other way.
Now, I know all about how you can fix data entry for the future, but that's NOT what I am asking about. I want to know how it is possible to:
Leave the data alone, but...
Search for the right thing.
Am I asking for the impossible here?
The best thing I found so far was a suggestion to simply muck about with the existing data, using the REPLACE function in mySQL.
I am uncomfortable with this option, as it means it will certainly actively piss off half of the users. The unfocused angst of all is less than the active ire of half.
The problem is that it has to go both ways:
Entering spaces in the query has to find both space and not-space entries,
and NOT entering spaces ALSO has to find both space and not-space entries.
Thanks for any help you can offer!
The "ideal" solution is pretty straightforward:
Decide what is the canonical way of representing a record
When someone saves a record, canonicalize it before saving
When someone searches for a record, canonicalize the input before searching for it
You could also write a small program to convert all existing data to the canonical form (you will have the code for it anyway, as "canonicalize" in steps 2 and 3 require that you write code that does so).
Edit: some specific information on how to canonicalize
With the sample data you give, the algorithm might be:
Replace all spaces with hyphens
Replace all runs of one or more hyphens with a single hyphen (a regex would be easiest for this -- actually, a regex can do both steps in one go)
Is there any practical problem with this approach?
Trim whitespaces from BOTH the existing data and the input of the search. That way the intended record(s) will always be returned. Hope your data size is small, though, because it's going to perform pretty poorly.
Edit: by "existing data" I meant "the query of existing data". My answer was based on assumption that the actual data could not be touched (which might not be correct).
If it where up to me, I'd have the data in the database updated with REPLACE, and on future searches when dealing with the given row remove all spaces in the input.
Presumably your users enter the search terms (or record details, when creating a record) in an HTML form, which then goes to a PHP script. It looks like your data can always be written in a way that contains no spaces, so why don't you do this:
Run a query that strips spaces from the existing data
Add code in the PHP script(s) that receives the form(s), so that it strips spaces from submitted data - whether that data is to be used for search or for writing new data.
Edit: I guess you would also need to change some spaces to hyphens. Shouldn't be too hard to write logic to accomplish that.
Something like this.
pseudo code:
$myinput = mysql_real_escape_string('101mxeGte-TS-200-10')
$query = " SELECT * FROM table1
WHERE REPLACE(REPLACE(f1, ' ', ''),'-','')
= REPLACE(REPLACE($myinput, ' ', ''),'-','') "
Alternatively you might write your own function to trim the data so it can be compared.
DELIMITER $$
CREATE FUNCTION myTrim(AStr varchar) RETURNS varchar
BEGIN
declare Result varchar;
SET Result = REPLACE(AStr, ' ','');
SET Result = ......
.....
RETURN Result;
END$$
DELIMITER ;
And then use this in your select
$query = " SELECT * FROM table1
WHERE MyTrim(f1) = MyTrim($myinput) "
have you ever heard of SQL's LIKE?
http://dev.mysql.com/doc/refman/4.1/en/string-comparison-functions.html
there's also regex
http://dev.mysql.com/doc/refman/4.1/en/regexp.html#operator_regexp
101mxeGte - TS 200-10
101mxeGte-TS-200-10
how about this?
SELECT 'justalnums' REGEXP '101mxeGte[[:blank:]]*(\-[[:blank:]]*)?TS[[:blank:]-]*200[[:blank:]-]*10'
digits can be represented by [0-9] and alphas as [a-z] or [A-Z] or [a-zA-Z]
append a + to make then multiple of that. perens allow you to group and even capture what is in the perens and reuse it later in a replace or something else.
RLIKE is the same as REGEXP.

How to replace all instances of a particular value in a mysql database with another?

I'm looking for a MySQL equivalent of what str_replace is for PHP. I want to replace all instances of one word with another, I want to run a query that will replace all "apples" with "oranges".
The reason why:
UPDATE fruits SET name='oranges' WHERE name='apples';
isn't going to work for my situation, is because I often times have multiple words in a table row separated by commas like: "apples, pears, pineapples". In this case I want just apples to be replaced by oranges and pear and pineapples to stay in tact.
Is there any way to do this?
You have a database design problem, as Ignacio has pointed out. Instead of including separate pieces of information included in a single column, that column should become a separate table with one piece of information per row. For instance, if that "fruits" field is in a table called "hats", you would have one table for "hats" with a column "hat_id" but no information about fruits and a second column "hat_fruits" with two columns, "hat_id" and "fruit_name". In your example, the given hat would have three rows in "hat_fruits", one for each fruit.
Once you implement this design (if you have control of the database design) you can go back to use the simple UPDATE command you originally had. In addition, you will be able to index by fruit type, search more easily, use less disk space, validate fruit names, and not have any arbitrary limit on the number of fruits that fit into the database
That said, if you absolutely cannot fix the database structure, you might try something like this:
REPLACE(REPLACE(CONCAT(',', fruits, ','), ', ', ','), ',apples,', ',oranges,')
This monstrosity first converts the fruits field to begin and end with commas, then removes any spaces before commas. This should give you a string in which fruit names are unambiguously delimited by commas. Finally, it replaces the ,apples, (note the delimiters) with ,oranges,.
After that, of course, you ought to strip off the beginning and ending commas and put back the spaces after the commas (that's left as an exercise for the reader).
Update: Okay, I couldn't resist looking it up:
REPLACE(TRIM(',' FROM REPLACE(REPLACE(CONCAT(',', fruits, ','), ', ', ','), ',apples,', ',oranges,')), ',', ' ,')
Note that this isn't tested and I'm not a MySQL expert anyway — I don't know if MySQL has function nesting issues or anything like that.
PS: Don't tell anyone I was the one who showed you this!
Not reliably. There is REPLACE(), but that will only work until you decide to add pineapples to your menu.
Putting your database in First Normal Form and then using UPDATE is the only reliable solution.
I think you want to use REPLACE():
REPLACE(str,from_str,to_str)
Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str.
Below will replace all occurances of 'apples' with 'oranges' in the 'Name' column for all the rows in the 'Fruits' table.
UPDATE fruits SET Name=REPLACE(Name,'apples','oranges')

Categories