Not signed in (Sign In)

Categories

Vanilla 1.1.4 is a product of Lussumo. More Information: Documentation, Community Support.

Help keep Vanilla free:
Welcome Guest!
Want to take part in these discussions? If you have an account, sign in now.
If you don't have an account, apply for one now.
    • CommentAuthormichael-e
    • CommentTimeJan 3rd 2008 edited
     # 1
    I have a strange problem in a live installation of Vanilla (which is not public). I hope that someone else had this issue before so he can point me in the right direction.

    The forum in question is running perfectly with UTF-8 (as explained by Max_B - thank you for that!). In need of investigating a small but strange problem I have three installations in the meantime – the live forum (running on my client's webspace) an two test installations (locally, on my OS X machine), the latter being copies of the files and the database:

    (1) live forum running on:
    * Apache 2.0
    * PHP 5.2.3
    * MySQL 5.0.32
    (2) test forum #1 running on:
    * Apache 1.3
    * PHP 5.2.2
    * MySQL 5.0.24a
    (3) test forum #2 running on:
    * Apache 2.0
    * PHP 5.2.3
    * MySQL 5.0.45

    In the test forums everything works perfectly, but in the live forum the search function does not work if I use special German characters in the search string. So in short:

    * any UTF-8 characters are stored and delivered fine, but:
    * searching for phrases using special (German) characters gives no results

    Here's an example: Let's assume you have an entry with the German word "präsentieren" in the title. In this case

    * searching for "prasentieren" (please note there is "a" instead of "ä") will find the article
    * searchibg for "präsentieren" (the original phrase) will find nothing

    The weird thing about it all is that everything's fine in the test installations. So there must be some incompatibility and/or misconfiguration on the live server, I guess. I did a lot of tests, but could not find the problem.

    The problem with the search function (on the live web server) even exists with a completely fresh copy of Vanilla, without any extensions (which is installation #4, of course).

    I hope that somebody has an idea where to go from here...
    •  
      CommentAuthorMax_B
    • CommentTimeJan 5th 2008
     # 2
    I tested searching präsentieren on this (Lussumo) forum with success.

    Besides server software version, did you check default php options (magic_quote and the like) and .htaccess also ?
    • CommentAuthormichael-e
    • CommentTimeJan 5th 2008
     # 3
    There is no .htaccess file at all. Regarding PHP options I just did some quick check, because I do not really know where to search. Do you know some server or PHP parameters which could have an influence on handling GET data?

    I am aksing this because I've seen that searching the database works fine with PHPMyAdmin! So maybe the problem is in the handling of GET data on the server?
    • CommentAuthormichael-e
    • CommentTimeJan 5th 2008
     # 4
    I compared PHP core configurations now. There are no significant differences, beside:

    1. "allow_url_include" is ON on the live server
    2. "enable_dl" is OFF on the live server
    3. "register_argc_argv" is ON on the live server
    4. "short_open_tag" is ON on the live server
    5. the live server uses the suhosin patch (which AFAIK is meant to increase safety)
    •  
      CommentAuthorMax_B
    • CommentTimeJan 6th 2008 edited
     # 5
    Well, this suhosin patch looks for a candidate, you may want to try the suhosin.simulation flag.
    Nevertheless, if phpMyAdmin is ok, you may want to double check you live forum utf-8 configuration.
    • CommentAuthormichael-e
    • CommentTimeJan 6th 2008 edited
     # 6
    Max_B, thank you very much for your help. I tried and switched the local value of suhosin.simulation to ON, but nothing changed.

    The Hosting Company told me to take a deeper look in the Vanilla software. Haha - I did tell them it works perfectly elsewhere (like in my local installations or in this forum here)! I double checked the Vanilla side, of course. (One of the installations even is a brand new one.)

    I do not know what to do now. I would not like to switch to a different hosting company, because I have many clients' websites there... If you have any new ideas, please post them here. I appreciate any help on the topic.
    •  
      CommentAuthorMax_B
    • CommentTimeJan 6th 2008
     # 7
    I'm not sure if you switched it on the right side:
    You told that suhosin patch is on the live server. You must test the simulation flag on this one, not on the local one. If you can't because it's hosted, you may try to install this patch locally and see if it is the culprit.

    I see no other clue, right now, beside MySQL version. MySQL doc say something about searching with LIKE operator (Vanilla uses LIKE for search):
    Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:
    mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
    +-----------------------------------------+
    | 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
    +-----------------------------------------+
    | 0 |
    +-----------------------------------------+
    mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
    +--------------------------------------+
    | 'ä' = 'ae' COLLATE latin1_german2_ci |
    +--------------------------------------+
    | 1 |
    +--------------------------------------+

    Be sure to test phpMyAdmin search with a LIKE operator.
    • CommentAuthormichael-e
    • CommentTimeJan 7th 2008
     # 8
    Yes, I switched on the live server. There's a local and a global value for all the parameters, and by adding
    php_value suhosin.simulation ON

    to the .htaccess file I switched the local value to "ON". I can not disable Suhosin completely (because it's on a hosted webspace), but shouldn't it be enough to change the local value for suhosin.simulation?

    OK, this did not work.

    To be extra sure, I thought about installing the Suhosin patch on one of my development (OS X) servers. But this is rather tricky stuff, too tricky for me... I'm rather a coder than a programmer.

    I tried searching for my user family name (which includes the German letter "ö") with PHPMyAdmin using the LIKE operator:

    SELECT * 
    FROM `db1083101-test`.`LUM_User`
    WHERE (
    `LastName` LIKE '%ö%'
    )
    LIMIT 0 , 30

    and also
    SELECT * 
    FROM `db1083101-test`.`LUM_User`
    WHERE (
    `LastName` LIKE '%ö%' COLLATE utf8_general_ci
    )
    LIMIT 0 , 30

    Both works fine.
    •  
      CommentAuthorMax_B
    • CommentTimeJan 7th 2008
     # 9
    Also, to be extra sure, check for register_argc_argv as it implies copying the request arguments to system environment , which might not be utf-8 capable, it can possibly mess-it.

    The fact that searching is ok for phpMyAdmin and not Vanilla is the only element we have. Check if both use GET or POST (vanilla may be POST while PMA GET) and set up some testing around this.
    • CommentAuthorkeith_
    • CommentTimeJan 7th 2008
     # 10
    additionally there is some character stuff within the apache 2 conf- files, sorry I'm on my way to bed so if anyone believes this could be the issue then i need to research that again, but i once had a problem with the same perl-generated pages looking differently on two identical (so i thought) servers, one was delivered as UTF and one was delivered being ISO... ...
    •  
      CommentAuthorWallPhone
    • CommentTimeJan 7th 2008
     # 11
    I'm suspecting something in MySQL is causing this, since it matches on a but not ä.

    Curious if it would be possible to point a local Vanilla installation to the remote DB server to verify this. (It might be inaccessible from the outside...)

    Another tool that might help is to upload this file to your installation, and while logged in as administrator visit it and turn on the debug mode (only the administrator roles who toggle this can view it). Then you will be able to see and test the various SQL queries that Vanilla generates at the bottom of the page.
    • CommentAuthormichael-e
    • CommentTimeJan 9th 2008 edited
     # 12
    Thank you all! Meanwhile I did some more testing on the subject. Here are the results:

    • changing the PHP parameter register_argc_argv does nothing to the phenomenon
    • there is no difference using GET or POST parameters. (I verified this using a manually edited search page.)
    • the problem is not in the database. As WallPhone suggested, I opened the live database for external connections and connected my development server to this database. Everything worked perfectly. So the problem must be in the Apache/PHP configuration on the live server WORKING TOGETHER with Vanilla. (Everything is OK with PHPMyAdmin, but this Software has been pre-installed by the hosting company somewhere out of the webspace; I have no file access to PHPMyAdmin.)
    • I verified that the problem is in Apache/PHP using Vanilla's debug mode. After hitting "search" the German character is still OK in the search field, but obviously it has been garbled on it's way to the database query!

    I have written another E-Mail to the hosting company. Unfortunately I have no access to Apache 2 configuration files, I only can switch some PHP configuration options via htaccess. I already tried a lot of PHP options without any success. Maybe the problem is in the Apache config (meaning keith_ is on the right track)?
    •  
      CommentAuthorMax_B
    • CommentTimeJan 9th 2008
     # 13
    http://httpd.apache.org/docs/2.2/mod/core.html#adddefaultcharset

    This should not be necessary because Vanilla does output an utf-8 charset header but I have read that some installation override charset someway.
    • CommentAuthormichael-e
    • CommentTimeJan 9th 2008
     # 14
    I already checked the HTTP headers. The character set is included correctly in the header section:

    Content-Type: text/html; charset=utf-8

    (So it does not change anything to use the AddDefaultCharset utf-8 directive.)
    • CommentAuthormichael-e
    • CommentTimeJan 9th 2008
     # 15
    Having checked the Apache configuration options once again, I am rather sure that the problem must be either in a faulty compilation or in PHP's configuration. I hope that the hosting company will have s.th. interesting to say (once they checked Vanilla's debug information).

    Nevertheless I will try and evaluate any ideas in the meantime!
    • CommentAuthorPere
    • CommentTimeJan 10th 2008 edited
     # 16
    I have almost exactly the same problem. The difference is that it's not working even in my testing server. EVERYTHING, both in the testing and the production server, is UTF-8. I can search the "body" column of table "comment" using the "search" function for that column in phpmyadmin, for example for 'á' (spanish), and it will find results both for 'a' and 'á'. However, if I make the same search inside Vanilla, if I search for 'á', it will return both results for 'á' and 'a', but if I search for 'a', it will return results with 'a' but omit 'á'. If I search for "ñ", it will find nothing, but if I search for "n", it will return both "n" and "ñ".

    Again, MySQL is OK.

    As I've never been able to make this work, I was thinking it was a Vanilla issue and I should "filter" every "special" character before sending them to be searched in the database, but after seeing you have made it work and after trying to search directly inside MySQL, definititely there must be something misconfigurated somewhere...
    •  
      CommentAuthorMax_B
    • CommentTimeJan 10th 2008
     # 17
    I think you(both) did, but have you checked the PHP default_charset value? It should be empty.
    Also, just to keep searching, what are PHP mbstring settings.
    • CommentAuthormichael-e
    • CommentTimeJan 10th 2008
     # 18
    I have found and solved the problem!!!

    In line 35 of /library/Framework/Framework.Class.SqlSearch.php the user query is converted with PHP's 'strtolower' function.
    $this->UserQuery = strtolower(trim($this->UserQuery));

    This function relies on the current locale setting. This means that in i.e. the default “C” locale, multibyte characters such as "ä" will not be converted properly! Thus the database query will include garbled characters.

    On the server in question (Apache/2.0.54 (Debian GNU/Linux) PHP/5.2.3 with Suhosin-Patch DAV/2) the 'locales' seem not to work for the 'strtolower' function! Even adding
    setlocale (LC_ALL, 'de_DE@euro', 'de_DE', 'de', 'ge');

    (immediately before the function call) had no effect. I assume that Debian is not set up correctly. I found out that it needs a package named "locales" in order to properly support locales. Maybe they did not compile this package? I do not know this for sure. Anyway.

    I solved the problem by simply removing the mentioned line in the source code. (In my eyes there is no need for it, because MySQL itself will behave case insensitive.)

    I wrote the above to the hosting company as well. I will be back here if they have anything interesting to say!
    • CommentAuthorPere
    • CommentTimeJan 10th 2008
     # 19
    Megazilions of thanks. You're the man, Michael-e!!

    I was struggling a lot with this. Smoke was going out of my head. I thought it had something to do with character encoding, misconfiguration on the server and the like.

    The solution provided works like a charm.

    Thank a lot!!!!! :D
    • CommentAuthormichael-e
    • CommentTimeJan 10th 2008
     # 20
    @ Pere: Thank you. I was struggling a lot with this, too...

    Many thanks to Max_B and WallPhone!
  1.  # 21
    Michael-e you are a survivor.. thanks a lot, it was my big headache.. :=)
    also thanks for everyone who tried to solve that problem.
  2.  # 22
    Do you guys feel this change (removing the line) should be changed in the release source too? Is it likely to have any ill effects?
    • CommentAuthormichael-e
    • CommentTimeJan 10th 2008
     # 23
    Thank you guys a lot!

    @minisweeper: I am not an expert for MySQL. But in my installations (on the live server and on my development servers) I have not seen any negative side effects. MySQL seems to behave case insensitive with the generated search strings. Even if this was not the case, does it make any sense to change the case of the search string, thus changing the user's "search intention"? So all in all: Yes, as far as I can see, it would make sense to change the release source, too. (I would be proud to have contributed at least a very small piece to Vanilla...)

    I have received final information from the hosting company, and I did some final research and thinking:

    On my (Debian) live server there is a bunch of locales installed:

    de_DE ISO-8859-1
    de_DE.UTF-8 UTF-8
    de_DE@euro ISO-8859-15
    en_US ISO-8859-1
    pl_PL ISO-8859-2
    nl_NL@euro ISO-8859-15
    fr_FR@euro ISO-8859-15
    cs_CZ ISO-8859-2
    tr_TR ISO-8859-9
    sv_SE ISO-8859-1
    ru_RU KOI8-R
    pt_PT ISO-8859-1
    es_ES ISO-8859-1

    So in my case de_DE.UTF-8 would be the right choice. If I include an additional line saying

    setlocale (LC_CTYPE, 'de_DE.UTF-8');

    immediately before the line in question, searching will work even if I leave everything else intact. But setlocale should be used with caution (see PHP documentation), and it makes things even more complicated (as you would need to include the right locale for your language...)

    Another possibility would be to configure the server's locale settings (which can be done via SSH, if you have SSH access) - but on my live server these changes have no effect on PHP. I have not found out if the latter is typical for Debian servers, or if it was simply my fault, doing s.th. wrong and/or insufficient... Anyway this would make Vanilla's installation much more complicated!

    So as long as I do not know about any negative effects, I will stick to my solution, as proposed above.

    Thanks again!
    •  
      CommentAuthorselflearner
    • CommentTimeJan 10th 2008 edited
     # 24
    I have no problem with the line erased. Actually my search results are more logical now.

    but I have another problem, (not about the line, I tried even if its there or not)
    when I made a search with comments selected, I have got all unicodes displayed instead of the letters itself. I realised that some of my users data written to database differently;

    for the letter ğ , its written as Ä�
    but for some of my users its written as & # 287; (I have put spaces so it will be shown by you)

    search results are finding the letters which are written like � and they displayed normally..
    never find any letter written as & # 287;

    interesting point is thata when I check the table discussions, I have seen all the characters are like � , not unicodes. so there is no problem if I make my search within titles.

    so. I have two problems... and two questions. :=)

    -is it normal for some user's writings are differently encoded to my database,
    -is it actually normal all the letters are converted to unicode or something else on my database? or I should see all the characters normally ?

    and my problem is how to solve all the mess... ehr.
    I think if my comments also encoded as the way which happens on titles, it solves the problem.

    thanks in advance.
    •  
      CommentAuthorMax_B
    • CommentTimeJan 11th 2008
     # 25
    @selflearner: it is likely that you database is not utf-8 clean. It is even possible that you have part of it clean and other "weird", if you changed vanilla setting without mySQL side cleaning. See http://lussumo.com/docs/doku.php?id=vanilla:administrators:encodings for complete information.
    • CommentAuthorkeith_
    • CommentTimeJan 11th 2008
     # 26
    hey,
    i wasn't really satisfied with the feed publishers that were online here. if the user was online they were presenting headlines from categorys he was not supposed to see, they only showed the first comments of a thread, never the latest, etcetcetc,...

    this is my good oldschool standalone version, presenting only the latest 20 newest entries (and only one) per discussion. maybe someone likes to make it vanilla-like,...
    :)

    http://salmacis.com/feed_forum-vanillaforall.php.txt

    [sorry now i posted this two times in 2 different threads]
  3.  # 27
    @Max_B - one more time, thank you verymuch.. I dump all the sqls, erase my old database, recreate ad utf8_general_ci .. but the worst part was correcting all the data, I couldn't find anyway to do that, so I do find and replace to all characters, and import them into my new database, now everything is ok.. ooops, did I say everything???

    no, of course not,
    now there is a problem (which probably I should investigate before I ask) I use FCKeditor and if I choose Enable Visual editing toolbar then, all the characters are written as unicode (like & # 287;) into my database. as a result, my search results are still not finding those characters.
    if I disable Enable Visual editing toolbar, everything is ok, all characters written as they are, and all my search results are correct.

    PS. I will search about this issue, but if you know anyway to keep my characters normal with using FCKeditor please tell me.. may be someone suggest me another editor, (I really need to use, bold, italic, list, and importantly embeding Youtube videos)

    thank you again and again.
    •  
      CommentAuthorMax_B
    • CommentTimeJan 11th 2008
     # 28
    I'm looking for client side xhtml editor myself and the contenders are:
    TinyMCE almost as big as FCKeditor
    NicEdit small and minimal
    Apple FancyToolbar CSS3/webkit requirement so reserved for intranet/private pages.
    jTagEditor light (if you have jQuery already) and sufficient but not wysiwyg

    I have not made any testing yet.
Add your comments
    Username Password
  • Format comments as