Ticket #543 (closed bug: invalid)

Opened 3 months ago

Last modified 3 months ago

Bug in jLocale, when an accent is in utf-8 format.

Reported by: brunto Assigned to: Julien
Priority: normal Milestone:
Component: jelix:core:jLocale Version: 1.0.3
Severity: normal Keywords:
Cc: Php version:
Review: review- Hosting Provider:
Documentation needed: 0 Blocking:

Description

In Jlocale when an accent like 'é' is written é The display screen is only '&'

Exemple in a locale file exemple.UTF-8.properties

titre_page = Titre momentané

Display : Titre momentan&

But, if é instead é the display is correct.

Attachments

543-beta.diff (2.2 kB) - added by Julien on 04/18/08 19:59:33.
beta patch, need some more test, but if you have comments.…
543-jLocale-utf8-entities-fix-and-other-improvements.diff (8.7 kB) - added by Julien on 04/22/08 20:08:50.
good patch ;)
543.updated_unit_tests.diff (4.1 kB) - added by laurentj on 04/24/08 23:56:45.
more tests to check syntax in properties file

Change History

04/15/08 08:57:44 changed by laurentj

  • milestone set to Jelix 1.0.4.

04/16/08 01:13:42 changed by Julien

  • status changed from new to assigned.

04/18/08 19:57:35 changed by Julien

Hello,

fixed it, works fine.

Because I did a complete rewrite of jBundle::_loadResources(), I need to make more tests (didn't test with latin1 strings for example)

I also added a new feature, that makes possible to have multi-line rendered strings like this (end with 2 \) :

mystring = This a string that will be rendered\\
on two lines, even in HTML if you use nl2br jtpl plugin

This is cool, because you can have a big bunch of text in only one localized string, and don't have to create multiple strings just to have rendered line breaks.

Of course, it's still possible to write a string on multiple lines like before :

mystring = This a string that will be rendered\
on one line, even if it's declared on 2 lines in the property file

I also kept the spaces at the end of line, because I think it could be useful sometimes.

Last, I think that performance is improved, but didn't make any test.

This patch is not final, but I you have comments, they're welcome.

04/18/08 19:59:33 changed by Julien

  • attachment 543-beta.diff added.

beta patch, need some more test, but if you have comments....

04/22/08 11:21:14 changed by laurentj

I don't think that loading the properties file in an array is better than reading line after line, in term of performance and memory consumption.

04/22/08 11:52:00 changed by Julien

I have no real idea about performance gain or loss, I think I can do tests for that.

But with file_get_contents, the file is not kept open while we parse it. I think it could be nice.

04/22/08 12:53:01 changed by Julien

I made some tests :

launched 1000 times theses scripts in cli :

<?php

$time = microtime();

$strings = array();

foreach(explode("\n",file_get_contents('errors.UTF-8.properties')) as $linenumber=>$line){
    $strings[$linenumber] = $line;
}

echo memory_get_usage(),"\t",microtime()-$time,"\n";
?>
<?php

$time = microtime();

$strings = array();

$fp = fopen('errors.UTF-8.properties','r');

$linenumber = 0;

while (!feof($fp)) {
    if($line=fgets($fp)){
        $strings[$linenumber] = $line;
        $linenumber++;
    }
}

fclose($fp);

echo memory_get_usage(),"\t",microtime()-$time,"\n";
?>
<?php

$time = microtime();

$strings = array();

foreach(file('errors.UTF-8.properties') as $linenumber=>$line){
    $strings[$linenumber] = $line;
}

echo memory_get_usage(),"\t",microtime()-$time,"\n";
?>

the average results are :

75688 0,00026674 file_get_contents()

71112 0,00034062 fgets()

75364 0,00025454 file()

so file_get_contents uses more memory but is faster (ok, we're talking about microseconds :)). Maybe file() is better.

fopen(), fgets(), fclose() use less memory, and need a little more time but nothing problematic.

So OK, if the fact that the file is kept open while parsing it is not a problem (I think it's ok because we open it read-only in fact), I'll switch back to fgets()

04/22/08 14:39:27 changed by laurentj

Ok, let's use file function.

04/22/08 20:08:50 changed by Julien

  • attachment 543-jLocale-utf8-entities-fix-and-other-improvements.diff added.

good patch ;)

04/22/08 20:12:40 changed by Julien

  • review set to review?.

Here's the final patch.

It's working fine.

As I said before, it's now possible to end a line with \\ to get an \n in the string. So I'll update the documentation when the patch is landed in 1.0.4.

04/24/08 23:55:34 changed by laurentj

  • review changed from review? to review-.
  • Careful, many extra whitespace at end of some lines. Remove them.
  • Your new reader don't allow same syntax as before.
    • a "#" in a value begin a comment, except when a \ is before. We cannot do that in your version.
    • we cannot specify only a whitespace in the value, by using "\w"
    • html entities shouldn't be converted. The value shouldn't be modified. So no html_entity_decode.
  • When I tried to run unit tests : "Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 13 bytes) in lib/jelix/core/jSelector.class.php on line 483"

04/24/08 23:56:45 changed by laurentj

  • attachment 543.updated_unit_tests.diff added.

more tests to check syntax in properties file

04/24/08 23:58:00 changed by laurentj

I added some few tests in unit-tests which check more syntaxic features. Your patch should pass all this test without modification. Include this patch in your patch.

04/30/08 13:25:21 changed by Julien

Hello,

ok, I need to take a deeper look at complete syntax rules. Is there some kind of specs about the syntax of properties file somewhere ? (yes I could find these by deep analysing all the regexp in the current version, but ... ;) )

I think your new unit tests will also help in that way, thanks.

More to come in the next days.

04/30/08 22:41:40 changed by laurentj

  • status changed from assigned to closed.
  • resolution set to invalid.
  • milestone deleted.

In fact, I just realize that this ticket is invalid. jLocale works perfectly :-) In fact, this is the documentation which is uncomplete.

In fact, if we want to use a # in a property, we have to add a \ before it.

So, in the example of the ticket should be :

  titre_page = Titre momentan&\#233; 

I'm going to update the documentation about all the syntax.

04/30/08 22:50:07 changed by laurentj

I commit also unit tests attached to this ticket, with a minor correction in this tests.

05/01/08 10:45:00 changed by Julien

Ok I didn't know about that spec ;)

I'll open a new enhancement ticket, because I think that it's quite useful to be able to have line breaks in strings : #569

Download in other formats: Comma-delimited Text Tab-delimited Text RSS Feed