developer.jelix.org is not used any more and exists only for history. Post new tickets on the Github account.
developer.jelix.org n'est plus utilisée, et existe uniquement pour son historique. Postez les nouveaux tickets sur le compte github.

Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#543 closed bug (invalid)

Bug in jLocale, when an accent is in utf-8 format.

Reported by: brunto Owned by: Julien
Priority: normal Milestone:
Component: jelix:core:jLocale Version: 1.0.3
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Documentation needed: no
Hosting Provider: Php version:

Description

In Jlocale when an accent like 'é' is written é The display screen is only '&'

Exemple in a locale file exemple.UTF-8.properties

titre_page = Titre momentané

Display : Titre momentan&

But, if é instead é the display is correct.

Attachments (3)

543-beta.diff (2.2 KB) - added by Julien 12 years ago.
beta patch, need some more test, but if you have comments.…
543-jLocale-utf8-entities-fix-and-other-improvements.diff (8.7 KB) - added by Julien 12 years ago.
good patch ;)
543.updated_unit_tests.diff (4.1 KB) - added by laurentj 12 years ago.
more tests to check syntax in properties file

Download all attachments as: .zip

Change History (17)

comment:1 Changed 12 years ago by laurentj

  • Milestone set to Jelix 1.0.4

comment:2 Changed 12 years ago by Julien

  • Status changed from new to assigned

comment:3 Changed 12 years ago by Julien

Hello,

fixed it, works fine.

Because I did a complete rewrite of jBundle::_loadResources(), I need to make more tests (didn't test with latin1 strings for example)

I also added a new feature, that makes possible to have multi-line rendered strings like this (end with 2 \) :

mystring = This a string that will be rendered\\
on two lines, even in HTML if you use nl2br jtpl plugin

This is cool, because you can have a big bunch of text in only one localized string, and don't have to create multiple strings just to have rendered line breaks.

Of course, it's still possible to write a string on multiple lines like before :

mystring = This a string that will be rendered\
on one line, even if it's declared on 2 lines in the property file

I also kept the spaces at the end of line, because I think it could be useful sometimes.

Last, I think that performance is improved, but didn't make any test.

This patch is not final, but I you have comments, they're welcome.

Changed 12 years ago by Julien

beta patch, need some more test, but if you have comments....

comment:4 Changed 12 years ago by laurentj

I don't think that loading the properties file in an array is better than reading line after line, in term of performance and memory consumption.

comment:5 Changed 12 years ago by Julien

I have no real idea about performance gain or loss, I think I can do tests for that.

But with file_get_contents, the file is not kept open while we parse it. I think it could be nice.

comment:6 Changed 12 years ago by Julien

I made some tests :

launched 1000 times theses scripts in cli :

<?php

$time = microtime();

$strings = array();

foreach(explode("\n",file_get_contents('errors.UTF-8.properties')) as $linenumber=>$line){
    $strings[$linenumber] = $line;
}

echo memory_get_usage(),"\t",microtime()-$time,"\n";
?>
<?php

$time = microtime();

$strings = array();

$fp = fopen('errors.UTF-8.properties','r');

$linenumber = 0;

while (!feof($fp)) {
    if($line=fgets($fp)){
        $strings[$linenumber] = $line;
        $linenumber++;
    }
}

fclose($fp);

echo memory_get_usage(),"\t",microtime()-$time,"\n";
?>
<?php

$time = microtime();

$strings = array();

foreach(file('errors.UTF-8.properties') as $linenumber=>$line){
    $strings[$linenumber] = $line;
}

echo memory_get_usage(),"\t",microtime()-$time,"\n";
?>

the average results are :

75688 0,00026674 file_get_contents()

71112 0,00034062 fgets()

75364 0,00025454 file()

so file_get_contents uses more memory but is faster (ok, we're talking about microseconds :)). Maybe file() is better.

fopen(), fgets(), fclose() use less memory, and need a little more time but nothing problematic.

So OK, if the fact that the file is kept open while parsing it is not a problem (I think it's ok because we open it read-only in fact), I'll switch back to fgets()

comment:7 Changed 12 years ago by laurentj

Ok, let's use file function.

Changed 12 years ago by Julien

good patch ;)

comment:8 Changed 12 years ago by Julien

  • review set to review?

Here's the final patch.

It's working fine.

As I said before, it's now possible to end a line with
to get an \n in the string. So I'll update the documentation when the patch is landed in 1.0.4.

comment:9 Changed 12 years ago by laurentj

  • review changed from review? to review-
  • Careful, many extra whitespace at end of some lines. Remove them.
  • Your new reader don't allow same syntax as before.
    • a "#" in a value begin a comment, except when a \ is before. We cannot do that in your version.
    • we cannot specify only a whitespace in the value, by using "\w"
    • html entities shouldn't be converted. The value shouldn't be modified. So no html_entity_decode.
  • When I tried to run unit tests : "Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 13 bytes) in lib/jelix/core/jSelector.class.php on line 483"

Changed 12 years ago by laurentj

more tests to check syntax in properties file

comment:10 Changed 12 years ago by laurentj

I added some few tests in unit-tests which check more syntaxic features. Your patch should pass all this test without modification. Include this patch in your patch.

comment:11 Changed 12 years ago by Julien

Hello,

ok, I need to take a deeper look at complete syntax rules. Is there some kind of specs about the syntax of properties file somewhere ? (yes I could find these by deep analysing all the regexp in the current version, but ... ;) )

I think your new unit tests will also help in that way, thanks.

More to come in the next days.

comment:12 Changed 12 years ago by laurentj

  • Milestone Jelix 1.0.4 deleted
  • Resolution set to invalid
  • Status changed from assigned to closed

In fact, I just realize that this ticket is invalid. jLocale works perfectly :-) In fact, this is the documentation which is uncomplete.

In fact, if we want to use a # in a property, we have to add a \ before it.

So, in the example of the ticket should be :

  titre_page = Titre momentan&\#233; 

I'm going to update the documentation about all the syntax.

comment:13 Changed 12 years ago by laurentj

I commit also unit tests attached to this ticket, with a minor correction in this tests.

comment:14 Changed 12 years ago by Julien

Ok I didn't know about that spec ;)

I'll open a new enhancement ticket, because I think that it's quite useful to be able to have line breaks in strings : #569

Note: See TracTickets for help on using tickets.