Teen Programmers Unite  
 

 

Return to forum top

International text support

Posted by Mike_L [send private reply] at January 29, 2003, 05:46:17 PM

この伝言板は日本語の文字がコンパチですか?
(Above text is JIS encoded)

Posted by AnyoneEB [send private reply] at January 29, 2003, 10:02:10 PM

How about some HTML support so you can use &? I think this should be on the suggestions board.

Posted by ItinitI [send private reply] at January 30, 2003, 08:52:17 AM

Yes, JIS would be good! I think Vlad also pointed out that Cyrllic charactors don't seem to show up either.

Posted by Psion [send private reply] at January 30, 2003, 09:11:07 AM

Special syntax for international characters, maybe. HTML support in general, dream on. =D

I'm not familiar with this stuff in general, so if anyone would like to propose a reasonable extension to the markup system in use in the new system (http://tpu.tpu.org/) to support international characters, I'm all ears. Perhaps a way to start a block where only the types of HTML codes Mike_L used above are allowed?

Posted by ItinitI [send private reply] at January 30, 2003, 10:31:36 AM

JIS is Japanese Idustrial Standard encoding [There is also EUC-JP, Extended UNIX Coding, as well as ISO-2022-JPN.]. I don't think it was HTML he used, though I could be wrong, since only the numbers and chars displayed.

Posted by Psion [send private reply] at January 30, 2003, 03:38:32 PM

He used HTML character escape codes. HTML code is not allowed on these forums, so you just saw the code, not the characters.

Posted by CViper [send private reply] at January 30, 2003, 04:00:33 PM

Why not allow these &#xxxx; as far as I'm aware of, you can't really mess up anything with them. Other candidates would be ä å ü ö etc. Currently these show up as "????"

Posted by Mike_L [send private reply] at January 30, 2003, 06:32:15 PM

I just typed Japanese text into the form in Mozilla. Mozilla must have converted them like that. I'm using Windows XP and the Microsoft Global IME set to JIS mode. Mozilla has the curious ability to properly display international characters in HTML when they're encoded as HTML entities, ala 〹

It would be fun for the TPU system to support International text. We might get some non-English speakers interested in TPU.

Posted by ItinitI [send private reply] at January 30, 2003, 10:36:49 PM

I use Mozilla all the time to input JIS, and it shows correctly. I think it's jsut not enabled here [For obvious reasons, being everyone here speaks the common tongue English].

Posted by buzgub [send private reply] at January 30, 2003, 11:16:45 PM

I would have thought that it would have been safe to allow any html entities through. To allow them, I think we'd just have to make the system stop encoding & signs as &. The problem is that & signs would then need to be marked by the user. How often do people use them here? I think it's almost a non-issue. If it is an issue, perhaps we could complement the code tags with "entity tags", perhaps like this:

[* *]

I'm not sure if that's an ideal solution, though.

I assume that the codes used by Mike are just unicode entities. Perhaps we could check entities, check if they are unicode that matches some foreign alphabet or other, and if so let them through untouched. I think that's probably the best solution, but keeping up with the valid unicode entities could present problems.

Posted by CViper [send private reply] at January 31, 2003, 07:52:04 AM

well, I see another little problem: most people won't start to type text in a sequence of &#<5digitCode>;'s just because of the possibility of it. I don't know about JIS, and how it is typed, but I don't think you write text and automatically get the &#-stuff.

So it would only be really useful if the software running this site would recognize "special-characters" in a post and then turn them into those &#-things.

Posted by buzgub [send private reply] at January 31, 2003, 08:36:31 AM

My understanding of what Mike_L said is that mozilla did do that conversion automatically. I'm assuming that IE does the same thing, because I suspect it's the only standard way to ship such characters around as part of a form.

Posted by CViper [send private reply] at February 01, 2003, 06:08:18 AM

I don't know about JIS, but &*uml;-characters don't seem to get posted that way - as said, they display as '?' in the forums.

Posted by ItinitI [send private reply] at February 01, 2003, 10:44:17 AM

Hohoho, I'm confused!! It could be only JIS doesn't show up, but EUC and 2022 do, or soemthing like that. I dunno.

Posted by Mike_L [send private reply] at February 03, 2003, 02:32:14 PM

Whatever you type into an HTML form gets HTML-ized by the browser during submission. When you type the & character, it automatically is converted to the HTML entity &amp;. Just like the conversion of JIS characters to &12345; type entities. All that is required is for these codes to be accepted as they are instead of being run through the html_tidy() function or whatever is used.

Now I realize that Adam is busy with the new version of the TPU site, so I'm not going to worry about this until he's ready to look at it.

Posted by regretfuldaydreamer [send private reply] at February 03, 2003, 05:15:08 PM

Theres an old saying around here:

If you want something done, do it yourself.

Look on the bright side, if you do it then Smerdy won't be able to say that no one does anything around here.

Posted by ItinitI [send private reply] at February 03, 2003, 09:13:08 PM

LOL!

You must be logged in to post messages and see which you have already read.

Log on
Username:
Password:
Save for later automatic logon

Register as a new user
 
Copyright TPU 2002. See the Credits and About TPU for more information.