TOTO.net

Be alive

日本語とは、もっと頑張って!

by tonny.xu on Oct.11, 2006, under Be alive, Foreign Languages

最近、日本語の会話は少なくなった。もうすぐ、今年の日本語能力試験になりますね。。でも私はまだ良く準備ではない。

日常会話、と日常文書の日本語もまだ色々不足があります。

日本語漢字もまだ全部認識ではない。発音とか、音読動詞とか、訓読名詞とか。。。大変ですね。。。

今年の試験としては、まだ2ヶ月もないし、もっと頑張って。。。自分の能力向上ために、仕事の向上ために。

凌さんと4人中に、私だけ日本語はよくない状態です。恥ずかしいな~~~~

今後必ずよくする!


 

鬼十則

1、仕事は自ら創るべきで、与えられるべきではない!

2、仕事とは、先手先手働きかけていくことで、受身でやるものではない!

3、大きい仕事を取り組め、小さい仕事は己を小さくする!

4、難しい仕事を狙え、そして成し遂げるところに進歩がある。

5、取り組んでも放すな、殺されても放すな、仕事を完遂するまでは。。。

6、周囲を引きずりまわせ、引きずるのと引きずられるのとでは、永い間に天地の開きができる。

7、計画を持て、長期な計画を持っていれば、忍耐と工夫と、そして正しい努力と希望が生まれる。

8、自信を持て、自信が無いから、君の仕事には、迫力と粘りも、そして厚みすらがない。

9、頭を常に全回転、八方に気を配って、一部の隙もあって放すぬ!サービスとはそのようなものだ。

10、摩擦を怖れるな!摩擦は進歩の母、積極の肥料だ。でないと君は卑屈未練になる。


 

それは凌さんに暗記させられた「鬼十則」です。その鬼十則は日本で一番大きい広告会社「電通」の元社長、吉田秀雄(よしだ ひでお)の名言です。凄いなぁ~~~~。これから、良く勉強して、暗記して、運用します。

65 Comments more...

朝九朝一

by tonny.xu on Oct.10, 2006, under Be alive

任是沧浪,搅浑这清清河水,却也只是眼力所及。

你却不知,这清清河水之下,却是雄心滔天。

 

每日回家之后才发现今天原来已经快要过去了。

我总是在凌晨一点,记录着“今天”所发生的事情。感谢上帝,我睡得比Xinghua多。

Leave a Comment more...

Web Internationalization [I18N]: Part V

by tonny.xu on Feb.09, 2006, under Get Busy Dying

Sorry for long break, here we are again.

After we discussed how characters are transferred over Internet, now we can tell are there any scramble possibility in each step.

  • Step 1: Input via input method
    Actually, almost all the input method now support multi-codepage. But, when u running your system on a special code page(character encoding) like Abric, then input your Japanese characters or Chinese characters in a non-unicode-support editor, then you will se the scramble characters. And if you save it and publish it as a html page, then you will find the scramble code in your browser.
  • Step 2: Stored in text file.
    Text file has some tricks to indicate which character encoding is used in this text file. When you open a text file in a Hex Editor like UltraEdit or 001Editor, you will see the difference between a ANSI text file and Unicode Text file. ANSI text file means using system default code page(characters encoding) to parse these characters. Using Unicode will cause 3 different situations:
    A. Start with 0xFE 0xFF, that means this is unicode text file and using big endian.
    B. Start with 0xFF 0xFE, that means this is a unicode text file and using little endian.
    C. Start with 0xEF 0xBB 0xBF, that means this is a unicode text file and using UTF-8.

    If non of the above signature is discovered at the beginning of a text file, then the edit will try to use the system specified character encoding to interpret these characters.
    Here comes the problem: if you stored a text file in ANSI format and contained Japanese characters and when you try to open it in a Chinese System, you will see the scramble characters.

  • Step 3: Transferred over HTTP
    The HTTP protocol has a implict header indicate the character encoding used in the HTML file. Here comes the problem: if you specified a HTML using Shift-JIS(it's an Japanese Character Encoding format under ANSI), and actually using GB2312 in the HTML file then the page must be scrambled in client browser. Here, the consistency of HTML file and the HTTP protocol is very important.
  • Step 4: Find the correct interpreter according to character encoding specified by HTTP protocol.
    As we mentioned above, what if the character encoding specified in HTTP header is not consistency to the really character encoding used in HTML file, then the browser will use the wrong code page to mapping these characters. That's the cause of scramble characters.
  • Step 5: Mapping from font to character code.
    Also as we mentions in above 2 steps, the system is trying to find the correct glyphs in the font according to character code. The unicode has conformed East Asian Characters like Chinese, Japanese and Korea, so the character will have the same code under Unicode format. But those ANSI code have there own character code. In 2 different code page, the same code stands for 2 different character, and if you choose Japanese code page to interpret Chinese characters, the result is so obviously that we will get the scramble code.

Till now, we have find out how the characters are inputed, transferred, interpreted and displayed to user, and in each step, we discussed how the scramble characters are happened. With this analysis, we can drop a conclusion:

In order to avoid scramble characters, we must use UNICODE and use it through out the entire procedure of authoring HTML file and transportation it.

Leave a Comment more...

Comments式的到此一游!

by tonny.xu on Jan.08, 2006, under Get Busy Dying

Blog终于是流行起来了,君不见新浪Blog上的那些人都是什么人啊。以前遮着神秘面纱的?名人们?纷纷打开心路和你近距离交流。各式各样的人还在不断的开放新的Blog,博客还没有完,播客已经来了。

下午在晚上溜达,看了看那些朋友的blog,一个一个的留言过去,突然一种感觉浮上心头,我这和在别人家门板上刻上?TOTO到此一游?有什么分别?哦,分别是他们居然还以此为乐,以此为荣!在Sina或者在CSDN,一片blog被置顶以后,动辙就是上千人在你门板上刻字!呵呵,也算是一个有人看过这块门板的一个证据吧,长城上还不是每块砖头上都被人刻上了自己的记号,这样也证明了长城比那些Blog们更加为人所知!

而大家的动机不就是为了让别人知道自己么?

我难道不是么?你难道不是么?

哈哈哈哈

8 Comments more...

The Most expected movies @2006

by tonny.xu on Jan.08, 2006, under Movies

Here we are!

经过2005了,动荡的2005留下了不少的回忆,唯独在电影上没有留下让我开心的记忆。一方面,国内市场疲软,一些所谓的大片充斥着人们的视线,像张艺谋这样的艺术家虽然拍商业电影依然博得超高的票房纪录,但是缺少了一些让人回味的东西。港片部分慢慢的开始回归了精彩激烈的动作片,年末甄子丹的《傻破狼》还不错,可惜在国内市场被太监了。另外一方面,欧美市场那边经历的2003和2004年的超大投入超大回报的几个电影以后,基本上没有什么值得回味的新片子。

到了2006年,一切都不一样了。

我的电影的年度应该从每年的12月份开始算,因为圣诞和暑期永远是欧美的黄金档期,而我们的市场还没有说能够添加个黄金的春节档期。今年从目前国内的情况来看,让我非常有感觉的《如果爱》应该算是2005年国内市场的一个亮点,张学友的歌声依然完美,金城武和周迅的表现可圈可点。不错!

关注欧美的情况的话2006年有很多的电影值得期待:

  • 2006年1月12日国内公影:《KING KONG》。《The lord of the rings》的Peter Jackson再一次证明了自己的价值。非常期待。
  • 2006年1月25日李连杰投资并主演的《霍元甲》全国上映,赫赫,经典英雄人物。。值得一看,不过都快过年了。
  • 2006年3月8日公影的《The Chronicles of Narnia: The Lion, The Witch and the Wardrobe》也是魔幻系列的一个新里程碑,如此一个能见证历史的电影如何能不期待呢?
  • 《ICE AGE 2》绝对的动画大片,美国3月31日上映,等到5月吧。
  • 《Munich》可能不会公映了。但是绝对值得期待。?摩萨德?一直以来就是特工的一个代名词,如同KGB一样,又是这样一个敏感的历史事件,在以色列正在进行的和平进程上,这样一个片,绝对值得期待!
  • 《Transporteur 2》是法国人学习中国功夫的一个片子,可以看看。预计会在第一季度引进。
  • 2006年5月5日《Mission Impossible:III》老汤的魅力不减,光是现在爆出来的片花已经让很多人翘首望汤了,看了前面两部,那就继续看第三部吧:)- 据说有望达到全球同步首映
  • 2006年5月19日《The Da Vinci Code》在美国上映,国内能否公映还未知。Tom Hanks在《幸福终点站》之后又一力作。
  • 2006年5月26日《X-Men III》美国公映,前两部都不错,值得期待
  • 《Carfield 2》6月23日美国公映,很喜欢Carfield,但是国内情况未知
  • 《Superman Returns》6月30日美国公映,具体情况未知
  • 《Pirates of the Caribbean: Dead man's chest》,哈哈,海盗的续集,继续由Orlando Bloom出演,还有Johnny Depp,哈哈哈。。。

其它的继续考证和等待中??

1 Comment more...

水煮2005

by tonny.xu on Dec.30, 2005, under Get Busy Dying

2005马上就要过去了,一年发生了不少的事情,年终总结终究还是要做的。回想过去的一年,突然想到了水煮二字,呵呵,想想这一年还真是水煮2005啊。

所谓水煮,只是水煮了鱼片,而真正香的却是油,那一堆一堆的油啊,代价很高,价格也不错,只是??值得么?自己掏钱的话,一年吃个2次左右就差不多了。这样的一年,一辈子有个一年也差不多了。

1月:日本的ChatCafe项目结束了,有点累,不过还是很高兴的,因为对Flash的使用有了深刻的理解,哪怕我现在不了解最新的Flash情况,但是基本的理解是正确的,只要了解一下那些API的使用情况,就可以很快的上手。尤其是FCS,直接影响到了后来到了NECST(杭州)以后的工作性质。但是这个月里,公司决定开发自己的VideoChat,这个项目从一开始我就不看好,呵呵,不过不是我决定的,也不是我投钱,那就干吧。教训却在很快的时间就来到了。这个VideoChat项目是真正把我身心拖疲的,明知道这个前景不好,明知道这么短的时间内,这么几个人没有办法做出产品品质的东西。唉??

2月:VideoChat项目还在继续,我已经进入了一种我自己都能感觉到了亚健康状态。每天晚上睡不着,早上起不来。晚上11点睡觉,我能3点才睡着!呵呵,那段时间里面Zou倒是经常早上一个Morning Call,感动啊,谢谢你啊。这段时间我萌生了离开的想法。过年的时候看到了家里网吧的情况,激烈的竞争让我决定努力一把,看看是否能筹集到一笔资金,让网吧规模扩大。只有规模上去了,网吧才能赚钱了。

3月:年后回来,发现过年的时候能吃能喝能睡的状态居然不见了!那个郁闷啊,进入了一种更加难以明状的烦躁,生活一团糟!Fuck it!开始联系我的朋友看看能否进行网吧融资,呵呵,甚至还幻想融资以后的繁荣景象,白日梦做了不少。开始往杭州跑,3月12日杭州还下雪了,经过了断桥残雪,我记住了这一天,第二天还拍了几张我自己认为还算不错的照片,算是今年的一个亮点。

4月:决定了离开NetSpeed,网吧的事情继续进行中,结果4月15日离职以后融资不成功。呵呵,马云说,现在只要说他需要用钱,3天之内能筹集到3000万美金,可惜我2个月了还没有筹集到100万,早知道就问马云融资了:)融资失败以后决定出去旅游。工作快三年了,除了公司曾经组织过两次短途旅游以外,还真的没有出去旅行过。于是在一个晚上安排了行程。从杭州到上海,到无锡,到湖南吉首,到凤凰,到桂林,到龙脊,到阳朔??最后实施这个旅行计划的时候真的是很快乐!(忘了,离职的时候给自己发了一个誓:I will never let nobody and nothing turn me into no cripple.这是从Ray这部电影中学来的,很感动)

5月:旅行中,4月底到了凤凰,呆到了5月3日,然后出发去了桂林,呆了一天。认识了Eric,一个美国大叔,人不错,结伴旅行,一起去了龙脊和阳朔。在凤凰的8天时间里面,除了拍拍风景,认识一些旅行中的男男女女,就是发呆,边走边发呆??在阳朔的时候虽然觉得漓江很美丽,虽然觉得兴坪真的很美丽,但是没有那种身心完全融入这样一种美景的感觉。突然觉得需要一个人在身边陪伴才爽。中旬回到南京,生了一场病。娘的,多少年没有挨屁股针了,这次挨了一把!开始觉得需要找工作了。5月底的时候重新准备了一份简历,在杭州范围内准备撒网,来了杭州后发现IT业还不如南京。虽然住的很舒服,生活环境也比南京要好,城市建设更是让我眼前一亮,只是高高在上的房价让我叹气不已。用两天时间找了三家公司:Alibaba, NEC,TATA。娘的,Alibaba说一般需要面试三轮,给我面试了5轮,前后耗费了7个工作日,结果还是不要我。嘎嘎。。Alibaba,你会后悔的。在Alibaba做完第一天的面试以后第二天去了杭州的滨江这个鸟不拉屎的地方来找工作。早上TATA,下午NEC。早上TATA的印度阿三和我聊了5分钟后,就像暂时性缺氧一样,英语自信心完全崩溃!那也叫英语?主动放弃。下午去了NEC,做完笔试之后和凌晨聊了有将近2个小时,呵呵,还算愉快。第二天就收到了Offer,为了等Alibaba的消息,苦忍没有直接答应凌晨。月末终于忍不住问了一下Alibaba,说不要我,可是当时人事的mm是答应不管要不要我都给我答复的。短暂的思考后,去了NEC

6月:退掉南京的房子,和南京的朋友们吃了些散伙饭,直奔杭州来了,主题是爱情与工作,副标题是回家。修整以后,6月15日正式上班。开始工作!

7月:搬到滨江住,住老潘租下的农民房,虽然条件不咋的,但是开始新工作的兴奋劲还没有过去。一切充满了不同。整个人吃的香睡的香。早上7点就能自动睁开眼睛。期间一次NECST的CTO在浙大的讲座上偶遇了她,惊为天仙,从此不能自拔。唉,我终究还是一个感性的男人,还不能挥洒自如。

8月:1日的时候,她也来了公司!!!!只是在第二天晚上才发现原来她就是我在讲座上偶遇的她!!心中狂喜,脑子却从此不再听我指挥。整个8月就是憧憬-失败-再憧憬-再失败??

9月,第一个正式的项目开始了,从8月28日开始,到9月20日交东西。娘的,不是人干的事情!加班无数,却也拉下了两件事情:日语学习就此中断,和她的交往陷入低谷(那是一种还没有开始就陷入低估的感觉),感觉非常非常非常的不爽。

10月:交货以后,终于稍微休整了1个星期,期间假期回家一趟,然后就是看碟子,那个看的天昏地暗啊~她说她有安排,我却没有去继续努力,只能给自己两个耳光了。接下来准备新一个版本的开发。结果从10月10号开始了新版本的开发,这次开发经验发挥了重要的作用,呵呵,体会到了Leader的辛酸和轻松。整个10月和她基本上没有说话!我要继续给我自己两个耳光。

11月:和她没有进展,项目进展顺利,日语准备放弃。

12月:Feel Extremely Blue!项目进展继续顺利。和日方的交流保持在正常的水品。期间给2部的一些弟兄解决了一个问题,感觉非常有成就感,那一刻觉得好幸福。好久好久好久没有这样的成就感了。圣诞想约她一起活动,呵呵,流产了。一个人的圣诞,在杭州的第一个圣诞,在家一个人孤独的过着。她在Happy中。马上就是新年,呵呵。。一年就这样过去了????

blog进行时,全国上下上网的人如果不玩个blog似乎就不是玩互连网的。我拿blog当流水帐记录。

水煮2005,我的水煮2005,真他妈的不爽!Feel Extremely Blue!

不要再水煮一年了!

33 Comments more...

Web Internationalization [I18N]: Part IV

by tonny.xu on Dec.20, 2005, under Get Busy Living

After we discussed the most basic concepts about the character encoding in previously 3 parts, next I'd like to discuss how the word you typed from a input method is correctly shown on the screen.

At first, let's take a look at a picture which is part of my presentation of "Web I18N".

How the character typed from a input method is shown correctly on your screen

As shown above, we can tell that mainly after 4 steps, one character you typed is shown correctly on your screen.

  • Step 1: Type it via your input method.
    The input method can link two kinds of mapping: mapping the inputted English characters to your own characters for the keyboard is usally only support 26 english characters and mapping your own characters to it's encoded code.
  • Step 2: The textpad accepted what you typed in your own language and store with the correct text file encoding mode. Text file has some encoding mark at the beginning of the file. Usally, "0xFF 0xFE" will be the first 2 bytes if you stored the text with "Unicode(Big Endian)", "0xFE 0xFF" for "Unicode(Little Endian)" and "0xEF 0xBB 0xBF" for UTF-8 mode.
  • Step 3: These text file will be transfered from server to client browser, and the server and client will exchange the language information.Here, the HTTP protocol plays an important role. Take a deep look as HTTP protocol will help u a lot.
  • Step 4: After the browser received the HTML file it mapping the character code inside it with the font.
    Using which font is prompted by response or a HTML tag which contains the charset information.
  • Step 5: The browser invoke the system API to rander the HTML with correct character in your own language.Till now, the end user will see the right characters that he can read.(maybe actually he can not read it, but the browser though it's shown the right characters)

After now, u should understand the full work flow for a character from inputed to show to the end user.

Next, we will see how to avoid some traps may inside this flow and give u a complete solution.

27 Comments more...

Web Internationalization [I18N]: Part III

by tonny.xu on Dec.03, 2005, under Get Busy Living

Previously in Part II, we discussed the basic terms, with this basic concepts, we can discuss the complex part - Character Encoding.

//Today, I finally resolved the last few confused questions about Character Encoding, and can continue to write this article. :p

//Continued @2005.12.18

Now, let's see what the character encoding is.

At the very beginning of computer science, character encoding was created. And, after computer is born, it bring the first famous character encoding - ASCII. Actually, ASCII is the abbreviation of America Standard Code for Information Interchange. That's define how the computer recognize those 26 English letters and some other control code.

That's the exact point!

Character Encoding means given each character a unique number, and also each one of us accept it and each computer can accept it. Accept means when the computer read that code in binary, it knows that which character it presents.

And after the computer is expressed to the whole world, those people who do not use English as their native language have to face a fact that they can not add other unique number to ASCII for ASCII using 7bit to mark those characters who believe will be enough for American people. Even when ASCII is grown to 8bit, that means 1 byte, it also can only contain 256 characters. How could other people outside America to use their own characters in computer? That's the right question and which bring an answer: Double Bytes.

Yeah, Double Bytes, the evil of clobber( or messed code?)

Yet but they are also the key to resolve the clobber (messed code / scrambled code).

27 Comments more...

Web Internationalization [I18N]: Part II

by tonny.xu on Nov.25, 2005, under Get Busy Living

Today, I'd like to talk about the "Character Encodings"

Since the very beginning of the Computer Science, Character Encodings is as old as CS. The most famous ASCII table, is one of the most popular Character Encodings.

So, what means character encoding? Character encoding is some kind of organization of numeric codes that represent the characters of a character set in memory.

There are many character encodings in this world because a lot of people had tried how to express their own language or characters in computer.

Before we take a deep digging into character encoding, we need understand some basic concepts.

  • Character: According to the glossary of Unicode standard[Unicode standard 4.0], a character is the smallest component of a written language that has semantic value.
  • Phoneme: A phoneme is a minimally distinct sound in the context of a particular spoken language. Also we can say that Phoneme is the unit of aural rendering, and in some scripts, character has a close relation to phoneme, while others have a close relation to meaning. There is no one-to-one correspondence between the characters and Phonemes.
  • Glyph: Glyphs are defined by ISO/IEC 9541-1[ISO/IEC 9541-1] as "a recoganizable abstract graphic symbol which is independent of a specific design". Usually, also referred as the unit of visual rendering. There is no one-to-one correspondence between the characters and Glyphs.
  • Unit of input: In keyboard input, it's NOT ALWAYS the case that keystrokes and input characters correspond one-to-one. Only a few language like English can correspond the keystroke and the character one-to-one, there are many other languages outside there and they are using far more complex writing system. It's impossible to fit them all to the keyboard and they must rely on some kind of input method which transform keystroke sequence into character sequence.
  • Unit of collation: String comparison are used on sorting and searching which based on collation but not characters. Those collation does not have a one-to-one relation with characters. For example, in triditional Spanish sorting, the character sequence 'ch' and 'll' are treated as atomic collation unit.
  • Unit of storage: All information is stored in physical storage, the basic principle of CS, as usual, we know bits and bytes, thus the most complex part. A frequent error in specification and implementations is the equating of characters with unit of physical storage. That's mapping is our object, usally called the Character encoding.

The above terms are the basic conecpts for understanding character encoding.

Here is the end of Part II

29 Comments more...

Web Internationalization [I18N]: Part I

by tonny.xu on Nov.22, 2005, under Get Busy Living

These days, I'm preparing for the first paper contest in my company. Although, somebody may consider this kind of contest is too naivety to attend, but u know what? I thought this kind of contest will drive most of the newbie employee to promote their skill in one certain field.

For me, as a member of company's Technical Center, I'd like to write some thing about the Internationalization (aka. I18N) on Web. And as we can see, web I18N has already been talked for a long time, and a lot of people had draw a conclusion or made a guider to web I18N, and I'd like to look through it and build a similar conclusion that a lot of experts had already recommanded. For me, it's a chance to let everybody know how I made the research work, and how I build a nice presentation.

As some time ago, I had considered to write something about AJAX, which becoming more and more hot over Internet, but 2 other guys had already decided to write such kind topic. So, I choose to give up and find myself another topic, which is this one - Web I18N

Within my paper, I'd like to cover these areas:

  • Character Encodings
  • Character Escaping
  • Unicode
  • Normalization
  • How to build a Web I18N site based on J2EE technologies
  • How to build a Web I18N site based on ASP.NET technologies
  • How to build a Web I18N site based on AJAX technologies
  • Finally, some demo

Ladies and Gentlemen, it's Show Time!

28 Comments more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!