Java中汉字、字母、数字所占字节是多少?

待我称王封你为后i 2021-09-28 00:40 437阅读 0赞

1.重点:

不同的编码格式占字节数是不同的,UTF-8编码下一个中文所占字节也是不确定的,可能是2个、3个、4个字节;

2.不多说,直接上程序:

  1. @Test
  2. public void test1() throws UnsupportedEncodingException {
  3. String a = "名";
  4. System.out.println("UTF-8编码长度:"+a.getBytes("UTF-8").length);
  5. System.out.println("GBK编码长度:"+a.getBytes("GBK").length);
  6. System.out.println("GB2312编码长度:"+a.getBytes("GB2312").length);
  7. System.out.println("==========================================");
  8. String c = "0x20001";
  9. System.out.println("UTF-8编码长度:"+c.getBytes("UTF-8").length);
  10. System.out.println("GBK编码长度:"+c.getBytes("GBK").length);
  11. System.out.println("GB2312编码长度:"+c.getBytes("GB2312").length);
  12. System.out.println("==========================================");
  13. char[] arr = Character.toChars(0x20001);
  14. String s = new String(arr);
  15. System.out.println("char array length:" + arr.length);
  16. System.out.println("content:| " + s + " |");
  17. System.out.println("String length:" + s.length());
  18. System.out.println("UTF-8编码长度:"+s.getBytes("UTF-8").length);
  19. System.out.println("GBK编码长度:"+s.getBytes("GBK").length);
  20. System.out.println("GB2312编码长度:"+s.getBytes("GB2312").length);
  21. System.out.println("==========================================");
  22. }

3.结果:

  1. UTF-8编码长度:3
  2. GBK编码长度:2
  3. GB2312编码长度:2
  4. ==========================================
  5. UTF-8编码长度:7
  6. GBK编码长度:7
  7. GB2312编码长度:7
  8. ==========================================
  9. char array length:2
  10. content:| ? |
  11. String length:2
  12. UTF-8编码长度:4
  13. GBK编码长度:1
  14. GB2312编码长度:1
  15. ==========================================

Unicode 编码参考播客:https://blog.csdn.net/hezh1994/article/details/78899683

发表评论

表情:
评论列表 (有 0 条评论,437人围观)

还没有评论,来说两句吧...

相关阅读