5种字符串拼接方式的效率比拼

╰+哭是因爲堅強的太久メ 2024-03-27 18:21 147阅读 0赞

字符串拼接一般使用“+”,但是“+”不能满足大批量数据的处理,Java中有以下五种方法处理字符串拼接,各有优缺点,程序开发应选择合适的方法实现。

  1. 加号 “+”
  2. String contact() 方法
  3. StringUtils.join() 方法
  4. StringBuffer append() 方法
  5. StringBuilder append() 方法

经过简单的程序测试,从执行100次到90万次的时间开销如下表:

图片

由此可以看出:

  1. 方法1 加号 “+” 拼接 和 方法2 String contact() 方法 适用于小数据量的操作,代码简洁方便,加号“+” 更符合我们的编码和阅读习惯;
  2. 方法3 StringUtils.join() 方法 适用于将ArrayList转换成字符串,就算90万条数据也只需68ms,可以省掉循环读取ArrayList的代码;
  3. 方法4 StringBuffer append() 方法 和 方法5 StringBuilder append() 方法 其实他们的本质是一样的,都是继承自AbstractStringBuilder,效率最高,大批量的数据处理最好选择这两种方法。
  4. 方法1 加号 “+” 拼接 和 方法2 String contact() 方法 的时间和空间成本都很高(分析在本文末尾),不能用来做批量数据的处理。

源代码,供参考

  1. package cnblogs.twzheng.lab2;
  2. /**
  3. * @author Tan Wenzheng
  4. *
  5. */
  6. import java.util.ArrayList;
  7. import java.util.List;
  8. import org.apache.commons.lang3.StringUtils;
  9. public class TestString {
  10. private static final int max = 100;
  11. public void testPlus() {
  12. System.out.println(">>> testPlus() <<<");
  13. String str = "";
  14. long start = System.currentTimeMillis();
  15. for (int i = 0; i < max; i++) {
  16. str = str + "a";
  17. }
  18. long end = System.currentTimeMillis();
  19. long cost = end - start;
  20. System.out.println(" {str + \"a\"} cost=" + cost + " ms");
  21. }
  22. public void testConcat() {
  23. System.out.println(">>> testConcat() <<<");
  24. String str = "";
  25. long start = System.currentTimeMillis();
  26. for (int i = 0; i < max; i++) {
  27. str = str.concat("a");
  28. }
  29. long end = System.currentTimeMillis();
  30. long cost = end - start;
  31. System.out.println(" {str.concat(\"a\")} cost=" + cost + " ms");
  32. }
  33. public void testJoin() {
  34. System.out.println(">>> testJoin() <<<");
  35. long start = System.currentTimeMillis();
  36. List<String> list = new ArrayList<String>();
  37. for (int i = 0; i < max; i++) {
  38. list.add("a");
  39. }
  40. long end1 = System.currentTimeMillis();
  41. long cost1 = end1 - start;
  42. StringUtils.join(list, "");
  43. long end = System.currentTimeMillis();
  44. long cost = end - end1;
  45. System.out.println(" {list.add(\"a\")} cost1=" + cost1 + " ms");
  46. System.out.println(" {StringUtils.join(list, \"\")} cost=" + cost
  47. + " ms");
  48. }
  49. public void testStringBuffer() {
  50. System.out.println(">>> testStringBuffer() <<<");
  51. long start = System.currentTimeMillis();
  52. StringBuffer strBuffer = new StringBuffer();
  53. for (int i = 0; i < max; i++) {
  54. strBuffer.append("a");
  55. }
  56. strBuffer.toString();
  57. long end = System.currentTimeMillis();
  58. long cost = end - start;
  59. System.out.println(" {strBuffer.append(\"a\")} cost=" + cost + " ms");
  60. }
  61. public void testStringBuilder() {
  62. System.out.println(">>> testStringBuilder() <<<");
  63. long start = System.currentTimeMillis();
  64. StringBuilder strBuilder = new StringBuilder();
  65. for (int i = 0; i < max; i++) {
  66. strBuilder.append("a");
  67. }
  68. strBuilder.toString();
  69. long end = System.currentTimeMillis();
  70. long cost = end - start;
  71. System.out
  72. .println(" {strBuilder.append(\"a\")} cost=" + cost + " ms");
  73. }
  74. }

测试结果:

  1. 执行100次, private static final int max = 100;

    testPlus() <<<
    {str + “a”} cost=0 ms
    testConcat() <<<
    {str.concat(“a”)} cost=0 ms
    testJoin() <<<
    {list.add(“a”)} cost1=0 ms
    {StringUtils.join(list, “”)} cost=20 ms
    testStringBuffer() <<<
    {strBuffer.append(“a”)} cost=0 ms
    testStringBuilder() <<<
    {strBuilder.append(“a”)} cost=0 ms

  2. 执行1000次, private static final int max = 1000;

    testPlus() <<<
    {str + “a”} cost=10 ms
    testConcat() <<<
    {str.concat(“a”)} cost=0 ms
    testJoin() <<<
    {list.add(“a”)} cost1=0 ms
    {StringUtils.join(list, “”)} cost=20 ms
    testStringBuffer() <<<
    {strBuffer.append(“a”)} cost=0 ms
    testStringBuilder() <<<
    {strBuilder.append(“a”)} cost=0 ms

  3. 执行1万次, private static final int max = 10000;

    testPlus() <<<
    {str + “a”} cost=150 ms
    testConcat() <<<
    {str.concat(“a”)} cost=70 ms
    testJoin() <<<
    {list.add(“a”)} cost1=0 ms
    {StringUtils.join(list, “”)} cost=30 ms
    testStringBuffer() <<<
    {strBuffer.append(“a”)} cost=0 ms
    testStringBuilder() <<<
    {strBuilder.append(“a”)} cost=0 ms

  4. 执行10万次, private static final int max = 100000;

    testPlus() <<<
    {str + “a”} cost=4198 ms
    testConcat() <<<
    {str.concat(“a”)} cost=1862 ms
    testJoin() <<<
    {list.add(“a”)} cost1=21 ms
    {StringUtils.join(list, “”)} cost=49 ms
    testStringBuffer() <<<
    {strBuffer.append(“a”)} cost=10 ms
    testStringBuilder() <<<
    {strBuilder.append(“a”)} cost=10 ms

  5. 执行20万次, private static final int max = 200000;

    testPlus() <<<
    {str + “a”} cost=17196 ms
    testConcat() <<<
    {str.concat(“a”)} cost=7653 ms
    testJoin() <<<
    {list.add(“a”)} cost1=20 ms
    {StringUtils.join(list, “”)} cost=51 ms
    testStringBuffer() <<<
    {strBuffer.append(“a”)} cost=20 ms
    testStringBuilder() <<<
    {strBuilder.append(“a”)} cost=16 ms

  6. 执行50万次, private static final int max = 500000;

    testPlus() <<<
    {str + “a”} cost=124693 ms
    testConcat() <<<
    {str.concat(“a”)} cost=49439 ms
    testJoin() <<<
    {list.add(“a”)} cost1=21 ms
    {StringUtils.join(list, “”)} cost=50 ms
    testStringBuffer() <<<
    {strBuffer.append(“a”)} cost=20 ms
    testStringBuilder() <<<
    {strBuilder.append(“a”)} cost=10 ms

  7. 执行90万次, private static final int max = 900000;

    testPlus() <<<
    {str + “a”} cost=456739 ms
    testConcat() <<<
    {str.concat(“a”)} cost=186252 ms
    testJoin() <<<
    {list.add(“a”)} cost1=20 ms
    {StringUtils.join(list, “”)} cost=68 ms
    testStringBuffer() <<<
    {strBuffer.append(“a”)} cost=30 ms
    testStringBuilder() <<<
    {strBuilder.append(“a”)} cost=24 ms

查看源代码,以及简单分析

String contactStringBufferStringBuilder 的源代码都可以在Java库里找到,有空可以研究研究。

1.其实每次调用contact()方法就是一次数组的拷贝,虽然在内存中是处理都是原子性操作,速度非常快,但是,最后的return语句会创建一个新String对象,限制了concat方法的速度。

  1. public String concat(String str) {
  2. int otherLen = str.length();
  3. if (otherLen == 0) {
  4. return this;
  5. }
  6. int len = value.length;
  7. char buf[] = Arrays.copyOf(value, len + otherLen);
  8. str.getChars(buf, len);
  9. return new String(buf, true);
  10. }

2.StringBuffer 和 StringBuilder 的append方法都继承自AbstractStringBuilder,整个逻辑都只做字符数组的加长,拷贝,到最后也不会创建新的String对象,所以速度很快,完成拼接处理后在程序中用strBuffer.toString()来得到最终的字符串。

  1. /**
  2. * Appends the specified string to this character sequence.
  3. * <p>
  4. * The characters of the {@code String} argument are appended, in
  5. * order, increasing the length of this sequence by the length of the
  6. * argument. If {@code str} is {@code null}, then the four
  7. * characters {@code "null"} are appended.
  8. * <p>
  9. * Let <i>n</i> be the length of this character sequence just prior to
  10. * execution of the {@code append} method. Then the character at
  11. * index <i>k</i> in the new character sequence is equal to the character
  12. * at index <i>k</i> in the old character sequence, if <i>k</i> is less
  13. * than <i>n</i>; otherwise, it is equal to the character at index
  14. * <i>k-n</i> in the argument {@code str}.
  15. *
  16. * @param str a string.
  17. * @return a reference to this object.
  18. */
  19. public AbstractStringBuilder append(String str) {
  20. if (str == null) str = "null";
  21. int len = str.length();
  22. ensureCapacityInternal(count + len);
  23. str.getChars(0, len, value, count);
  24. count += len;
  25. return this;
  26. }
  27. /**
  28. * This method has the same contract as ensureCapacity, but is
  29. * never synchronized.
  30. */
  31. private void ensureCapacityInternal(int minimumCapacity) {
  32. // overflow-conscious code
  33. if (minimumCapacity - value.length > 0)
  34. expandCapacity(minimumCapacity);
  35. }
  36. /**
  37. * This implements the expansion semantics of ensureCapacity with no
  38. * size check or synchronization.
  39. */
  40. void expandCapacity(int minimumCapacity) {
  41. int newCapacity = value.length * 2 + 2;
  42. if (newCapacity - minimumCapacity < 0)
  43. newCapacity = minimumCapacity;
  44. if (newCapacity < 0) {
  45. if (minimumCapacity < 0) // overflow
  46. throw new OutOfMemoryError();
  47. newCapacity = Integer.MAX_VALUE;
  48. }
  49. value = Arrays.copyOf(value, newCapacity);
  50. }

3.字符串的加号“+” 方法, 虽然编译器对其做了优化,使用StringBuilder的append方法进行追加,但是每循环一次都会创建一个StringBuilder对象,且都会调用toString方法转换成字符串,所以开销很大。

注:执行一次字符串“+”,相当于 str = new StringBuilder(str).append("a").toString();

4.本文开头的地方统计了时间开销,根据上述分析再想想空间的开销。常说拿空间换时间,反过来是不是拿时间换到了空间呢,但是在这里,其实时间是消耗在了重复的不必要的工作上(生成新的对象,toString方法),所以对大批量数据做处理时,加号“+” 和 contact 方法绝对不能用,时间和空间成本都很高。

发表评论

表情:
评论列表 (有 0 条评论,147人围观)

还没有评论,来说两句吧...

相关阅读

    相关 字符串拼接方式

    \+ 号拼接 通过`+`拼接是最常见的拼接方式,这个应该算是最简单的一种方式了,但是很遗憾得玩告诉你,阿里巴巴在他们的规范里面之处不建议在 for 循环里面使用 “+”