ETL异构数据源Datax_使用数据分片提升同步速度_05

悠悠 2022-10-09 03:04 325阅读 0赞

文章目录

              1. 构建json,添加数据分片
              1. Mysql数据清除
              1. 数据分片前后对比
1. 构建json,添加数据分片

在这里插入图片描述

  1. {
  2. "job": {
  3. "setting": {
  4. "speed": {
  5. "channel": 3
  6. },
  7. "errorLimit": {
  8. "record": 0,
  9. "percentage": 0.02
  10. }
  11. },
  12. "content": [
  13. {
  14. "reader": {
  15. "name": "oraclereader",
  16. "parameter": {
  17. "column": [
  18. "IDNO",
  19. "COL1",
  20. "COL2",
  21. "COL3",
  22. "DT",
  23. "COL5",
  24. "COL6",
  25. "COL7",
  26. "COL8",
  27. "COL9",
  28. "COL10"
  29. ],
  30. splitPk:"IDNO",
  31. "connection": [
  32. {
  33. "jdbcUrl": [
  34. "jdbc:oracle:thin:@192.xxx.xxx.xxx:1521:orcl"
  35. ],
  36. "table": [
  37. "TEST.OTBS1"
  38. ]
  39. }
  40. ],
  41. "username": "username",
  42. "password": "password"
  43. }
  44. },
  45. "writer": {
  46. "name": "mysqlwriter",
  47. "parameter": {
  48. "column": [
  49. "IDNO",
  50. "COL1",
  51. "COL2",
  52. "COL3",
  53. "DT",
  54. "COL5",
  55. "COL6",
  56. "COL7",
  57. "COL8",
  58. "COL9",
  59. "COL10"
  60. ],
  61. "connection": [
  62. {
  63. "jdbcUrl": "jdbc:mysql://127.0.0.1:3306/datax?autoReconnect=true&useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=CONVERT_TO_NULL&useSSL=false&serverTimezone=CTT&nullCatalogMeansCurrent=true",
  64. "table": [
  65. "otbs1"
  66. ]
  67. }
  68. ],
  69. "username": "root",
  70. "password": "123456"
  71. }
  72. }
  73. }
  74. ]
  75. }
  76. }
2. Mysql数据清除

清除mysql otbs1表数据

  1. truncate table otbs1;
3. 数据分片前后对比

数据分片前

  1. 2021-06-23 12:28:12.390 [job-0] INFO StandAloneJobContainerCommunicator - Total 1048576 records, 69143488 bytes | Speed 1.65MB/s, 26214 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 26.038s | All Task WaitReaderTime 8.483s | Percentage 100.00%
  2. 2021-06-23 12:28:12.402 [job-0] INFO JobContainer -
  3. 任务启动时刻 : 2021-06-23 12:27:31
  4. 任务结束时刻 : 2021-06-23 12:28:12
  5. 任务总计耗时 : 41s
  6. 任务平均流量 : 1.65MB/s
  7. 记录写入速度 : 26214rec/s
  8. 读出记录总数 : 1048576
  9. 读写失败总数 : 0
  10. channel并发3个未生效

数据分片后

  1. 2021-06-23 12:59:01.629 [job-0] INFO JobContainer -
  2. 任务启动时刻 : 2021-06-23 12:58:29
  3. 任务结束时刻 : 2021-06-23 12:59:01
  4. 任务总计耗时 : 31s
  5. 任务平均流量 : 2.20MB/s
  6. 记录写入速度 : 34952rec/s
  7. 读出记录总数 : 1048576
  8. 读写失败总数 : 0

速度相比数据分片前提升了10s

同步日志,相比数据分片前做了数据分片处理,并发3个channel处理16个任务。
channel并发3个未生效
在这里插入图片描述

  1. 2021-06-23 12:58:31.020 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
  2. 2021-06-23 12:58:31.020 [job-0] INFO JobContainer - DataX Reader.Job [oraclereader] do prepare work .
  3. 2021-06-23 12:58:31.020 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
  4. 2021-06-23 12:58:31.021 [job-0] INFO JobContainer - jobContainer starts to do split ...
  5. 2021-06-23 12:58:31.021 [job-0] INFO JobContainer - Job set Channel-Number to 3 channels.
  6. 2021-06-23 12:58:31.113 [job-0] INFO SingleTableSplitUtil - split pk [sql=SELECT * FROM ( SELECT IDNO FROM DBTEST.OTBS1 SAMPLE (0.1) WHERE (IDNO IS NOT NULL) ORDER BY DBMS_RANDOM.VALUE) WHERE ROWNUM <= 15 ORDER by IDNO ASC] is running...
  7. 2021-06-23 12:58:31.389 [job-0] INFO SingleTableSplitUtil - After split(), allQuerySql=[
  8. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (42075 <= IDNO AND IDNO < 77408)
  9. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (77408 <= IDNO AND IDNO < 187833)
  10. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (187833 <= IDNO AND IDNO < 263631)
  11. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (263631 <= IDNO AND IDNO < 349253)
  12. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (349253 <= IDNO AND IDNO < 364994)
  13. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (364994 <= IDNO AND IDNO < 434398)
  14. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (434398 <= IDNO AND IDNO < 437250)
  15. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (437250 <= IDNO AND IDNO < 516705)
  16. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (516705 <= IDNO AND IDNO < 555961)
  17. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (555961 <= IDNO AND IDNO < 578695)
  18. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (578695 <= IDNO AND IDNO < 638120)
  19. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (638120 <= IDNO AND IDNO < 655685)
  20. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (655685 <= IDNO AND IDNO < 859873)
  21. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where (859873 <= IDNO AND IDNO <= 962533)
  22. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where ((IDNO < 42075) OR (962533 < IDNO))
  23. select IDNO,COL1,COL2,COL3,DT,COL5,COL6,COL7,COL8,COL9,COL10 from TEST.OTBS1 where IDNO IS NULL
  24. ].
  25. 2021-06-23 12:58:31.390 [job-0] INFO JobContainer - DataX Reader.Job [oraclereader] splits to [16] tasks.
  26. 2021-06-23 12:58:31.394 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [16] tasks.
  27. 2021-06-23 12:58:31.431 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
  28. 2021-06-23 12:58:31.460 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
  29. 2021-06-23 12:58:31.463 [job-0] INFO JobContainer - Running by standalone Mode.
  30. 2021-06-23 12:58:31.487 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [3] channels for [16] tasks.
  31. 2021-06-23 12:58:31.508 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
  32. 2021-06-23 12:58:31.508 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.

发表评论

表情:
评论列表 (有 0 条评论,325人围观)

还没有评论,来说两句吧...

相关阅读