《手摸手带你学ClickHouse》之导入导出数据

古城微笑少年丶 2023-10-01 17:30 75阅读 0赞

本文同步发表于我的微信公众号,扫一扫文章底部的二维码或在微信搜索 chaodev 即可关注。

文章目录

      • 1、导入数据
      • 2、查询数据效率
      • 3、导出数据

前两篇文章《手摸手带你学ClickHouse》之安装部署、《手摸手带你学ClickHouse》之访问接口,详细介绍了ClickHouse单机的安装部署、ClickHouse监听端口、访问接口、设置允许其他ip访问以及可视化的客户端。本文主要内容为数据的导入导出。

1、导入数据

测试数据如下
在这里插入图片描述

解压

  1. xz -d hits_v1.tsv.xz
  2. xz -d visits_v1.tsv.xz

解压后
在这里插入图片描述

创建数据库
在这里插入图片描述

创建表

  1. CREATE TABLE tutorial.hits_v1
  2. (
  3. `WatchID` UInt64,
  4. `JavaEnable` UInt8,
  5. `Title` String,
  6. `GoodEvent` Int16,
  7. `EventTime` DateTime,
  8. `EventDate` Date,
  9. `CounterID` UInt32,
  10. `ClientIP` UInt32,
  11. `ClientIP6` FixedString(16),
  12. `RegionID` UInt32,
  13. `UserID` UInt64,
  14. `CounterClass` Int8,
  15. `OS` UInt8,
  16. `UserAgent` UInt8,
  17. `URL` String,
  18. `Referer` String,
  19. `URLDomain` String,
  20. `RefererDomain` String,
  21. `Refresh` UInt8,
  22. `IsRobot` UInt8,
  23. `RefererCategories` Array(UInt16),
  24. `URLCategories` Array(UInt16),
  25. `URLRegions` Array(UInt32),
  26. `RefererRegions` Array(UInt32),
  27. `ResolutionWidth` UInt16,
  28. `ResolutionHeight` UInt16,
  29. `ResolutionDepth` UInt8,
  30. `FlashMajor` UInt8,
  31. `FlashMinor` UInt8,
  32. `FlashMinor2` String,
  33. `NetMajor` UInt8,
  34. `NetMinor` UInt8,
  35. `UserAgentMajor` UInt16,
  36. `UserAgentMinor` FixedString(2),
  37. `CookieEnable` UInt8,
  38. `JavascriptEnable` UInt8,
  39. `IsMobile` UInt8,
  40. `MobilePhone` UInt8,
  41. `MobilePhoneModel` String,
  42. `Params` String,
  43. `IPNetworkID` UInt32,
  44. `TraficSourceID` Int8,
  45. `SearchEngineID` UInt16,
  46. `SearchPhrase` String,
  47. `AdvEngineID` UInt8,
  48. `IsArtifical` UInt8,
  49. `WindowClientWidth` UInt16,
  50. `WindowClientHeight` UInt16,
  51. `ClientTimeZone` Int16,
  52. `ClientEventTime` DateTime,
  53. `SilverlightVersion1` UInt8,
  54. `SilverlightVersion2` UInt8,
  55. `SilverlightVersion3` UInt32,
  56. `SilverlightVersion4` UInt16,
  57. `PageCharset` String,
  58. `CodeVersion` UInt32,
  59. `IsLink` UInt8,
  60. `IsDownload` UInt8,
  61. `IsNotBounce` UInt8,
  62. `FUniqID` UInt64,
  63. `HID` UInt32,
  64. `IsOldCounter` UInt8,
  65. `IsEvent` UInt8,
  66. `IsParameter` UInt8,
  67. `DontCountHits` UInt8,
  68. `WithHash` UInt8,
  69. `HitColor` FixedString(1),
  70. `UTCEventTime` DateTime,
  71. `Age` UInt8,
  72. `Sex` UInt8,
  73. `Income` UInt8,
  74. `Interests` UInt16,
  75. `Robotness` UInt8,
  76. `GeneralInterests` Array(UInt16),
  77. `RemoteIP` UInt32,
  78. `RemoteIP6` FixedString(16),
  79. `WindowName` Int32,
  80. `OpenerName` Int32,
  81. `HistoryLength` Int16,
  82. `BrowserLanguage` FixedString(2),
  83. `BrowserCountry` FixedString(2),
  84. `SocialNetwork` String,
  85. `SocialAction` String,
  86. `HTTPError` UInt16,
  87. `SendTiming` Int32,
  88. `DNSTiming` Int32,
  89. `ConnectTiming` Int32,
  90. `ResponseStartTiming` Int32,
  91. `ResponseEndTiming` Int32,
  92. `FetchTiming` Int32,
  93. `RedirectTiming` Int32,
  94. `DOMInteractiveTiming` Int32,
  95. `DOMContentLoadedTiming` Int32,
  96. `DOMCompleteTiming` Int32,
  97. `LoadEventStartTiming` Int32,
  98. `LoadEventEndTiming` Int32,
  99. `NSToDOMContentLoadedTiming` Int32,
  100. `FirstPaintTiming` Int32,
  101. `RedirectCount` Int8,
  102. `SocialSourceNetworkID` UInt8,
  103. `SocialSourcePage` String,
  104. `ParamPrice` Int64,
  105. `ParamOrderID` String,
  106. `ParamCurrency` FixedString(3),
  107. `ParamCurrencyID` UInt16,
  108. `GoalsReached` Array(UInt32),
  109. `OpenstatServiceName` String,
  110. `OpenstatCampaignID` String,
  111. `OpenstatAdID` String,
  112. `OpenstatSourceID` String,
  113. `UTMSource` String,
  114. `UTMMedium` String,
  115. `UTMCampaign` String,
  116. `UTMContent` String,
  117. `UTMTerm` String,
  118. `FromTag` String,
  119. `HasGCLID` UInt8,
  120. `RefererHash` UInt64,
  121. `URLHash` UInt64,
  122. `CLID` UInt32,
  123. `YCLID` UInt64,
  124. `ShareService` String,
  125. `ShareURL` String,
  126. `ShareTitle` String,
  127. `ParsedParams` Nested(
  128. Key1 String,
  129. Key2 String,
  130. Key3 String,
  131. Key4 String,
  132. Key5 String,
  133. ValueDouble Float64),
  134. `IslandID` FixedString(16),
  135. `RequestNum` UInt32,
  136. `RequestTry` UInt8
  137. )
  138. ENGINE = MergeTree()
  139. PARTITION BY toYYYYMM(EventDate)
  140. ORDER BY (CounterID, EventDate, intHash32(UserID))
  141. SAMPLE BY intHash32(UserID)
  142. SETTINGS index_granularity = 8192;
  143. CREATE TABLE tutorial.visits_v1
  144. (
  145. `CounterID` UInt32,
  146. `StartDate` Date,
  147. `Sign` Int8,
  148. `IsNew` UInt8,
  149. `VisitID` UInt64,
  150. `UserID` UInt64,
  151. `StartTime` DateTime,
  152. `Duration` UInt32,
  153. `UTCStartTime` DateTime,
  154. `PageViews` Int32,
  155. `Hits` Int32,
  156. `IsBounce` UInt8,
  157. `Referer` String,
  158. `StartURL` String,
  159. `RefererDomain` String,
  160. `StartURLDomain` String,
  161. `EndURL` String,
  162. `LinkURL` String,
  163. `IsDownload` UInt8,
  164. `TraficSourceID` Int8,
  165. `SearchEngineID` UInt16,
  166. `SearchPhrase` String,
  167. `AdvEngineID` UInt8,
  168. `PlaceID` Int32,
  169. `RefererCategories` Array(UInt16),
  170. `URLCategories` Array(UInt16),
  171. `URLRegions` Array(UInt32),
  172. `RefererRegions` Array(UInt32),
  173. `IsYandex` UInt8,
  174. `GoalReachesDepth` Int32,
  175. `GoalReachesURL` Int32,
  176. `GoalReachesAny` Int32,
  177. `SocialSourceNetworkID` UInt8,
  178. `SocialSourcePage` String,
  179. `MobilePhoneModel` String,
  180. `ClientEventTime` DateTime,
  181. `RegionID` UInt32,
  182. `ClientIP` UInt32,
  183. `ClientIP6` FixedString(16),
  184. `RemoteIP` UInt32,
  185. `RemoteIP6` FixedString(16),
  186. `IPNetworkID` UInt32,
  187. `SilverlightVersion3` UInt32,
  188. `CodeVersion` UInt32,
  189. `ResolutionWidth` UInt16,
  190. `ResolutionHeight` UInt16,
  191. `UserAgentMajor` UInt16,
  192. `UserAgentMinor` UInt16,
  193. `WindowClientWidth` UInt16,
  194. `WindowClientHeight` UInt16,
  195. `SilverlightVersion2` UInt8,
  196. `SilverlightVersion4` UInt16,
  197. `FlashVersion3` UInt16,
  198. `FlashVersion4` UInt16,
  199. `ClientTimeZone` Int16,
  200. `OS` UInt8,
  201. `UserAgent` UInt8,
  202. `ResolutionDepth` UInt8,
  203. `FlashMajor` UInt8,
  204. `FlashMinor` UInt8,
  205. `NetMajor` UInt8,
  206. `NetMinor` UInt8,
  207. `MobilePhone` UInt8,
  208. `SilverlightVersion1` UInt8,
  209. `Age` UInt8,
  210. `Sex` UInt8,
  211. `Income` UInt8,
  212. `JavaEnable` UInt8,
  213. `CookieEnable` UInt8,
  214. `JavascriptEnable` UInt8,
  215. `IsMobile` UInt8,
  216. `BrowserLanguage` UInt16,
  217. `BrowserCountry` UInt16,
  218. `Interests` UInt16,
  219. `Robotness` UInt8,
  220. `GeneralInterests` Array(UInt16),
  221. `Params` Array(String),
  222. `Goals` Nested(
  223. ID UInt32,
  224. Serial UInt32,
  225. EventTime DateTime,
  226. Price Int64,
  227. OrderID String,
  228. CurrencyID UInt32),
  229. `WatchIDs` Array(UInt64),
  230. `ParamSumPrice` Int64,
  231. `ParamCurrency` FixedString(3),
  232. `ParamCurrencyID` UInt16,
  233. `ClickLogID` UInt64,
  234. `ClickEventID` Int32,
  235. `ClickGoodEvent` Int32,
  236. `ClickEventTime` DateTime,
  237. `ClickPriorityID` Int32,
  238. `ClickPhraseID` Int32,
  239. `ClickPageID` Int32,
  240. `ClickPlaceID` Int32,
  241. `ClickTypeID` Int32,
  242. `ClickResourceID` Int32,
  243. `ClickCost` UInt32,
  244. `ClickClientIP` UInt32,
  245. `ClickDomainID` UInt32,
  246. `ClickURL` String,
  247. `ClickAttempt` UInt8,
  248. `ClickOrderID` UInt32,
  249. `ClickBannerID` UInt32,
  250. `ClickMarketCategoryID` UInt32,
  251. `ClickMarketPP` UInt32,
  252. `ClickMarketCategoryName` String,
  253. `ClickMarketPPName` String,
  254. `ClickAWAPSCampaignName` String,
  255. `ClickPageName` String,
  256. `ClickTargetType` UInt16,
  257. `ClickTargetPhraseID` UInt64,
  258. `ClickContextType` UInt8,
  259. `ClickSelectType` Int8,
  260. `ClickOptions` String,
  261. `ClickGroupBannerID` Int32,
  262. `OpenstatServiceName` String,
  263. `OpenstatCampaignID` String,
  264. `OpenstatAdID` String,
  265. `OpenstatSourceID` String,
  266. `UTMSource` String,
  267. `UTMMedium` String,
  268. `UTMCampaign` String,
  269. `UTMContent` String,
  270. `UTMTerm` String,
  271. `FromTag` String,
  272. `HasGCLID` UInt8,
  273. `FirstVisit` DateTime,
  274. `PredLastVisit` Date,
  275. `LastVisit` Date,
  276. `TotalVisits` UInt32,
  277. `TraficSource` Nested(
  278. ID Int8,
  279. SearchEngineID UInt16,
  280. AdvEngineID UInt8,
  281. PlaceID UInt16,
  282. SocialSourceNetworkID UInt8,
  283. Domain String,
  284. SearchPhrase String,
  285. SocialSourcePage String),
  286. `Attendance` FixedString(16),
  287. `CLID` UInt32,
  288. `YCLID` UInt64,
  289. `NormalizedRefererHash` UInt64,
  290. `SearchPhraseHash` UInt64,
  291. `RefererDomainHash` UInt64,
  292. `NormalizedStartURLHash` UInt64,
  293. `StartURLDomainHash` UInt64,
  294. `NormalizedEndURLHash` UInt64,
  295. `TopLevelDomain` UInt64,
  296. `URLScheme` UInt64,
  297. `OpenstatServiceNameHash` UInt64,
  298. `OpenstatCampaignIDHash` UInt64,
  299. `OpenstatAdIDHash` UInt64,
  300. `OpenstatSourceIDHash` UInt64,
  301. `UTMSourceHash` UInt64,
  302. `UTMMediumHash` UInt64,
  303. `UTMCampaignHash` UInt64,
  304. `UTMContentHash` UInt64,
  305. `UTMTermHash` UInt64,
  306. `FromHash` UInt64,
  307. `WebVisorEnabled` UInt8,
  308. `WebVisorActivity` UInt32,
  309. `ParsedParams` Nested(
  310. Key1 String,
  311. Key2 String,
  312. Key3 String,
  313. Key4 String,
  314. Key5 String,
  315. ValueDouble Float64),
  316. `Market` Nested(
  317. Type UInt8,
  318. GoalID UInt32,
  319. OrderID String,
  320. OrderPrice Int64,
  321. PP UInt32,
  322. DirectPlaceID UInt32,
  323. DirectOrderID UInt32,
  324. DirectBannerID UInt32,
  325. GoodID String,
  326. GoodName String,
  327. GoodQuantity Int32,
  328. GoodPrice Int64),
  329. `IslandID` FixedString(16)
  330. )
  331. ENGINE = CollapsingMergeTree(Sign)
  332. PARTITION BY toYYYYMM(StartDate)
  333. ORDER BY (CounterID, StartDate, intHash32(UserID), VisitID)
  334. SAMPLE BY intHash32(UserID)
  335. SETTINGS index_granularity = 8192;

在这里插入图片描述

导入数据

  1. clickhouse-client --query "INSERT INTO tutorial.hits_v1 FORMAT TSV" --max_insert_block_size=100000 < hits_v1.tsv

7.3G数据,用时190秒

  1. clickhouse-client --query "INSERT INTO tutorial.visits_v1 FORMAT TSV" --max_insert_block_size=100000 < visits_v1.tsv

2.5G数据,用时96秒

FORMAT TSV :导入的数据格式为TSV。
max_insert_block_size : 一次导入10万条记录。

2、查询数据效率

在这里插入图片描述

在这里插入图片描述

单表限制条件:
在这里插入图片描述

求和:
在这里插入图片描述

占用空间情况

在这里插入图片描述

表关联查询:
在这里插入图片描述

在这里插入图片描述

需要注意的是,关联查询时小表在后(大表 join 小表)

3、导出数据

  1. clickhouse-client --query="select * from dbtest1.tb_merge_tree FORMAT CSV">/data/ClickHouse/tb_merge_tree.csv

在这里插入图片描述

后续将继续更新该系列,大佬超手摸手带你学ClickHouse,敬请关注!!!

推荐阅读:
《手摸手带你学ClickHouse》之安装部署
《手摸手带你学ClickHouse》之访问接口


觉得有帮助点个赞吧!!!
原创不易,转载请注明出处。

微信扫一扫下方二维码即可关注我的公众号
201911051045512.jpg

发表评论

表情:
评论列表 (有 0 条评论,75人围观)

还没有评论,来说两句吧...

相关阅读