A Study on Improvement of Data Analysis Process through Smart Compression

Authors

  • Cheolhee YOON
  • Jang Mook KANG
  • Jung Joong KIM

Abstract

Smart Compression applies the appropriate compression algorithm for each column, taking into account the maximum compression and decompression rates. The compression algorithms used here are LZ4, RLE, PFOR, Delta encoding on top of PFOR, and Dictionary encoding. The vector - based compression algorithm used in this paper provides 4 to 6 times higher compression efficiency than the commonly used compression algorithm. To maximize Input and Output performance, we run CBM (Column Buffer Manager) which replicates the contents of the disk in the main memory area. In addition, The decompression is processed not in the memory but in the vector mode immediately before the data is processed in the CPU cache, and decompression proceeds. Therefore, it is efficient in terms of improving the data analysis process. Smart compression is derived to use large capacity manufacturing data in smart factory.

 Keywords: Smart compression, Algorithm, data analysis

Downloads

Published

2019-12-12

Issue

Section

Articles