- Direct Known Subclasses:
- TestMultiColumnScannerWithAlgoGZAndNoDataEncoding, TestMultiColumnScannerWithAlgoGZAndUseDataEncoding, TestMultiColumnScannerWithNoneAndNoDataEncoding, TestMultiColumnScannerWithNoneAndUseDataEncoding
public abstract class TestMultiColumnScanner
extends Object
Tests optimized scanning of multiple columns.
We separated the big test into several sub-class UT, because When in ROWCOL bloom type, we will
test the row-col bloom filter frequently for saving HDFS seek once we switch from one column to
another in our UT. It's cpu time consuming (~45s for each case), so moved the ROWCOL case into a
separated LargeTests to avoid timeout failure.
To be clear: In TestMultiColumnScanner, we will flush 10 (NUM_FLUSHES=10) HFiles here, and the
table will put ~1000 cells (rows=20, ts=6, qualifiers=8, total=20*6*8 ~ 1000) . Each full table
scan will check the ROWCOL bloom filter 20 (rows)* 8 (column) * 10 (hfiles)= 1600 times, beside
it will scan the full table 6*2^8=1536 times, so finally will have 1600*1536=2457600 bloom filter
testing. (See HBASE-21520)