Oracle Database In-Memory Column Store: Architecture, Population, and Workload Tuning

Oracle Database In-Memory (DBIM) adds a columnar representation of data alongside the existing row-based buffer cache. The key architectural insight is that both formats coexist — OLTP operations continue using the row-store (buffer cache) for optimal single-row performance, while analytical scans use the column store for optimal aggregation performance. The optimizer decides which format to use per operation.

The Dual-Format Architecture

Traditional databases force a choice: row format for OLTP, column format for analytics. Oracle’s approach avoids this by maintaining both simultaneously:

                    ┌───────────────────────────────────┐
                    │         SGA                        │
                    │  ┌─────────────┐  ┌─────────────┐ │
  DML (INSERT/      │  │ Buffer Cache│  │ In-Memory   │ │
  UPDATE/DELETE) ──►│  │ (Row Store) │  │ Column Store│ │
                    │  └──────┬──────┘  └──────┬──────┘ │
  OLTP Queries ─────►  Row format              │        │
  Analytics ─────────────────────────► Columnar format  │
                    └──────────────────────────────────-─┘

Both formats read from the same datafiles. DML is applied to the buffer cache (row store); the In-Memory Column Store is refreshed from redo logs by the IMCO (In-Memory Coordinator) background process. The column store is therefore always slightly behind the row store, but for analytical queries the IMCO applies pending DML before returning results.

In-Memory Compression Units (IMCUs)

The unit of columnar storage is the IMCU (In-Memory Compression Unit), which stores approximately 1 million rows of a single extent in columnar format. Each IMCU contains:

Column vectors: arrays of column values, sorted within the IMCU.
Storage indexes: per-column MIN/MAX metadata used to skip entire IMCUs that cannot satisfy a predicate.
Dictionary compression: repeated values are stored once; column vectors store dictionary offsets.

The storage index is the in-memory equivalent of Exadata’s storage index — it allows the scan to skip entire IMCUs without touching the column data, providing sub-millisecond predicate evaluation on hundreds of millions of rows.

Enabling and Populating the Column Store

Configuration

-- Enable In-Memory (requires SGA sizing for IM_AREA)
ALTER SYSTEM SET INMEMORY_SIZE = 10G SCOPE=SPFILE;
-- Requires restart: INMEMORY_SIZE is not dynamically resizable (pre-21c)

-- In 21c+, resize dynamically up to SGA_TARGET bounds
ALTER SYSTEM SET INMEMORY_SIZE = 20G;

Marking Objects for Population

-- Populate an entire table in the column store
ALTER TABLE sales INMEMORY;

-- Priority levels: NONE, LOW, MEDIUM, HIGH, CRITICAL
-- CRITICAL: populated immediately at database startup
ALTER TABLE sales INMEMORY PRIORITY CRITICAL;

-- Populate only specific columns (projection — reduces memory footprint)
ALTER TABLE sales INMEMORY NO INMEMORY (comments, blob_column);

-- Compress level: MEMCOMPRESS FOR QUERY LOW (default) or DML/CAPACITY HIGH
ALTER TABLE sales INMEMORY MEMCOMPRESS FOR QUERY HIGH;

Population Triggers

Population is lazy by default — the column store is filled when the object is first scanned. With PRIORITY CRITICAL, population starts at database open time (before any queries).

-- Monitor population progress
SELECT segment_name,
       partition_name,
       populate_status,
       ROUND(bytes_not_populated / 1024 / 1024, 0) AS mb_remaining,
       ROUND(inmemory_size / 1024 / 1024, 0)        AS mb_in_memory
FROM v$im_segments
ORDER BY segment_name, partition_name;

SEGMENT_NAME  PARTITION_NAME  POPULATE_STATUS  MB_REMAINING  MB_IN_MEMORY
------------- --------------- ---------------- ------------- -------------
SALES         P_2026_01       COMPLETED                    0          450
SALES         P_2026_02       STARTED                    220          180
SALES         P_2026_03       NOT STARTED               1200            0

SIMD-Accelerated Scan Processing

The columnar format enables SIMD (Single Instruction Multiple Data) CPU instructions to process multiple column values in a single clock cycle. Oracle uses AVX-512 instructions on modern Intel processors, evaluating up to 16 predicates per CPU clock cycle on columnar data.

Verifying SIMD Utilisation

-- Check if SIMD is active (from V$MYSTAT or V$SYS_OPTIMIZER_ENV)
SELECT name, value
FROM v$sysstat
WHERE name IN (
    'IM scan rows',
    'IM scan rows optimized',
    'IM scan bytes uncompressed',
    'IM scan CUs columns accessed',
    'IM scan CUs pruned',       -- Storage index eliminations
    'IM scan CUs no cleanout',
    'IM scan rows projected'
);

The ratio of IM scan CUs pruned to IM scan CUs columns accessed indicates storage index effectiveness — a high ratio means most of the column store is being skipped.

Query Execution with In-Memory

Verifying Column Store Is Used

-- Force the optimizer to use the column store (for testing)
SELECT /*+ INMEMORY(sales) */ SUM(amount), region
FROM sales
WHERE sale_date >= DATE '2026-01-01'
GROUP BY region;

-- Check execution plan for "TABLE ACCESS INMEMORY FULL"
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'ALLSTATS LAST'));

---------------------------------------------------------------------
| Id | Operation                    | Name  | Rows  | Bytes | Cost |
---------------------------------------------------------------------
|  0 | SELECT STATEMENT             |       |       |       |      |
|  1 |  HASH GROUP BY               |       |   20  |   260 |  150 |
|* 2 |   TABLE ACCESS INMEMORY FULL | SALES |  5M   |  65M  |  148 |
---------------------------------------------------------------------

TABLE ACCESS INMEMORY FULL confirms the column store is being used.

When the Optimizer Prefers Row Store

The optimizer may prefer the buffer cache (row store) over the column store when:

The query accesses only a small number of rows (index access is more efficient).
The table or partition is not fully populated yet.
Stale statistics do not reflect the actual rowcount.

Use hints INMEMORY and NO_INMEMORY to control this during testing:

-- Force column store
SELECT /*+ INMEMORY(t) FULL(t) */ ...

-- Force row store (for comparison benchmarking)
SELECT /*+ NO_INMEMORY(t) */ ...

In-Memory Expressions (IME)

Frequently computed expressions can be materialised in the column store — the expression result is stored as a virtual column in the IMCU, eliminating repeated computation:

-- Frequently computed: net_amount = quantity * unit_price * (1 - discount)
-- Materialize as an In-Memory Expression
ALTER TABLE sales ADD (net_amount AS (quantity * unit_price * (1 - discount)));

ALTER TABLE sales MODIFY (net_amount INMEMORY VIRTUAL COLUMN);

The database detects frequently computed expressions automatically (from V$IM_EXPRESSIONS) and can populate them without DDL:

-- View candidate IMEs detected by the system
SELECT * FROM v$im_expressions;

-- Manually populate detected IME
EXEC DBMS_INMEMORY_ADMIN.IME_POPULATE_EXPRESSIONS;

Sizing the In-Memory Area

Sizing is empirical. Start with the compressed columnar size estimate:

-- Estimate compressed columnar size for candidate objects
SELECT segment_name,
       SUM(bytes) / 1024 / 1024 AS segment_mb,
       -- Columnar with QUERY LOW compression: typically 2–4x smaller than row format
       SUM(bytes) / 1024 / 1024 / 3 AS estimated_imcs_mb
FROM dba_segments
WHERE owner = 'SCHEMA1'
  AND segment_name IN ('SALES', 'ORDERS', 'PRODUCTS')
GROUP BY segment_name;

The actual compression ratio depends on data distributions. Use POPULATE_STATUS = 'COMPLETED' rows in V$IM_SEGMENTS for actual measured sizes.

Monitor memory pressure using V$INMEMORY_AREA:

SELECT pool, alloc_bytes / 1024 / 1024 AS alloc_mb,
       used_bytes  / 1024 / 1024 AS used_mb,
       ROUND(used_bytes / alloc_bytes * 100, 1) AS pct_used
FROM v$inmemory_area;

When pct_used > 90%, the column store is evicting segments to make room — less-recently used segments are evicted first. Increase INMEMORY_SIZE or reduce the set of objects marked INMEMORY.

Combining In-Memory with Exadata

On Exadata, the In-Memory Column Store and Smart Scan are complementary. For very large tables that exceed the In-Memory area, Exadata Smart Scan handles the overflow efficiently. The optimizer automatically uses the column store for in-memory segments and Smart Scan for segments not yet populated or evicted.

This combination is particularly powerful for mixed OLTP/analytics workloads: OLTP uses the row-store path; small analytical queries use the fully in-memory column store; large analytical scans use Smart Scan on Exadata storage cells — all on the same database, automatically.

The Dual-Format Architecture#

In-Memory Compression Units (IMCUs)#

Enabling and Populating the Column Store#

Configuration#

Marking Objects for Population#

Population Triggers#

SIMD-Accelerated Scan Processing#

Verifying SIMD Utilisation#

Query Execution with In-Memory#

Verifying Column Store Is Used#

When the Optimizer Prefers Row Store#

In-Memory Expressions (IME)#

Sizing the In-Memory Area#

Combining In-Memory with Exadata#