Scaling Thousands of Concurrent Data Grid Rows with Cell-Based Virtualization in React

In this article, we'll explore a technique called "cell-based virtualization" to smoothly handle tens of thousands of concurrent data grid rows in a React-based web application.

GraphQL has a role beyond API Query Language- being the backbone of application Integration
background Coditation

Scaling Thousands of Concurrent Data Grid Rows with Cell-Based Virtualization in React

As web applications grow in complexity, the amount of data they need to manage also increases. This data is often presented in data grids, which allow users to view, edit, and manipulate rows of data. However, rendering thousands of rows in a data grid can cripple performance and lead to an unusable interface. In this article, we'll explore a technique called "cell-based virtualization" to smoothly handle tens of thousands of concurrent data grid rows in a React-based web application.

The Problem: Rendering Large Data Sets

A naive data grid implementation might load all the data and render all the rows and cells at once. This works fine for small data sets, but as the number of rows grows, the interface slows to a crawl.
Some major performance issues include:

  • The browser needing to layout and render tens of thousands of DOM elements
  • Heavy memory usage to store all the row and cell data
  • Expensive data processing for features like sorting, filtering, and aggregation

For example, a grid with 50,000 rows and 5 columns would need to render 250,000 cell elements on top of the row markup, JavaScript memory overhead per record, event handlers, and more. This taxes the browser and leads to severe lag when scrolling and interacting.

Virtualization Basics

Virtualization techniques only render a small subset of rows that are currently visible, usually with some overflow on either side. As the user scrolls, rows are seamlessly rendered or removed as needed.
This helps performance by:

  • Reducing the number of DOM elements the browser must manage and redraw
  • Lowering memory usage by avoiding unused data
  • Decreasing the data processing load for grid features

However, traditional virtualization operates at the row level – entire rows are rendered or removed as a block. This still requires layout and rendering of hundreds or thousands of cells at a time.

Introducing Cell Virtualization

Cell virtualization takes things to the next level by only rendering the currently visible cells, rather than full rows. As the user scrolls, individual cells are rendered precisely where they need to be in the viewport.
For example, given a grid with 10,000 rows and 5 columns, traditional row virtualization might render 100 full rows with 500 visible cells. In contrast, cell virtualization could render just the 150 cells currently visible rather than full rows.
This further optimizes:

  • The number of DOM elements avoiding unused markup
  • Memory usage by skipping unused data
  • Scroll smoothness by reducing paint areas

A cell-virtualized data grid also intelligently reuses existing cell elements as much as possible. As you scroll, most new cells can be efficiently swapped into place rather than created from scratch. This caching avoids expensive DOM placement and reflows.
Here is a simplified version of cell reuse logic in React:


jsx
Copy code
// Cache of cell react elements 
let cellCache = {} 
function renderCell(row, col, data) {
  // See if we already have this cell
  let cellElement = cellCache[[row, col]]
  if (cellElement) { 
    // Reuse existing element
    cellElement.data = data;  
    return cellElement;
  } else {
    // Render new cell 
    cellElement =  
    // Save to cache
    cellCache[[row, col]] = cellElement;
    return cellElement;
  }
}

By reusing elements, wasted recreation of identical markup is avoided. This optimization starts to become significant at scale across tens of thousands of records.

Scaling to 100,000 Rows

As a real-world test case, we implemented a React data grid using cell-based virtualization with the following parameters:

  • 100,000 rows
  • 5 columns
  • 1,000 pixel height
  • Dynamic data (filtering, sorting)

Even with this much raw data, cell virtualization provided a smooth 60 FPS scrolling experience. Memory usage remained reasonable for such a large dataset, and DOM elements were optimized by reusing cells.
Some numbers from Chrome DevTools:

  • 1,000 visible rows – Only rows in viewport were rendered
  • 5,000 visible cells – Individual cells rendered as needed
  • 60 FPS – Consistently smooth scrolling
  • 50 MB memory – Expected size without optimization
  • 1.5 MB memory – Actual usage with cell virtualization

Compare this to a naive rendering approach which would have likely crashed the browser!

Implementation Details

Here is a high-level outline of how cell-based virtualization can be implemented:
Determine visible row range

  • Listen to scroll events
  • Calculate first and last visible row index based on row heights

Calculate visible cell range

  • Determine horizontal position
  • Iterate through visible row range
  • Identify first and last visible cell in each column

Here is sample logic in React:


jsx
Copy code
function updateVisibleCells() {
  let topRow = getTopVisibleRowIndex();
  let bottomRow = getBottomVisibleRowIndex();
  for (let row = topRow; row <= bottomRow; row++) {
    let topCell = getTopVisibleCellIndex(row);
    let bottomCell = getBottomVisibleCellIndex(row);
    for (let cell = topCell; cell <= bottomCell; cell++) {
      // Render cell
    }
  }
}

Populate container

  • Reuse existing cell elements
  • Create missing cells
  • Position elements correctly

Smooth scrolling

  • Debounce scroll handler
  • Request animation frame
  • Throttle data calls

By following virtualization best practices, the grid stays responsive even with 100K concurrent records!

Next Steps

Cell-based data grid virtualization opens the door to managing large, real-time datasets in web UIs. Some ideas for taking things further:
Incremental loading
– Fetch additional data as user scrolls down
Remote data
– Integrate with large cloud data sources
Column virtualization
– Only render visible columns
Dynamic columns
– Reordering, resizing, etc
Immutable data
– For snapshots, time travel debugging, etc
If you need to visualize or interact with huge numbers of records, give cell virtualization a try! Proper virtualization technique can make the difference between an unusable laggy interface and a buttery-smooth user experience.

Want to receive update about our upcoming podcast?

Thanks for joining our newsletter.
Oops! Something went wrong.

Latest Articles

Implementing Custom Instrumentation for Application Performance Monitoring (APM) Using OpenTelemetry

Application Performance Monitoring (APM) has become crucial for businesses to ensure optimal software performance and user experience. As applications grow more complex and distributed, the need for comprehensive monitoring solutions has never been greater. OpenTelemetry has emerged as a powerful, vendor-neutral framework for instrumenting, generating, collecting, and exporting telemetry data. This article explores how to implement custom instrumentation using OpenTelemetry for effective APM.

Mobile Engineering
time
5
 min read

Implementing Custom Evaluation Metrics in LangChain for Measuring AI Agent Performance

As AI and language models continue to advance at breakneck speed, the need to accurately gauge AI agent performance has never been more critical. LangChain, a go-to framework for building language model applications, comes equipped with its own set of evaluation tools. However, these off-the-shelf solutions often fall short when dealing with the intricacies of specialized AI applications. This article dives into the world of custom evaluation metrics in LangChain, showing you how to craft bespoke measures that truly capture the essence of your AI agent's performance.

AI/ML
time
5
 min read

Enhancing Quality Control with AI: Smarter Defect Detection in Manufacturing

In today's competitive manufacturing landscape, quality control is paramount. Traditional methods often struggle to maintain optimal standards. However, the integration of Artificial Intelligence (AI) is revolutionizing this domain. This article delves into the transformative impact of AI on quality control in manufacturing, highlighting specific use cases and their underlying architectures.

AI/ML
time
5
 min read