Skip to contents

Execution speed is rarely the primary consideration when selecting statistical software—correctness, interpretability, and ease-of-use usually take precedence. However, computational efficiency becomes relevant when working with large datasets, conducting simulation studies, or iterating through model specifications during exploratory analysis.

This article documents the computational performance of summata relative to established alternatives. The benchmarks presented here are intended as a reference for users whose workflows involve performance-sensitive operations, and as a record of the design tradeoffs inherent in different implementation approaches.


Methodology

All benchmarks were conducted using the microbenchmark package under the following conditions:

  • Iterations: 5–20 per benchmark, adjusted for computational intensity
  • Dataset sizes: 500 to 10,000 observations
  • Data structure: Simulated clinical trial data with continuous, categorical, and time-to-event variables
  • Predictors: 14 variables for screening benchmarks

Datasets were generated using a fixed random seed to ensure reproducibility. Timing measurements exclude package loading and data generation. All packages were tested using default parameters unless otherwise noted.

Two summata configurations are benchmarked throughout: the default configuration, which uses profile likelihood confidence intervals for GLM models and includes full formatting (QC statistics, sample sizes, reference rows); and a minimal configuration (summata_minimal), which uses Wald CIs and disables optional output features. This distinction is important because profile likelihood CIs dominate GLM execution time, and the minimal configuration provides a way to measure summata’s formatting overhead in isolation. Note that finalfit and broom also use profile likelihood CIs by default for GLM models.


Descriptive Tables

Descriptive summary tables represent a common first step in data analysis. The following packages provide comparable functionality with differing implementation strategies.

Package Function Implementation Notes
summata desctable() data.table operations
arsenal tableby() Formula-based interface
tableone CreateTableOne() Matrix-based computation
finalfit summary_factorlist() tidyverse ecosystem
gtsummary tbl_summary() gt table framework

Dataset Size summata arsenal tableone finalfit gtsummary
n = 1,000 42 ms 64 ms 46 ms 429 ms 2,901 ms
n = 5,000 57 ms 77 ms 79 ms 442 ms 2,929 ms
n = 10,000 73 ms 98 ms 126 ms 464 ms 3,001 ms

The observed timing differences reflect underlying implementation choices. Packages built on data.table or base R matrix operations (summata, tableone, arsenal) exhibit lower overhead than those employing more extensive formatting pipelines (gtsummary). The gtsummary package prioritizes output flexibility and gt integration, which introduces additional computational cost.


Survival Tables

Survival probability tables summarize Kaplan-Meier estimates at specified time points.

Package Function Notes
summata survtable() Formatted output
manual survival::survfit() Raw computation
gtsummary tbl_survfit() gt integration

Dataset Size summata gtsummary manual
n = 1,000 21 ms 266 ms 6 ms
n = 5,000 35 ms 271 ms 11 ms
n = 10,000 52 ms 274 ms 14 ms

Direct survfit() computation provides a baseline for the minimum time required. The difference between raw computation and formatted output reflects the cost of table construction and presentation logic.


Regression Output

The following benchmarks compare functions that extract and format regression coefficients. Each package produces tables suitable for publication, though with varying levels of default formatting. Compared functions are as follows:

Package Function Notes
summata fit() Profile likelihood CIs, QC stats, counts, and reference rows
summata_minimal fit(..., conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE, keep_qc_stats = FALSE) Wald CIs, reduced output
finalfit glmuni() + fit2df() Profile likelihood CIs (default)
broom tidy() Profile likelihood CIs via confint() dispatch
gtsummary tbl_regression() gt formatting

Logistic Regression

Dataset Size summata_minimal summata finalfit broom gtsummary
n = 500 18 ms 174 ms 147 ms 150 ms 1,344 ms
n = 1,000 22 ms 234 ms 212 ms 214 ms 1,399 ms
n = 5,000 37 ms 840 ms 749 ms 936 ms 2,153 ms
n = 10,000 45 ms 1,532 ms 1,562 ms 1,564 ms 2,756 ms

The default summata configuration uses profile likelihood confidence intervals for GLM models, as do finalfit and broom::tidy(). The three packages show comparable performance for logistic regression because profile likelihood profiling dominates execution time for all of them. The summata_minimal configuration uses Wald CIs instead, skipping the profiling step entirely, and achieves the fastest extraction times at all sample sizes. At large n, profiling cost grows with the number of IRLS iterations, causing all profile-based packages to converge toward similar timings.

Linear Regression

Dataset Size summata_minimal summata finalfit broom gtsummary
n = 500 20 ms 35 ms 6 ms 6 ms 1,179 ms
n = 1,000 21 ms 36 ms 7 ms 6 ms 1,193 ms
n = 5,000 27 ms 43 ms 9 ms 9 ms 1,192 ms
n = 10,000 38 ms 49 ms 13 ms 12 ms 1,230 ms

For linear models, broom::tidy() and finalfit achieve faster coefficient extraction due to lower formatting overhead. All three packages use exact t-distribution CIs for lm objects (via confint.lm()), so the timing difference reflects formatting features (reference rows, QC statistics) rather than CI computation.

Poisson Regression

Dataset Size summata_minimal summata finalfit broom gtsummary
n = 500 20 ms 155 ms 135 ms 144 ms 1,293 ms
n = 1,000 24 ms 201 ms 181 ms 184 ms 1,325 ms
n = 5,000 34 ms 595 ms 613 ms 612 ms 1,868 ms
n = 10,000 47 ms 1,351 ms 1,409 ms 1,402 ms 2,577 ms

Poisson regression shows the same profile likelihood pattern as logistic regression: the default summata, finalfit, and broom all use profile CIs and show comparable performance. The summata_minimal configuration with Wald CIs is consistently the fastest option.

Cox Regression

Dataset Size summata_minimal summata finalfit broom gtsummary
n = 500 17 ms 34 ms 7 ms 12 ms 1,149 ms
n = 1,000 21 ms 38 ms 9 ms 14 ms 1,161 ms
n = 5,000 40 ms 61 ms 25 ms 30 ms 1,227 ms

Cox models use Wald CIs regardless of the conf_method setting (the standard approach in survival analysis), so the timing difference between summata and summata_minimal reflects formatting overhead only. finalfit and broom achieve faster extraction with less formatting.


Mixed-Effects Models

Mixed-effects models present a useful comparison case because the underlying model fitting (via lme4) dominates execution time regardless of the wrapper package.

Package Function Notes
summata fit(..., model_type = "lmer") Unified interface
summata_minimal fit(..., model_type = "lmer", conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE, keep_qc_stats = FALSE) Reduced output
finalfit lmmixed() + fit2df() Two-step process
broom.mixed tidy() Minimal extraction
gtsummary tbl_regression() gt formatting

Dataset Size summata_minimal summata finalfit broom gtsummary
n = 500 35 ms 58 ms 26 ms 31 ms 1,141 ms
n = 1,000 37 ms 60 ms 28 ms 34 ms 1,133 ms
n = 5,000 52 ms 76 ms 43 ms 47 ms 1,185 ms

The relatively narrow spread among summata, finalfit, and broom reflects the dominance of model fitting time. Differences in wrapper overhead become proportionally less significant as the underlying computation grows.


Univariable Screening

Univariable screening—fitting separate models for each predictor—provides a test case for operations involving many repeated model fits.

Package Function Notes
summata uniscreen() Parallel-capable
summata_minimal uniscreen(..., conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE) Wald CIs, reduced output
finalfit glmuni() + fit2df() Sequential
broom Loop + tidy() Manual implementation
arsenal modelsum() Formula interface
gtsummary tbl_uvregression() gt formatting

Screening 14 predictors:

Dataset Size summata_minimal summata finalfit broom arsenal gtsummary
n = 500 117 ms 319 ms 360 ms 440 ms 877 ms 12,963 ms
n = 1,000 134 ms 351 ms 477 ms 558 ms 1,128 ms 13,025 ms
n = 5,000 196 ms 624 ms 1,763 ms 1,799 ms 3,401 ms 14,001 ms

The performance gap between summata (default) and summata_minimal is amplified during univariable screening because profile likelihood profiling is repeated for each of the 14 predictor models. All profile-based packages (summata default, finalfit, broom) show comparable performance, as profiling dominates their execution time. With Wald CIs, summata_minimal is the fastest option at all sample sizes, outperforming the next-fastest alternative by 2.6–9.0× due to data.table vectorization and parallel model fitting.


Complete Workflow

The combined univariable screening and multivariable modeling workflow represents a common analytical pattern in statistical research.

Package Approach Notes
summata fullfit() Single function
summata_minimal fullfit(..., conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE) Wald CIs, reduced output
finalfit finalfit() Single function
manual Loop + glm() + broom::tidy() + rbind() Custom
gtsummary tbl_uvregression() + tbl_regression() + tbl_merge() Multi-step

Dataset Size summata_minimal summata finalfit manual gtsummary
n = 500 123 ms 450 ms 207 ms 407 ms 9,655 ms
n = 1,000 136 ms 549 ms 200 ms 541 ms 9,726 ms
n = 5,000 196 ms 1,479 ms 209 ms 1,889 ms 11,092 ms

The default summata and finalfit show comparable performance for GLM workflows because both use profile likelihood CIs. The difference between them reflects summata’s additional features (QC statistics, reference rows, complete-case sample sizes) versus finalfit’s inclusion of a descriptive statistics table. The summata_minimal configuration with Wald CIs is the fastest single-function option at small to moderate sample sizes, completing the combined analysis in roughly 60–70% of the time finalfit requires at n = 500–1,000.


Forest Plots

Forest plot generation combines data extraction with graphical rendering.

Package Function Notes
summata coxforest() Integrated table and plot
survminer ggforest() Survival-focused
manual Custom ggplot2 Maximum flexibility

Dataset Size summata survminer manual
n = 500 203 ms 345 ms 57 ms
n = 1,000 198 ms 338 ms 54 ms
n = 5,000 203 ms 329 ms 53 ms

The manual approach produces only the graphical element, while summata and survminer generate integrated displays with coefficient tables. The relatively constant timing across dataset sizes indicates that plot rendering, rather than data processing, dominates execution time. Also, there are significant cosmetic differences between the three graphical outputs, which predominates other factors when selecting a plotting function.


Relative Performance

The following figures summarize timing ratios across benchmarks. Values greater than 1 indicate the comparison package requires more time than the baseline.

Relative to summata (default, profile likelihood CIs)

Relative to summata_minimal (Wald CIs, no QC stats)

Summary of Ratios

Ratios relative to summata (default):

Benchmark gtsummary finalfit arsenal
Descriptive Tables 41–70× 6–10× 1.4–1.5×
Survival Tables 5–12×
Logistic Regression 6–8× 0.8–1.0×
Poisson Regression 7–8× 0.9–1.0×
Linear Regression 25–34× 0.2–0.3×
Cox Regression 20–34× 0.2–0.4×
Mixed-Effects 16–20× 0.5–0.6×
Univariable Screening 22–41× 1.1–2.8× 2.8–5.5×
Complete Workflow 8–21× 0.1–0.5×

Ratios relative to summata_minimal (Wald CIs):

Benchmark gtsummary finalfit summata (default)
Logistic Regression 58–74× 8–35× 10–34×
Poisson Regression 55–63× 7–30× 8–29×
Linear Regression 32–59× 0.3× 1.3–1.7×
Cox Regression 31–68× 0.4–0.6× 1.5–2.0×
Mixed-Effects 23–33× 0.8–0.9× 1.5–1.7×
Univariable Screening 71–111× 3.1–9.0× 2.6–3.2×
Complete Workflow 57–78× 1.1–1.7× 3.7–7.5×

For GLM models (logistic and Poisson), summata_minimal outperforms all alternatives by a wide margin: 8–35× faster than finalfit, 7–30× faster than broom. This is because summata_minimal is the only configuration that uses Wald CIs — finalfit, broom, and the default summata all use profile likelihood CIs, which accounts for their comparable timings.

For linear, Cox, and mixed-effects models, where all packages use the same CI method (exact t-distribution for lm, Wald for Cox and mixed-effects), the timing gap between summata and summata_minimal is narrow (1.3–2.0×) and reflects formatting overhead only.


Scaling Characteristics

The relationship between dataset size and execution time provides insight into algorithmic complexity. Near-linear scaling (execution time proportional to n) indicates efficient implementation, while superlinear scaling may suggest operations with O(n²) complexity, such as repeated rbind() calls or element-wise data frame construction.

Observed scaling factors for summata (ratio of time at n = 10,000 to time at n = 1,000):

Operation Scaling Factor Expected for O(n)
Descriptive tables 1.7× 10×
Logistic regression 6.5× 10×
Univariable screening 1.8× 10×

The sublinear scaling reflects that fixed overhead (package loading, object construction, profile likelihood profiling) constitutes a significant fraction of total time at smaller dataset sizes. Logistic regression shows nearer-to-linear scaling because profile likelihood profiling cost scales with the number of IRLS iterations, which grows with n.


Implementation Notes

The performance characteristics documented here reflect specific implementation choices:

summata: Built on data.table for data manipulation, with coefficient extraction optimized for common model classes. Default configuration uses profile likelihood CIs for GLM/negbin models (matching finalfit and broom). The conf_method = "wald" option skips profiling entirely, producing a configuration faster than any alternative tested.

gtsummary: Prioritizes output flexibility through the gt table framework. The additional abstraction layers enable extensive customization but increase computational overhead.

finalfit: Balances functionality and performance with a tidyverse-compatible interface. Uses profile likelihood CIs by default for GLM models (confint_type = "profile"). The finalfit() function is particularly optimized for the combined workflow.

arsenal: Uses formula-based syntax familiar to SAS users. Performance varies by operation type.

broom: Provides minimal coefficient extraction with limited formatting. Uses profile likelihood CIs for GLM models via stats::confint() dispatch. Suitable as a building block for custom pipelines.


Effect of Output Options

By default, summata regression functions compute profile likelihood confidence intervals (for GLM and negative binomial models), sample sizes, event counts, QC statistics, and reference rows for categorical variables. These features produce more complete and accurate output for publication but add computational overhead. For performance-sensitive applications, these options can be disabled.

The summata_minimal configuration shown in the benchmarks represents:

fit(data, outcome, predictors, 
    conf_method = "wald",
    show_n = FALSE, 
    show_events = FALSE, 
    reference_rows = FALSE,
    keep_qc_stats = FALSE)

The conf_method parameter can also be set globally for an entire session:

options(summata.conf_method = "wald")

The impact of each option varies by model type:

Option GLM/negbin models Linear/Cox/mixed models
conf_method = "wald" Large effect (eliminates profile likelihood profiling) Minimal effect (Wald already used for Cox/mixed; exact t is fast for lm)
keep_qc_stats = FALSE Moderate effect (skips C-statistic, Hosmer-Lemeshow) Small effect
show_n/show_events = FALSE Small effect Small effect
reference_rows = FALSE Small effect Small effect

For logistic and Poisson models at n = 1,000, the minimal configuration is approximately 10× faster than the default (22 ms vs. 234 ms for logistic), with the majority of the difference attributable to conf_method. For linear and Cox models, the difference is roughly 1.5–2×, reflecting formatting overhead only.

The choice between configurations depends on the use case:

  • Publication tables: Default settings provide profile likelihood CIs and complete output ready for manuscripts
  • Simulation studies: conf_method = "wald" reduces per-iteration overhead substantially for GLM models
  • Exploratory analysis: Either setting is appropriate; conf_method = "wald" is recommended when iterating through many model specifications

Practical Considerations

The timing differences documented here range from negligible (tens of milliseconds) to substantial (several seconds). The practical significance depends on context:

  • Interactive analysis: Differences under 500 ms are generally imperceptible
  • Batch processing: Cumulative differences matter when processing many datasets
  • Simulation studies: Per-iteration overhead compounds across thousands of replicates
  • Teaching and demonstration: Faster feedback loops improve the interactive experience

Package selection should primarily reflect functional requirements, syntax preferences, and ecosystem compatibility. Performance considerations become relevant only when computational constraints are binding.


Reproducibility

The benchmark script is available in the package repository at inst/benchmarks/benchmarks.R. Execution produces:

  • Individual PNG figures for each benchmark category
  • Summary figures (benchmark_speedup.png, benchmark_speedup_minimal.png)
  • CSV files with detailed timing data

Results will vary across systems due to differences in hardware, R version, and package versions.


Session Information

This benchmark was run under the following conditions:

R version 4.5.2 (2025-10-31)
Platform: x86_64-unknown-linux-gnu

Matrix products: default
BLAS/LAPACK: /usr/lib/libopenblasp-r0.3.30.so;  LAPACK version 3.12.0

Void Linux x86_64
Linux 6.12.63_1
Intel(R) Core(TM) i5-4670K (4) @ 3.80 GHz
NVIDIA GeForce GTX 970 [Discrete]