Aims and Scopes


BenchCouncil Transactions on Benchmarks, Standards and Evaluations (TBench) is an open-access, multi-disciplinary journal dedicated to advancing the science and engineering of evaluation.

We extend a warm invitation to researchers from diverse disciplines to submit their work, with a particular emphasis on interdisciplinary research. Whether the research pertains to computers, artificial intelligence (AI), medicine, education, finance, business, psychology, or other social disciplines, all relevant contributions are highly valued and welcomed.

At TBench, we place great importance on the reproducibility of research. We strongly encourage authors to ensure that their articles are appropriately prepared for open-source or artifact evaluation prior to submission.

Areas of interest include, but are not limited to:

  • 1. Evaluation theory and methodology
    • Formal specification of evaluation requirements
    • Development of evaluation models
    • Design and implementation of evaluation systems
    • Analysis of evaluation risk
    • Cost modeling for evaluations
    • Accuracy modeling for evaluations
    • Evaluation traceability
    • Identification and establishment of evaluation conditions
    • Equivalent evaluation conditions
    • Design of experiments
    • Statistical analysis techniques for evaluations
    • Methodologies and techniques for eliminating confounding factors in evaluations
    • Analytical modeling techniques and validation of models
    • Simulation and emulation-based modeling techniques and validation of models
    • Development of methodologies, metrics, abstractions, and algorithms specifically tailored for evaluations
  • 2. The engineering of evaluation
    • Benchmark design and implementation
    • Benchmark traceability
    • Establishing least equivalent evaluation conditions
    • Index design, implementation
    • Scale design, implementation
    • Evaluation standard design and implementations
    • Evaluation and benchmark practice
    • Tools for evaluations
    • Real-world evaluation systems
    • Testbed
  • 3. Data sets
    • Explicit or implicit problem definition deduced from the data set
    • Detailed descriptions of research or industry datasets, including the methods used to collect the data and technical analyses supporting the quality of the measurements
    • Analyses or meta-analyses of existing data
    • Systems, technologies, and techniques that advance data sharing and reuse to support reproducible research
    • Tools that generate large-scale data while preserving their original characteristics
    • Evaluating the rigor and quality of the experiments used to generate the data and the completeness of the data description
  • 4. Benchmarking
    • Summary and review of state-of-the-art and state-of-the-practice
    • Searching and summarizing industry best practice
    • Evaluation and optimization of industry practice
    • Retrospective of industry practice
    • Characterizing and optimizing real-world applications and systems
    • Evaluations of state-of-the-art solutions in the real-world setting
  • 5. Measurement and testing
    • Workload characterization
    • Instrumentation, sampling, tracing, and profiling of large-scale, real-world applications and systems
    • Collection and analysis of measurement and testing data that yield new insights
    • Measurement and testing-based modeling (e.g., workloads, scaling behavior, and assessment of performance bottlenecks)
    • Methods and tools to monitor and visualize measurement and testing data
    • Systems and algorithms that build on measurement and testing-based findings
    • Reappraisal of previous empirical measurements and measurement-based conclusions
    • Reappraisal of previous empirical testing and testing-based conclusions