Skip navigation

BenchCouncil: International Open Benchmark Council

 

BenchCouncil’s View On Benchmarking AI and Other Emerging Workloads (Technical Report, Slides presented by Prof. Jianfeng Zhan at BenchCouncil SC BoF). This paper outlines BenchCounci’s view on the challenges, rules, and vision of benchmarking modern workloads.

BenchCouncil Top Level Projects

BigDataBench: a Scalable Big Data Benchmark Suite [HPCA'14]

The current version BigDataBench 5.0 provides 13 representative real-world data sets and 25 benchmarks. The benchmarks cover six workload types including online services, offline analytics, graph analytics, data warehouse, NoSQL, and streaming from three important application domains, Internet services (including search engines, social networks, e-commerce), recognition sciences, and medical sciences. Our benchmark suite includes micro benchmarks, each of which is a single data motif, components benchmarks, which consist of the data motif combinations, and end-to-end application benchmarks, which are the combinations of component benchmarks. Meanwhile, data sets have great impacts on workloads behaviors and running performance (CGO’18). Hence, data varieties are considered with the whole spectrum of data types including structured, semi-structured, and unstructured data. Currently, the included data sources are text, graph, table, and image data. Using real data sets as the seed, the data generators—BDGS— generate synthetic data by scaling the seed data while keeping the data characteristics of raw data.

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite [TR-2020, TR-2019, Bench18, Specification]

The current version of AIBench 1.0 proposes an agile domain-specific benchmarking methodology. Together with seventeen industry partners, AIBench identifies ten important end-to-end application scenarios, among which sixteen representative AI tasks are distilled as the AI component benchmarks, and proposes the permutations of essential AI and non-AI component benchmarks as end-to-end benchmarks. An end-to-end benchmark is a distillation of the essential attributes of an industry-scale application. In addition, AIBench provides a highly extensible, configurable, and flexible benchmark framework, on the basis of which, we propose the guideline for building end-to-end benchmarks, and present the first end-to-end Internet service AI benchmark. Sixteen component benchmarks are classification, image generation, text-to-text translation, image-to-text, image-to- image, speech-to-text, face embedding, 3D face recognition, object detection, video prediction, image compression, recommendation, 3D object reconstruction, text summarization, spatial transformer, and learning to rank. Fourteen micro benchmarks are Covolution, Fully connected, Relu, Sigmoid, Tanh, MaxPooling, AvgPooling, CosineNorm, BatchNorm, Dropout, Element-wise multipy, Softmax, Data arrangement, and Memcpy. The first end-to-end benchmark models the E-commerce search intelligence. The benchmarks are implemented not only based on main-stream deep learning frameworks like TensorFlow and PyTorch, but also based on traditional programming model like Pthreads, to conduct an apple-to-apple comparison.

HPC AI500: A Benchmark Suite for HPC AI Systems [Bench18, Specification]

A Benchmark Suite for HPC AI Systems--- HPC AI500 provides 3 representative scientific data sets and 7 benchmarks. The benchmarks cover 3 workload types including extreme weather analysis, high energy physics, and cosmology. It consists of 3 micro benchmarks and 4 component benchmarks. Micro Benchmarks use two software stacks including CUDA and MKL. Component Benchmarks use two software stacks including TensorFlow and Pytorch.

AIoTBench: Towards Comprehensive Benchmarking Mobile and Embedded device Intelligence [Bench18, Specification]

Benchmarking for Mobile and Embedded device Intelligence---AIOT Bench provides 3 representative real-world data sets and 12 benchmarks. The benchmarks cover 3 application domains including image recognition, speech recognition and natural language processing. It consists of 9 micro benchmarks and 3 component benchmarks. It covers different platforms, including Android devices and Raspberry Pi. It covers different development tools, including TensorFlow and Caffe2.

Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking [Bench18, Specification]

Comprehensive End-to-end Edge Computing Benchmarking---Edge AIBench provides 5 representative real-world data sets and 16 benchmarks. The benchmarks cover 4 application scenarios including ICU Patient Monitor, Surveillance Camera, Smart Home, and Autonomous Vehicle. It consists of 8 micro benchmarks and 8 component benchmarks. Moreover, it provides an edge computing AI testbed combined with federated learning.


BenchCouncil Incubator Projects

The Incubator Project is the entry path into BenchCouncil for projects and codebases wishing to become part of the BenchCouncil's efforts. All code donations from external organisations and existing external projects wishing to join BenchCouncil enter through the Incubator. The BenchCouncil Incubator has two primary goals: Ensure all donations are in accordance with BenchCouncil legal standards Develop new communities that adhere to the BenchCouncil's guiding principles For more regarding BenchCouncil Incubator, see the Incubator website.

A Benchmark Suite for Medical AI

BenchCPU

EChip

A Benchmark Suite for Smart Grid


Other Benchmarking Proposals

BenchCouncil conferences are open to everyone who would like to contribute benchmarking proposals at any time.

We have received 8 benchmarking proposals.