SC 19 BoF: BenchCouncil AI Benchmark

--- Towards a Comprehensive AI Benchmark Suite for HPC, Datacenter, Edge, and IoT

As diverse communities pay great attention to innovative AI or machine learning algorithms, architecture, and systems, the pressure of benchmarking rises. However, complexity, diversity, frequently changed workloads, and rapid evolution of AI workloads and systems raise great challenges in AI benchmarking. The aim of this BoF is to discuss how to build a comprehensive AI benchmark suite across different communities with an emphasis on data and workload distributions among HPC, data center, Edge, and IoT.

(1) HPC AI500: A Benchmark Suite for HPC AI Systems. [Homepage], [Paper]

(2) AIBench: An Industry Standard Internet Service AI Benchmark Suite. [Homepage], [Technical Report], [Bench18]

(3) AIoTBench: Towards Comprehensive Benchmarking Mobile and Embedded device Intelligence. [Homepage], [Paper]

(4) Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking. [Homepage], [Paper]

(5) FakeBench: a benchmark for detecting fake images and videos.

The source code of BenchCouncil AI Benchmarks are hosted on BenchCouncil code management system---BenchHub. The hand-on tutorials use BenchCouncil Testbed as platform, which is the world's first large-scale and open artificial intelligence testbed.

Location and Date

We will give a BoF forum on BenchCouncil AI Benchmarks at SC19 in Denver, Colorado, USA.

November 19, 2019 (Tuesday), 05:15 pm - 06:45 pm (An hour and a half)

ROOM: 503-504

Organizers and Presenters

Session Leader: Prof. Jianfeng Zhan ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences
Session Co-Leader: Prof. Xiaoyi Lu Ohio State University, USA
Presenter: Prof. Geoffrey Fox Indiana University, USA
Presenter: Prof. Tony Hey Rutherford Appleton Laboratory STFC
Presenter: Tianshu Hao ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences


As a multi-discipline, i.e., system, architecture, data management and machine learning, research and engineering effort from both industry and academia, BenchCouncil AI Benchmarks is a comprehensive AI benchmark suite for HPC, Datacenter, Edge, and IoT.

HPC AI500 is a benchmark suite for evaluating HPC systems that running scientific DL workloads. Each workload from HPC AI500 bases on real scientific DL applications and covers the most representative scientific fields, namely climate analysis, cosmology, high energy physics, gravitational wave physics and computational biology. Currently, we choose 18 scientific DL benchmarks (For details, see Specification) from application scenarios, datasets, and software stack. Furthermore, we propose a set of metrics of comprehensively evaluating the HPC systems, considering both accuracy, performance as well as power and cost. In addition, we provide a scalable reference implementation of HPC AI500.

AIBench is the first industry scale AI benchmark suite, joint with seventeen industry partners. First, we present a highly extensible, configurable, and flexible benchmark framework, containing multiple loosely coupled modules like data input, prominent AI problem domains, online inference, offline training and automatic deployment tool modules. We analyze typical AI application scenarios from three most important Internet services domains, and then we abstract and identify sixteen prominent AI problem domains, including classification, image generation, text-to-text translation, image-to-text, image-to- image, speech-to-text, face embedding, 3D face recognition, object detection, video prediction, image compression, recommendation, 3D object reconstruction, text summarization, spatial transformer, and learning to rank. We implement sixteen component benchmarks for those AI problem domains, and further profile and implement twelve fundamental units of computation across different component benchmarks as the micro benchmarks.

Edge AIBench is a benchmark suite for end-to-end edge computing including four typical application scenarios: ICU Patient Monitor, Surveillance Camera, Smart Home, and Autonomous Vehicle, which consider the complexity of all edge computing AI scenarios. In addition, Edge AIBench provides an end-to-end application benchmarking framework, including train, validate and inference stages. Table 1 shows the component benchmarks of Edge AIBench. Edge AIBench provides an end-to-end application benchmarking, consisting of train, inference, data collection and other parts using a general three-layer edge computing framework.

AIoTBench is a comprehensive benchmark suite to evaluate the AI ability of mobile and embedded devices. Our benchmark 1) covers different application domains, e.g. image recognition, speech recognition and natural language processing; 2) covers different platforms, including Android devices and Raspberry Pi; 3) covers different development tools, including TensorFlow and Caffe2; 4) offers both end-to-end application workloads and micro workloads.


Time Agenda Presenter Resources
17:15-17:42 BenchCouncil's view on AI Benchmarking Jianfeng Zhan Slides, TR
17:42-17:50 Hand-on tutorials on AI testbed, benchmarks, and AI challenges Tianshu Hao TBD
17:50-18:05 BDEC perspectives on AI Benchmarking Geoffery Fox TBD
18:05-18:20 Benchmarking AI for Sciences Tony Hey TBD
18:20-18:35 Benchmarking AI Processors and Systems on HPC and Edge Computing Architectures Xiaoyi Lu TBD
18:35-18:45 Discussion and closing remarks Jianfeng Zhan TBD


BenchCouncil’s View on Benchmarking AI and Other Emerging Workloads. [PDF]
Jianfeng Zhan, Lei Wang, Wanling Gao, and Rui Ren. arXiv preprint arXiv:1912.00572.

AIBench: An Industry Standard Internet Service AI Benchmark Suite. [PDF]
Wanling Gao, Fei Tang, Lei Wang, Jianfeng Zhan, Chunxin Lan, Chunjie Luo, Yunyou Huang, Chen Zheng, Jiahui Dai, Zheng Cao, Daoyi Zheng, Haoning Tang, Kunlin Zhan, Biao Wang, Defei Kong, Tong Wu, Minghe Yu, Chongkang Tan, Huan Li, Xinhui Tian, Yatao Li, Gang Lu, Junchao Shao, Zhenyu Wang, Xiaoyu Wang, and Hainan Ye. Technical Report, 2019.

AIBench: Towards Scalable and Comprehensive Datacenter AI Benchmarking. [PDF]
Wanling Gao, Chunjie Luo, Lei Wang, Xingwang Xiong, Jianan Chen, Tianshu Hao, Zihan Jiang, Fanda Fan, Mengjia Du, Yunyou Huang, Fan Zhang, Xu Wen, Chen Zheng, Xiwen He, Jiahui Dai, Hainan Ye, Zheng Cao, Zhen Jia, Kent Zhan, Haoning Tang, Daoyi Zheng, Biwei Xie, Wei Li, Xiaoyu Wang, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

HPC AI500: A Benchmark Suite for HPC AI Systems. [PDF]
Zihan Jiang, Wanling Gao, Lei Wang, Xingwang Xiong, Yuchen Zhang, Xu Wen, Chunjie Luo, Hainan Ye, Xiaoyi Lu, Yunquan Zhang, Shengzhong Feng, Kenli Li, Weijia Xu, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

AIoTBench: Towards Comprehensive Benchmarking Mobile and Embedded device Intelligence. [PDF]
Chunjie Luo, Fan Zhang, Cheng Huang, Xingwang Xiong, Jianan Chen, Lei Wang, Wanling Gao, Hainan Ye, Tong Wu, Runsong Zhou, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking. [PDF]
Tianshu Hao, Yunyou Huang, Xu Wen, Wanling Gao, Fan Zhang, Chen Zheng, Lei Wang, Hainan Ye, Kai Hwang, Zujie Ren, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

DCMix: Generating mixed workloads for the cloud data center. [PDF]
Xingwang Xiong, Lei Wang, Wanling Gao, Rui Ren, Ke Liu, Chen Zheng, Yu Wen, and Yi Liang. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

Data Motifs: A Lens Towards Fully Understanding Big Data and AI Workloads. [PDF]
Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Daoyi Zheng, Fei Tang, Biwei Xie, Chen Zheng, Xu Wen, Xiwen He, Hainan Ye and Rui Ren. The 27th International Conference on Parallel Architectures and Compilation Techniques (PACT 2018).

BigDataBench: a Big Data Benchmark Suite from Internet Services. [PDF]
Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, WanlingGao, Zhen Jia, Yingjie Shi, Shujie Zhang, Cheng Zhen, Gang Lu, Kent Zhan, Xiaona Li, and Bizhu Qiu. The 20th IEEE International Symposium On High Performance Computer Architecture (HPCA-2014), February 15-19, 2014, Orlando, Florida, USA.

BigDataBench: a Scalable and Unified Big Data and AI Benchmark Suite. [PDF]
Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Daoyi Zheng, Rui Ren, Chen Zheng, Gang Lu, Jingwei Li, Zheng Cao, Shujie Zhang, and Haoning Tang. Technical Report, arXiv preprint arXiv:1802.08254, January 27, 2018.

Understanding Big Data Analytics Workloads on Modern Processors. [PDF]
Zhen Jia, Jianfeng Zhan, Lei Wang, Chunjie Luo, Wanling Gao, Yi Jin, Rui Han and Lixin Zhang. IEEE Transactions on Parallel and Distributed Systems, 28(6), 1797-1810, 2017.

Characterizing data analysis workloads in data centers. [PDF]
Zhen Jia, Lei Wang, Jianfeng Zhan, Lixin Zhang, Chunjie Luo. 2013 IEEE International Symposium on Workload Characterization (IISWC 2013) (Best paper award).


Jianfeng Zhan

Jianfeng Zhan is a Full Professor at Institute of Computing Technology, Chinese Academy of Sciences, and University of Chinese Academy of Sciences. He has supervised over 80 graduate students (both MS and Ph.D), post-docs, and engineers. His research interests cover a wide spectrum in the areas of high performance and distributed systems. He has made strong and effective efforts to transfer his academic research into advanced technology to impact general-purpose production systems. Since the publication in HPCA 2014, BigDataBench is widely used in both academia and industry in the world. He has transferred more than 40 OS and distributed system patents to top companies. He founded BenchCouncil---a multidisciplinary international benchmark council and served as TPDS associate editor.

Xiaoyi Lu

Xiaoyi Lu is a Research Assistant Professor of the Department of Computer Science and Engineering at the Ohio State University, USA. His current research interests include high performance interconnects and protocols, Big Data Analytics, Parallel Computing Models, Virtualization, Cloud Computing, and Deep Learning system software. He has published more than 100 papers in major International conferences, workshops, and journals with multiple Best (Student) Paper Awards or Nominations. He has delivered more than 100 invited talks, tutorials, and presentations worldwide. He has been actively involved in various professional activities in academic journals and conferences. He is a member of IEEE and ACM. More details about Dr. Lu are available at∼luxi.

Geoffrey Fox

Geoffrey Fox received a Ph.D. in Theoretical Physics from Cambridge University and is now distinguished professor of Informatics and Computing, and Physics at Indiana University where he is director of the Digital Science Center, Chair of Department of Intelligent Systems Engineering and Director of the Data Science program at the School of Informatics, Computing, and Engineering. He currently works in applying computer science from infrastructure to analytics in Biology, Pathology, Sensor Clouds, Earthquake and Ice-sheet Science, Image processing, Deep Learning, Manufacturing, Network Science and Particle Physics. The infrastructure work is built around Software Defined Systems on Clouds and Clusters. The analytics focuses on scalable parallelism.

Tony Hey

Tony Hey has a doctorate in particle physics from the University of Oxford. After a career in physics that included research positions at Caltech, MIT and CERN, and a professorship at the University of Southampton, he became interested in parallel computing and moved into computer science. His group was one of the first to build and explore the development of parallel software for message-passing distributed memory computers. He was one of the authors of the first draft of the MPI message-passing standard. Tony led the U.K.’s eScience initiative in 2001 before joining Microsoft in 2005 as Vice-President for Technical Computing. He returned to work in the UK in 2015 as Chief Data Scientist at the Rutherford Appleton Laboratory and leads the ‘Scientific Machine Learning’ group. Tony is a fellow of the Association for Computing Machinery, the American Association for the Advancement of Science, and the Royal Academy of Engineering.

Tianshu Hao

Tianshu Hao received the B.S. degree from Nankai University, Tianjin, China, in 2015. She is currently pursuing Ph. D. dergree in ICT, CAS. Her research interests focus on big data, edge computing, IoT and AI benchmarking.

Relate Links