The First International Workshop on Evaluatology (Evaluatology 2024)

Program

Bench 2024 and Open & AI 2024
Time	Schedule	Presenter
12.4 Morning Plenary Session（Chair：Dr. Fanda Fan, Simin Chen, Jiayuan Ge), NanHua Hall, 3rd floor.
9:00-9:10	Opening Remarks	Prof. Jianfeng Zhan (Founding Chair of BenchCouncil), Prof. Geoffrey Fox (BenchCouncil Steering Committee, ACM/IEEE Fellow)
9:10-9:15	Top Open Source Contributions: A Century Ranking List	Prof. Weiwei Lin (South China University of Technology)
9:15-9:55	Keynote: Benchmarking AI for Science	Prof. Geoffrey Fox (ACM/IEEE Fellow)
9:55-10:00	Inauguration ceremony for Evaluatology Research Center: Civil Aviation Flight University of China, East China Normal University, Institute of Computing Technology, Chinese Academy of Sciences, Institute of Software, Chinese Academy of Sciences, Guangxi Normal University, BenchCouncil Research Beijing, The Second Affiliated Hospital of Guilin Medical University	Prof. Geoffrey Fox (BenchCouncil Steering Committee), Prof. Tilmann Rabl, Hajdi Cenan, Davor Runje, Prof. Jianfeng Zhan
10:00-10:05	Top Open Source AI Contributions: A Century Ranking List	TBD
10:05-10:45	Keynote: Evaluatology in Aviation	Prof. Weiping Li (Civil Aviation Flight University of China)
10:45-10:55	Tea Break
10:55-11:00	Grant Open Source Achievement Certificate (Project: EasyGraph, Contributor: Prof. Chen Yang from Fudan University)	Prof. Geoffrey Fox (BenchCouncil Steering Committee Member) & Hajdi Cenan (European AI Expert)
11:00-11:40	Keynote: Introducing FastAgency - the fastest way to bring AutoGen workflows to production	Hajdi Cenan Davor Runje (European AI Experts)
11:40-12:10	Open Source Evaluatology: Towards a Global Standard for Contribution Evaluation	Prof. Wei Wang (East China Normal University)
12:10-12:15	Release of Open Source National and Regional Rankings	Prof. Wei Wang (East China Normal University)
12.4 Afternoon Plenary Session（Chair：Dr. Fanda Fan, Simin Chen, Jiayuan Ge）, NanHua Hall, 3rd floor.
14:00-14:40	Keynote: Challenges in Modern Benchmarking	Prof. Tilmann Rabl (University of Potsda)
14:40-14:45	Top LLM Contributions: A Century Ranking List	Prof. Saiqin Long (Jinan University)
14:45-15:15	Invited Talk	Prof. Quanshi Zhang (Shanghai Jiao Tong University)
15:15-15:45	Invited Talk: Evaluatology: The Science and Engineering of Evaluation	Dr. Wanling Gao (ICT, CAS)
15:45-15:50	Top Open Source Education Contributions: A Century Ranking List	Dr. Jianhua Yang (Guangdong Polytechnic Normal University)
15:50-16:00	BenchCouncil Standards Working Group Guidelines	Prof. Jianfeng Zhan (Founding Chair of BenchCouncil)
16:00-16:02	Inauguration ceremony for the Open Source Standards Working Group	Prof. Geoffrey Fox (BenchCouncil Steering Committee), Hajdi Cenan (AIrt)
16:02-16:20	Methodology, Approach, and Vision of the Open Source Standards Working Group	Prof. Aoying Zhou, Prof. Wei Wang (East China Normal University)
16:20-16:25	Top Medical AI Contributions: A Century Ranking List	TBD
16:25-16:27	Top Open Source Contributions: A Century Ranking List	Prof. Geoffrey Fox (BenchCouncil Steering Committee), Hajdi Cenan (AIrt)
16:27-16:45	Methodology, Approach, and Vision of the Low Altitude Economy Standards Working Group	Prof. Weiping Li (Civil Aviation Flight University of China); Dr. Xinguo Chen (Institute of Software, Chinese Academy of Sciences)
16:45-16:50	Top Financial AI Contributions: A Century Ranking List	TBD
16:50-17:10	Current Status and Technical Practices of AI Computing Power	Shihai Zhu (CTO of Beijing An-link)
17:10-17:12	Inauguration ceremony for the LLM Standards Working Group	Prof. Geoffrey Fox (BenchCouncil Steering Committee), Hajdi Cenan (AIrt)
17:12-17:30	Methodology, Approach, and Vision of the LLM Standards Working Group	Prof. Jianfeng Zhan, Dr. Wanling Gao, Dr. Chunjie Luo
17:30-17:40	Ranking List for Top Contributions in Open Source Subfields: LLM/Unmanned Aerial Vehicles	TBD
17:40-18:00	Invited Talk: The Principle of Evaluation	Prof. Jianmin Tang (Hangzhou Dianzi University)
18:00-18:15	Ranking List for Top Contributions in Open Source Subfields:Security/RISC-V/Embodied AI	TBD
12.5 Morning
Session I Bench Paper Session, JuYi Hall, 3rd floor.
9:00-9:40	Invited Talk	Prof. Weining Qian (East China Normal University)
9:40-10:00	LWMEval: Evaluating Large-Scale Neural Networks for Six-Hour Weather Nowcasting	Chaochong Zhang (Sun Yat-Sen University)
10:00-10:20	DNN-schedule: A Predictive Scheduler for Minimizing Interference of Co-located DNN Workload	Jiamin Lu(University of Science and Technology of China)
10:20-10:40	Benchmarking Distributed Transactional Database Systems	Hailin He (East China Normal University)
10:40-11:00	CaloBench: A Benchmark Study of Generative Models for Calorimeter Showers	Geoffrey Fox （University of Virginia）
11:00-11:20	StockNNEval:Evaluating Neural Network Methods for Predicting Stock Trend	Zikai Liao (Sun Yat-Sen University）
11:20-11:40	StellarTop : An Integrated Multi-Topic Dataset on GitHub Repositories	Zhiwei Zhu (East China Normal University)
11:40-12:00	Evaluating Kernel Anti-Exploitation Capabilities: A Evaluatology-based Scalable and General Framework	Simin Chen (Zhongguancun Lab)
12:00-12:20	Benchmarking Edge Computing System for Autonomous Vehicle via CAV Motifs	Yifan Wang (ICT, CAS)
Session II SimAI: High-Precision Simulator for LLM Cluster Training，NanHu Hall, 3rd floor.
9:00-9:30	SimAI: A Simulator for Large-Scale Cluster Training	Alibaba
9:30-9:50	AICB Communication Benchmark Practice	Alibaba
9:50-10:30	SimAI-Analytical Simulation Practice	Alibaba
10:30-11:00	Tea break
11:00-11:30	SimAI-Simulation Full-Stack Simulation Practice	Alibaba
11:30-12:00	SimAI-Physical CPU-NCCL Physical Traffic Practice Interactive Exchange	Alibaba
Session III IC Paper Session, NanShan Hall, 3rd floor.
9:00-9:20	Invited Talk: Trisolaris Computing Constellation – Space Computing Infrastructure	Luqi Gong (Zhejiang Lab)
9:20-9:40	Parallel Computing on RTEMS Operating System	Zeyu Liang (Northeastern University)
9:40-10:00	GRAC：a method for cancer drug response prediction based on graph residual attention and contrastive learning similarity	Na Luo (Northeast Normal University)
10:00-10:20	Construction and Application of a Semantic Linked Network for Space Weather Data Based on Metadata	Ci-Feng Wang (National Space Science Center, CAS)
10:20-10:40	Personalized Exercise Recommendations: Federated Learning with Hierarchical Attention for School-Specific Needs	Ye Zhang (Northeast Normal University)
10:40-11:00	Patent Information Extraction Based on Teacher Student Model - A Case Study of Zinc Battery Patent Dataset	Lingchen Cai (Sichuan University)
11:00-11:20	Artificial Intelligence Modelling Paths for Reasoning Argumentation Methods for Criminal Evidence	Xiaohan Shao (Zhejiang University)
11:20-11:40	Parallel Decomposition Method for Deep Learning Models Based on Improved Dual Population Genetic Algorithm	Zi Han (Shandong Normal University)
11:40-12:00	Integrating CNNs and Transformers for Mid-Price Prediction in High-Frequency Trading	Yuqing Tang (Xi'an Jiaotong-Liverpool University)
12:00-12:20	Hierarchical Recurrent Network for Active Stereo Matching	Yuan Liu (Zhejiang Lab)
12.5 Afternoon
Session I：Preparatory meeting and forum discussions for the Open Source Standards Working Group, Prof. Aoying Zhou (East China Normal University), Prof. Wei Wang (East China Normal University), JuYi Hall, 3rd floor.
Session II： Preparatory meeting and forum discussions for the Low Altitude Economy Standards Working Group, Prof. Weiping Li (Civil Aviation Flight University of China); Dr. Xinguo Chen (Institute of Software, Chinese Academy of Sciences), NanHu Hall, 3rd floor.
Session III： Preparatory meeting and forum discussions for the LLM Standards Working Group, Prof. Jianfeng Zhan (Founding Chair of BenchCouncil), NanShan Hall, 3rd floor.
12.6 Morning
Session I：Evaluatology Paper Session 1-Evaluatology Foundations and Frameworks, JuYi Hall, 3rd floor.
9:00-9:20	Open Source Evaluatology: Theoretical Framework and Practical Pathways for Systematic Evaluation of Open Source Ecosystem	Fanyu Han (East China Normal University)
9:20-9:40	Constructing Benchmarks for Open Source Ecosystems: A Stakeholder Needs-Driven Approach	Zhen Zhang (Hubei University)
9:40-10:00	Open Source Informetrics: Theoretical Framework and Practical Path of Open Source Ecosystem	Zehua Lou (East China Normal University)
10:00-10:20	Evaluatology's Perspective on AI Evaluation in Critical Scenarios: From Tail Quality to The Landscape	Zhengxin Yang (ICT, CAS)
10:20-10:40	Evaluating Long-Term Usage Patterns of Open Source Datasets: A Citation Network Approach	Jiaheng Peng (East China Normal University)
10:40-11:00	Evaluating Large Language Models on the Edge: A Use Case of Evaluatology	Zhikun Dong (ICT, CAS)
11:00-11:20	A Benchmark Dataset and Evaluation of Collaboration Network in Open Source Software Community	Fan Huang (East China Normal University)
11:20-11:40	The Theory of Computational Evaluatology	Hedong YAN (ICT, CAS)
Session II：Evaluatology Paper Session 2-Benchmarkology and Performance Evaluation, NanHu Hall, 3rd floor.
9:00-9:20	Evaluating the Performance of Complex Textual Tasks Generated by Large Language Models	Fenglin Bi (East China Normal University)
9:20-9:40	A Framework for Evaluating Cultural Bias and Historical Misconceptions in LLM Outputs	Moon-Kuen Mak (Chinese Academy of Sciences)
9:40-10:00	Patrick Star: A Comprehensive Benchmark for Multi-Modal Image Editing	Di Cheng (Beijing Institute Of Fashion Technology)
10:00-10:20	AICB: a Benchmark Suite for Evaluating the Communication Subsystem of LLM Training Clusters	Gang Lu (Alibaba Cloud)
10:20-10:40	A Performance Evaluation Method for Recommendation Model Training on Heterogeneous NPUs	Qiang Liu (Tencent)
10:40-11:00	Design and Practice of Performance Evaluation System for High Performance General-purpose CPU	Weijun Zhong (China Electronics Standardization Institute)
11:00-11:20	A Context-Driven Benchmark for Evaluating Task Management Capabilities of Digital Assistants	JIACHEN DU (Tsinghua University)
11:20-11:40	FewNovelBench: A Benchmark for Few-Shot Learning with Many Novel Classes	Miao Wang (Intelligent Game and Decision Lab, Academy of Military Science)
12.6 Afternoon
Session I： Evaluatology Paper Session 3-Evaluatology Applications Across Multi-Disciplines, JuYi Hall, 3rd floor.
14:00-14:20	Missing materials data imputation workflow towards improving the prediction performance of machine learning	Yue Liu (Shanghai University)
14:20-14:40	Research on intelligent traffic surveillance video compression quality assessment method	Xiangnan Zhao (National Institute of Metrology)
14:40-15:00	Research on Multidimensional Evaluation Technology of Teachers' Digital Literacy Based on Large Language Models	Di Fan (Northeastern University)
15:00-15:20	Knuth Test: Enhancing Assessment Accuracy in Introductory Computer Science Education	Yu Du (ICT, CAS)
15:20-15:40	MixSchedSim: A Simulator for Mixed Workload Scheduling in Heterogeneous Computing Environments	Fei Tang (Inspur Data Co.,Ltd.)
15:40-16:00	BigTensorDB-Coupled Artificial Intelligence for Science: A Retrosynthetic Analysis Case Study	Xueya Zhang (University of Chinese Academy of Sciences)
16:00-16:20	An Experimental study on Evaluating Senior High School Gifted Talented Students’ Academic Literacy	Ping Lei
16:20-16:40	Real-World Drug Clinical Research Based on Artificial Intelligence	Kunqian Yu (Chinese Academy of Sciences)
16:40 -17:00	CodeAgent - Collaborative Agents for Software Engineering	Daniel Tang, Interdisciplinary Security and Trust Centre (SnT), University of Luxembourg (UL)
Session II： TBench Paper Session, NanHu Hall, 3rd floor.
14:00-14:20	Could bibliometrics reveal top science and technology achievements and researchers? The case for evaluatology-based science and technology evaluation	Wanling Gao (University of Chinese Academy of Sciences)
14:20-14:40	BinCodex: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques	Peihua Zhang (Tencent)
14:40-15:00	An approach to workload generation for modern data centers: A view from Alibaba trace	Yi Liang (Beijing University of Technology)
15:00-15:20	TensorTable: Extending PyTorch for mixed relational and linear algebra pipelines	Xu Wen (Huawei)
15:20-15:40	Evaluation of mechanical properties of natural fiber based polymer composite	Tarikur Jaman Pramanik（Khulna University of Engineering Technology）
15:40-16:00	Enhanced deep learning based decision support system for kidney tumour detection	Taha ETEM（Cankkiri Karatekin University）
16:00-16:20	Analyzing the impact of opportunistic maintenance optimization on manufacturing industries in Bangladesh: An empirical study	Md. Ariful Alam（Bangladesh Army University of Science and Technology）