评价科学与工程大会 & 开源与人工智能大会 |
时间 |
日程 |
报告人 |
12.4 上午 主论坛(主持人:范帆达博士、陈思敏、葛佳媛),三楼南华厅 |
9:00-9:10 |
开幕式 |
詹剑锋教授(BenchCouncil创始主席),Geoffrey Fox 教授(BenchCouncil Steering Committee, ACM/IEEE Fellow) |
9:10-9:15 |
开源贡献世纪榜发布 |
林伟伟教授(华南理工大学) |
9:15-9:55 |
主旨报告:Benchmarking AI for Science |
Geoffrey Fox 教授(ACM/IEEE Fellow) |
9:55-10:00 |
Evaluatology研究中心授牌仪式:中国民用航空飞行学院、华东师范大学、中科院计算所、中科院软件所、广西师范大学、北京本尺康舍研究院 |
Geoffrey Fox 教授(BenchCouncil Steering Committee),Tilmann Rabl教授,Hajdi Cenan,Davor Runje,詹剑锋教授 |
10:00-10:05 |
开源人工智能贡献世纪榜发布 |
待定 |
10:05-10:45 |
主旨报告:Evaluatology in Aviation |
李维萍教授(中国民用航空飞行学院首席科学家) |
10:45-10:55 |
茶歇 |
10:55-11:00 |
颁发开源贡献证书(项目名:EasyGraph,贡献者:陈阳,复旦大学教授) |
Geoffrey Fox 教授(BenchCouncil Steering Committee Member), Hajdi Cenan (欧洲AI专家) |
11:00-11:40 |
主旨报告:Introducing FastAgency - the fastest way to bring AutoGen workflows to production |
Hajdi Cenan & Davor Runje (欧洲AI专家) |
11:40-12:10 |
Open Source Evaluatology: Towards a Global Standard for Contribution Evaluation |
王伟教授(华东师范大学) |
12:10-12:15 |
开源国家榜和地区榜发布 |
王伟教授(华东师范大学) |
12.4 下午 主论坛(主持人:范帆达博士、陈思敏、葛佳媛),三楼南华厅 |
14:00-14:40 |
主旨报告:Challenges in Modern Benchmarking |
Tilmann Rabl教授(University of Potsda) |
14:40-14:45 |
开源大模型贡献世纪榜发布 |
龙赛琴教授(暨南大学) |
14:45-15:15 |
邀请报告 |
张拳石教授(上海交通大学) |
15:15-15:45 |
邀请报告:Evaluatology: The Science and Engineering of Evaluation |
高婉铃副研究员(中国科学院计算技术研究所) |
15:45-15:50 |
开源无人机贡献榜发布 |
阳建华副教授(广东技术师范大学) |
15:50-16:00 |
BenchCouncil标准工作组细则 |
詹剑锋教授(BenchCouncil创始主席) |
16:00-16:02 |
开源标准工作组授牌仪式 |
Geoffrey Fox 教授(BenchCouncil Steering Committee), Hajdi Cenan |
16:02-16:20 |
开源标准工作组方法、思路与愿景 |
周傲英教授,王伟教授(华东师范大学) |
16:20-16:25 |
开源医疗人工智能榜贡献世纪榜发布 |
待定 |
16:25-16:27 |
低空经济标准工作组授牌仪式 |
Geoffrey Fox 教授(BenchCouncil Steering Committee), Hajdi Cenan |
16:27-16:45 |
低空经济标准工作组方法、思路与愿景 |
李维萍教授(中国民用航空飞行学院首席科学家);陈新国高级工程师(中国科学院软件所) |
16:45-16:50 |
开源金融人工智能榜贡献世纪榜发布 |
待定 |
16:50-17:10 |
AI算力现况与技术实践 |
朱世海(北京安联通CTO) |
17:10-17:12 |
大模型标准工作组授牌仪式 |
Geoffrey Fox 教授(BenchCouncil Steering Committee), Hajdi Cenan |
17:12-17:30 |
大模型标准工作组方法、思路与愿景 |
詹剑锋教授,高婉铃副研究员,罗纯杰副研究员(中国科学院计算技术研究所) |
17:30-17:40 |
邀请报告:评价科学的基本理论研究 |
汤建民教授(杭州电子科技大学) |
17:40-18:00 |
开源分领域贡献榜发布: 安全/RISC-V/具身智能榜 |
待定 |
12.5 上午 |
分论坛I Bench大会论文报告,三楼聚谊厅 |
9:00-9:40 |
邀请报告 |
钱卫宁(华东师范大学) |
9:40-10:00 |
LWMEval: Evaluating Large-Scale Neural Networks for Six-Hour Weather Nowcasting
|
Chaochong Zhang (中山大学) |
10:00-10:20 |
DNN-schedule: A Predictive Scheduler for Minimizing Interference of Co-located DNN Workload |
Jiamin Lu(中科大) |
10:20-10:40 |
Benchmarking Distributed Transactional Database Systems |
Hailin He (华东师范大学) |
10:40-11:00 |
CaloBench: A Benchmark Study of Generative Models for Calorimeter Showers |
Geoffrey Fox(弗吉尼亚大学) |
11:00-11:20 |
StockNNEval:Evaluating Neural Network Methods for Predicting Stock Trend |
Zikai Liao (中山大学) |
11:20-11:40 |
StellarTop : An Integrated Multi-Topic Dataset on GitHub Repositories |
Zhiwei Zhu(华东师范大学) |
11:40-12:00 |
Evaluating Kernel Anti-Exploitation Capabilities: A Evaluatology-based Scalable and General Framework |
Simin Chen (中关村实验室) |
12:00-12:20 |
Benchmarking Edge Computing System for Autonomous Vehicle via CAV Motifs |
Yifan Wang (中科院计算所) |
分论坛II SimAI论坛:面向大模型集群训练的高精度模拟器,三楼南湖厅 |
9:00-9:30 |
面向大规模集群训练的模拟器SimAI |
阿里巴巴技术专家 |
9:30-9:50 |
AICB 通信 benchmark 实践 |
阿里巴巴技术专家 |
9:50-10:30 |
SimAI-Analytical 仿真实践 |
阿里巴巴技术专家 |
10:30-11:00 |
Tea break |
|
11:00-11:30 |
SimAI-Simulation 全栈仿真实践 |
阿里巴巴技术专家 |
11:30-12:00 |
SimAI-Physical CPU-NCCL物理打流实践 交流互动 |
全体人员 |
分论坛III IC 大会论文报告,三楼南山厅 |
9:00-9:20 |
邀请报告:三体计算星座—太空计算基础设施 |
宫禄齐(之江实验室) |
9:20-9:40 |
Parallel Computing on RTEMS Operating System |
Zeyu Liang (东北大学) |
9:40-10:00 |
GRAC:a method for cancer drug response prediction based on graph residual attention and contrastive learning similarity |
Na Luo (东北师范大学) |
10:00-10:20 |
Construction and Application of a Semantic Linked Network for Space Weather Data Based on Metadata |
Ci-Feng Wang (国家空间科学研究中心) |
10:20-10:40 |
Personalized Exercise Recommendations: Federated Learning with Hierarchical Attention for School-Specific Needs |
Ye Zhang (东北师范大学) |
10:40-11:00 |
Patent Information Extraction Based on Teacher Student Model - A Case Study of Zinc Battery Patent Dataset |
Lingchen Cai (四川大学) |
11:00-11:20 |
Artificial Intelligence Modelling Paths for Reasoning Argumentation Methods for Criminal Evidence |
Xiaohan Shao (浙江大学) |
11:20-11:40 |
Parallel Decomposition Method for Deep Learning Models Based on Improved Dual Population Genetic Algorithm |
Zi Han (山东师范大学) |
11:40-12:00 |
Integrating CNNs and Transformers for Mid-Price Prediction in High-Frequency Trading |
Yuqing Tang (西交利物浦大学) |
12:00-12:20 |
Hierarchical Recurrent Network for Active Stereo Matching |
Yuan Liu (之江实验室) |
12:20-12:40 |
FewNovelBench: A Benchmark for Few-Shot Learning with Many Novel Classes |
Zhipeng Lin (中国人民解放军军事科学院) |
12.5 下午 |
分论坛I:开源贡献标准工作组筹备会议和研讨 (周傲英教授,华东师范大学;王伟,华东师范大学),三楼聚谊厅 |
分论坛II: 低空经济标准工作组筹备会议与研讨(李维萍教授,中国民用航空飞行学院首席科学家;陈新国,中国科学院软件所),三楼南湖厅 |
分论坛III: 大模型标准工作组筹备会议与研讨(詹剑锋教授,国际测试委员会主席),三楼南山厅 |
12.6 上午 |
分论坛I:Evaluatology论坛报告1-Evaluatology Foundations and Frameworks,三楼聚谊厅 |
9:00-9:20 |
Open Source Evaluatology: Theoretical Framework and Practical Pathways for Systematic Evaluation of Open Source Ecosystem |
Fanyu Han (华东师范大学) |
9:20-9:40 |
Constructing Benchmarks for Open Source Ecosystems: A Stakeholder Needs-Driven Approach |
Zhen Zhang (湖北大学) |
9:40-10:00 |
Open Source Informetrics: Theoretical Framework and Practical Path of Open Source Ecosystem |
Zehua Lou (华东师范大学) |
10:00-10:20 |
Evaluatology's Perspective on AI Evaluation in Critical Scenarios: From Tail Quality to The Landscape |
Zhengxin Yang (中国科学院计算技术研究所) |
10:20-10:40 |
Evaluating Long-Term Usage Patterns of Open Source Datasets: A Citation Network Approach |
Jiaheng Peng (华东师范大学) |
10:40-11:00 |
Evaluating Large Language Models on the Edge: A Use Case of Evaluatology |
Zhikun Dong (中国科学院计算技术研究所) |
11:00-11:20 |
A Benchmark Dataset and Evaluation of Collaboration Network in Open Source Software Community |
Fan Huang (华东师范大学) |
11:20-11:40 |
The Theory of Computational Evaluatology |
Hedong YAN (中国科学院计算技术研究所) |
分论坛II:Evaluatology论坛报告2-Benchmarkology and Performance Evaluation,三楼南湖厅 |
9:00-9:20 |
Evaluating the Performance of Complex Textual Tasks Generated by Large Language Models |
Fenglin Bi (华东师范大学) |
9:20-9:40 |
A Framework for Evaluating Cultural Bias and Historical Misconceptions in LLM Outputs |
Moon-Kuen Mak (中国科学院) |
9:40-10:00 |
Patrick Star: A Comprehensive Benchmark for Multi-Modal Image Editing |
Di Cheng (北京服装学院) |
10:00-10:20 |
AICB: a Benchmark Suite for Evaluating the Communication Subsystem of LLM Training Clusters |
Gang Lu (阿里巴巴) |
10:20-10:40 |
A Performance Evaluation Method for Recommendation Model Training on Heterogeneous NPUs |
Qiang Liu (腾讯) |
10:40-11:00 |
Design and Practice of Performance Evaluation System for High Performance General-purpose CPU |
Weijun Zhong (中国电子技术标准化研究院) |
11:00-11:20 |
A Context-Driven Benchmark for Evaluating Task Management Capabilities of Digital Assistants |
JIACHEN DU (清华大学) |
12.6 下午 |
分论坛I: Evaluatology论坛报告3-Evaluatology Applications Across Multi-Disciplines,三楼聚谊厅 |
14:00-14:20 |
Missing materials data imputation workflow towards improving the prediction performance of machine learning |
Yue Liu (上海大学) |
14:20-14:40 |
Research on intelligent traffic surveillance video compression quality assessment method |
Xiangnan Zhao (中国计量科学研究院) |
14:40-15:00 |
Research on Multidimensional Evaluation Technology of Teachers; Digital Literacy Based on Large Language Models |
Di Fan (东北大学) |
15:00-15:20 |
Knuth Test: Enhancing Assessment Accuracy in Introductory Computer Science Education |
Yu Du (中国科学院计算技术研究所) |
15:20-15:40 |
MixSchedSim: A Simulator for Mixed Workload Scheduling in Heterogeneous Computing Environments |
Fei Tang (浪潮) |
15:40-16:00 |
BigTensorDB-Coupled Artificial Intelligence for Science: A Retrosynthetic Analysis Case Study |
Xueya Zhang (中国科学院大学) |
16:00-16:20 |
An Experimental study on Evaluating Senior High School Gifted Talented Students’ Academic Literacy |
Ping Lei |
16:20-16:40 |
Real-World Drug Clinical Research Based on Artificial Intelligence |
Kunqian Yu(中国科学院) |
16:40 -17:00 |
CodeAgent - Collaborative Agents for Software Engineering |
Daniel Tang (卢森堡大学) |
分论坛II: Tbench论坛,三楼南湖厅 |
14:00-14:20 |
Could bibliometrics reveal top science and technology achievements and researchers? The case for evaluatology-based science and technology evaluation |
Wanling Gao (中国科学院大学) |
14:20-14:40 |
BinCodex: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques |
Peihua Zhang (腾讯) |
14:40-15:00 |
An approach to workload generation for modern data centers: A view from Alibaba trace |
Yi Liang(北京工业大学) |
15:00-15:20 |
TensorTable: Extending PyTorch for mixed relational and linear algebra pipelines |
Xu Wen(华为) |
15:20-15:40 |
Evaluation of mechanical properties of natural fiber based polymer composite |
Tarikur Jaman Pramanik(Khulna University of Engineering Technology) |
15:40-16:00 |
Enhanced deep learning based decision support system for kidney tumour detection |
Taha ETEM(Cankkiri Karatekin University) |
16:00-16:20 |
Analyzing the impact of opportunistic maintenance optimization on manufacturing industries in Bangladesh: An empirical study |
Md. Ariful Alam(Bangladesh Army University of Science and Technology) |