提交 88df39c1 编辑于 作者: Bill Cai's avatar Bill Cai
浏览文件

Update README

上级 2d0de04d
加载中
加载中
加载中
加载中
+17 −0
原始行号 差异行号 差异行
@@ -49,6 +49,23 @@ To the best of our knowledge, this is the largest competition-validated dataset

Both datasets are reusable beyond microservice RCA: time-aligned Metrics+Logs+Traces support multimodal time-series anomaly detection; RCA100's causal-chain labels and entity-relation graphs supply ground truth for causal discovery and graph neural methods; AIOps2025's per-modality key-evidence labels are rare material for studying expert–agent reasoning-path divergence and agent self-evaluation.

## Citation

If you use AgenticOpsEval in your research, please cite the accompanying paper ([arXiv:2606.29193](https://arxiv.org/abs/2606.29193)):

```bibtex
@misc{cai2026agenticopseval,
  title        = {A Multi-Dataset Benchmark for Evaluating LLM Agents in Microservice Failure Diagnosis},
  author       = {Cai, Yuanhong and Nie, Xiaohui and Yin, Kanglin and Pei, Changhua and Sun, Yongqian and Zhang, Shenglin and Liu, Haibin and Liu, Guiyang and Wen, Xidao and Situ, Fang and Pei, Dan},
  year         = {2026},
  eprint       = {2606.29193},
  archivePrefix= {arXiv},
  primaryClass = {cs.SE},
  doi          = {10.48550/arXiv.2606.29193},
  url          = {https://arxiv.org/abs/2606.29193}
}
```

## License

- **AIOps2025**: CC BY-NC 4.0