init (3efe07ae) · 提交 · aiops-live-benchmark / AIOpsChallenge2025-Eval

README.md

+1 −93

原始行号	差异行号	差异行
		# AIOpsChallenge2025-Eval



		## Getting started

		To make it easy for you to get started with GitLab, here's a list of recommended next steps.

		Already a pro? Just edit this README.md and make it your own. Want to make it easy? [Use the template at the bottom](#editing-this-readme)!

		## Add your files

		- [ ] [Create](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#create-a-file) or [upload](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#upload-a-file) files
		- [ ] [Add files using the command line](https://docs.gitlab.com/ee/gitlab-basics/add-file.html#add-a-file-using-the-command-line) or push an existing Git repository with the following command:

		```
		cd existing_repo
		git remote add origin http://www.aiops.cn/gitlab/aiops-live-benchmark/AIOpsChallenge2025Eval.git
		git branch -M main
		git push -uf origin main
		```

		## Integrate with your tools

		- [ ] [Set up project integrations](http://www.aiops.cn/gitlab/aiops-live-benchmark/AIOpsChallenge2025Eval/-/settings/integrations)

		## Collaborate with your team

		- [ ] [Invite team members and collaborators](https://docs.gitlab.com/ee/user/project/members/)
		- [ ] [Create a new merge request](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html)
		- [ ] [Automatically close issues from merge requests](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically)
		- [ ] [Enable merge request approvals](https://docs.gitlab.com/ee/user/project/merge_requests/approvals/)
		- [ ] [Set auto-merge](https://docs.gitlab.com/ee/user/project/merge_requests/merge_when_pipeline_succeeds.html)

		## Test and Deploy

		Use the built-in continuous integration in GitLab.

		- [ ] [Get started with GitLab CI/CD](https://docs.gitlab.com/ee/ci/quick_start/index.html)
		- [ ] [Analyze your code for known vulnerabilities with Static Application Security Testing (SAST)](https://docs.gitlab.com/ee/user/application_security/sast/)
		- [ ] [Deploy to Kubernetes, Amazon EC2, or Amazon ECS using Auto Deploy](https://docs.gitlab.com/ee/topics/autodevops/requirements.html)
		- [ ] [Use pull-based deployments for improved Kubernetes management](https://docs.gitlab.com/ee/user/clusters/agent/)
		- [ ] [Set up protected environments](https://docs.gitlab.com/ee/ci/environments/protected_environments.html)

		***

		# Editing this README

		When you're ready to make this README your own, just edit this file and use the handy template below (or feel free to structure it however you want - this is just a starting point!). Thanks to [makeareadme.com](https://www.makeareadme.com/) for this template.

		## Suggestions for a good README

		Every project is different, so consider which of these sections apply to yours. The sections used in the template are suggestions for most open source projects. Also keep in mind that while a README can be too long and detailed, too long is better than too short. If you think your README is too long, consider utilizing another form of documentation rather than cutting out information.

		## Name
		Choose a self-explaining name for your project.

		## Description
		Let people know what your project can do specifically. Provide context and add a link to any reference visitors might be unfamiliar with. A list of Features or a Background subsection can also be added here. If there are alternatives to your project, this is a good place to list differentiating factors.

		## Badges
		On some READMEs, you may see small images that convey metadata, such as whether or not all the tests are passing for the project. You can use Shields to add some to your README. Many services also have instructions for adding a badge.

		## Visuals
		Depending on what you are making, it can be a good idea to include screenshots or even a video (you'll frequently see GIFs rather than actual videos). Tools like ttygif can help, but check out Asciinema for a more sophisticated method.

		## Installation
		Within a particular ecosystem, there may be a common way of installing things, such as using Yarn, NuGet, or Homebrew. However, consider the possibility that whoever is reading your README is a novice and would like more guidance. Listing specific steps helps remove ambiguity and gets people to using your project as quickly as possible. If it only runs in a specific context like a particular programming language version or operating system or has dependencies that have to be installed manually, also add a Requirements subsection.

		## Usage
		Use examples liberally, and show the expected output if you can. It's helpful to have inline the smallest example of usage that you can demonstrate, while providing links to more sophisticated examples if they are too long to reasonably include in the README.

		## Support
		Tell people where they can go to for help. It can be any combination of an issue tracker, a chat room, an email address, etc.

		## Roadmap
		If you have ideas for releases in the future, it is a good idea to list them in the README.

		## Contributing
		State if you are open to contributions and what your requirements are for accepting them.

		For people who want to make changes to your project, it's helpful to have some documentation on how to get started. Perhaps there is a script that they should run or some environment variables that they need to set. Make these steps explicit. These instructions could also be useful to your future self.

		You can also document commands to lint the code or run tests. These steps help to ensure high code quality and reduce the likelihood that the changes inadvertently break something. Having instructions for running tests is especially helpful if it requires external setup, such as starting a Selenium server for testing in a browser.

		## Authors and acknowledgment
		Show your appreciation to those who have contributed to the project.

		## License
		For open source projects, say how it is licensed.

		## Project status
		If you have run out of energy or time for your project, put a note at the top of the README saying that development has slowed down or stopped completely. Someone may choose to fork your project or volunteer to step in as a maintainer or owner, allowing your project to keep going. You can also make an explicit request for maintainers.
		# aiops-2025-eval
		No newline at end of file

client/README.md

0 → 100644

+57 −0

原始行号	差异行号	差异行
		# 提交脚本使用指南

		本指南介绍如何在竞赛中向评测服务器提交答案。该脚本允许您提交一个答案文件，并在操作成功时获得提交 ID。

		## 环境要求

		在开始之前，请确保您的系统已安装 Python 3。

		## 脚本概览

		脚本接受一个 JSON Lines 文件（`*.jsonl`），每行是一个单独的 JSON 对象，代表一个问题回答。

		## 命令行提交

		要从命令行使用该脚本，请切换到该脚本所在目录，用 Python 运行该脚本：

		```bash
		python submit.py [-h] [-s SERVER] [-c CONTEST] [-k TICKET] [result_path]
		```

		* `[result_path]`：提交的结果文件路径。如果未指定，默认使用当前目录下的 `result.jsonl`。
		* `-s, --server`：指定评测服务器的 URL。如果未提供，将使用脚本中定义的 `JUDGE_SERVER` 变量。
		* `-c, --contest`：比赛标识。如果未提供，将使用脚本中定义的 `CONTEST` 变量。
		* `-k, --ticket`：团队标识。如果未提供，将使用脚本中定义的 `TICKET` 变量。

		## 编程方式提交

		您还可以将 `submit` 函数导入到您的 Python 代码中，以便用编程方式提交数据。


		1. 导入函数：
		确保提交脚本位于您的项目目录或 Python 路径中。使用以下方式导入 submit 函数：

		```python3
		from submit import submit
		```

		2. 调用 submit 函数：
		准备您的提交数据为字典列表，每个字典代表一个要提交的问题回答。调用 submit 函数：

		```python3
		data = [
		{'anomaly type': '', 'root cause': [{'location': '', 'reason': ''}], 'reasoning trace': [{'step': 0, 'action': '', 'observation': ''}], 'path length': 0}
		# 根据需要添加更多项
		]

		submission_id = submit(data, judge_server='http://judge.aiops-challenge.com', contest='YOUR_CONTEST_ID', ticket='YOUR_TEAM_TICKET')
		if submission_id:
		print("提交成功！提交 ID: ", submission_id)
		else:
		print("提交失败")
		```

		在此示例中，请将 `YOUR_CONTEST_ID` 替换为您参加的比赛ID，将 `YOUR_TEAM_TICKET` 替换为您的团队ID。
		* 比赛ID 在比赛的URL中获得，比如"赛道一（Qwen1.5-14B）：基于检索增强的运维知识问答挑战赛"的URL为https://competition.aiops-challenge.com/home/competition/1771009908746010681 ，比赛ID为1771009908746010681
		* 团队ID需要在参加比赛并组队后能获得，具体在比赛详情页-> 团队 -> 团队ID，为一串数字标识。

client/answer.json

0 → 100644

+211 −0

原始行号	差异行号	差异行
		[
		{
		"anomaly type": "pod kill",
		"root cause": [
		{
		"location": "paymentservice-0",
		"reason": "Critical pod in the service chain with recent restarts"
		},
		{
		"location": "paymentservice-1",
		"reason": "Critical pod in the service chain with recent restarts"
		},
		{
		"location": "paymentservice-2",
		"reason": "Critical pod in the service chain with recent restarts"
		},
		{
		"location": "aiops-k8s-03",
		"reason": "Node hosting paymentservice pods"
		},
		{
		"location": "aiops-k8s-08",
		"reason": "Node hosting paymentservice pods"
		}
		],
		"reasoning trace": [
		{
		"step": 0,
		"action": "classifier Args: {'start_time': '2025-05-05 10:11:31', 'end_time': '2025-05-05 10:29:31'}",
		"observation": "Main failure type: pod kill"
		},
		{
		"step": 1,
		"action": "match_sop Args: {'query': 'SOP of pod kill', 'threshold': 0.55}",
		"observation": "Matched SOP 1: SOP of Pod Kill\nSimilarity Score: 1.0000\nSOP Content:\n1. Time series anomaly detection of all services in terms of metric request and response. 2. Time series anomaly detection of all pods in terms of metric pod_processes and pod_network_transmit_packets. 3. The result is combination of the previous steps.\nMatched SOP 2: SOP of Pod Failure\nSimilarity Score: 0.8032\nSOP Content:\n1. 2. The result is combination of the previous steps.\n"
		},
		{
		"step": 2,
		"action": "generate_and_run_sop_code Args: {'sop_str': '1. Time series anomaly detection of all services in terms of metric request and response. 2. Time series anomaly detection of all pods in terms of metric pod_processes and pod_network_transmit_packets. 3. The result is combination of the previous steps.', 'fault_info': 'Fault between 2025-05-05T10:11:31Z and 2025-05-05T10:29:31Z. Anomaly type: pod kill'}",
		"observation": "{'service_request_anomalies': '', 'service_response_anomalies': '', 'pod_processes_anomalies': '', 'pod_network_anomalies': '', 'anomaly_type': 'pod kill'}"
		},
		{
		"step": 3,
		"action": "time_series_anomaly_detection Args: {'start_time': '2025-05-05T10:11:31Z', 'end_time': '2025-05-05T10:29:31Z', 'metric_name': 'pod_processes', 'instance': 'all'}",
		"observation": ""
		},
		{
		"step": 4,
		"action": "time_series_anomaly_detection Args: {'start_time': '2025-05-05T10:11:31Z', 'end_time': '2025-05-05T10:29:31Z', 'metric_name': 'pod_network_transmit_packets', 'instance': 'all'}",
		"observation": ""
		},
		{
		"step": 5,
		"action": "trace_anomaly_detection Args: {'start_time': '2025-05-05T10:11:31Z', 'end_time': '2025-05-05T10:29:31Z'}",
		"observation": "No trace data found"
		},
		{
		"step": 6,
		"action": "log_anomaly_detection Args: {'start_time': '2025-05-05T10:11:31Z', 'end_time': '2025-05-05T10:29:31Z'}",
		"observation": "No error logs detected during the specified time period"
		},
		{
		"step": 7,
		"action": "JudgeRootCause Args: {'all_anomalous_information': 'Fault between 2025-05-05T10:11:31Z and 2025-05-05T10:29:31Z. Anomaly type: pod kill. No specific metrics or logs anomalies detected.'}",
		"observation": "Based on the pod kill fault and service call relationships, the most likely root cause is the failure of paymentservice pods (paymentservice-0, paymentservice-1, paymentservice-2) since they are critical in the chain: `checkoutservice -> paymentservice -> shippingservice -> emailservice`. No other services show systemic pod failures during the fault window. The paymentservice pods had recent restarts (19h ago), indicating potential instability. Since the anomaly propagated downstream (shippingservice, emailservice), the root cause is likely the paymentservice itself or its node dependencies (aiops-k8s-03, aiops-k8s-08, aiops-k8s-01)."
		}
		],
		"token count": 43914,
		"path length": 8
		},
		{
		"anomaly type": "pod failure",
		"root cause": [
		{
		"location": "adservice-0",
		"reason": "high restart count (8) during fault period"
		},
		{
		"location": "adservice-1",
		"reason": "high restart count (12) during fault period"
		},
		{
		"location": "adservice-2",
		"reason": "high restart count (5) during fault period"
		},
		{
		"location": "adservice",
		"reason": "service-level instability causing pod restarts"
		}
		],
		"reasoning trace": [
		{
		"step": 0,
		"action": "classifier Args: {'start_time': '2025-05-05 19:11:33', 'end_time': '2025-05-05 19:35:33'}",
		"observation": "Main failure type: pod cpu"
		},
		{
		"step": 1,
		"action": "match_sop Args: {'query': 'SOP of CPU issue', 'threshold': 0.55}",
		"observation": "Matched SOP 1: SOP of Node CPU\nSimilarity Score: 0.7777\nSOP Content:\n1. Time series anomaly detection: Detect the all nodes in terms of metric node_cpu_usage_rate. 2. The result is combination of the previous steps.\nMatched SOP 2: SOP of Jvm CPU\nSimilarity Score: 0.7378\nSOP Content:\n1. Time series anomaly detection: Detect all pods in terms of metric pod_cpu_usage. 2. Time series anomaly detection: Detect all nodes in terms of metric node_cpu_usage_rate. 3. The result is combination of the previous steps.\n"
		},
		{
		"step": 2,
		"action": "generate_and_run_sop_code Args: {'sop_str': '1. Time series anomaly detection: Detect all pods in terms of metric pod_cpu_usage. 2. Time series anomaly detection: Detect all nodes in terms of metric node_cpu_usage_rate. 3. The result is combination of the previous steps.', 'fault_info': 'Fault between 2025-05-05T11:11:33Z and 2025-05-05T11:35:33Z. Anomaly type: pod cpu'}",
		"observation": "{'pod_cpu_anomalies': '', 'node_cpu_anomalies': ''}"
		},
		{
		"step": 3,
		"action": "generate_and_run_sop_code Args: {'sop_str': '1. Time series anomaly detection: Detect the all nodes in terms of metric node_cpu_usage_rate. 2. The result is combination of the previous steps.', 'fault_info': 'Fault between 2025-05-05T11:11:33Z and 2025-05-05T11:35:33Z. Anomaly type: pod cpu'}",
		"observation": ""
		},
		{
		"step": 4,
		"action": "time_series_anomaly_detection Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'all'}",
		"observation": ""
		},
		{
		"step": 5,
		"action": "query_system_information Args: {'query': 'What is the correct metric name for pod CPU usage?'}",
		"observation": "\"pod_cpu_usage\""
		},
		{
		"step": 6,
		"action": "get_metric_values_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'all'}",
		"observation": "Invalid instance name."
		},
		{
		"step": 7,
		"action": "query_system_information Args: {'query': 'What are the pod names in the system?'}",
		"observation": "['adservice-0', 'adservice-1', 'adservice-2', 'cartservice-0', 'cartservice-1', 'cartservice-2', 'checkoutservice-0', 'checkoutservice-1', 'checkoutservice-2', 'currencyservice-0', 'currencyservice-1', 'currencyservice-2', 'emailservice-0', 'emailservice-1', 'emailservice-2', 'example-ant-29107680-n6c8f', 'frontend-0', 'frontend-1', 'frontend-2', 'paymentservice-0', 'paymentservice-1', 'paymentservice-2', 'productcatalogservice-0', 'productcatalogservice-1', 'productcatalogservice-2', 'recommendationservice-0', 'recommendationservice-1', 'recommendationservice-2', 'redis-cart-0', 'shippingservice-0', 'shippingservice-1', 'shippingservice-2']"
		},
		{
		"step": 8,
		"action": "get_metric_values_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'adservice-0'}",
		"observation": "[0.09 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01\n 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01]"
		},
		{
		"step": 9,
		"action": "get_metric_values_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'adservice-1'}",
		"observation": "[0.06 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0.01 0. 0. 0. 0. 0. 0. 0. 0. 0. ]"
		},
		{
		"step": 10,
		"action": "get_metric_values_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'adservice-2'}",
		"observation": "[0.04 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ]"
		},
		{
		"step": 11,
		"action": "get_metric_values_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'cartservice-0'}",
		"observation": "[0.02 0.01 0.01 0.02 0.01 0.01 0.01 0.01 0.01 0.02 0.01 0.01 0.01 0.01\n 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.01]"
		},
		{
		"step": 12,
		"action": "get_metric_values_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'cartservice-1'}",
		"observation": "[0. 0. 0. 0. 0.01 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ]"
		},
		{
		"step": 13,
		"action": "get_metric_values_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z', 'metric_name': 'pod_cpu_usage', 'instance': 'cartservice-2'}",
		"observation": "[0.02 0.02 0.02 0.03 0.02 0.03 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.03\n 0.03 0.02 0.02 0.02 0.02 0.02 0.02 0.03 0.02 0.02]"
		},
		{
		"step": 14,
		"action": "trace_anomaly_detection Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z'}",
		"observation": "No anomalies detected during the specified time period"
		},
		{
		"step": 15,
		"action": "log_anomaly_detection Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z'}",
		"observation": "No logs found for the specified time period"
		},
		{
		"step": 16,
		"action": "classifier Args: {'start_time': '2025-05-05 19:11:33', 'end_time': '2025-05-05 19:35:33'}",
		"observation": "Main failure type: pod failure"
		},
		{
		"step": 17,
		"action": "match_sop Args: {'query': 'SOP of pod failure', 'threshold': 0.55}",
		"observation": "Matched SOP 1: SOP of Pod Failure\nSimilarity Score: 1.0000\nSOP Content:\n1. 2. The result is combination of the previous steps.\nMatched SOP 2: SOP of Pod Failure - Pod Level\nSimilarity Score: 0.9328\nSOP Content:\n1. Time series anomaly detection: Detect the pods of anomalous service in terms of metric error_ratio and metric error_ratio. Each service has three pods, for example, adservice's pods are adservice-0,adservice-1,adservice-2. 2. The result is combination of the previous steps.\n"
		},
		{
		"step": 18,
		"action": "generate_and_run_sop_code Args: {'sop_str': \"1. Time series anomaly detection: Detect the pods of anomalous service in terms of metric error_ratio and metric error_ratio. Each service has three pods, for example, adservice's pods are adservice-0,adservice-1,adservice-2. 2. The result is combination of the previous steps.\", 'fault_info': 'Fault between 2025-05-05T11:11:33Z and 2025-05-05T11:35:33Z. Anomaly type: pod failure'}",
		"observation": "Error executing code: no types given"
		},
		{
		"step": 19,
		"action": "query_system_information Args: {'query': 'What are the correct metric names for pod error ratios?'}",
		"observation": "[\"client_error_ratio\", \"error_ratio\", \"server_error_ratio\"]"
		},
		{
		"step": 20,
		"action": "generate_and_run_sop_code Args: {'sop_str': \"1. Time series anomaly detection: Detect the pods of anomalous service in terms of metric client_error_ratio and metric server_error_ratio. Each service has three pods, for example, adservice's pods are adservice-0,adservice-1,adservice-2. 2. The result is combination of the previous steps.\", 'fault_info': 'Fault between 2025-05-05T11:11:33Z and 2025-05-05T11:35:33Z. Anomaly type: pod failure'}",
		"observation": "{'fault_time_range': '2025-05-05T11:11:33Z to 2025-05-05T11:35:33Z', 'anomaly_type': 'pod failure', 'affected_pods': {}}"
		},
		{
		"step": 21,
		"action": "get_logs_offline Args: {'start_time': '2025-05-05T11:11:33Z', 'end_time': '2025-05-05T11:35:33Z'}",
		"observation": "Empty DataFrame\nColumns: []\nIndex: []"
		},
		{
		"step": 22,
		"action": "JudgeRootCause Args: {'all_anomalous_information': 'Fault occurred between 2025-05-05T11:11:33Z and 2025-05-05T11:35:33Z. Classifier initially indicated pod cpu issue, then pod failure. No anomalies found in metrics, traces, or logs. All pod CPU usage metrics appear normal.'}",
		"observation": "Based on the information:\n\n1. No CPU/metric anomalies found, suggesting the issue isn't resource-related.\n2. The adservice pods show higher restart counts (8,12,5) compared to others (<4), indicating instability.\n3. These restarts occurred recently (4m-63m ago), overlapping with the fault window.\n4. Since all adservice pods are affected, the root cause likely lies within the adservice itself rather than specific pods or nodes.\n\nConclusion: The root cause is the adservice, as its instability (frequent pod restarts) could disrupt frontend operations without leaving metric traces. The issue propagates as frontend depends on adservice."
		}
		],
		"token count": 147141,
		"path length": 23
		}
		]
		No newline at end of file

client/submit.py

0 → 100644

+133 −0

添加文件。

预览已超出大小限制，变更已折叠。

server/README.md

0 → 100644

+182 −0

添加文件。

预览已超出大小限制，变更已折叠。