add link to KDDCup99 notebook and fix typo (2e2df284) · 提交 · AIOps-NanKai / model / DAGMM

README.md

+11 −6

原始行号	差异行号	差异行
		@@ -65,8 +65,13 @@ model.restore("./fitted_model")
		```

		## Jupyter Notebook Example
		You can use [jupyter notebook example](Example_DAGMM.ipynb).
		You can use next jupyter notebook examples using DAGMM model.
		- [Simple DAGMM Example notebook](Example_DAGMM.ipynb) :
		This example uses random samples of mixture of gaussian.
		If you want to know simple usage, this notebook is recommended.
		- [KDDCup99 10% Data Evaluation](KDDCup99.ipynb) :
		Performance evaluation of anomaly detection for KDDCup99 10% Data
		with the same condition of original paper (need pandas)

		# Notes
		## GMM Implementation
		@@ -86,9 +91,9 @@ elements of covariance matrix for more numerical stability
		(it is same as Tensorflow GMM implementation,
		and [another author of DAGMM](https://github.com/danieltan07/dagmm) also points it out)

		## Parameter of GMM Covariance ($\lambda_2$)
		Default value of $\lambda_2$ is set to 0.0001 (0.005 in original paper).
		When $\lambda_2$ is 0.005, covariances of GMM becomes too large to detect
		anomaly points. But perhaps it depends on data and preprocessing
		(for example a method of normalization). Recommended control $\lambda_2$
		## Parameter of GMM Covariance (lambda_2)
		Default value of lambda_2 is set to 0.0001 (0.005 in original paper).
		When lambda_2 is 0.005, covariances of GMM becomes too large to detect
		anomaly points. But perhaps it depends on distribution of data and method of preprocessing
		(for example a method of normalization). Recommend to control lambda_2
		when performance metrics is not good.

README_ja.md

+9 −4

原始行号	差异行号	差异行
		@@ -66,8 +66,13 @@ model.restore("./fitted_model")
		```

		## Jupyter Notebook サンプル
		Jupyter notebook での[実行サンプル](./Example_DAGMM_ja.ipynb)を用意しました。
		次のJupyter notebook の実行サンプルを用意しました。
		- [DAGMM の利用例](Example_DAGMM_ja.ipynb) :
		このサンプルでは、混合正規分布に対して適用した結果となっています。
		利用方法を手っ取り早く知りたい場合、まずこのサンプルを見てください。
		- [KDDCup99 10% データによる異常検知評価](KDDCup99_ja.ipynb) :
		論文と同条件により、KDDCup99 10% データに対する異常検知を実施し、
		精度評価を行うサンプルです(pandasが必要です)

		# 補足
		## 混合正規分布(GMM)の実装について
		@@ -87,11 +92,11 @@ Jupyter notebook での[実行サンプル](./Example_DAGMM_ja.ipynb)を用意
		[DAGMMの別実装の実装者](https://github.com/danieltan07/dagmm)も
		同じ事情について言及しています)

		## 共分散パラメータ $\lambda_2$ について
		共分散の対角成分を制御するパラメータ $\lambda_2$ のデフォルト値は
		## 共分散パラメータ λ2 について
		共分散の対角成分を制御するパラメータλ2のデフォルト値は
		0.0001 としてあります（論文では 0.005 がおすすめとなっている）
		これは、0.005 とした場合に共分散が大きくなりすぎて、大きなクラスタ
		が選ばれる傾向にあったためです。ただしこれはデータの傾向、および
		前処理の手順（例えば、データの正規化の方法）にも依存すると考えられます。
		意図した精度が得られない場合は、$\lambda_2$ の値をコントロールすることを
		意図した精度が得られない場合は、λ2 の値をコントロールすることを
		お勧めします。