提交 71a59620 编辑于 6月 09, 2018 作者: Toshihiro Nakae

update parameter changes

上级 43eaaad5

README.md

+8 −1

原始行号	差异行号	差异行
		@@ -81,7 +81,14 @@ Instead, this implementation uses cholesky decomposition of covariance matrix.
		In ``DAGMM.fit()``, it generates and stores triangular matrix of cholesky decomposition
		of covariance matrix, and it is used in ``DAGMM.predict()``,

		In addition to it, small perturbation (1e-3) is added to diagonal
		In addition to it, small perturbation (1e-6) is added to diagonal
		elements of covariance matrix for more numerical stability
		(it is same as Tensorflow GMM implementation,
		and [another author of DAGMM](https://github.com/danieltan07/dagmm) also points it out)

		## Parameter of GMM Covariance ($\lambda_2$)
		Default value of $\lambda_2$ is set to 0.0001 (0.005 in original paper).
		When $\lambda_2$ is 0.005, covariances of GMM becomes too large to detect
		anomaly points. But perhaps it depends on data and preprocessing
		(for example a method of normalization). Recommended control $\lambda_2$
		when performance metrics is not good.

README_ja.md

+10 −1

原始行号	差异行号	差异行
		@@ -81,8 +81,17 @@ Jupyter notebook での[実行サンプル](./Example_DAGMM_ja.ipynb)を用意
		``DAGMM.fit()``において、共分散行列のコレスキー分解をしておき、算出された
		三角行列を ``DAGMM.predict()`` で利用しています。

		さらに、共分散行列の対角行列にあらかじめ小さな値(1e-3)を加えることで、
		さらに、共分散行列の対角行列にあらかじめ小さな値(1e-6)を加えることで、
		安定的にコレスキー分解ができるようにしています。
		(Tensorflow の GMM でも同様のロジックがあり、
		[DAGMMの別実装の実装者](https://github.com/danieltan07/dagmm)も
		同じ事情について言及しています)

		## 共分散パラメータ $\lambda_2$ について
		共分散の対角成分を制御するパラメータ $\lambda_2$ のデフォルト値は
		0.0001 としてあります（論文では 0.005 がおすすめとなっている）
		これは、0.005 とした場合に共分散が大きくなりすぎて、大きなクラスタ
		が選ばれる傾向にあったためです。ただしこれはデータの傾向、および
		前処理の手順（例えば、データの正規化の方法）にも依存すると考えられます。
		意図した精度が得られない場合は、$\lambda_2$ の値をコントロールすることを
		お勧めします。