Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[add] GTree based Decision Tree #367

Merged
merged 9 commits into from
Oct 24, 2023
Merged

Conversation

ElleryQu
Copy link
Contributor

Pull Request

What problem does this PR solve?

Issue Number: Fixed #212

Possible side effects?

  • Performance:

test:

No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
[2023-10-13 18:32:00.502] [info] [thread_pool.cc:30] Create a fixed thread pool with size 19
Accuracy in SKlearn: 0.96
Accuracy in SPU: 0.96
.
----------------------------------------------------------------------
Ran 1 test in 78.026s

OK

emulation:

Running time in SPU: 77.37s
Accuracy in SKlearn: 0.96
Accuracy in SPU: 0.96
[2023-10-13 18:35:38,091] Shutdown multiprocess cluster...

  • Backward compatibility:

@github-actions
Copy link

github-actions bot commented Oct 13, 2023

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@ElleryQu
Copy link
Contributor Author

I have read the CLA Document and I hereby sign the CLA

sml/tree/tree.py Outdated Show resolved Hide resolved
sml/tree/tree.py Outdated Show resolved Hide resolved
sml/tree/emulations/tree_emul.py Show resolved Hide resolved
sml/tree/tests/tree_test.py Show resolved Hide resolved
sml/tree/tree.py Show resolved Hide resolved
sml/tree/tree.py Outdated Show resolved Hide resolved
sml/tree/tree.py Outdated Show resolved Hide resolved
@deadlywing
Copy link
Contributor

@ElleryQu 抱歉,,前段时间有点忙,delay了一段时间;
整体实现应该ok,,有一些小的建议;

另外,麻烦下次push之前对代码的格式做一些调整:

  1. 使用 buildifier 格式化bazel文件
  2. 使用black和isort 格式化python文件

@ElleryQu
Copy link
Contributor Author

@ElleryQu 抱歉,,前段时间有点忙,delay了一段时间; 整体实现应该ok,,有一些小的建议;

另外,麻烦下次push之前对代码的格式做一些调整:

  1. 使用 buildifier 格式化bazel文件
  2. 使用black和isort 格式化python文件

好嘞,感谢

@ElleryQu
Copy link
Contributor Author

@ElleryQu 抱歉,,前段时间有点忙,delay了一段时间; 整体实现应该ok,,有一些小的建议;

另外,麻烦下次push之前对代码的格式做一些调整:

  1. 使用 buildifier 格式化bazel文件
  2. 使用black和isort 格式化python文件

您好,上述问题已经修改。

  1. 对于特征爆炸的问题:修改为X[:, ::3], y[:]
  2. 在DecisionTreeBuilder类增加了一段描述性文字。原始文章未给出复杂度分析,这里附上了一个大概的估算。
  3. 根据上一次检查结果进行了格式化处理。对于其他杂项进行了修改。

sml/tree/tests/tree_test.py Outdated Show resolved Hide resolved
sml/tree/tree.py Outdated Show resolved Hide resolved
sml/tree/tree.py Outdated Show resolved Hide resolved
sml/tree/emulations/tree_emul.py Outdated Show resolved Hide resolved
@deadlywing
Copy link
Contributor

python format 仍然有点问题

@deadlywing
Copy link
Contributor

test文件中 +1-1那里还没有修改哈~

@ElleryQu
Copy link
Contributor Author

test文件中 +1-1那里还没有修改哈~

您好,这里是为了保证jax.equal结果在SPU上是加法共享,如果改成+0会被解释器优化掉。

如果不加+1-1,jax.equal的结果为布尔分享,赋值时会发生错误。review展示了一个最小的错误例子。这种错误会导致在iris_data上分类正确率降低至66%。

@deadlywing
Copy link
Contributor

test文件中 +1-1那里还没有修改哈~

您好,这里是为了保证jax.equal结果在SPU上是加法共享,如果改成+0会被解释器优化掉。

如果不加+1-1,jax.equal的结果为布尔分享,赋值时会发生错误。review展示了一个最小的错误例子。这种错误会导致在iris_data上分类正确率降低至66%。

hi,我自己本地试了一下,并没有发现准确性下降的情况,,能辛苦贴一下具体的运行setting和结果么?

@ElleryQu
Copy link
Contributor Author

test文件中 +1-1那里还没有修改哈~

您好,这里是为了保证jax.equal结果在SPU上是加法共享,如果改成+0会被解释器优化掉。
如果不加+1-1,jax.equal的结果为布尔分享,赋值时会发生错误。review展示了一个最小的错误例子。这种错误会导致在iris_data上分类正确率降低至66%。

hi,我自己本地试了一下,并没有发现准确性下降的情况,,能辛苦贴一下具体的运行setting和结果么?

没有复现出来,但之前确实遇到过这个问题( ̄  ̄|| )

删除了+1-1

Copy link
Contributor

@deadlywing deadlywing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@deadlywing deadlywing merged commit 7e51ddb into secretflow:main Oct 24, 2023
6 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Oct 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

使用 SPU 实现决策树模型基础功能
2 participants