Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the german word "stark" always recognized as an ADV without a sentiment value? #8

Open
Diapolo opened this issue May 13, 2024 · 2 comments

Comments

@Diapolo
Copy link

Diapolo commented May 13, 2024

Hey @Liebeck,

Thanks for this extension, I'm currently digging into using spaCy for a university project working on a sentiment analysis. I just made my first steps and have a working spaCy installation using 'de_core_news_lg' as model and 'spacy_sentiws' as an extension.

See this sample:
"Ich bin stark. Die Digitalisierung begegnet uns überall – und hat die Art, wie wir arbeiten und leben, stark verändert."

Ich, None, PRON
bin, None, AUX
stark, None, ADV
., None, PUNCT
Die, None, DET
Digitalisierung, None, NOUN
begegnet, None, VERB
uns, None, PRON
überall, None, ADV
–, None, PUNCT
und, None, CCONJ
hat, None, VERB
die, None, DET
Art, None, NOUN
,, None, PUNCT
wie, None, SCONJ
wir, None, PRON
arbeiten, None, VERB
und, None, CCONJ
leben, None, VERB
,, None, PUNCT
stark, None, ADV
verändert, None, VERB
., None, PUNCT

Why is "stark" always recognized as an adverb and why doesn't it get a SentiWS value at all? If I look into it should get a value of 0.0040 (stark|ADJX 0.0040).

Thanks,
Dia

@Liebeck
Copy link
Owner

Liebeck commented May 13, 2024

@Diapolo POS tagging is done through spaCy, not through my SentiWS wrapper. Therefore, "stark" with the POS tag ADV does not have any entry in SentiWS. Have a look at the implementation https://github.com/Liebeck/spacy-sentiws/blob/master/spacy_sentiws/senti_ws_wrapper.py#L25

@Diapolo
Copy link
Author

Diapolo commented May 15, 2024

Thanks, I also had a look into that file, so it seems spaCy or it's german training data isn't accurate here? Would you call this a "bug"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants