-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End to end testing of various posts, sites, and media #655
base: beta-master
Are you sure you want to change the base?
Conversation
Test for https://mstdn.social/@BBC/105203076554056414 needs PR #187 to be merged in before it will pass
…ugin into jmafoster1/136-e2e-testing
|
||
const MediaType = { | ||
video: "video", | ||
image: "image", | ||
none: "none", | ||
}; | ||
|
||
const LinkResult = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These icons will be removed in PR #640 - for the future, the contents of sourceTypes
could be used instead?
verification-plugin/src/components/NavItems/Assistant/AssistantScrapeResults/AssistantSCResults.jsx
Line 48 in 609f971
const sourceTypes = { |
@@ -265,6 +265,7 @@ const AssistantCredSignals = () => { | |||
onChange={handleChange(newsFramingTitle)} | |||
disabled={newsFramingLoading || newsFramingFail} | |||
disableGutters | |||
data-testid="topic-accordion" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will be changing from Accordion to CustomTabPanel for genre, topic, persuasion and subjectivity.
verification-plugin/src/components/NavItems/Assistant/AssistantScrapeResults/AssistantTextResult.jsx
Line 244 in deeec8e
<CustomTabPanel value={textTabIndex} index={0}> |
tests/e2e/assistant.spec.js
Outdated
"Questioning the reputation", | ||
], | ||
// TODO: Maybe try to test _which_ sentences are subjective? | ||
subjectivity: ["Subjective sentences detected (4/60)"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will change to remove the number of sentences in translations v0.84
https://github.com/AFP-Medialab/InVID-Translations/blob/df4c03af3ec6f5d27511ce3e7c91302a25c7ed8f/components/NavItems/tools/Assistant.tsv#L180
services: [MediaServices.metadata, MediaServices.magnifier, MediaServices.forensic, MediaServices.ocr], | ||
namedEntities: { | ||
// There should definitely be some named entities in here, e.g. Andreje Babiše, but maybe the service doesn't | ||
// work in Czech? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question - @ianroberts does the named entity only work well in English or can it work for other languages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current named entity recogniser is a combination of our GATE-based "TwitIE" for English with the standard named entity pipelines from spaCy for German, Greek, Spanish, French, Italian and Portuguese.
tests/e2e/assistant.spec.js
Outdated
"Security, Defense and Well-being", | ||
"Health and Safety" | ||
], | ||
genre: ["Satire"], // I don't think so (Seems to be the default when it doesn't know?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianroberts The genre classifier has three classes: Opinionated News, Objective Reporting and Satire. Does it always select one? Or is there ever "No genre detected" for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LesyaR can you answer this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianroberts @rosannamilner Thank you for the question. The model is trained in such a way that there is always one class (with the highest probability) returned. This corresponds to how the dataset the model was trained on is annotated - there is exactly one gold-standard class per input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @LesyaR. @jmafoster1 There isn't something to change here so this ready to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work getting all this done! There will be things that will change, so we'll have to see how cumbersome this becomes I think
…ugin into jmafoster1/136-e2e-testing
…ugin into jmafoster1/136-e2e-testing
Hi @jmafoster1 are we waiting for @ianroberts approval? I can merge this if it is ready |
I've just pinged @LesyaR again about that last question on the genre classifier, but I'm happy for you to go ahead as I guess that's not a blocker. |
Ah sorry, I had to make the tests pass again for the latest PRs, but I did that now, so yes, please feel free to go ahead and merge. I was also wondering whether we should make it a github action to run these tests before PRs get merged. What do you think? |
Yes @jmafoster1, it is a good idea to add it in a GitHub action. You can go ahead and add it |
Great, I'll try and get that done tomorrow and we can put it in as part of this PR. |
b808bcc
to
2c1c1f5
Compare
Quick update on this: The tests seem to just time out, get cancelled, and provide no further information. I stripped it down to just one test as I thought maybe there were too many for a single worker to run in an hour (my local machine runs 10 at once), but it's still doing the same. It did occur to me that perhaps the tests were struggling to connect to the backend, as I have to have this running to run my tests locally. Is there a good way we can connect our frontend tests to some version of the backend or is it better to mock it? Personally, I'd be against mocking, but it would make the tests more self-contained, speed them up, and help to reduce flakiness. |
@jmafoster1 I will discuss this with @ttramb when he is back (next week), there may be a way to connect it to the back-end. |
@jmafoster1 could you elaborate why you are against mocking? We discussed mocking the APIs with Bertrand today and we think mocking is a good idea going forward. We can use mswjs for that. I have created an issue #678 to track adding mocking. Partners do not always provide us with all the responses their APIs can return and it will be useful to have a place to mock these API responses. + Their servers also experience downtime more often than we think. However, it will require a good amount of work to mock everything correctly. |
I'm not keen on mocking here because then they're not true end to end tests of the whole system. I was sort of hoping that these would serve as integration tests between the backend and the frontend. For example, if something changed in the backed with how data was passed to the front end, the tests wouldn't pick it up. I appreciate that proper integration tests will be extremely difficult to do as a github action though, and is likely to be flaky. Perhaps if we could pass some option to turn mocking on or off, then we could run tests with mocking as a github action and periodically run them locally without mocking to make sure nothing has broken? Another possible (and complementary) option would be to have a separate repository which just contains a mapping of URLs and minimal scrape results. We could make that a subrepo of both the backend and the frontend and assert that the backend returns objects with at least the minimal scrape results (i.e. the expected datatype containing at least certain images, links, videos, etc. for each article with the possibility of extras) and then mock the backend to return those minimal scraped objects to test the frontend. By using the same test data in this way, we effectively mock the integration so that we can still pick up problems in both the backend and the frontend, although anything that involves going off and scraping third-party sites will of course probably be a bit flaky. Perhaps I am overthinking this, though, and it's not necessary. (I have come from a software testing research background, so am compelled to test everything as thoroughly as possible!) |
Issue 136
Additional playwright end to end tests for the links from the test document. I'm afraid I haven't been with the project long enough to know correct behaviour, so this will need checking. All the tests have passed for me, but can be a little flakey (e.g. if one of the third party services times out or fails). I think it would be good to try to make this a github action, but only if we can make it less flakey!