Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End to end testing of various posts, sites, and media #655

Open
wants to merge 19 commits into
base: beta-master
Choose a base branch
from

Conversation

jmafoster1
Copy link
Collaborator

@jmafoster1 jmafoster1 commented Nov 14, 2024

Issue 136

Additional playwright end to end tests for the links from the test document. I'm afraid I haven't been with the project long enough to know correct behaviour, so this will need checking. All the tests have passed for me, but can be a little flakey (e.g. if one of the third party services times out or fails). I think it would be good to try to make this a github action, but only if we can make it less flakey!


const MediaType = {
video: "video",
image: "image",
none: "none",
};

const LinkResult = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These icons will be removed in PR #640 - for the future, the contents of sourceTypes could be used instead?

@@ -265,6 +265,7 @@ const AssistantCredSignals = () => {
onChange={handleChange(newsFramingTitle)}
disabled={newsFramingLoading || newsFramingFail}
disableGutters
data-testid="topic-accordion"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will be changing from Accordion to CustomTabPanel for genre, topic, persuasion and subjectivity.

"Questioning the reputation",
],
// TODO: Maybe try to test _which_ sentences are subjective?
subjectivity: ["Subjective sentences detected (4/60)"]
Copy link
Collaborator

@rosannamilner rosannamilner Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

services: [MediaServices.metadata, MediaServices.magnifier, MediaServices.forensic, MediaServices.ocr],
namedEntities: {
// There should definitely be some named entities in here, e.g. Andreje Babiše, but maybe the service doesn't
// work in Czech?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - @ianroberts does the named entity only work well in English or can it work for other languages?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current named entity recogniser is a combination of our GATE-based "TwitIE" for English with the standard named entity pipelines from spaCy for German, Greek, Spanish, French, Italian and Portuguese.

"Security, Defense and Well-being",
"Health and Safety"
],
genre: ["Satire"], // I don't think so (Seems to be the default when it doesn't know?)
Copy link
Collaborator

@rosannamilner rosannamilner Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianroberts The genre classifier has three classes: Opinionated News, Objective Reporting and Satire. Does it always select one? Or is there ever "No genre detected" for example

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LesyaR can you answer this one?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianroberts @rosannamilner Thank you for the question. The model is trained in such a way that there is always one class (with the highest probability) returned. This corresponds to how the dataset the model was trained on is annotated - there is exactly one gold-standard class per input.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LesyaR. @jmafoster1 There isn't something to change here so this ready to go

Copy link
Collaborator

@rosannamilner rosannamilner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work getting all this done! There will be things that will change, so we'll have to see how cumbersome this becomes I think

@Sallaa
Copy link
Contributor

Sallaa commented Nov 27, 2024

Hi @jmafoster1 are we waiting for @ianroberts approval? I can merge this if it is ready

@ianroberts
Copy link
Contributor

I've just pinged @LesyaR again about that last question on the genre classifier, but I'm happy for you to go ahead as I guess that's not a blocker.

@jmafoster1
Copy link
Collaborator Author

Ah sorry, I had to make the tests pass again for the latest PRs, but I did that now, so yes, please feel free to go ahead and merge. I was also wondering whether we should make it a github action to run these tests before PRs get merged. What do you think?

@Sallaa
Copy link
Contributor

Sallaa commented Nov 27, 2024

Yes @jmafoster1, it is a good idea to add it in a GitHub action. You can go ahead and add it

@jmafoster1
Copy link
Collaborator Author

Great, I'll try and get that done tomorrow and we can put it in as part of this PR.

@jmafoster1 jmafoster1 force-pushed the jmafoster1/136-e2e-testing branch from b808bcc to 2c1c1f5 Compare November 27, 2024 14:15
@jmafoster1
Copy link
Collaborator Author

Quick update on this: The tests seem to just time out, get cancelled, and provide no further information. I stripped it down to just one test as I thought maybe there were too many for a single worker to run in an hour (my local machine runs 10 at once), but it's still doing the same. It did occur to me that perhaps the tests were struggling to connect to the backend, as I have to have this running to run my tests locally. Is there a good way we can connect our frontend tests to some version of the backend or is it better to mock it? Personally, I'd be against mocking, but it would make the tests more self-contained, speed them up, and help to reduce flakiness.

@Sallaa
Copy link
Contributor

Sallaa commented Nov 29, 2024

@jmafoster1 I will discuss this with @ttramb when he is back (next week), there may be a way to connect it to the back-end.

@Sallaa
Copy link
Contributor

Sallaa commented Dec 5, 2024

Quick update on this: The tests seem to just time out, get cancelled, and provide no further information. I stripped it down to just one test as I thought maybe there were too many for a single worker to run in an hour (my local machine runs 10 at once), but it's still doing the same. It did occur to me that perhaps the tests were struggling to connect to the backend, as I have to have this running to run my tests locally. Is there a good way we can connect our frontend tests to some version of the backend or is it better to mock it? Personally, I'd be against mocking, but it would make the tests more self-contained, speed them up, and help to reduce flakiness.

@jmafoster1 could you elaborate why you are against mocking? We discussed mocking the APIs with Bertrand today and we think mocking is a good idea going forward. We can use mswjs for that. I have created an issue #678 to track adding mocking. Partners do not always provide us with all the responses their APIs can return and it will be useful to have a place to mock these API responses. + Their servers also experience downtime more often than we think. However, it will require a good amount of work to mock everything correctly.

@jmafoster1
Copy link
Collaborator Author

I'm not keen on mocking here because then they're not true end to end tests of the whole system. I was sort of hoping that these would serve as integration tests between the backend and the frontend. For example, if something changed in the backed with how data was passed to the front end, the tests wouldn't pick it up.

I appreciate that proper integration tests will be extremely difficult to do as a github action though, and is likely to be flaky. Perhaps if we could pass some option to turn mocking on or off, then we could run tests with mocking as a github action and periodically run them locally without mocking to make sure nothing has broken?

Another possible (and complementary) option would be to have a separate repository which just contains a mapping of URLs and minimal scrape results. We could make that a subrepo of both the backend and the frontend and assert that the backend returns objects with at least the minimal scrape results (i.e. the expected datatype containing at least certain images, links, videos, etc. for each article with the possibility of extras) and then mock the backend to return those minimal scraped objects to test the frontend. By using the same test data in this way, we effectively mock the integration so that we can still pick up problems in both the backend and the frontend, although anything that involves going off and scraping third-party sites will of course probably be a bit flaky. Perhaps I am overthinking this, though, and it's not necessary. (I have come from a software testing research background, so am compelled to test everything as thoroughly as possible!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants