Update testing suite to pytest #655

eachanjohnson · 2024-07-31T09:03:01Z

In running tests for PR #654, I ran into an issue with testing on my python3.11 platform. The nose package itself was running into import errors, and on further digging does not seem to be maintained any longer.

So I've lightly rewritten tests/test_style.py and tests/test_umi_tools.py to use pytest.

The tests now run for python3.11, so I expanded the range of Python versions in GitHub actions. Unfortunately, umi_tools doesn't compile under python3.12, which I think is a known issue #627.

Also, not all the tests run successfully, which I guess is related to the non-deterministic issue #362 and PRs #550, #98.

This may not be the kind of contribution you're looking for, but I ended up doing the work so I thought I'd hand it over 😄

Remove python3.12 test

Remove nose

Switch script tests to pytest

TomSmithCGAT · 2024-08-01T10:14:06Z

Thank you for this @eachanjohnson! You're right that nose is no longer maintained. Moving into a different testing framework has been on the to-do list for a while now. I really appreciate you taking the time to do this for us 🙏

I've confimed the 18 failed tests with python 3.11 outside of the github actions too. Given past experiences and looking into a few of the failures, I'm assuming this is due to changes in how dictionaries work between py3.10-3.11, but I can't see anything in the release notes relating to this. @IanSudbery - Do you have any idea what's going on here?

IanSudbery · 2024-08-05T17:25:35Z

So I had a quick look at one test. There seems to be a couple of things going on here. One is the sort order has gotten out of whack -

Taking the first failed test:

umi_tools dedup --paired --log=test.log --filter-umi --umi-whitelist=tests/umi_whitelist.tsv --umi-whitelist-paired=tests/umi_whitelist.tsv --out-sam
--random-seed=123456789 -I tests/whitelist_umi_input.bam

There are 2866 differences:

$ diff -y --suppress-common-lines test.out tests/whitelist_umi_output.sam | wc -l
2232

However, if we sort there are only 3:

$ diff -y --suppress-common-lines <(sort test.out) <(sort tests/whitelist_umi_output.sam) | wc -l
3

My guess is that this must be a change in how pysam (and possibly htslib/samtools) sorts bam files.

There are still three lines different though:

$ diff -y --suppress-common-lines <(sort test.out) <(sort tests/whitelist_umi_output.sam)
NS500105:308:HCGGHAFXY:1:11203:7544:2472_TACGAACCGTGTACTG     <
NS500105:308:HCGGHAFXY:1:21106:2617:15679_GTTCACCTAGCTCTAG    <
NS500105:308:HCGGHAFXY:1:21106:2617:15679_GTTCACCTAGCTCTAG    <

Three extra reads that arn't in the reference.

IanSudbery · 2024-08-07T15:28:37Z

The sort order issue can probably be sorted by add more sort directives in the tests. I believe the remaining issues are caused by changes in which way ties are being broken when a pseudo-random choice between two equivalent UMIs is being made. At the moment, I think we just choose whichever comes out top from sorted, which in turn will depend on what order they came out of the original dictionary.

One way in which this might have changed is that the default hashing algorithm for str changed between 3.10 and 3.11.

@TomSmithCGAT This leaves us in the somewhat difficult position of trying to decide whether to implement different tests for different versions, or finally chase down dealing with getting something predictable out of extracting things from a dict.

One alternative would be to explicitly check whether the top choice is a tie, and use an explict random choice to pick.

I don't know how much slow down this would give.

TomSmithCGAT · 2024-08-07T20:23:22Z

I think we can just merge #550 and be done with the non-deterministic issues, right?

For some reason, I thought that was waiting on you running some benchmarks, but I see that was done a full year ago 😱. If you're still happy with the changes in the code in #550, I'll resolve the conflicts and update the test files so we can merge that PR

IanSudbery · 2024-08-08T10:37:36Z

@TomSmithCGAT If you had time to do that, that would be great.

TomSmithCGAT · 2024-10-01T12:59:02Z

Hi @eachanjohnson. I finally got around to merging #550, which should resolve the non-deterministic issue causing issues here. I tried to make push commits to this branch so it will pass the tests but I got a permission denied error. It's just a few steps:

Pull master branch in your fork
Merge master into switch-to-pytest branch
That'll give a conflict error for .github/workflows/tests.yml. You can resolve that in favour of your branch and remove the line with export PYTHONHASHSEED=0 under the run directive, since that's no longer needed.

After that, you should see that the tests pass regardless of whether PYTHONHASHSEED is set. Hopefully that'll resolve the issues with python 3.10/3.11 too. Let's see..

eachanjohnson · 2024-10-01T21:54:14Z

Thanks, @TomSmithCGAT! Feeling ambitious, I added python3.12 to the tests, which also passed 🚀

IanSudbery · 2024-10-02T10:40:13Z

@eachanjohnson Thats for doing that. @TomSmithCGAT Are we good to merge?

TomSmithCGAT · 2024-10-03T08:36:13Z

Yes. All good. I'll do that now.

Thank you again @eachanjohnson 🙏

eaoljo added 3 commits July 31, 2024 09:35

Change tester from nose to pytest

0182827

Remove python3.12 test

Remove nose

5b6183c

Remove nose

Remove assert parentheses

0ed9cfa

Switch script tests to pytest

IanSudbery mentioned this pull request Aug 9, 2024

Update Python versions for tests #658

Closed

eachanjohnson added 2 commits October 1, 2024 22:47

Pull upstream changes

93328da

Add python3.12 support

7f46854

TomSmithCGAT merged commit 6a5577f into CGATOxford:master Oct 3, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update testing suite to pytest #655

Update testing suite to pytest #655

eachanjohnson commented Jul 31, 2024

TomSmithCGAT commented Aug 1, 2024

IanSudbery commented Aug 5, 2024

IanSudbery commented Aug 7, 2024

TomSmithCGAT commented Aug 7, 2024

IanSudbery commented Aug 8, 2024

TomSmithCGAT commented Oct 1, 2024

eachanjohnson commented Oct 1, 2024

IanSudbery commented Oct 2, 2024

TomSmithCGAT commented Oct 3, 2024

Update testing suite to pytest #655

Update testing suite to pytest #655

Conversation

eachanjohnson commented Jul 31, 2024

TomSmithCGAT commented Aug 1, 2024

IanSudbery commented Aug 5, 2024

IanSudbery commented Aug 7, 2024

TomSmithCGAT commented Aug 7, 2024

IanSudbery commented Aug 8, 2024

TomSmithCGAT commented Oct 1, 2024

eachanjohnson commented Oct 1, 2024

IanSudbery commented Oct 2, 2024

TomSmithCGAT commented Oct 3, 2024