Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to scanpy 1.9.3 and other improvements #117

Merged
merged 49 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
01075fb
Use mito as bool not changing to category
pcm32 Sep 21, 2022
14783d6
Trials scanpy 1.9.1 for the mnn - numba issue.
pcm32 Nov 30, 2022
b563fac
Fix 1.9.1 hgv not finding base on log1p
pcm32 Nov 30, 2022
d35b6bd
Please black
pcm32 Nov 30, 2022
aec9085
Try log fix regardless of option
pcm32 Nov 30, 2022
3ad791b
Check direct installation
pcm32 Nov 30, 2022
11f5701
Set base explicitly to avoid it being dropped
pcm32 Nov 30, 2022
ffb75cd
Please black
pcm32 Nov 30, 2022
8cb3e20
Add igrahp and reinstate leidenalg
pcm32 Nov 30, 2022
8366105
Also louvain is needed
pcm32 Nov 30, 2022
890403c
Pin back h5py
pcm32 Nov 30, 2022
bce29d7
Passing all tests locally
pcm32 Nov 30, 2022
5556459
Keep todo
pcm32 Mar 26, 2023
8fcdd12
Try github actions with mamba
pcm32 Mar 26, 2023
26adc07
Black formatting
pcm32 Mar 26, 2023
1f16bf8
Python versions, better mamba
pcm32 Mar 26, 2023
952807a
Avoid treating versions as numbers
pcm32 Mar 26, 2023
4e512fc
Actions changes
pcm32 Mar 26, 2023
e691193
Black with no options
pcm32 Mar 26, 2023
f69b515
Black fixes
pcm32 Mar 26, 2023
c77c853
Check co structure
pcm32 Mar 26, 2023
5f186be
Why do we get extra files?
pcm32 Mar 26, 2023
3c79a09
Black manually
pcm32 Mar 26, 2023
68ec5d1
Make sure env is activated
pcm32 Mar 26, 2023
1eb1f96
pytest fix
pcm32 Mar 26, 2023
d14b079
Try with original extra dir for pytest
pcm32 Mar 26, 2023
c40f073
Type
pcm32 Mar 26, 2023
c2ff95b
missing extra dir
pcm32 Mar 26, 2023
a16c045
Use importlib.metadata instead
pcm32 Mar 26, 2023
042f099
pip install before tests
pcm32 Mar 26, 2023
351188a
impose pin on scipy for mnnpy
pcm32 Mar 27, 2023
6e5f0fb
Avoid python 3.10
pcm32 Mar 27, 2023
cc57c3b
Allow single group in one to one marker comp
pcm32 Apr 20, 2023
0e2cbdd
Rerun automatic tests
irisdianauy Oct 31, 2023
69d8322
Try pinning bknn below 1.6.0
pcm32 Jan 20, 2024
631c42d
Pin sklearn for bbknn
pcm32 Jan 21, 2024
5f00501
Fix package name
pcm32 Jan 21, 2024
ee96a41
Pin numba for mnnpy
pcm32 Jan 21, 2024
985d860
Downgrade numba even more for mnnpy
pcm32 Jan 21, 2024
7caa7b1
Pin numpy for mmnpy
pcm32 Jan 21, 2024
3fd13f0
Further pin numpy
pcm32 Jan 21, 2024
949e410
Further pin numpy
pcm32 Jan 21, 2024
bfb0b87
More stringent pinning based on 2022 scanpy-scripts latest container
pcm32 Jan 21, 2024
04404a8
More pinning for mnnpy
pcm32 Jan 21, 2024
af7b84c
More mnnpy pinning
pcm32 Jan 21, 2024
fb18bed
Commented `mnn_batch_correction` test as it fails with scanpy 1.9.1
anilthanki Feb 9, 2024
b50ca8b
adds warning message for `mnn_correct`
anilthanki Feb 9, 2024
a9b6910
Reformat _mnn.py
anilthanki Feb 9, 2024
c1d98f6
Reformat _mnn.py
anilthanki Feb 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 27 additions & 33 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,54 +2,48 @@ name: Python package

on: [pull_request]

defaults:
run:
# for conda env activation
shell: bash -l {0}

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.7, 3.8]
python-version: ["3.8", "3.9"]

steps:

- uses: actions/checkout@v2
with:
path: scanpy-scripts

- uses: psf/black@stable
with:
options: '--check --verbose --include="\.pyi?$" .'

- uses: actions/checkout@v2
with:
repository: theislab/scanpy
path: scanpy
ref: 1.8.1

- name: Setup BATS
uses: mig4/setup-bats@v1
with:
bats-version: 1.2.1

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
- name: Setup mamba
uses: mamba-org/provision-with-micromamba@main
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
environment-file: test-env.yaml
cache-downloads: true
channels: conda-forge, bioconda, defaults
extra-specs: |
python=${{ matrix.python-version }}

- name: Run black manually
run: |
pushd scanpy
patch -p1 < ../scanpy-scripts/scrublet.patch
popd
black --check --verbose ./

sudo apt-get install libhdf5-dev
pip install -U setuptools>=40.1 wheel 'cmake<3.20' pytest
pip install $(pwd)/scanpy-scripts
python -m pip install $(pwd)/scanpy --no-deps --ignore-installed -vv
# - name: Install dependencies
# run: |
# sudo apt-get install libhdf5-dev
# pip install -U setuptools>=40.1 wheel 'cmake<3.20' pytest
# pip install $(pwd)/scanpy-scripts
# # python -m pip install $(pwd)/scanpy --no-deps --ignore-installed -vv

- name: Run unit tests
run: pytest --doctest-modules -v ./scanpy-scripts
run: |
# needed for __version__ to be available
pip install . --no-deps --ignore-installed
pytest --doctest-modules -v ./

- name: Test with bats
run: |
./scanpy-scripts/scanpy-scripts-tests.bats
./scanpy-scripts-tests.bats
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@
*.pyc
/.*history
/.*swp
data
compressed
uncompressed
16 changes: 13 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,22 @@ A command-line interface for functions of the Scanpy suite, to facilitate flexib

## Install

The recommended way of installing scanpy-scripts is via conda:

```bash
conda install scanpy-scripts
# or
pip3 install scanpy-scripts
```

pip installation is also possible, however the version of mnnpy is not patched as in the conda version, and so the `integrate` command will not work.

```bash
pip install scanpy-scripts
```

For development installation, we suggest following the github actions python-package.yml file.

Currently, tests run on python 3.9, so those are the recommended versions if not installing via conda. BKNN doesn't currently install on Python 3.10 due to a skip in Bioconda.

## Test installation

There is an example script included:
Expand All @@ -22,7 +32,7 @@ This requires the [bats](https://github.com/sstephenson/bats) testing framework

## Commands

Available commands are described below. Each has usage instructions available via --help, consult function documentation in scanpy for further details.
Available commands are described below. Each has usage instructions available via `--help`, consult function documentation in scanpy for further details.

```
Usage: scanpy-cli [OPTIONS] COMMAND [ARGS]...
Expand Down
23 changes: 12 additions & 11 deletions scanpy-scripts-tests.bats
Original file line number Diff line number Diff line change
Expand Up @@ -653,17 +653,18 @@ setup() {
}

# Do MNN batch correction, using clustering as batch (just for test purposes)

@test "Run MNN batch integration using clustering as batch" {
if [ "$resume" = 'true' ] && [ -f "$mnn_obj" ]; then
skip "$mnn_obj exists and resume is set to 'true'"
fi

run rm -f $mnn_obj && eval "$scanpy integrate mnn $mnn_opt $louvain_obj $mnn_obj"

[ "$status" -eq 0 ]
[ -f "$mnn_obj" ]
}
# Commented as it fails with scanpy 1.9.1
#
# @test "Run MNN batch integration using clustering as batch" {
# if [ "$resume" = 'true' ] && [ -f "$mnn_obj" ]; then
# skip "$mnn_obj exists and resume is set to 'true'"
# fi
#
# run rm -f $mnn_obj && eval "$scanpy integrate mnn $mnn_opt $louvain_obj $mnn_obj"
#
# [ "$status" -eq 0 ]
# [ -f "$mnn_obj" ]
#}

# Do ComBat batch correction, using clustering as batch (just for test purposes)

Expand Down
4 changes: 2 additions & 2 deletions scanpy_scripts/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
"""
Provides version, author and exports
"""
import pkg_resources
import importlib.metadata

__version__ = pkg_resources.get_distribution("scanpy-scripts").version
__version__ = importlib.metadata.version("scanpy-scripts")

__author__ = ", ".join(
[
Expand Down
4 changes: 3 additions & 1 deletion scanpy_scripts/cmd_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,11 @@
import pandas as pd
import scanpy as sc
import scanpy.external as sce

from .cmd_options import CMD_OPTIONS
from .lib._paga import plot_paga
from .obj_utils import _save_matrix
from .lib._scrublet import plot_scrublet
from .obj_utils import _save_matrix


def make_subcmd(cmd_name, func, cmd_desc, arg_desc, opt_set=None):
Expand Down Expand Up @@ -313,6 +314,7 @@ def plot_function(
showfig = True
if output_fig:
import os

import matplotlib.pyplot as plt

sc.settings.figdir = os.path.dirname(output_fig) or "."
Expand Down
37 changes: 34 additions & 3 deletions scanpy_scripts/lib/_diffexp.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@
scanpy diffexp
"""

import logging
import math

import pandas as pd
import scanpy as sc
import logging


def diffexp(
Expand All @@ -22,6 +24,15 @@ def diffexp(
):
"""
Wrapper function for sc.tl.rank_genes_groups.

Test that we can load a single group.
>>> import os
>>> from pathlib import Path
>>> adata = sc.datasets.krumsiek11()
>>> tbl = diffexp(adata, groupby='cell_type', groups='Mo', reference='progenitor')
>>> # get the size of the data frame
>>> tbl.shape
(11, 8)
"""
if adata.raw is None:
use_raw = False
Expand Down Expand Up @@ -51,6 +62,11 @@ def diffexp(
"Singlet groups removed before passing to rank_genes_groups()"
)

# avoid issue when groups is a single group as a string simplified by click
# https://github.com/ebi-gene-expression-group/scanpy-scripts/issues/123
if groups != "all" and isinstance(groups, str):
groups = [groups]

sc.tl.rank_genes_groups(
adata,
use_raw=use_raw,
Expand All @@ -64,17 +80,32 @@ def diffexp(
de_tbl = extract_de_table(adata.uns[diff_key])

if isinstance(filter_params, dict):
key_filtered = diff_key + "_filtered"
sc.tl.filter_rank_genes_groups(
adata,
key=diff_key,
key_added=diff_key + "_filtered",
key_added=key_filtered,
use_raw=use_raw,
**filter_params,
)

de_tbl = extract_de_table(adata.uns[diff_key + "_filtered"])
# there are non strings on recarray object at this point, in
# adata.uns['rank_genes_groups_filtered']['names']
# for instance:
# adata.uns['rank_genes_groups_filtered']['names'][0]
# (nan, nan, 'NKG7', nan, nan, 'PPBP')
# this now upsets h5py > 3.0
de_tbl = extract_de_table(adata.uns[key_filtered])
de_tbl = de_tbl.loc[de_tbl.genes.astype(str) != "nan", :]

# change nan for strings in adata.uns['rank_genes_groups_filtered']['names']
# TODO on scanpy updates, check if this is not done within scanpy so that we can remove this
for row in range(0, len(adata.uns[key_filtered]["names"])):
for col in range(0, len(adata.uns[key_filtered]["names"][row])):
element = adata.uns[key_filtered]["names"][row][col]
if isinstance(element, float) and math.isnan(element):
adata.uns[key_filtered]["names"][row][col] = "nan"

if save:
de_tbl.to_csv(save, sep="\t", header=True, index=False)

Expand Down
2 changes: 1 addition & 1 deletion scanpy_scripts/lib/_filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def filter_anndata(
k_mito = gene_names.str.startswith("MT-")
if k_mito.sum() > 0:
adata.var["mito"] = k_mito
adata.var["mito"] = adata.var["mito"].astype("category")
# adata.var["mito"] = adata.var["mito"].astype("category")
else:
logging.warning(
"No MT genes found, skip calculating "
Expand Down
1 change: 1 addition & 0 deletions scanpy_scripts/lib/_louvain.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"""

import scanpy as sc

from ..obj_utils import write_obs


Expand Down
9 changes: 7 additions & 2 deletions scanpy_scripts/lib/_mnn.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@
scanpy external mnn
"""

import scanpy.external as sce
import numpy as np
import click
import numpy as np
import scanpy.external as sce
import logging

# Wrapper for mnn allowing use of non-standard slot

Expand All @@ -16,6 +17,10 @@ def mnn_correct(adata, key=None, key_added=None, var_subset=None, layer=None, **

# mnn will use .X, so we need to put other layers there for processing

logging.warning(
"Use mnn_correct at your own risk, environment installation seems faulty for this module."
)

if layer:
adata.layers["X_backup"] = adata.X
adata.X = adata.layers[layer]
Expand Down
10 changes: 9 additions & 1 deletion scanpy_scripts/lib/_norm.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"""

import scanpy as sc
import math


def normalize(adata, log_transform=True, **kwargs):
Expand All @@ -12,6 +13,13 @@ def normalize(adata, log_transform=True, **kwargs):
"""
sc.pp.normalize_total(adata, **kwargs)
if log_transform:
sc.pp.log1p(adata)
# Natural logarithm is the default by scanpy, if base is not set
base = math.e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. If Natural (neperian) is the default in Scanpy, why are you defining it?

Is this related to the bump version in Scanpy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is mostly to explicitly set it in adata.uns. There seems to be an issue by which if you call sc.pp.log1p without the base explicitly given, it doesn't record the base on the uns, and as a result later down the line you have no idea on which base the log was called.

sc.pp.log1p(adata, base=base)
# scanpy is not setting base in uns['log1p'] keys, but later on asking for it
if "log1p" in adata.uns_keys() and "base" not in adata.uns["log1p"]:
# Note that setting base to None doesn't solve the problem at other modules that check for base later on
# as adata.uns["log1p"]["base"] = None gets dropped at either anndata write or read.
adata.uns["log1p"]["base"] = base

return adata
23 changes: 12 additions & 11 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
from setuptools import setup, find_packages
from setuptools import find_packages, setup

with open("README.md", "r") as fh:
long_description = fh.read()

setup(
name="scanpy-scripts",
version="1.1.6",
version="1.1.9",
author="nh3",
author_email="[email protected]",
description="Scripts for using scanpy from the command line",
Expand Down Expand Up @@ -35,23 +35,24 @@
]
),
install_requires=[
"packaging",
"anndata",
"scipy",
"matplotlib",
"pandas",
"h5py<3.0.0",
"scanpy==1.8.1",
# "packaging",
# "anndata",
# "scipy",
# "matplotlib",
# "pandas",
# "h5py<3.0.0",
"scanpy==1.9.3",
"louvain",
"igraph",
"leidenalg",
"loompy",
"Click<8",
"umap-learn",
# "umap-learn",
"harmonypy>=0.0.5",
"bbknn>=1.5.0",
"mnnpy>=0.1.9.5",
"scrublet",
"scikit-misc",
# "scikit-misc",
"fa2",
],
)
Loading
Loading