Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search sys.path when checking if module is builtin / binary #283

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

atleypnorth
Copy link

Hi

Not sure of the process to submit PR, so hope this is enough info etc. If not then please let me know!

I came across an issue when we make use of PYTHONPATH to pick up modules and these modules contain shared libraries

Example code

import sys

sys.path.insert(0,'path_to_numpy')
sys.path.insert(0,'path_to_dill')

import dill
import numpy

def f():
	ui = numpy.uint64(543)
	print(type(ui))

	with open('d1', 'wb') as f:
	   dill.dump(ui, f)

f()

Output

/usr/local/python/3.6.2/bin/python3 d1.py
<class 'numpy.uint64'>
/u1/morrpat/workspace/libs/1.14.3-cp36/numpy/core/multiarray.cpython-36m-x86_64-linux-gnu.so
Traceback (most recent call last):
File "d1.py", line 16, in
f()
File "d1.py", line 14, in f
dill.dump(ui, f)
File "/u1/morrpat/workspace/libs/0.2.7.1/dill/dill.py", line 274, in dump
pik.dump(obj)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 409, in dump
self.save(obj)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 521, in save
self.save_reduce(obj=obj, *rv)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 609, in save_reduce
save(func)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/u1/morrpat/workspace/libs/0.2.7.1/dill/dill.py", line 1033, in save_builtin_method
pickler.save_reduce(_get_attr, (module, obj.name), obj=obj)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 610, in save_reduce
save(args)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 736, in save_tuple
save(element)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/u1/morrpat/workspace/libs/0.2.7.1/dill/dill.py", line 1239, in save_module
state=_main_dict)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 634, in save_reduce
save(state)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/u1/morrpat/workspace/libs/0.2.7.1/dill/dill.py", line 871, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 821, in save_dict
self._batch_setitems(obj.items())
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 847, in _batch_setitems
save(v)
File "/usr/local/python/3.6.2/lib/python3.6/pickle.py", line 496, in save
rv = reduce(self.proto)
TypeError: can't pickle PyCapsule objects

The fix has it check against all the paths in sys.path as well if it hasnt already determined it is a builtin module.

@matsjoyce
Copy link
Contributor

This will mark all modules as builtin as for them to be found means that they are accessible via sys.path. Since numpy acts weird we could add a check for numpy but this fix does a bit too much ATM.

@atleypnorth
Copy link
Author

good point, let me think more on it.

Also will find another binary package and see if I can get the same issue as numpy shows since that may give more of a clue how fix it correctly.

@matsjoyce
Copy link
Contributor

Most packages don't have the problems numpy has, as it does all kinds of weird magic in creating its types (most of the, don't seem to pickle properly). dill has a number of special functions to deal with them, but it seems a few more are needed...

@mmckerns
Copy link
Member

@atleypnorth: Thanks for taking initiative on this. I agree with @matsjoyce. I think this PR has potential, however some of the changes should come in other forms. For example, issues with PyCapsule objects should be addressed directly. numpy does some "magic" behind the scenes things, and those should be dealt with specifically directly... and to be honest dill does not fully handle all of the numpy magic... so that's what should get patched, especially if it's an issue that stems from how dill handles numpy and not from other modules.

@atleypnorth
Copy link
Author

happy to help ! Will get a bit more time next week to dig into and hopefully can come up with something

@atleypnorth
Copy link
Author

Just going back to the first comment that the change would mark all modules as builtin - isn't this already the case for anything that for example comes out of the virtual environment?

For example if am running using the python in my virtual env /myhome/venv/bin/python then prefix is going to be /myhome/venv/ and this causes all modules under /myhome/venv/lib/python3.6/site-packages to marked as builtins.

@matsjoyce
Copy link
Contributor

Hmm, I guess it would. Maybe we need to replace it with a completely new test. @mmckerns We could try instead to serialize the module dict, and if an exception is thrown fallback to serialization by reference? That should solve @atleypnorth's problem too?

@mmckerns
Copy link
Member

mmckerns commented Oct 2, 2018

@matsjoyce: I don't think I have an issue with trying to serialize the module dict, and in failing... serializing by any of the other means (i.e. serialization variants) in dill. If there's a serialization variant that works... even if it's not the preferred variant by the user... I don't think I have an issue with either having a "whatever works" mode or special cases trying and then trying a variant upon failure. I'd have to think more if this potentially causes issues... I can't think of anything right away, aside from potentially causing some "unexpected" behavior.

@Piyush271986
Copy link

@atleypnorth : is there a plan to merge this PR . we are kind of stuck due to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants