-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include docstrings, maybe a schema? #10
Comments
@haesleinhuepf ... curious how these thoughts strike you? Playing around with plugin autogeneration and general code-deduplication in pyclesperanto is definitely something I'd like to work on, but the solutions there will depend very much on the structure in this submodule. |
Indeed, that would be the first choice on my side. |
Hey @tlambert03 , your suggestion is amazing! We could copy-paste the information from the current pyclesperanto docstrings, curate it here and in the future, we auto-generate the doc-strings (in Python, Java, C++, HTML/website) from this repository. Side-topic: How about the "kernels" which have no opencl-code associated directly. Do you think we could maintain their documentation here as well or should their documentation live somewhere else? I'm talking about filters like top-hat, for example. |
that's the dream! You can tell me how easy/hard that will be for Java or C++ ... but for python and HTML, it should be no sweat. And then we all know we're in sync :)
good question. I'd hope we can find a place for them here too. It'd be pretty trivial to do that with the schema idea: version: 2.0.0.10
operations:
- name: top-hat
description: Applies a top-hat filter for background subtraction to the input image.
tier: 2
parameters:
- name: input
...
comprises:
- minimum_box
- maximum_box
- add_images_weighted and with doxygen, there's probably some sort of stub concept we can use to the same effect. I'm not that familiar with doxygen though. @StRigaud do you have thoughts here? Does anyone have strong feelings for or against an independent schema vs strictly putting it in the cl code itself? |
I think it's also pretty straightforward in Java. I was thinking of translating the current code-generation machinery from Java to Python anyway. Just another code translation project ... this clEsperanto project turns out to be translating a lot of code. I hope it will spare at least that amount of code translation afterwards.🤣 Joking aside: This massive refactoring gives me the chance to clean up some messy parts. Thanks for being with me. And thanks for your patience. Minor comment: Can we leave out the version from the scheme you suggested? I think git tags / versions can do the job better. |
In the sense that we don't need to hard-code it sure. We can pull the actual value from version control, but I do think the actual "schema instance" (i.e. the yml or json file "product" that other repos/users use) should have a version string in it. If this concept of a schema outside of the CL code itself (but in this repo) is at least moderately attractive, I can start putting together a little more concrete example... which we can then populate more fully once we like the pattern. Sound good? We basically need two files: one to define the schema itself (i would probably use JSON Schema unless you have any objections... it has good support and validators for all of the languages at play here), and then we need the actual schema "instance" like the toy examples I put above. That too could be in JSON, but yaml obviously has very attractive human-readability. Ultimately, if we regret putting this info outside of the CL code itself, I don't think it will be that hard to covert (and at that point all of the info will be nicely in one place anyway).
I had some thoughts on that too, if you have time, I'd be curious to hear what your strategy is before you write it all out? |
It looks good for me. 👍 This will also help us define the filter name and parameters across all the language (Python, C++, Fiji, Java...). |
I'm just adding an idea here: How about aiming compatibility with the common workflow language? |
Got two questions from #12 :
For the last, I assume yes but prefer to ask. Mainly for the description field. |
that was a super-preliminary way to indicate that this "input" parameter also serves as a kernel output. was going to raise that as a thing to discuss... Happy to represent that concept differently!
yep, the easiest way to indicate that in YAML is a pipe character: operation:
description: |
any long string
with line breaks, etc... Though, there are alternate syntaxes that can also be used. see: https://yaml-multiline.info/ |
Love that there is now a central kernels repo. I think it would be great to put as much of the universal information here as possible.
I know it's a lot of work... but it would be great if the kernels here could be documented with a good (parseable) format ... I guess doxygen is the defacto C standard? (not sure there). And then that info wouldn't need to be duplicated (for instance, pyclesperanto.absolute and clij2 Absolute). Adding docstrings here could be a first step.
a related idea (maybe an alternative?) would be to have a very easily parseable schema that that does not require doxygen to parse. The downside there is that we'd need to make sure the kernels were in sync with the schema (we could write tests for that ... but it's possible that the logic for those tests would be as complicated as simply using doxygen to parse in the first place). An example schema:
(this sort of thing would also make it much easier to create templates that let you convert scripts between languages very easily... as well as facilitate autogeneration of most of the
.py
files in pyclesperanto)The text was updated successfully, but these errors were encountered: