Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ACCWaitDirective and async keyword on kernels, enter data, parallel (fixed #1664) #2070

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

svalat
Copy link

@svalat svalat commented Mar 14, 2023

While working on CROCO I had to make a try to reach performance of their manual ACC version by using the async/wait ACC semantic so I made the patch which by the way might solve issue #1664.

Done:

  • Add ACCWaitDirective
  • Patch ACCKernelDriective, ACCParallelDirective and AccEnterDataDir to add support for async keyword
  • Patch ACCKernelsTrans to support the async option.

Questions:

  • I made a commit for each, don't know if you prefer to split in a PR for each.
  • In the Trans, I currently forbid nesting async with different streams, don't know if it is too strict or not ?
  • We could also have introduced an intermediate class ACCAsync to inherit from for the ACCKernelDirective, ACCParallelDirective and ACCEnterDataDir which might help for latter tree analysis.

@codecov
Copy link

codecov bot commented Mar 14, 2023

Codecov Report

Patch coverage: 96.82% and project coverage change: -0.01% ⚠️

Comparison is base (5a77078) 99.85% compared to head (701f728) 99.84%.
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2070      +/-   ##
==========================================
- Coverage   99.85%   99.84%   -0.01%     
==========================================
  Files         339      340       +1     
  Lines       46027    46141     +114     
==========================================
+ Hits        45958    46068     +110     
- Misses         69       73       +4     
Files Changed Coverage Δ
src/psyclone/domain/lfric/__init__.py 100.00% <ø> (ø)
src/psyclone/psyir/nodes/__init__.py 100.00% <ø> (ø)
src/psyclone/transformations.py 99.00% <87.87%> (-0.56%) ⬇️
src/psyclone/domain/lfric/lfric_loop_bounds.py 100.00% <100.00%> (ø)
src/psyclone/f2pygen.py 99.04% <100.00%> (ø)
src/psyclone/psyir/nodes/acc_directives.py 100.00% <100.00%> (ø)
src/psyclone/psyir/nodes/acc_mixins.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sergisiso
Copy link
Collaborator

Thanks @svalat

I made a commit for each, don't know if you prefer to split in a PR for each.

No, a single PR is good if they all target to close a single issue.

In the Trans, I currently forbid nesting async with different streams, don't know if it is too strict or not ?

It is fine if its strict, it probably means less testing to consider, we can always expand it later.

We could also have introduced an intermediate class ACCAsync to inherit from for the ACCKernelDirective, ACCParallelDirective and ACCEnterDataDir which might help for latter tree analysis.

I will let the reviewer have the final say on this when looking at the code.

Currently we require that each PR has 100% coverage reported by CodeCov. Once you have this you can select multiple reviewers (in the top right corner of the github PR - for this PR you can mark myself and @arporter) and add the "Ready for Review" label. It is also good to write a message tagging the requested reviewers when adding the "Ready for Review" just so that we get a notification.

@svalat svalat force-pushed the feature/acc-wait-directive branch 2 times, most recently from 4c44e57 to 1fd09e5 Compare March 16, 2023 10:18
@arporter
Copy link
Member

Hi @svalat, if you could fix the conflict then I can take this for review (assuming it's ready?). Thanks :-)

@svalat svalat force-pushed the feature/acc-wait-directive branch from 609a8a5 to 3e34205 Compare June 20, 2023 20:07
@svalat
Copy link
Author

svalat commented Jun 21, 2023

Ok, it is fixed and pass all you tests.

I had to remove one of the check I added in on function (no inclusion of async(x) in async(y) with x!=y). In practice there is no way to reproduce it because it is already forbidden to put a parallel in a parallel or kernel.

@svalat
Copy link
Author

svalat commented Jun 21, 2023

OK, it is now ready for review.

@arporter arporter self-assigned this Jun 29, 2023
Copy link
Member

@arporter arporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for this @svalat. I'm going to pause my review here as there's quite a lot of duplication of the functionality related to support async (see inline). I think this is best handled by introducing a new mixin class (e.g. ACCAsyncMixin) and having those classes that need it inherit from it. Are you OK to do that or would you prefer one of us to take it on?

src/psyclone/psyir/nodes/acc_directives.py Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Show resolved Hide resolved
@svalat
Copy link
Author

svalat commented Jun 29, 2023

Ok, I will have a look next week, I can do it myself. Thanks for the review.

@svalat svalat force-pushed the feature/acc-wait-directive branch from 18fe9a6 to f3b8881 Compare July 13, 2023 19:19
@arporter
Copy link
Member

Hi @svalat, is this ready for another look now?

@svalat svalat force-pushed the feature/acc-wait-directive branch from a60ca57 to f3b8881 Compare July 17, 2023 19:56
@svalat
Copy link
Author

svalat commented Jul 17, 2023

Ok, I will have a look next week, I can do it myself. Thanks for the review.

Yes, we might just need to rebase or merge master in to fix the link issue in doc reported by git action once you agree.

Copy link
Member

@arporter arporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good now, thanks for restructuring.
Main things are to fix the pylint warnings and use a Reference rather than a Signature as argument to the queue number.
All tests and examples are OK with compilation.
Coverage is fine.
Ref. guide builds without new warnings.
Please could you extend the example show in the Kernels transformation docstring to show the use of the async clause.


:param async_queue: Enable async support and attach it to the given queue.
Can use False to disable, True to enable on default stream.
Int to attach to the given stream ID or use a variable Signature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want 'Reference' rather than 'Signature' - Signatures are used to simplify the process of comparing complicated references when doing dependence analysis.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
Notice : it complexify a bit the tests because now require to inject Reference(Symbol(("xxxx))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I see now why you used Signature. We're at the boundary between the 'old' and 'new' way of doing things here. In the new way, an ACCDirective will have one or more ACCClause's as children and the code is generated using the PSyIR backend. In the old way, we have a gen_code method which makes use of begin_string. The initial ACCDirective implementation isn't on yet (#2157) although it's nearly there. What you have will only work if the queue is a scalar variable - if it is an array or structure access then we'll only generate the root name of the access. I think the best solution for now is to use the FortranWriter to ensure that a correct string is produced:

from psyclone.psyir.backend import FortranWriter
text = FortranWriter()(self._wait_queue)

This then needs marking with a TODO for #1872 to say that it needs re-implementing with a Clause.

src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_directives.py Show resolved Hide resolved
src/psyclone/transformations.py Show resolved Hide resolved
@@ -2283,9 +2283,14 @@ def apply(self, sched, options=None):
current = current.parent
posn = sched.children.index(current)

# handle async option
if options == None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the 'async_queue' option in the docstring. Please also make it clear that if it is set then the same value must also be supplied to any subsequent kernel transformation to ensure synchronisation.


# Create a directive containing the nodes in node_list and insert it.
directive = ACCKernelsDirective(
parent=parent, children=[node.detach() for node in node_list],
default_present=default_present)
default_present=default_present, async_queue=async_queue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this in the docstring and make it clear that it needs to be the same value as supplied to any preceding enter data directive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. "If the async_queue option is supplied then the value must match that used with any preceding enter data directive."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still outstanding?

@arporter
Copy link
Member

(By the way, our working practice is that the reviewer resolves conversations/comments once they are happy that the change has been made - otherwise we (I) lose track!)

svalat added a commit to svalat/PSyclone that referenced this pull request Sep 8, 2023
@svalat
Copy link
Author

svalat commented Sep 8, 2023

All fixed. Sorry I made the mistake to start to answer to each one before remembering is sends an email for each.

@arporter
Copy link
Member

arporter commented Sep 8, 2023

All fixed. Sorry I made the mistake to start to answer to each one before remembering is sends an email for each.

No that's fine - I quite like it :-)

src/psyclone/psyir/nodes/acc_directives.py Show resolved Hide resolved
:param wait_queue: Which ACC async group to wait. None to wait all.
:type wait_queue: None/int/Signature
:param wait_queue: Which ACC async stream to wait. None to wait all.
:type wait_queue: Optional[:py:class:psyclone.psyir.nodes.Reference, int]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing some backticks: ":py:class:`psyclone.psyir.nodes.Reference`"

an integer.
:type async_queue: None/str/int
:type async_queue: Optional[bool|:py:class:psyclone.psyir.nodes.Reference|int]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backticks around the "psyclone.psyir.nodes.Reference" please.


def __init__(self, children=None, parent=None, default_present=True, async_queue=False):
'''
Constructor of the class.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the confusion but this docstring should be in the class rather than the constructor (that seems to be the standard way to do it. You can then drop the "Constructor of the class" line too.

src/psyclone/psyir/nodes/acc_directives.py Show resolved Hide resolved
'''
Constructor of the class.

:param children: the PSyIR nodes to be enclosed in the Kernels region \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Kernels/ACC parallel/

src/psyclone/psyir/nodes/acc_directives.py Show resolved Hide resolved
:returns: Define which queue to wait.
:rtype: None or str or int
:returns: The queue to wait on.
:rtype: Optional[int, :py:class:psyclone.psyir.nodes.Reference]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backticks please.

Setter to assign a specific wait queue to wait for.

:param wait_queue: The wait queue to expect, or None for all.
:type wait_queue: Optional[int, :py:class:psyclone.psyir.nodes.Reference]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backticks please.


:param async_queue: Enable async support and attach it to the given queue.
Can use False to disable, True to enable on default stream.
Int to attach to the given stream ID or use a variable Signature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I see now why you used Signature. We're at the boundary between the 'old' and 'new' way of doing things here. In the new way, an ACCDirective will have one or more ACCClause's as children and the code is generated using the PSyIR backend. In the old way, we have a gen_code method which makes use of begin_string. The initial ACCDirective implementation isn't on yet (#2157) although it's nearly there. What you have will only work if the queue is a scalar variable - if it is an array or structure access then we'll only generate the root name of the access. I think the best solution for now is to use the FortranWriter to ensure that a correct string is produced:

from psyclone.psyir.backend import FortranWriter
text = FortranWriter()(self._wait_queue)

This then needs marking with a TODO for #1872 to say that it needs re-implementing with a Clause.

src/psyclone/psyir/nodes/acc_mixins.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_mixins.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_mixins.py Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_mixins.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_mixins.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_mixins.py Outdated Show resolved Hide resolved
src/psyclone/psyir/nodes/acc_mixins.py Show resolved Hide resolved
:param async_queue: Make the directive asynchonous and attached to the given
stream identified by an ID or by a variable name pointing to
an integer.
:type async_queue: bool | :py:class:psyclone.psyir.nodes.Reference | int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backticks please.

src/psyclone/tests/psyir/nodes/acc_directives_test.py Outdated Show resolved Hide resolved
src/psyclone/transformations.py Show resolved Hide resolved
Copy link
Member

@arporter arporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly, nearly there. Mostly tidying required (particularly for pycodestyle and pylint). There's also a bit of an issue with handling the Reference argument for the asynch. queue but that's fairly easy to fix for now. Please take a look through the 'unresolved' comments as there are some that still need looking at. Thanks :-)
All tests are OK with compilation. (I can't run the integration tests as this PR is from a fork.)

svalat added a commit to svalat/PSyclone that referenced this pull request Sep 11, 2023
svalat added a commit to svalat/PSyclone that referenced this pull request Sep 11, 2023
@svalat
Copy link
Author

svalat commented Sep 12, 2023

All fixed, and I fixed style.

@@ -1143,8 +1092,8 @@ def begin_string(self):

# handle specifying groups
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is still outstanding?

# -----------------------------------------------------------------------------
# BSD 3-Clause License
#
# Copyright (c) 2021-2023, Science and Technology Facilities Council.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My pylint (2.15.8) is still unhappy:

************* Module psyclone.psyir.nodes.acc_mixins
psyir/nodes/acc_mixins.py:111:16: C0415: Import outside toplevel (psyclone.psyir.backend.fortran.FortranWriter) (import-outside-toplevel)
psyir/nodes/acc_mixins.py:128:8: R1705: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it (no-else-return)

Please could you take a look?


# Create a directive containing the nodes in node_list and insert it.
directive = ACCKernelsDirective(
parent=parent, children=[node.detach() for node in node_list],
default_present=default_present)
default_present=default_present, async_queue=async_queue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still outstanding?

:param async_queue: The async queue to expect in parents.
:type async_queue: \
Optional[bool,int,:py:class: psyclone.core.Reference]
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this routine - I think it's good.
Please could you document the :raises: here.

'''

directive_cls = (ACCParallelDirective, ACCKernelsDirective)
for dir in sched.walk(directive_cls):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pylint complains that dir is overriding a Python intrinsic. Please could you rename it?

# check type
if (async_queue is not None and
not isinstance(async_queue, (int, Reference))):
raise TypeError(f"Invalid async_queue value, expect Reference or "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeCov is saying that there's some missed coverage in this new routine (https://github.com/stfc/PSyclone/pull/2070/checks?check_run_id=16784743166). Please could you add one or more tests to cover it?

src/psyclone/tests/psyir/nodes/acc_directives_test.py Outdated Show resolved Hide resolved
Copy link
Member

@arporter arporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those changes.
Functionally, I think this is done now. However, CodeCov is reporting that some of the new lines in your new checking routine aren't covered (see inline) and the Ref Guide is reporting a lot of new errors when building. Apart from that, there are just a few minor things to tidy up. Do feel free to comment on each conversation so that we can keep track of what you've done - I don't mind the emails!
(To build the docs, you may need to cd PSyclone; pip install -e .[doc]

'''
Class representing the !$ACC PARALLEL directive of OpenACC
in the PSyIR. By default it includes the 'DEFAULT(PRESENT)' clause which
means this node must either come after an EnterDataDirective or within
a DataDirective.

:param children: the PSyIR nodes to be enclosed in the ACC parallel region\
and which are therefore children of this node.
:type children: List[:py:class:`psyclone.psyir.nodes.Node`]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you un-indent these lines (i.e. line them up with L305) as the build of the Reference guide is now giving a lot of new errors. (cd PSyclone/doc/reference_guide; make html - there should be just 17 warnings.)

@@ -758,15 +836,21 @@ class ACCUpdateDirective(ACCStandaloneDirective):
clause on the update directive (this instructs the
directive to silently ignore any variables that are not
on the device).
:param async_queue: Make the directive asynchonous and attached to the \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not your fault but please could you fix the indentation of the continued lines L836-838 as they are also causing errors in the Ref. Guide.

elif isinstance(self.async_queue, int):
result = f" async({self.async_queue})"
elif isinstance(self.async_queue, Reference):
from psyclone.psyir.backend.fortran import FortranWriter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you disable the pylint warning about this import by adding:

# pylint: disable=import-outside-toplevel

on the line immediately before the import. Please could you also add:

# TODO #1872 - this will be removed once a suitable Clause is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants