(Closes #2499) scalarization transformation implementation #2563

LonelyCat124 · 2024-04-30T14:52:12Z

First implementation is here. I have unit tests for the private routines, I need to test it with code examples doing the full transformation properly but in theory its implemented here for now.

…ivate functions

codecov · 2024-04-30T15:14:46Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.86%. Comparing base (6f618c9) to head (ed75417).
Report is 351 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #2563   +/-   ##
=======================================
  Coverage   99.86%   99.86%           
=======================================
  Files         357      355    -2     
  Lines       48332    48402   +70     
=======================================
+ Hits        48266    48339   +73     
+ Misses         66       63    -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

LonelyCat124 · 2024-04-30T15:20:16Z

@hiker Quick question on how the ordering of statements goes with the dependency system - if you have an assignment such as a = a + 1 does the a access on the RHS appear before the a on the LHS? I assume so (and that makes sense) but I just wanted to double check as I have a superfluous (and unreachable) if statement if so.

LonelyCat124 · 2024-05-01T12:49:52Z

@sergisiso this is ready for a first look.
I didn't add anything to user or dev guide for now, let me know if you think I should.

I'm waiting for hiker to clarify my question before I remove the code at lines 123 in the transformation, but I expect to remove those lines as I think they're unreachable.

sergisiso

Good initial implementation @LonelyCat124, I am just asking if the code can be slightly refactored to be more readable and I am wondering if some logic should leave inside the next_access methods, I am happy to discuss this one.

sergisiso · 2024-05-16T09:15:07Z

src/psyclone/psyir/transformations/scalarization_trans.py

+
+class ScalarizationTrans(LoopTrans):
+
+    def _find_potential_scalarizable_array_symbols(self, node, var_accesses):


I found this method hard to understand until I read the apply method. I think its name is too generic and it does too many things, so I suggest breaking it up (see name suggestions below).

I also notice that they all iterate over potential var_accesses and select/filter the ones passing the conditions. Could we use the python filter functionality? If the apply would be something like this:

var_accesses = VariablesAccessInfo(nodes=node.loop_body) var_accesses = filter(self._is_local_array, var_accesses) var_accesses = filter(self._have_same_unmodified_index, var_accesses) var_accesses = filter(self._first_access_is_write, var_accesses) var_accesses = filter(self._not_used_after_loop, var_accesses)

It would be very readable (so much so that I think it doesn't need inline comments) and makes each method simpler (to output array, no iteration, ...). Also the methods don't use self or node (except the last one but see next comment), should they be static methods or even just inner functions of apply?

Could you explore if this would work.

Its doable I think, I've moved them to be static methods (I don't like inner functions), they do still need var_accesses (and node in the final functions case), so there is a lambda in the filter calls, but it seems to work. I will need to rewrite tests for this and maybe expand the apply testing.

sergisiso · 2024-05-16T09:21:55Z

src/psyclone/psyir/transformations/scalarization_trans.py

+from psyclone.psyir.transformations.loop_trans import LoopTrans
+
+
+class ScalarizationTrans(LoopTrans):


When you add the docstring, can you bring what you wrote in apply here and add a simple example, e.g. like src/psyclone/psyir/transformations/hoist_trans.py . So then we just need to list the autoclass under the available transformations section of doc/user_guide/transformations.rst and can be tested with python -m doctest src/psyclone/psyir/transformations/scalarization_trans.py. The apply docstring can be a oneliner then.

sergisiso · 2024-05-16T09:44:24Z

src/psyclone/psyir/transformations/scalarization_trans.py

+            # Find the last access of each signature
+            last_access = var_accesses[signature].all_accesses[-1].node
+            # Find the next access to this symbol
+            next_access = last_access.next_access()
+            # If we don't use this again then its valid
+            if next_access is None:
+                potential_arrays.append(signature)
+                continue


I was expecting the check for next accesses to only be this. Why do we need the rest, is it because the next_access implementation is currently missing these? E.g. if we use next_accesses in other places, e.g. when deciding where to place a halo_exchage to be safe, wouldn't we need the same logic as below? In this case shouldn't this all be inside the next_access implementation?

Also I see that next access only returns one reference, was this the right choice?, given that potentially is more than one:

a = 3 if condition: a = 1 else: a = 1

Also the a in the inner loop has 2 potential next_acesses

do i=1, 10 a = 1 do j 1=1, 10 a = 2 endo enddo a = 3

I deliberately kept the next_access function simple and tried to avoid the complicated cases (and rely on the dependency analysis tooling) like this, and just assume these are ok.

I can try to make a new issue and PR to improve that functionality if you think its necessary/useful. I feel like there can always be an incredibly complex structure of nested loops that makes this difficult.

But wouldn't this complex logic be for everywhere that we want to prove if a symbol is going to be accessed next or not, not just for this transformation?

I deliberately kept the next_access function simple and tried to avoid the complicated cases (and rely on the dependency analysis tooling)

I am not sure if it's next access or the VariablesAccessInfo tooling itself that needs to provide this.

@hiker and @JulienRemy I am trying to understand this. Is this related to the logic you needed on building the DAG? That the VariablesAccessInfo returns ALL following dependencies under a node (but not looking upwards - like we need in my loop snippet above) and the next_accesses just one is code order?

But what we need (and maybe next_access should be) is a method that return any "directly reachable" accesses to the same symbol, which could be more than one, and will be equivalent to the arrows leaving from a node in the DAG?

I think this is a separate issue (though I'm not sure about the loop case, we might need to gracefully fail somehow for some case, I'll try to remember to put a test in for that).

We otherwise ignore the if case here (which could be improved if next_acccess gave all options, either through the access info or otherwise) and return False instead of trying to work it out.

@sergisiso I think I could probably do most of this for next access, butt I'm not sure on if there is an easy way to tell whether a node is inside an else if or an else or an if, for example for a case like (this is how we have the if/elseif behaviour in Psyclone)

a = 1 if cond1 then a = 2 else if cond2 then a = 3 a = 6 else if cond3 then a=4 end if end if end if a = 5

We currently would find the a=2 case, and the proposal is that we need to also find the a=3, a=4, and a=5 cases. The trick is knowing that the a=3 and a=4 is inside an else if case, so to continue trying to find more "next accesses" that aren't inside the same branch of the else/if structures. We also need different behaviour for else if vs else behaviour (and all cases are complicated by the "one branch doesn't contain a next_access")

sergisiso · 2024-05-16T09:45:43Z

src/psyclone/psyir/transformations/scalarization_trans.py

+            # If we do and the next_access has an ancestor IfBlock
+            # that isn't an ancestor of the loop then its not valid since
+            # we aren't tracking down what the condition-dependent next
+            # use really is.
+            if_ancestor = next_access.ancestor(IfBlock)
+
+            # If abs_position of if_ancestor is > node.abs_position
+            # its not an ancestor of us.
+            if (if_ancestor is not None and
+                    if_ancestor.abs_position > node.abs_position):
+                # Not a valid next_access pattern.
+                continue
+
+            # If next access is the LHS of an assignment, we need to
+            # check that it doesn't also appear on the RHS. If so its
+            # not a valid access
+            # I'm not sure this code is reachable
+#            if (isinstance(next_access.parent, Assignment) and
+#                next_access.parent.lhs is next_access and
+#                (next_access.next_access() is not None and
+#                 next_access.next_access().ancestor(Assignment) is
+#                 next_access.parent)):
+#                continue
+
+            # If next access is the RHS of an assignment then we need to
+            # skip it
+            ancestor_assign = next_access.ancestor(Assignment)
+            if (ancestor_assign is not None and
+                    ancestor_assign.lhs is not next_access):
+                continue
+
+            # If it has an ancestor that is a CodeBlock or Call or Kern
+            # then we can't guarantee anything, so we remove it.
+            if (next_access.ancestor((CodeBlock, Call, Kern))
+                    is not None):
+                continue


I find hard to see what each of these are checking (without going to the tests), could you add a short code snipet showing the dependency that each is looking for, inlcuding the commented one.

The commented one is removed because its not reachable with how the next_access/variable_access info gives results for assignments.

LonelyCat124 · 2024-05-20T14:49:25Z

@sergisiso I've addressed most of the comments here - I think the next_access one is a bigger issue than just here (and we need to decide if the variable access tooling should handle this or if we just do it in next_access, but I think we can do an implementation for this in the mean time anyway).

I need to still expand the tests for the apply routine and add documentation before another review though I think.

sergisiso · 2024-05-28T11:05:22Z

@LonelyCat124 I don't mind the order of these (if you put the associated TODOs), but wouldn't want to go very far with some logic that then will need to be deleted.

LonelyCat124 added 3 commits April 29, 2024 15:53

First code towards #2499

59d9875

forgotten file

ede95c7

First implementation of scalarization_trans and unit tests for the pr…

6ae3ae3

…ivate functions

LonelyCat124 requested a review from sergisiso April 30, 2024 14:52

LonelyCat124 self-assigned this Apr 30, 2024

LonelyCat124 added 2 commits April 30, 2024 15:56

Linting errors

c7cf88c

Linting errors

577d702

Final commits ready for a review

d76bfb6

LonelyCat124 added the ready for review label May 1, 2024

LonelyCat124 added 2 commits May 1, 2024 14:16

Added compile test

2523a6b

linting

fece2bd

sergisiso added under review and removed ready for review labels May 14, 2024

sergisiso requested changes May 16, 2024

View reviewed changes

sergisiso added reviewed with actions and removed under review labels May 16, 2024

Changes to use filter to make the code easier to understand

ed75417

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Closes #2499) scalarization transformation implementation #2563

(Closes #2499) scalarization transformation implementation #2563

LonelyCat124 commented Apr 30, 2024

codecov bot commented Apr 30, 2024 •

edited

Loading

LonelyCat124 commented Apr 30, 2024

LonelyCat124 commented May 1, 2024

sergisiso left a comment

sergisiso May 16, 2024

LonelyCat124 May 20, 2024

sergisiso May 16, 2024

sergisiso May 16, 2024

LonelyCat124 May 16, 2024

sergisiso May 16, 2024

sergisiso May 16, 2024

LonelyCat124 May 20, 2024

LonelyCat124 May 21, 2024

sergisiso May 16, 2024

LonelyCat124 May 20, 2024

LonelyCat124 commented May 20, 2024

sergisiso commented May 28, 2024


		class ScalarizationTrans(LoopTrans):

		def _find_potential_scalarizable_array_symbols(self, node, var_accesses):

		from psyclone.psyir.transformations.loop_trans import LoopTrans


		class ScalarizationTrans(LoopTrans):

(Closes #2499) scalarization transformation implementation #2563

Are you sure you want to change the base?

(Closes #2499) scalarization transformation implementation #2563

Conversation

LonelyCat124 commented Apr 30, 2024

codecov bot commented Apr 30, 2024 • edited Loading

Codecov Report

LonelyCat124 commented Apr 30, 2024

LonelyCat124 commented May 1, 2024

sergisiso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LonelyCat124 commented May 20, 2024

sergisiso commented May 28, 2024

codecov bot commented Apr 30, 2024 •

edited

Loading