code for extracting additional channel based on baseline++ #9

nikk-nikaznan · 2023-07-31T14:02:54Z

No description provided.

sfmig

Great job Nik!

I included some comments that I hope are useful. The two main ones are:

(1) I think there is an alternative way of computing the channels that may be a bit clearer and more memory efficient - basically I think we can do away with the batch parameter (I'm not sure if it is entirely correct to use it actually - more on that on the review).

(2) I think there may also be an issue when computing the differences between frames for the motion channel: as it is now, I think they are computed between frames that are delta frames apart in the list of extracted frames, rather than delta frames apart in the actual video.

I suggested having two separate loops: one for computing the mean_blurred_frame (aka, the "background") and the max_abs_blurred_frame, and another one to compute for each frame selected for labelling, the background subtracted frame and the motion frame.

A smaller comment is that generally it is nice to include a brief description of the PR in the comment above (just saying what it does, sort of giving an overview). But the size of the PR was perfect! I always struggle with that 😅

Let me know if anything is not clear (or if I understood things wrong), happy to discuss!

bboxes labelling/additional_channels_extraction.py

sfmig · 2023-08-01T12:55:00Z

bboxes labelling/additional_channels_extraction.py

+
+    for vid_file, frame_idx in frame_dict.items():
+        if not os.path.exists(vid_file):
+            print(f"Video path not found: {vid_file}")


maybe we can add in the message that in that case the video is skipped

bboxes labelling/additional_channels_extraction.py

sfmig · 2023-08-01T14:00:20Z

bboxes labelling/additional_channels_extraction.py

+                    frame_image = (frame_image * 255).astype(np.uint8)
+                    out_fp = os.path.join(args.out_dir, file_name)
+                    Image.fromarray(frame_image).save(out_fp, quality=95)
+            del frames


is the del frames bit needed? if so, maybe it'd be nice to add a comment to clarify why?

It helps with the OOM issue.

sfmig · 2023-08-01T14:02:09Z

bboxes labelling/additional_channels_extraction.py

+        return None
+
+
+def get_frames(args):


I'm a big fan of almost self-explanatory function names, and I feel maybe this one could be a bit more clear about what it does? Maybe something like compute_additional_channels or compute_stacked_inputs?

I think there is an alternative way of computing the channels that may be a bit clearer and more memory efficient - basically I think we can do away with the batch parameter.

I'm not sure if it is entirely correct to use it actually. If I am interpreting correctly, we are computing the mean over chunks of 1000 frames (whereas in the original implementation they compute it over the whole video)....is that correct?

I think there may also be an issue when computing the differences between frames for the motion channel: as it is now, I think they are computed between frames that are delta frames apart in the list of extracted frames, rather than delta frames apart in the actual video.

I would suggest having two loops: one for computing the mean_blurred_frame (aka, the "background") and the max_abs_blurred_frame, and another one to compute the background subtracted frame and the motion frame for each frame selected for labelling.

For the first one (computing the mean and the max_abs arrays), I would do the following (in pseudocode)

# initialise capture for this video cap = cv2.VideoCapture(vid_file) # get image size width = cap.get(cv2.CAP_PROP_FRAME_WIDTH) height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT) # initialise array for mean blurred frame mean_blurred_frame = np.zeros((width, height)) # initialise array for max_abs_blurred_frame max_abs_blurred_frame = np.zeros((width, height)) # loop thru frames in the video frame_counter = 0 while True: # read a frame success_frame, frame = cap.read() if not success_frame: print("Can't receive frame. Exiting ...") break # apply transformations to that frame frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) blurred_frame = cv2.GaussianBlur(frame, (5, 5), 0) # accumulate sum in mean_blurred_frame mean_blurred_frame += blurred_frame # accumulate max absolute values in max_abs_blurred_frame max_abs_blurred_frame = np.maximum(max_abs_blurred_frame, abs(blurred_frames)) # for each element of the array, select the one with the max absolute value; see here https://numpy.org/doc/stable/reference/generated/numpy.maximum.html # update frame counter frame_counter += 1 # compute the mean mean_blurred_frame = mean_blurred_frame/frame_counter

Note that in this loop we go through all the frames in the video. We also compute the average across all frames, but we avoid holding a large 3d array in memory by accumulating the temporary values in mean_blurred_frame and max_abs_blurred_frame

sfmig · 2023-08-01T14:07:38Z

bboxes labelling/additional_channels_extraction.py

+
+            # detecting motion by finding the differences between frame
+            # set the delta : frame[i+delta] - frame[i]
+            for i, frame_offset in enumerate(range(len(frames) - delta)):


would it be clearer if we the loop goes from frame f=delta until f=n_frames, and at each iteration we compute the difference between frame f and frame f-delta?

If I understand correctly it should be equivalent right? (maybe I'm missing something)

I would go around this loop slightly differently. This would be the second loop I was referring to earlier.

In pseudocode, I would do the following:

frame_dict = read_json_file(args.json_path) for vid_file, list_extracted_frame_idcs in frame_dict.items(): # initialise video capture object cap = cv2.VideoCapture(vid_file) # for every frame extracted for labelling for frame_idx in list_extracted_frame_idcs: # read the frame from the video cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx) success, frame = cap.read() if not success: # break or warning..... # transform the frame # (since we do these operations quite a bit, we could have them as separate functions) frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) blurred_frame = cv2.GaussianBlur(frame, (5, 5), 0) #------------------------------- # compute the background subtracted frame # (note that mean_blurred_frame and max_abs_blurred_frame were computed in a previous separate loop) background_subtracted_frame = (((frame - mean_blurred_frame)/max_abs_blurred_frame) + 1)/2 #------------------------------- # compute the motion frame # read frame f-delta, the frame delta frames before the current one # NOTE: we'd need to have a way to deal with the case in which frame_idx - delta < 0. For example, we can force it to be at least 0? cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx - delta) success_delta, frame_delta = cap.read() # transform the frame f-delta frame_delta = cv2.cvtColor(frame_delta, cv2.COLOR_BGR2GRAY) blurred_frame_delta = cv2.GaussianBlur(frame_delta, (5, 5), 0) # compute the motion channel for frame f motion_frame = np.abs(blurred_frame - blurred_frame_delta) #------------------------------- # stack the three channels final_frame = np.dstack( [ cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY), # original grayscaled image? background_subtracted_frame, motion_frame ] ) # save final frame as file Image.fromarray(final_frame).save(path, quality=95)

The sections between dotted lines could be separate functions. Note that we could have a frame selected for labelling with index < delta, and therefore it doesn't have a frame that is delta frames before it. Maybe in that case we can cap it? Not sure if there is a better way, let me know thoughts

OpenCV is a bit odd when you want to fetch a specific frame in a video. Below a snippet to show how it works (from this notebook):

frame_idx = 25 # to read frame with index 25 (in zero indexed system) # this sets the index for the frame we will read next cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx) # this reads the frame at idx=25 and updates the index to idx=26 success, frame = cap.read() # you can check that the index is 26 here print(cap.get(cv2.CAP_PROP_POS_FRAMES))

sfmig · 2023-08-01T15:14:23Z

bboxes labelling/additional_channels_extraction.py

+            # detecting motion by finding the differences between frame
+            # set the delta : frame[i+delta] - frame[i]
+            for i, frame_offset in enumerate(range(len(frames) - delta)):
+                if (i + (n_frame - batch_size)) in frame_idx:


As I mentioned above, I think there is an issue here with frame_idx. If I understand correctly, it is the list of frames extracted for labelling. But we want to compute the difference between frames that are separated by delta frames in the original video, not in the subset of extracted frames (right?).

Yes frame_idx is the list of frames extracted for labelling. The original code will not have the condition there, and save all the frames. This condition is to make sure we save the correct frame we want as in the frame_idx list. We still compute the difference between frame[i+delta] - frame[I]. So if let's say

frame_idx = [270, ..] if 270 in frame_idx: frame_images = np.dstack([ frame[270], blurred_frames[270], blurred_frames[270+delta] - blurred_frames[270] ])

Is it make sense?

But, I will check again in regards to the frame indexing you mentioned above

aah ok I see!! Thanks for clarifying (I remember now you mentioned this earlier, sorry I forgot!)
I didn't realise blurred_frames is the array with all the frames, not only the ones selected for labelling.

I think maybe the frame indexing approach is more readable, but happy to discuss!

… for the new functions

sfmig

Good stuff!

I made some minor changes, so please can you check everything is still a-ok?

The changes are mainly:

I changed the docstrings format from Google to Numpy - we can chat about this if you want
I split the larger function computing the stacked inputs into smaller ones

Happy to discuss any thoughts!

sfmig · 2023-08-07T14:05:54Z

bboxes labelling/additional_channels_extraction.py

+    """Function to apply transformation to the frame.
+    Convert the frame to grayscale and apply Gaussian blurring
+
+    Args:


Is it possible that these docstrings are Google format rather than Numpy format?

See the difference here

sfmig · 2023-08-07T14:10:18Z

bboxes labelling/additional_channels_extraction.py

+        # compute the mean
+        mean_blurred_frame = mean_blurred_frame / frame_counter
+        # save the mean
+        cv2.imwrite(f"{Path(vid_file).stem}_mean.jpg", mean_blurred_frame)


nice, fantastic for sanity check!

I'm thinking in the future we could save it or not based on a flag - I added it as an enhancement issue #16

bboxes labelling/additional_channels_extraction.py

sfmig · 2023-08-07T14:19:30Z

bboxes labelling/additional_channels_extraction.py

+
+            if not success_delta:
+                # Break the loop if no more frames to read
+                print(f"Cannot read frame{frame_idx}+{args.delta}. Exiting...")


I think this is good enough for now - basically we won't label any frame whose motion channel cannot be computed. But we need to make sure we don't label those frames by mistake (since it will be useless) - I added another issue for that #17

sfmig · 2023-08-07T14:20:47Z

bboxes labelling/additional_channels_extraction.py

+            background_subtracted_frame_delta = (
+                ((blurred_frame_delta - mean_blurred_frame) / max_abs_blurred_frame) + 1
+            ) / 2


since we use this a couple of times, maybe this could also be a separate function that receives blurred frame and mean blurred frame, and outputs the background subtracted one (with the normalisation, +1 and division by 2)

sfmig · 2023-08-07T15:04:35Z

bboxes labelling/additional_channels_extraction.py

+        (
+            mean_blurred_frame,
+            max_abs_blurred_frame,
+        ) = compute_mean_and_max_abs_blurred_frame(cap, args.kernel_size, args.sigmax)


I moved the computation of mean_blurred_frame and max_abs_blurred_frame to their own separate function (to try make functions a bit more compact and focused)

sfmig · 2023-08-07T15:05:16Z

bboxes labelling/utils.py

+import json
+
+
+def read_json_file(file_path):


fantastic! ✨

bboxes labelling/additional_channels_extraction.py

code for ectracting additional channel based on baselineplusplus

c19213b

nikk-nikaznan requested a review from sfmig July 31, 2023 14:03

nikk-nikaznan added 3 commits July 31, 2023 15:46

fixing the formatting

bc77e3a

fixing the formatting - ruff

e77b832

fixing the formatting

ce42126

sfmig requested changes Aug 1, 2023

View reviewed changes

nikk-nikaznan added 2 commits August 1, 2023 19:23

chnages on some of the reviews, tbc

b86ebcb

fixing pre-commit

e1e4db9

samcunliffe mentioned this pull request Aug 21, 2023

Finalise pipeline for preparing different channels! #35

Closed

4 tasks

changes according to the reviews

55d9404

nikk-nikaznan requested a review from sfmig August 7, 2023 09:40

nikk-nikaznan and others added 4 commits August 7, 2023 14:52

latest change

e59d5e4

separate channel function into smaller ones

e486807

change existing docstrings from Google to Numpy format. Add templates…

5e23f13

… for the new functions

add remaining docstrings

3ae84a7

sfmig reviewed Aug 7, 2023

View reviewed changes

nikk-nikaznan requested a review from sfmig August 7, 2023 15:44

sfmig approved these changes Aug 7, 2023

View reviewed changes

nikk-nikaznan merged commit d399c16 into main Aug 7, 2023
1 check passed

nikk-nikaznan deleted the nikkna/extract_addchannel branch August 7, 2023 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code for extracting additional channel based on baseline++ #9

code for extracting additional channel based on baseline++ #9

nikk-nikaznan commented Jul 31, 2023

sfmig left a comment •

edited

Loading

sfmig Aug 1, 2023

sfmig Aug 1, 2023

nikk-nikaznan Aug 1, 2023

sfmig Aug 1, 2023

sfmig Aug 1, 2023

sfmig Aug 1, 2023

sfmig Aug 1, 2023

sfmig Aug 1, 2023

sfmig Aug 1, 2023

nikk-nikaznan Aug 1, 2023

nikk-nikaznan Aug 1, 2023

sfmig Aug 2, 2023

sfmig left a comment

sfmig Aug 7, 2023

sfmig Aug 7, 2023

sfmig Aug 7, 2023

sfmig Aug 7, 2023

sfmig Aug 7, 2023

sfmig Aug 7, 2023

		import json


		def read_json_file(file_path):

code for extracting additional channel based on baseline++ #9

code for extracting additional channel based on baseline++ #9

Conversation

nikk-nikaznan commented Jul 31, 2023

sfmig left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfmig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfmig left a comment •

edited

Loading