how to implement cnn+lstm model for predicting a video

Question

I am currently learning how to make and implement CNN(alexnet)+LSTM model to predict a video, but i got stuck in the prediction thing

when i try to predict i got this error

ValueError: Input 0 is incompatible with layer model_1: expected shape=(None, 10, 384, 384, 3), found shape=(1, 270, 480)

I admit that my width and height is different, but how to add the timesteps(10) in the prediction so it will be the same just like my model?

Here is my code:

model_path = 'CCTV_10Frame_SGD_Model_1e4_b16_l21e2_Terbaru.h5'
model = keras.models.load_model(model_path, compile = True)

vid = cv2.VideoCapture('Data16_116.mp4')

prev_frame_time = 0
total_frame = 0
while vid.isOpened():
    ret, frame = vid.read()
    if ret == True:
      total_frame += 1
      draw = frame.copy()
      draw = cv2.cvtColor(draw, cv2.COLOR_BGR2GRAY)

      scale_percent = 25 # percent of original size
      width = int(frame.shape[1] * scale_percent / 100)
      height = int(frame.shape[0] * scale_percent / 100)
      dim = (width, height)
      frame_set = cv2.resize(draw, dim, interpolation = cv2.INTER_AREA)
      boxes, scores, labels = model.predict_on_batch(
          np.expand_dims(frame_set, axis=0))
      boxes /= scale
      i_iterate = 0
      for box, score, label in zip(boxes[0], scores[0], labels[0]):
        if score < 0.5 or i_iterate > 0:
          break

        fps = 1/(start-prev_frame_time)
        prev_frame_time = start
        
        cv2.putText(draw, "%.2f" % fps, (7, 70), font,
                    1, (100, 255, 0), 3, cv2.LINE_AA)
        

        color = label_color(label)
        b = box.astype(int)
        draw_box(draw, b, color=color)
        caption = "{} {:.3f}".format(classes[label], score)
        draw_caption(draw, b, caption)
        print("=================================")
        print("[INFO] Score : ", score)
        print("[INFO] Label : ", classes[label])
        i_iterate += 1

      print("=================================")
      cv2.imshow('Result', draw)  
      if cv2.waitKey(25) & 0xFF == ord('q'):
        break
      else:
        break

vid.release()
cv2.destroyAllWindows()

I hope any of you guys who have experience of this can help me

Thank you so much !

MD Mushfirat Mohaimin · Accepted Answer · 2021-07-14 10:33:51Z

1

In my opinion, what you can do is that for the first 10 frames you just append them, and but from the 11th frame, you now pop from the start and append to the end.

By doing so you will have 10 frames based on which the model will predict the 11th frame.
I think the model is trained to predict one frame from the past 10 frames.

Also if you want to predict just from the first frame then try to see what was the input to the model while training in the case of a single frame. It should be like 9 Frames with some arbitrary constant value and then the 10th frame with some actual values.

edited Jul 14, 2021 at 10:33

MD Mushfirat Mohaimin

2,0724 gold badges14 silver badges24 bronze badges

answered Jul 14, 2021 at 8:54

Abhishek Prajapat

1,8882 gold badges11 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JuniorPython Over a year ago

yes but the predict system is still ask for the same input shape , currently my input shape model are (batch, timesteps, w, h, channel) when timesteps are how much frames that it take to predict, but i still dont understand how to make my input video for testing have the same input shape just like my model

Abhishek Prajapat Over a year ago

For that, you have to look at the source of the model and code. It is highly likely that your model might be trained such that it accepts 9 timestamps of nan type (may each value at those timestamps being -99 or something) and 1 timestamp of the first frame. If that helped then kindly upvote.

Collectives™ on Stack Overflow

how to implement cnn+lstm model for predicting a video

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related