Understanding what makes an image memorable or forgettable is one of the interesting problems in the domain of computer vision and cognitive science. The initial studies have demonstrated that memorability is an inherent characteristic of an image and have found many high-level image properties (such as emotions, saliency, object statistics, popularity, aesthetics, etc.) which influence image memorability. This paper sheds light on the relationship between image memorability and two image features motion and depth which, to the best of our knowledge, is still unexplored. Experimental analysis reveals that motion and depth cues have a positive influence in determining image memorability. Further, the paper presents a novel deep learning model, FOD-MemNet, to exploit motion and depth cues along with object features to predict image memorability. Experimental results demonstrate that the proposed FOD-MemNet model outperforms the current state-of-the-art model by achieving a rank correlation of 0.655 which is near to human consistency (ρ = 0.68).