Object Tracking using Convolutional and Recurrent Neural Network

Deshpande, Mayuri

Object Tracking using Convolutional and Recurrent Neural Network

Search for this publication on Google Scholar

Deshpande, M. (2018). Object Tracking using Convolutional and Recurrent Neural Network. Unc Charlotte Electronic Theses And Dissertations.

Download PDF

Analytics

717 views ◎
669 downloads ⇓

Abstract

In this master thesis, recurrent neural network based method for visual track-ing in videos is introduced that learns to predict the bounding box locations of atarget object at every frame. Region information and distinctive visual featuresare obtained from applying Convolutional Neural Network on each of the framesin the video. Our Recurrent Neural Network (LSTM) exploits these history of lo-cations along with the high level visual features learned by the deep neural net-works. In order to increase the tracking accuracy and reduce the computation cost, anovel approach is proposed to construct a larger LSTM Network which we call it asSparsely stacked LSTM (S2LSTM).The promise of S2LSTM is to offer a systematicsolution to scale LSTM networks capture longer and more complex sequences, com-pared to mainstream LSTM design. S2LSTM is scalable and contains discrete non-overlapping training stacks, offering a modular design for building complex LSTM net-works. S2LSTM offers a discrete training mechanism which significantly helps to growthe complexity without retaining the next network.The key significance of S2LSTMis adding a time pooling module across stacked LSTM layers.It reduces the numberof time steps propagating from first LSTM to the second LSTM by filtering out the"Intermediate Outputs" across the stacked layers. In S2LSTM, the output of eachstack LSTM is compared with respective ground truth and are trained as separateparadigms. At the same time, it is less computationally Intensive compared to regularstacked LSTM. Our experiment on video data demonstrates that S2LSTM increasesthe tracking overlap accuracy by 15% compared to baseline ROLO implementation.

Details

Author: Deshpande, Mayuri
Title: Object Tracking using Convolutional and Recurrent Neural Network
Physical Description: 1 online resource (52 pages) : PDF
Date: 2018
Degree Granting Institution: University of North Carolina at Charlotte
Abstract: In this master thesis, recurrent neural network based method for visual track-ing in videos is introduced that learns to predict the bounding box locations of atarget object at every frame. Region information and distinctive visual featuresare obtained from applying Convolutional Neural Network on each of the framesin the video. Our Recurrent Neural Network (LSTM) exploits these history of lo-cations along with the high level visual features learned by the deep neural net-works. In order to increase the tracking accuracy and reduce the computation cost, anovel approach is proposed to construct a larger LSTM Network which we call it asSparsely stacked LSTM (S2LSTM).The promise of S2LSTM is to offer a systematicsolution to scale LSTM networks capture longer and more complex sequences, com-pared to mainstream LSTM design. S2LSTM is scalable and contains discrete non-overlapping training stacks, offering a modular design for building complex LSTM net-works. S2LSTM offers a discrete training mechanism which significantly helps to growthe complexity without retaining the next network.The key significance of S2LSTMis adding a time pooling module across stacked LSTM layers.It reduces the numberof time steps propagating from first LSTM to the second LSTM by filtering out the"Intermediate Outputs" across the stacked layers. In S2LSTM, the output of eachstack LSTM is compared with respective ground truth and are trained as separateparadigms. At the same time, it is less computationally Intensive compared to regularstacked LSTM. Our experiment on video data demonstrates that S2LSTM increasesthe tracking overlap accuracy by 15% compared to baseline ROLO implementation.
Genre: masters theses
Subjects--Topics: Computer science
Degree: M.S.
Subject Area: Computer Science
Advisor(s): Tabkhi, Dr. Hamed
Shin, Dr. Min
Committee Members: Lee, Dr. Minwoo
Shaikh, Dr. Samira
Degree Note: Thesis (M.S.)--University of North Carolina at Charlotte, 2018.
Rights Statement: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). For additional information, see http://rightsstatements.org/page/InC/1.0/.
Rights Holder Information: Copyright is held by the author unless otherwise indicated.
Identifier: Deshpande_uncc_0694N_11851
Permalink: http://hdl.handle.net/20.500.13093/etd:1302

J. Murrey Atkins Library

J. Murrey Atkins Library