TUWHERA Open Theses & Dissertations
AUT University
View Item 
  •   Open Theses & Dissertations
  • Masters Theses
  • View Item
  •   Open Theses & Dissertations
  • Masters Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Information extraction from free text comments in questionnaires

Ramachandran, Kartik
Thumbnail
View/Open
Whole thesis (2.125Mb)
Permanent link
http://hdl.handle.net/10292/11141
Metadata
Show full metadata
Abstract
The last 15 years have seen a tremendous explosion in the amount of information available, encoded both in structured forms such as databases and XML files as well as free, naturally occurring forms such as HTML pages and word documents. This availability of free texts has created a need for automated text processing tools so that information can be extracted in a timely and effective manner.

This research investigated the extraction of information from free text responses to open-ended questions in questionnaires. The research undertook to develop a framework for analyzing open question responses to extract structured information which can then be conflated with the closed question responses in order to produce a more informative report from the survey, in particular to determine the sentiment expressed in the response.

Specifically, this research will help in understanding the positive or negative nature of the respondent’s answers through the creation of software tools using Natural Language Toolkit (NLTK) and data mining and Natural Language Processing techniques and will help surveyors (Health centers, doctors, data analysts) obtain additional information from surveys. There is also a discussion of existing sentiment analysis solutions as well as the different components and ways of analyzing sentiment and creating a Natural Language Processing tool which would be interesting to future developers of such systems.

This research was successfully able to classify free text responses as positive or negative. While we appreciate that more time to fine tune the application and perform more training and testing would have been useful, the results obtained are promising. We have successfully developed a platform which can be used for generating a custom corpus and provide interested developers a starting framework to develop sentiment analysis tools.
Keywords
Natural Language Processing; Data mining; Information extraction; NLTK
Date
2018
Item Type
Thesis
Supervisor(s)
Tegginmath, Shoba; Nand, Parma
Degree Name
Master of Computer and Information Sciences
Publisher
Auckland University of Technology

Contact Us
  • Admin

Hosted by Tuwhera, an initiative of the Auckland University of Technology Library

 

 

Browse

Open Theses & DissertationsTitlesAuthorsDateThesis SupervisorMasters ThesesTitlesAuthorsDateThesis Supervisor

Alternative metrics

 

Statistics

For this itemFor all Open Theses & Dissertations

Share

 
Follow @AUT_SC

Contact Us
  • Admin

Hosted by Tuwhera, an initiative of the Auckland University of Technology Library