AI_Paper_Presentation

Please download to get full document.

View again

of 19
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Slides

Published:

Views: 2 | Pages: 19

Extension: PDF | Download: 0

Share
Related documents
Description
Download AI_Paper_Presentation
Transcript
  • 1. Unsupervised Relation Extraction using Dependency Trees for Automatic Generation of Multiple-Choice Questions Naveed Afzal¹, Ruslan Mitkov¹, Atefeh Farzindar² ¹University of Wolverhampton, Wolverhampton, UK ²NLP Technologies Inc. Canada
  • 2. Outline  Introduction  Automatic generation of MCQs  Our Approach  NER  Extraction of Candidate Patterns  Patterns Ranking  Evaluation  Results  Conclusion  Future work
  • 3. Multiple-Choice Questions (MCQs)  MCQs a popular large scale assessment tool  MCQs are straightforward to conduct and instantaneously provide an effective measure of test-takers performance and feedback test results to the learner  MCQ consists of a question, the correct answer and a list of distractors
  • 4. Automatic Generation of MCQs  Work done in this area does not have a long history  Most of the previous approaches rely on syntactic structure of a sentence  Our approach to automatically generate MCQs by employing IE  Better quality of questions using semantic relations
  • 5. Automatic Generation of MCQ Unannotated corpus Named Entity Recognition Semantic Relations Distractors Generation Rules Question Generation Output (MCQ) Extraction of Candidate Patterns Patterns Ranking Evaluation Distributional Similarity
  • 6. Our Approach  In this paper, we only focus on semantic relations extraction  Unsupervised dependency-based approach  Dependency trees are suitable basis for semantic patterns acquisition as they abstract away from the surface structure to represent relations between elements of a sentence  Our approach can cover a potentially unrestricted range of semantic relations
  • 7. Our Approach  Our assumption for semantic relations is that it is between NE’s stated in the same sentence  Our approach is suitable in situations where a lot of unannotated text is available as it does not require manually annotated text or seeds  Other approaches (supervised and semi- supervised) require seed patterns to learn similar patterns which are exemplified by the seeds
  • 8. Named Entity Recognition (NER) Entity Type Precision Recall F-score Protein 65.82 81.41 72.79 DNA 65.64 66.76 66.20 RNA 60.45 68.64 64.29 Cell Line 56.12 59.60 57.81 Cell Type 78.51 70.54 74.31 Overall 67.45 75.78 71.37 GENIA NER is used to recognise the following 5 main Named Entities:
  • 9. Extraction of Candidate Patterns  After NER, the next step is extraction of candidate patterns and it consist of two main stages:  the construction of potential patterns from an unannotated domain corpus  their relevance ranking  Use of Linked chain pattern model that combines the pair of chains in a dependency tree which share common verb root but no direct descendants
  • 10. Extraction of Candidate Patterns  Treat every NE as a chain in a dependency tree if it is less than 5 dependencies away from the verb root and the word linking the NE’s to the verb root are from the category of content words (Verb, Noun, Adverb and Adjective) along with prepositions.  Consider only those chains in the dependency tree of a sentence which contain NE’s
  • 11. Examples of dependency patterns Patterns Frequency <NE ID="0" func="SUBJ" Dep="1"> "DNA" </NE> <W ID="1" func="+FMAINV" Dep="none">"contain"</W> <NE ID="2" func="OBJ" Dep="1"> "DNA" </NE> 34 <NE ID="0" func="SUBJ" Dep="1"> "PROTEIN" </NE> <W ID="1" func="+FMAINV" Dep="none">"activate"</W> <NE ID="2" func="OBJ" Dep="1"> "PROTEIN" </NE> 32 <NE ID="0" func="SUBJ" Dep="1"> "PROTEIN" </NE> <W ID="1" func="+FMAINV" Dep="none">"contain"</W> <NE ID="2" func="OBJ" Dep="1"> "PROTEIN" </NE> 19 <NE ID="0" func="SUBJ" Dep="2"> "PROTEIN" </NE> <NE ID="1" func="APP" Dep="0">"PROTEIN" </NE> <W ID="2" func="+FMAINV" Dep="none">"induce"</W> 19
  • 12. Patterns Ranking  Ranked according to their significance in domain corpus  Use of general corpus (BNC)  Measure the strength of association of a pattern with the domain corpus as opposed to the general corpus
  • 13. Patterns Ranking  The patterns are ranked using the following ranking methods:  Information Gain  Information Gain Ratio  Mutual Information  Normalised Mutual Information  Log-likelihood  Chi-Square  Meta-ranking  The patterns along with their scores obtained using the above mentioned ranking methods are stored into the database Information-theoretic concepts Statistical tests
  • 14. Patterns Evaluation  GENIA EVENT Annotation corpus is used for the purpose of evaluation. It consists of 9,372 sentences  Two ways are used for patterns evaluation  Rank-thresholding  Take top 100, 200 patterns  Score-thresholding  Consider patterns above certain threshold scores
  • 15. Results  Score-thresholding method performs better than the rank-thresholding method  Score-thresholding method able to achieve 100% precision  Higher precision is important for MCQ generation as MCQ applications rely on the production of good questions rather than the production of all possible questions, so high precision is important  T-Test revealed that there is no statistical significant difference between IG and IGR
  • 16. Best performing ranking methods  Chi-score and Normalised mutual Information 0 0.2 0.4 0.6 0.8 1 >0.08 >0.09 >0.1 >0.2 >0.3 >0.4 >0.5 CHI NMI Precision scores Score-threshold values
  • 17. Conclusion  Presented an unsupervised approach for RE from dependency trees intended to be deployed in an e- Learning system for automatic generation of MCQs by employing semantic patterns.  Explored different ranking methods and found that the CHI and NMI ranking methods obtained higher precision  Employed two techniques: the rank-thresholding and score-thresholding and found that score- thresholding perform better
  • 18. Future Work  In the future, we plan to employ these semantic relation for automatic MCQ generation, where it will be used to find relations and NE’s in educational texts that are important for testing students’ familiarity with key facts contained in the texts.
  • 19. Questions
  • We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks
    SAVE OUR EARTH

    We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

    More details...

    Sign Now!

    We are very appreciated for your Prompt Action!

    x