چكيده لاتين
In recent years, question and answer systems have received attention with the aim of providing final answers to usersʹ questions. In these systems, users receive a short and accurate answer in the shortest time by submitting a question in natural language. Some question and answer systems use the database as a data source to provide answers to usersʹ questions. In these systems, we face two categories of simple and complex questions. Most researches have worked on simple questions that can be answered by finding a relationship in the database. But in the real world, we face more complex questions that require finding multiple relationships in the database to answer them. These types of complex questions are called multi-step questions; that multi-step question and answer systems based on knowledge base have been developed to answer these questions. In these systems, finding the answer to the question requires a multi-step reasoning from the original entity of the question to the expected answer in the database. To answer multi-step questions based on knowledge base, there are three approaches based on semantic analysis, based on information retrieval and based on deep learning (knowledge graph embedding). Researches in this field mainly work on English language. In Persian language, very few researches have been done in the field of complex question and answer systems based on knowledge base, in which methods based on semantic analysis have been used. In this research, a hybrid approach including information retrieval and knowledge graph embedding is presented to answer multi-step Persian questions. Using the combination of these two approaches will eliminate the dependence on syntactic and semantic rules. For this purpose, in the component of answering multi-step questions, the answer to the userʹs question is obtained by using the knowledge graph embedding space. Although the proposed method is able to find answers to many multi-step questions; But the incompleteness of the relationships in the knowledge graph makes the answer to the question not found, while the answer to the question exists in the knowledge graph. This problem will be much more important in low-source languages such as Persian. To solve this problem, in this research, the completion component of graph knowledge has been used. Considering that the embedding of relationships and entities exists in the embedding of knowledge graph; This feature has been used to extract the closest relationships that can be used to find the answer. In addition, in cases where there is no relationship in the knowledge graph for an entity, the embedding space of the knowledge graph has been used to create a new (ternary) relationship. By translating the MetaQA dataset into Persian language, necessary evaluations have been done on the presented method. Based on the experiments, the accuracy of 73.66% based on the Hits@1 criterion has been obtained in answering multi-step questions to Persian language questions based on graph knowledge. This result shows a 10.91% improvement in the performance of the proposed model compared to previous Persian studies.