Authors & Contributors
Anand Bora & Dr. Nicholas S. Flann
Teaser
Welcome to the arena of the intelligent systems. Programmers tend to go crazy when they don't find the exact information they are looking for, but no need to worry, the solution is here, Semantic Search for Questions Related to Programming...
Introduction
Semantically-aware search engines have gained considerable interest in the last few years, but the actual fulfillment of the vision is still unclear. Whenever developer fires a search query about the information that they are trying to find online, the search space of millions web pages being generated by the search engine. What we are trying to do is come up with one web page that satisfies most of the user search criteria, and for that, we need some sort of an intelligent system. The proposed search system gave better results as judged by students in a tutor lab.
Summary
The proposed system is being evaluated in CS tutor lab, and almost 80% students preferred this system against google search system results.
Methods
The proposed system accepts the user input as the search query and parses it in JAVA. Then using Natural Language Processing (NLP), we are removing the stop words and extracting only those keywords which affect the query results. We are using a standard list of stopwords and using these extracted keywords and google ajax API, the search space of web pages is being generated. Each web page is scored on the criteria of term frequency, precision, and recall. So higher the score, better the search quality. Further, the score function will map each web page to the score in between -1 to +1. In the end, we will choose a web page with the maximum score by applying simple hill climbing algorithm as the final result. As we are trying to answer queries by beginners or early developers, we need to filter these results based on some additional parameters, like a web page having good tutorials or example with some algorithm, flowchart or basic sample code snippet will be more helpful. On the other hand, solutions provided by websites like stackoverflow.com will be less helpful because those are written by experts in a very abstract manner. Additionally, these sites give alternative solutions for the same problem. So it may be confusing for beginners.
System flowchart
Results
The proposed search system is being evaluated in CS tutor lab. This is the place where students enrolled in CS-1, CS-2 and CS-3 computer science courses come to resolve programming issues that they encounter in their academic assignments. So when these students try to find some programming questions online, along with the existing search engines, students have been asked to use this system and then do a comparative evaluation. The students have been asked questions like which system is better, and for what search query they are trying to search and for which course. The feedback is stored in the database via a google form.
The performance of the proposed search system is shown in three pie charts. The first pie chart shows the participation of the students in testing the systems. Out of the total students who have been asked to test the system, 57.69% were from CS-1, 26.92% were from CS-2 while 15.38% were from CS-3. From these students, 80.77% students preferred the proposed search system, on the other hand, 19.23% students said google search is better as shown in the second pie chart. Additionally, the students who preferred the proposed search system, if we see the distribution, 66.67% of them were from CS-1, 19.05% were from CS-2 and 14.29% were from CS-3.
Performance of Search Systems
Conclusion & Future Work
As we can see from the results, the proposed system gives promising outcomes. As CS-1 is the course for beginners and they have given a good rating for the system performance. This is the exact motto of the system. Hence, this system can definitely be used in educational institutions, tutor labs or even for personal purposes. As future work, the system can be made adaptive in such a way that it will keep track of user feedback on a particular search query and its corresponding results. So whenever it comes across similar search query, it can make better decisions. Also in this implementation, we are focusing on the search queries related to the programming only, it can be made for the larger scope without such specific domain constraints...
Support or Contact
Having trouble with Pages? Or for more information, Contact anandbora99@gmail.com and I’ll be happy to assist you.