Mining Software Documentation

Title of the Talk: Mining Software Documentation
Speakers: Dr. Akhila Sri Manasa
Host Faculty: Prof.M V Panduranga Rao
Date: April 25, 2025
Time: 11:00am to 12:00 pm
Venue: CS-LH3

Abstract:
Software Documentation refers to the body of information that describes a project during different phases of software development. About 70% of the cost and efforts during software development are spent on project maintenance, while developers spend more than 50% of their time on project comprehension. Availability of software documentation can significantly reduce the time and effort spent in software development and maintenance. Current trends in software maintenance and evolution indicate a significant rise in the number of developers contributing to projects. This, in turn, accelerates the pace of software development, further leading to increased maintenance overheads and lack of quality documentation. Despite the importance of documentation, there are several challenges associated with documentation, such as incompleteness, inconsistencies, insufficiency, incorrectness and so on. These issues in documentation hinder developers in taking advantage of the documentation to its full potential. In spite of the introduction of the approaches aimed to help developers comprehend source code, availability of documentation suitable for comprehending other details of the projects still remains unsolved. Addressing this significant problem of solving issues with documentation, in spite of its widely accepted nature of importance, requires an understanding of the underlying reasons behind these issues and motivates the need to comprehend documentation from multiple perspectives beyond the source code.

In the talk, we would be discussing the reasons behind challenges in the current state of documentation through studies that mine information related to documentation from crowd-sourced platforms. We shall discuss analysis of large documentation related content mined from GitHub to identify artifacts useful for documentation and systematically organize these artifacts in the form of a taxonomy. We shall discuss the perceptions of various developer categories towards documentation, commonly followed documentation practices in the current open source projects, the process of arriving at a taxonomy of documentation types and sources and a research prototype to generate information related to documentation by leveraging the proposed taxonomy. Finally, we shall discuss the possible future explorations that can be performed in the context of documentation, such as domain-specific taxonomy of documentation, methods to enhance documentation and so on.

Speaker Bio: Akhila Sri Manasa Venigalla has graduated with a PhD and MTech in Computer Science and Engineering from RISHA Lab, IIT Tirupati. Her primary research interest is largely in the areas of Software Documentation and Software Engineering. Her research interests also include Education Technologies and Human Computer Interaction. She has explored ways to support novice programmers and end user software engineers through software systems for the society. As a part of her PhD thesis, she explored ways to improve software documentation and its comprehensibility. She has served as a member of the Program Committee of various software engineering conferences and has received IEEE TCLT and ACM SIGSOFT CAPS travel grants during her PhD. She received her Bachelor’s degree in Computer Science and Engineering from JNTUH, in 2017, following which she worked at Wells Fargo India Solutions for a year. In addition to exploring ways to support novice programmers, she also worked on bringing awareness on the precautions to be taken, need for collaborative efforts and on identifying changing emotion trends of people during Covid-19 during her PhD. Here is a link to her list of publications.