11 January 2010

Software Maintenance and Free and Open Source Software: A Synergy for the Twenty-First Century

Author's Note: certain elements (tables of contents/tables/figures) have been removed for better web-log display. You may contact me if you'd like a PDF of the paper for reference, but please remember that plagiarism is bad.

Software Maintenance and Free and Open Source Software:
A Synergy for the Twenty-First Century

Robert C. Murray
University of Maryland University College


Abstract

Free and Open Source Software (FOSS) and software maintenance are fields that have interesting commonalities, and these likenesses indicate a synergy between the two topics. FOSS and software maintenance are first put into context and then discussed, as each is a component of the other. Research to demonstrate the presence of a synergy via plotting the average of the number of articles returned from four different academic search engines is presented, discussed, and analyzed, as the presence of a synergy is indicated by the results. Suggestions are offered for clearer results in future research.

- - -

Software Maintenance and Free and Open Source Software:
A Synergy for the Twenty-First Century

Software maintenance has enjoyed a renaissance in the decade from 2000 to 2009. Long considered nothing more than the unglamorous back end of software development, software maintenance long failed to obtain any focused attention per serious scholarly research (Pigoski, 1997). Over years research on the software development life cycle consistently reported that maintenance was the single largest expense in both time and resources related to a piece of software, consuming "...roughly 60 percent of the software life cycle."(Glass, 2003, p.115). This, coupled with the increase in legacy systems kept in use far beyond their originally intended life span, led to an increase in interest in software maintenance (Schneidewind, 1987). This growth in interest began in the mid-1990's, with text books like Pigoski's Practical Software Maintenance and the introduction of college courses devoted to software maintenance (Andrews and Lutfiyya, 2000). While text books written on the topic could afford to remain technology independent, course work and research on specific software maintenance practices and tools required software on which to work for their principles and effectiveness to be properly tested. Free and Open Source Software (FOSS), itself already undergoing an increase in popularity in the mid to late 1990's, was precisely what those interested in software maintenance needed to complete their research. The ascendancy of the popularity of FOSS and the increased interest in serious academic research in software maintenance have formed a synergy that has produced improvements in these related fields throughout the first decade of the twenty-first century.

Software Maintenance as of the Year 2000

Though this paper is concerned with the relationship between software maintenance and FOSS since the year 2000, it is important that the reader understand the status of software maintenance in the years leading up to 2000.
Schneidewind wrote "The State of Software Maintenance" in 1987, which documented the presence of the maintenance problem as a series of questions and answers; in fact the first question he puts to the reader is "Why Is There a Maintenance Problem?" (Schneidewind, 1987, p.303). Pigoski's 1997 textbook on software maintenance discussed the lack of research into software maintenance as well. In 2000 Andrews and Lutfiyya published a paper on their inaugural semester of teaching an undergraduate level software maintenance course, presented as a novel experience in their work. In 2000 Bennett and Rajlich published "Software Maintenance and Evolution: A Roadmap", wherein they indicate that the body of knowledge for software maintenance is still sorely lacking, that "...much more empirical knowledge about software maintenance and evolution is needed, including process, organization and human aspects." (Bennett and Rajlich, 2000, pp.85-86).
In the year 2000 software maintenance was beginning to be taught as a discrete subject in colleges, it had its own textbook, and there were papers published on the topic. But as Bennett, Rajlich, Pigoski, and Schneidewind all indicated, there was still much more work to be done. To accomplish the goals these researchers set forth for software maintenance, the discipline would need a tool that was freely available, easily accessed, and at the same time robust and widely used. FOSS would meet that need nicely.

Free and Open Source Software

History and Definition

The movement for FOSS arguably began in 1983, when Richard Stallman published his manifesto outlining the GNU is Not Unix (GNU) project. Stallman's thinking is that software should be as “free as air” (Stallman, 2007). In the document he considers the commercial sale of software to be a destructive force, and insists that people might only pay to either obtain support for or for the distribution of software. According to GNU, in a perfect world everyone has the right to freely create and modify the software they are using without the danger of violating any licensing agreements.

Almost a decade after the GNU project was begun, Linus Torvalds (a 21 year old from Finland) developed Linux, a FOSS Operating System (OS) based on Minix, itself a GNU licensed free version of Unix (Raymond, 2001). Since 1991 Linux has grown from one man's FOSS hobby project into a multi-million dollar industry that comprises dozens of distributions, all free and legal for the downloading (or installing from a friend's disc). Linux is by far not the only FOSS project in the world, but it could certainly be considered one of, if not the largest.
As stated FOSS is of course much more than Linux, but all FOSS conforms to specific criteria as indicated in
Table 1:

Table 1. Outline of Key Conditions of Open Source Definition (Feller and Fitzgerald, 2000, p.59)

Condition

Commentary

The source code must be available to user.

The software distribution must include the source code (i.e., the original programming language), or else the code must be made available by free, public Internet download.

The software must be redistributable.

The user of an [Open Source Software] release is given full rights to reproduce and redistribute the software, on any medium, to any party, either gratis or for a fee.

The software must be modifiable, and the creation of derivative works must be permitted.

All users are given the right to modify the software or produce derivative works. There is considerable variation among licenses regarding whether or not modifications must also be released publicly under an OSD compliant license.

The license must not discriminate against any user, group of users, or field of endeavor.

In an attempt to counter overtly ideological content in software licenses, the

OSD precludes any limitations on the possible uses of an [Open Source Software] distribution.

The license must apply to all parties to whom the software is distributed.

While some licenses might allow modifications to be released under a noncompliant license, an [Open Source Software] distribution cannot be “relicensed” by the user.

The license cannot restrict aggregations of software.

OSD compliant licenses cannot be limited to a particular distribution, nor can they seek to contaminate separately licensed software with which it is aggregated.

FOSS Research

As the twentieth century came to a close, researchers who had realized the potential of FOSS were calling for more research to be conducted on the paradigm so as to better understand as well as to improve it. Feller and Fitzgerald note this lack of academic inquiry in their 2000 paper "A Framework Analysis of the Open Source Software Development Paradigm", where they state "...we believe that rigorous academic inquiry into [FOSS] is sorely needed." This paper offered a basic framework upon which Feller and Fitzgerald hoped to both stimulate and direct further research into FOSS (Feller and Fitzgerald, 2000). Furthermore, Bennet and Rajlich wrote in 2000 that FOSS "may increase in importance" (Bennet and Rajlich, 2000, p.76).
As is demonstrated below in the Graphing the Synergy section, the opening decade of the twenty-first century did enjoy an overall growth in the amount of research conducted on FOSS. As major projects enjoyed greater success and the overall paradigm matured, the attention of major corporate entities like IBM was drawn (Capek, Frank, Gerdt, Shields, 2005). This corporate attention was coupled with what Feller and Fitzgerald had called for, additional academic research. Perhaps ironically, FOSS became a significant component in software maintenance research, as FOSS was found to be an excellent tool for successfully conducting this research.

Software Maintenance in FOSS Projects

As software maintenance is considered an unglamorous task by programmers working on commercial software products, software maintenance has been even more neglected in the FOSS community, where the members of the community (the widespread developers and user base of FOSS) are very likely to file problem requests -- known in the community as "bug reports" -- but are much less likely to take the initiative to work on a solution to these bugs (Mockus, Fielding, Herbsleb, 2000). Bug reports are the reality of software maintenance as it is found in FOSS, and are addressed by "massively parallel debugging"(Godfrey and Tu, 2000, p.3). This is an application of "Linus's Law", which states that "given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone" (Raymond, 2001, p. 30). While Linus's Law is apparently contradictory to the previously indicated community reluctance to fix the bugs they reported, it is important to note the scale of most popular FOSS projects, where thousands of individuals will file bug reports and hundreds will subsequently attend to these bugs (Mockus et al, 2000).
The adaptive and evolutionary components as defined in software maintenance are present and attended to in FOSS, but they have been absorbed into the shortened need-implement-use life cycle that is indicative of FOSS projects (Capek, et al, 2005). Indeed the author found in his literature review that this mutation of software maintenance in FOSS projects can be attributed to the "rapidly moving codebase" that is found in the continual release cycle of popular FOSS project (Andrews and Lutfiyya, 2000).
Some scholarly research has been conducted in the arena of software maintenance on FOSS projects. Koponen and Hotti detailed a software maintenance process framework for FOSS in 2005, and in the same year Yu, Schach, Chen, Heller, and Offutt conducted a detailed study on the maintainability of FOSS OSes. The former wished to provide a formal maintenance process for FOSS analogous to those found in various ISO/IEC standards (Koponen and Hotti, 2005). The latter research was conducted as a rebuttal to a challenge of prior research by the authors, but was also used to highlight a potential danger to the future maintainability of Linux relative to other FOSS operating systems, namely FreeBSD, NetBSD, and OpenBSD. (Yu, Schach, Chen, Heller, and Offutt, 2005).

FOSS - The Right Tools

Eric Raymond likened FOSS development as a bazaar, "...a great babbling bazaar of differing agendas and approaches...out of which a coherent and stable system could seemingly emerge only by a succession of miracles." (Raymond, 2001, pp.21-22). By the start of the twenty-first century, this bazaar had brought forth a wealth of software, all of it free, with the creators and maintainers actively interested in having their work built upon and studied by others. This openness made FOSS ideal for academic projects from undergraduate coursework to graduate studies and ongoing academic research (Andrews and Lutfiyya, 2000). Another factor working in the favor of FOSS was the absence of non-disclosure agreements (NDAs) and other proprietary practices favored by commercial software creators (Hassan, Godfrey, and Holt, 2001). Because of this fundamental difference, and the ever-present label of free, these same commercial software creators viewed (as some continue to view) FOSS as nothing but competing software offered at no cost; the actuality is that the free in FOSS is meant in the context of free speech, and not simply zero-cost freeware (Feller and Fitzgerald, 2000). In fact, there are several forms of software licensing of which FOSS is only one.

FOSS in Software Maintenance Research

When software maintenance was finally receiving the serious academic attention it was due FOSS, as has been demonstrated, was recognized by software maintenance researchers as the best option to be the object of their research.
Throughout the first decade of the twenty-first century many FOSS projects have been used as research subjects by investigators examining new theories or processes pertaining to software maintenance. While some like Rajlich and Gosavi (2004), or Hill, Pollock, and Vijay-Shanker (2007) used smaller-scope projects like Drawlets and the Eclipse integrated development environment (respectively), much software maintenance research has gravitated toward the higher-profile projects like Linux and the OpenOffice.org suite of productivity applications. The work by software maintenance researchers using these FOSS projects has been mutually beneficial, providing the researchers with needed insight and test data on their theories and providing the FOSS community with valuable insight and improvements to their projects.
The synergy was being realized as early as 2000, when Tran, Godfrey, Lee, and Holt published "Architectural Repair of Open Source Software", addressed the specific issue of architectural drift, and the method they had developed for repairing this drift. One of the FOSS projects used to test the method was Linux, specifically the Linux kernel (Tran, Godfrey, Lee, and Holt, 2000). Their research demonstrated the initial effectiveness of their theory, and also presented the Linux development community with an opportunity to have an easier time with future work on their project. The work has continued as indicated by the publication dates above. In 2008 Abd-El-Hafiz, Shawsky, and El-Sedeek published "Recovery of Object-Oriented Design Patterns Using Static and Dynamic Analyses", where their object of study included several of the individual applications that comprise the OpenOffice.org suite. All four studies that have been discussed as using FOSS to conduct software maintenance research have indicated initially positive results, with the need for future work to both confirm and extend the findings. These statements indicate that there is a synergy at work.

Graphing the Synergy

Concept

In performing the initial research for this assignment, it became apparent that while there was not enough to be located in the literature pertaining to software maintenance of FOSS projects to properly complete a scholarly paper on the topic, FOSS projects appeared with regularity in software maintenance research papers. In light of this insight, and with the evidence already presented, it was determined that a brief and informal piece of research should be conducted to determine if such a synergy exists.

Methodology

The methodology for conducting this experiment was basic. Four prominent academic search engines were selected by the author, chosen for their familiarity and for the likelihood of returning results for the search criteria indicated. Four search engines were chosen so that any unforeseen bias present in any one academic search engine might be mitigated by the absence of such in the other three. The academic search engines and the method of presenting the search criteria to them are listed in Table 2 below.
There were 2 sets of search criteria. For both sets was a year, from 2000 to 2009. For each of these years there were two exact phrases that were processed by the search engines: "Software Maintenance", and "Open Source". The number of returned documents for each phrase in each year for each search engine was entered into a spreadsheet, averaged, and then plotted on a line graph. The data collected can be viewed in Table 3 below, and the line graphs for the averages computed for both search phrases are presented in Figure 1 and Figure 2, below. One final note is that even though 2009 is not over, the year was included in the research to give as complete a picture as possible of the decade to date.

Results

In comparing the two line graphs one can see that from 2000 to 2006 there is a shared upward trend in published material for both terms, with there being a downward trend for both in the years following 2006. The results differ between the search terms by an order of magnitude, with the average number of results across all years for "software maintenance" at 43.83, while results for "open source" were an average of 626.

Table 2. Academic Search Engines Used to Generate Results

Search Engine

Method of Criteria Presentation

Google Scholar

Advanced Scholar Search using the exact phrase in the title, indicating each year, searching only articles in the Engineering, Computer Science, and Mathematics fields

IEEE Computer Society

Advanced Search using the phrase without quotes appearing in the title for each year. Of note is the lack of “exact phrase”, which was an option. Searching this way returned 100 results for each year, and was considered to be a less valid indicator for this research.

ACM Digital Library

Advanced Search for titles containing the exact phrase in the Abstract, considered a more valid indicator in this instance than the title field.

ProQuest

Searched for the phrase in the citation and abstract, specifying a date range from 01/01 of the year to 12/31 of the year.



Table 3a. Data Collected from Academic Search Engines for Search Term “Software Maintenance”.

Year

Google Scholar

IEEE Computer Society

ACM Digital Library

ProQuest

Maintenance Average

2000

56

9

16

58

34.75

2001

62

10

30

48

37.5

2002

53

8

38

55

38.5

2003

57

7

20

62

36.5

2004

50

15

41

102

52

2005

49

15

63

116

60.75

2006

52

18

63

79

53

2007

50

8

76

59

48.25

2008

66

6

72

28

43

2009

26

8

59

43

34


Table 3b. Data Collected from Academic Search Engines for Search Term “Open Source”.

Year

Google Scholar

IEEE Computer Society

ACM Digital Library

ProQuest

Open Source Average

2000

146

10

25

1330

377.75

2001

243

11

46

1058

339.5

2002

320

18

70

1022

357.5

2003

422

11

86

1336

463.75

2004

458

39

99

2051

661.75

2005

561

65

200

2666

873

2006

626

55

229

2962

968

2007

552

58

247

2254

777.75

2008

484

68

323

2571

861.5

2009

286

58

235

1739

579.5


Figure 1. Average Number of Published Material for Search Term “Software Maintenance”, from 2000 - 2009



Figure 2. Average Number of Published Material for Search Term “Open Source”, from 2000 - 2009

Analysis

Thoughts on Research

It was expected that the search terms would share a general trend, but the shared arc from 2000 to 2009 was an unexpected result that perhaps provides more correlative evidence to the presence of the synergy. Whether the downward trend from 2006 onward is indicative of this proposed synergy between these two fields, or of a reduction in information technology research in general, or of an even broader downturn in the amount of traditionally published literature is beyond the scope of this paper, but may warrant further investigation by interested parties. Another possible cause for the reduction in research into software maintenance may be the 2006 publication of the ISO/IEC 14764 IEEE Std 14764 (2006).
Though it is positive to note that 20 years after nothing was published, in 2005 more than 60 documents were published relating to software maintenance (Schneidewind, 1987), there was still an order of magnitude of difference between the two sets of search criteria. FOSS was an area of particular note at the end of the twentieth century, and continues to be very popular at the beginning of the twenty-first century. Any synergy between these two topics would mean that the celebrity of FOSS is only good for software maintenance.

Threats to Research

The research conducted was informal and demonstrative. As indicated in Table 3 the specific methods of criteria entry into the different academic search engines was not uniform across all four, and perhaps it may have been better if the Institute of Electrical and Electronics Engineers (IEEE) Computer Society results had been not included, rather than making the indicated accommodation. The author felt, however, that the auspicious nature of IEEE publications would lend credence to the research.
Another threat would be the search results themselves. The number of documents returned was recorded, but these results were not vetted for quality or academic nature. It is entirely possibly that results may have included unrelated materials or non-scholarly works. It is hoped that the average of the four search engines' results would help mitigate such confounding returns; however, future research conducted in such a manner should take measures -- perhaps random auditing of search results and a larger number of academic search engines -- to prevent superfluous results from entering the data pool.

Conclusions

The author's interest in free and open source software led him to conduct a literature review of software maintenance as it pertains to FOSS. Initially he sought to contrast and compare the status of FOSS software maintenance in the year 2000 to the present day, but instead discovered that while FOSS is an important component of software maintenance research, software maintenance as traditionally defined does not play quite so large a role in the typical FOSS life cycle. Instead, it was further hypothesized that there exists a synergy between FOSS and software maintenance, and that the two have enjoyed a mutually beneficial relationship throughout the past decade.
An informal research method was developed, searching for the terms "open source" and "software maintenance" appearing in published materials in each of the years from 2000 to 2009 in four different academic search engines. The results for each search engine were averaged by year for each phrase and then graphed to demonstrate that a synergy existed. The shared arc in results across this decade points to such a synergy, but future work may wish to delve more deeply into both individual papers on these topics as well as the development of more stringent and farther-reaching searches.
Ultimately, FOSS and software maintenance have only benefited by the presence of the other, and as both are fields that are maturing, they will continue to benefit from each other well into the twenty-first century.

References

Abd-El-Hafiz, S.K., Shawsky, D.M., El-Sedeek, A.-L., (2008). Recovery of object-oriented design patterns using static and dynamic analyses. International Journal of computers and Applications 30(3), pp.220-233.


Andrews, J.H., Lutfiyya, H.L., (2000). Experience report: a software maintenance project course.
13th Conference on Software Engineering Education & Training, 2000. Proceedings, pp.132-139.

Bennett, K.H., Rajlich, V.T. (2000). Software maintenance and evolution: a roadmap. Proceedings of the Conference on The Future of Software Engineering, pp.73-87.

Capek, P.G., Frank, S.P., Gerdt, S., Shields, D. (2005). A history of IBM's open-source involvement and strategy. IBM Systems Journal 44(2), pp.249-257.

Feller, J., Fitzgerald, B. (2000). A framework analysis of the open source software development paradigm. Proceedings of the twenty first international conference on Information systems, pp.58-69.


Glass, R. (2003). Facts and fallacies of software engineering. Boston, MA: Pearson Education, Inc.


Godfrey, M.W., Tu, Q. (2000). Evolution in open source software: a case study
. Proceedings of the International Conference on Software Maintenance, pp. 131-142.

Hassan, A.E., Godfrey, M.W., Holt, R.C. (2001). Software engineering research in the bazaar. Proceedings of the 2nd Workshop on Open Source Software Engineering at the 24th International Conference on Software Engineering.

Hill, E., Pollock, L., Vijay-Shanker, K. (2007). Exploring the neighborhood with dora to expedite software maintenance. Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering, pp.14-23.

ISO/IEC 14764 (2006) and IEEE 14764 (2006). International Standard: Software Engineering – Software Life Cycle Processes – Software Maintenance. New York: International Organization for Standardization and Institute of Electrical and Electronics Engineers.

Koponen, T., Hotti, V. (2005). Open source software maintenance framework. ACM SIGSOFT Software Engineering Notes 30(4), pp.1-5.

Mockus, A., Fielding, R.T., Herbsleb, J. (2000). A Case Study of Open Source Software Development: The Apache Server. Proceedings of the 2000 International Conference on Software Engineering, pp. 263-272.


Pigoski, T.M.. (1997). Practical software maintenance. New York, NY: Wiley Computer Publishing.

Rajlich, V., Gosavi, P. (2004). Incremental change in object-oriented programming. IEEE Software 21(4), pp. 62-69.


Raymond, E.S. (2001). The cathedral and the bazaar: musings on linux and open source by an accidental revolutionary. Sebastopol, CA: O'Reilly & Associates, Inc.

Schneidewind, N.F., (1987). The state of software maintenance. IEEE Transactions on Software Engineering 13(3), pp.303-310.


Stallman, R. (2007). The gnu manifesto. Boston, MA: Free Software Foundation, Inc.

Tran, J.B., Godfrey, M.W., Lee, E.H.S., Holt, R.C., (2000). Architectural repair of open source software. 8th International Workshop on Program Comprehension, pp.48-59.

Yu, L., Schach, S.R., Chen, K., Heller, G.Z., Offutt, J. (2005). Maintainability of the kernels of open-source operating systems: a comparison of linux with freebsd, netbsd, and openbsd. Journal of Systems and Software 79(6), pp.807-815.

Popular Posts