An Empirical Comparison of Modularity of Procedural and Object-oriented Software - PDF

Please download to get full document.

View again

of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Science

Published:

Views: 8 | Pages: 10

Extension: PDF | Download: 0

Share
Related documents
Description
An Empirical Comparison of Modularity of Procedural and Object-oriented Software Lisa K. Ferrett AT&T Government Solutions, Inc Gallows Road Vienna, VA USA Jeff Offutt Information
Transcript
An Empirical Comparison of Modularity of Procedural and Object-oriented Software Lisa K. Ferrett AT&T Government Solutions, Inc Gallows Road Vienna, VA USA Jeff Offutt Information and Software Engineering George Mason University Fairfax VA USA (+1) Thirteenth International Conference on Engineering of Complex Computer Software, Annapolis, MD, November Abstract A commonly held belief is that applications written in a object-oriented languages are more modular than those written in procedural languages. This paper presents results from an experiment that examines this hypothesis. Open source and industrial program modules written in the procedural languages of Fortran and C were compared with open source program modules written in the object-oriented languages of C++ and Java. The metrics examined in this study were lines of code per module and number of parameters per module. The results of the investigation support the hypothesis. The modules of the object-oriented programs were found to be half the size of those of the procedural programs and the average number of parameters per module for the objectoriented programs was approximately half that of the procedural programs. Thus the object-oriented programs were twice as modular as the procedural programs. An unexpected result was that the C++ programs were found to be no more modular than the C programs. 1. Introduction Most current software development projects are using some sort of object-oriented design and an object-oriented programming language. Object-oriented programming is based on the concept of an object, which is a data structure encapsulated with a set of routines, called methods, which operate on the data [1, 8]. Object-oriented programming evolved from structured programming, which emphasizes top-down development and modular Partially supported by the U.S. National Science Foundation under grant CCR structure, so that each module performs a specific function [5]. Object-oriented programming was intended, among other things, to be more modular and help programmers develop modules that exhibit less coupling and more cohesion. It is widely believed that applications that are more modular are also more maintainable, extendable and reusable [1, 4, 5, 6]. This paper tries to establish a relation between modularity and objectoriented programming for industrial software modules. This paper presents results from a study designed to compare the modularity of programs written in procedural languages to those written in object-oriented languages. The hypothesis investigated is that object-oriented programs are more modular than procedural programs. Exactly what the property of modularity means is somewhat open to discussion. One would like to say that modularity is a property of the entire software application, but that is a difficult characterization to quantify. One informal characterization is that modularity is related to the ease of analyzing or modifying a piece of code in isolation from the rest of the application. This, of course, is related to the common metrics of coupling and cohesion [2]. Although methods to calculate these metrics have been discussed [7], no method is widely agreed on. Thus, this study examined three attributes associated with modularity. For the purposes of this experiment, the term module refers to a subroutine or function in Fortran and a method in C, C++ and Java (sometimes also called unit ). The attributes studied are the size of the modules, the number of modules, and the number of parameters passed to a module. A program with a few large modules is considered to be less modular than a program with many, small modules. In addition, a large number of parameters may indicate that a module is too complex and should be subdivided [5]. The remainder of this paper presents the design of our experiment, results from our experiments, analysis of the results, and some limitations of our results. In conclusion, we found that object-oriented programs did indeed exhibit more modularity. However, a surprising result is that C++ programs do not exhibit more modularity than C programs. We theorize that this indicates that (1) the benefits of object-oriented programming are derived primarily from the design rather than the use of a particular language, and (2) many C++ programmers are not following a true object-oriented design. 2. Experimental Design This experiment analyzed 38 applications. Ten applications were written in Fortran, ten in C, ten in C++ and eight in Java. Fortran and C are procedural languages, while C++ and Java are object-oriented languages. Most applications were obtained as open source code arbitrarily selected from those available on the Internet, however finding open source Fortran applications was difficult and only three were found. The first author had access to several Fortran applications that had been developed in-house and seven of these were included to bring the count to ten. The appendix lists the programs used and where they were obtained. The applications that were investigated are of a variety of different types. Combining the results from a number of different application types for each language should provide results that can be extrapolated to the general case. However, it would also be interesting to compare applications of the same type. Since web server applications were available that had been written in C, C++ and Java, an additional comparison was made between the three of them. Several tools were used to analyze the programs selected. The author created a tool to extract the number of lines of code per module and the number of parameters per module for Fortran programs. Understand for C++ (Scientific Toolworks, Inc., was used to count the number of lines of code per method in C and C++ programs. Scientific Toolworks also provided a perl script to extract the number of parameters per method from the database files that their tool created. JStyle (Man Machine Systems, was used to determine the number of lines of code per method in Java programs. JStyle provides an integrated scripting capability that the author used to extract the number of parameters passed per method. This experiment investigated two hypotheses: 1. Object-oriented programs have more, smaller modules than procedural programs. Since the programs investigated are not all the same size, it was decided to use number of modules per thousand lines of code as a measure of the number of modules. The size of modules was calculated as the number of lines of code per module. 2. The average number of parameters per module is smaller for object-oriented programs than for procedural programs. 3. Results The tools described previously were used to measure the number of modules, number of lines of code per module and number of parameters per module. Descriptive statistics were calculated for the number of lines of code per module and the number of parameters per module. 3.1 Experiment 1: Module Size Comparison The number of lines of code per module was determined for 38 applications, 10 in each of Fortran, C and C++, and 8 in Java. The number of modules and lines of code (LOC) per module were determined, as well as the minimum, maximum, mean, median, mode and standard deviation of the lines of code per module [3, 8, 9]. In addition, the number of modules per thousand lines of code (KLOC) was also calculated. This metric was used to normalize the measure of number of modules per application for comparison between applications of different sizes. The tabulated results for the applications written in Fortran, C, C++, and Java are displayed in Tables 1 through Experiment 2: Number of Parameters As noted in the Introduction, a module is defined to be a function, procedure, or method. The number of parameters per module was determined for the same 38 applications. The minimum, maximum, mean, median, mode and standard deviation of the number of parameters were calculated. The tabulated results for applications written in Fortran, C, C++, and Java are displayed in Tables 5 through 8, respectively. 4. Analysis Fenton and Pfleeger [3] point out that the mean, median and mode of the data will be the same in a normal distribution. An examination of the collected data clearly shows that they do not form normal distributions. Therefore, normal parametric statistical analyses such as linear regression and the t-test cannot be applied. Table 1. Module size comparison for applications written in Fortran LOC/ module Application # Modules Total LOC Mean Min Max Median Mode Std Dev Mod/KLOC advsrc anypia booster fdwell image lowtran mlsc plume quail simfence All applications Table 2. Module size comparison for applications written in C LOC/ module Application # Modules Total LOC Mean Min Max Median Mode Std Dev Mod/KLOC apache chemtool gcompris gdpc gperiodic gstat gsx mutt umfpack xpaint All applications Table 3. Module size comparison for applications written in C++ LOC/ module Application # Modules Total LOC Mean Min Max Median Mode Std Dev Mod/KLOC amaya benson btl celestia gperf guarddog knetfilter ktouch pi3web skysight All applications Table 4. Module size comparison for applications written in Java LOC/ module Application # Modules Total LOC Mean Min Max Median Mode Std Dev Mod/KLOC ArtOfIllusion bugbase ConsultComm jakarta-tomcat jcharts jext jigsaw mercator All applications Table 5. Number of parameters per module in Fortran programs Parameters/ Module Application # Modules Mean Min Max Median Mode Std Dev booster fdwell image lowtran mlsc plume simfence advsrc quail anypia All applications Table 6. Number of parameters per module in C programs Parameters/ Module Application # Modules Mean Min Max Median Mode Std Dev apache chemtool gcompris gperiodic gstat gsx mutt xpaint gdpc umfpack All applications Table 7. Number of parameters per module in C++ programs Parameters/ Module Application # Modules Mean Min Max Median Mode Std Dev btl celestia gperf guarddog knetfilter ktouch amaya benson pi3web skysight All applications Table 8. Number of parameters per module in Java programs Parameters/ Module Application # Modules Mean Min Max Median Mode Std Dev ArtOfIllusion bugbase ConsultComm jakarta-tomcat jcharts jext jigsaw mercator All applications Analysis of Module Size Comparison An examination of the data in Tables 1 through 4 shows noticeable variation among the individual applications and between those written in the different programming languages. It is likely that there is a significant variation in lines of code per module across the entire population of programs. Combining the results from the object-oriented applications and from the procedural applications may provide results that are representative of the general case. Table 9 summarizes the aggregated data, while Figure 1 provides a graphical comparison of the module sizes for the object-oriented, procedural, Java, C++, C, and Fortran applications. A major purpose of this experiment is to investigate whether object-oriented programs have more modules that are smaller than those in similarly sized procedural programs. Comparing the size of the modules is straightforward. The average size of the object-oriented modules (16.6 lines of code) is slightly less than half (47%) the size of the procedural modules (34.8 lines of code). Comparing the number of modules cannot be done directly because of the differences in sizes of the applications. To normalize the data, the number of modules per thousand lines of code is instead used as the metric for comparison. The number of modules per thousand lines of code for the object-oriented applications is just over twice (210%) the number for the procedural applications. These results support the idea that objectoriented applications are more modular than procedural applications. Table 9. Summary of module size comparison LOC/ Module Language # Modules Total LOC Mean Min Max Median Mode Std Dev Mod/KLOC Procedural Object-oriented object-oriented procedural Java C++ C Fortran LOC/Module Figure 1. Graphical comparison of module size Since three of the applications examined are web servers written in three different languages, it might be instructive to compare their results. As they were all written for the same purpose, they might be more similar to each other than to the other programs examined. Table 10 compares the results for these three applications. As can be seen, there is more difference between the two object-oriented applications (C++ and Java) than there is between the C application and the C++ application. It is not obvious therefore that comparing similar applications is any more useful than comparing dissimilar applications. An examination of the overall results for the four languages, given in Table 11, shows that the average size of the C++ applications investigated is very close to that of the C applications examined (33.4 LOC/module vs LOC/module). Correspondingly, the modules per thousand lines of code, which is used as a normalized measure of the number of modules per application, is also quite similar (30.1 vs. 30.7). This suggests that the C++ programs that were examined were no more modular than the C programs. Two complementary theories may explain this. Although C++ supports object-oriented programming, it does not strongly encourage the programmer to use an object-oriented approach, as does Java. Additionally, many C++ programmers are re-trained C programmers who have had little or no exposure to OO concepts and thus cannot be expected to use the OO language features effectively. These two related observations may indicate that, these C++ applications may not have been written using an object-oriented approach. Table 10. Comparison of three web-server applications LOC/ Module Language Application # Modules Total LOC Mean Min Max Median Mode Std Dev Mod/KLOC C apache C++ pi3web Java jigsaw Table 11. Module size summary LOC/Module Language # Modules Total LOC Mean Min Max Median Mode Std Dev Mod/KLOC Fortran C C Java 4.2 Analysis of Number of Parameters Examination of Tables 5-8 indicates a significant variability in the number of parameters per module. Table 12 summarizes the data for the applications written in the four languages, and also shows the aggregated summary for the procedural and object-oriented applications. Not surprisingly, the minimum number of parameters per module is zero, regardless of the programming language. The mean number of parameters ranges from a low of 0.9 for Java programs to a high of 3.3 for Fortran programs. As in Experiment 1, the results for the C and C++ programs are quite close. The C programs had an average of 2.3 parameters per module, while the C++ programs had an average of 2.2. For the applications analyzed in this experiment there seems to be little difference in modularity between the C and C++ programs. Comparison of the aggregated results for the procedural and object-oriented applications shows that there is a noticeable difference. The maximum number of parameters was 40 for the object-oriented programs and 34 for the procedural programs. However, the standard deviations of the measurements indicate that there was more variability in number of parameters for the procedural programs than for the object-oriented programs. For the object-oriented programs the mean number of parameters was 1.3, but the mode was 0. For the procedural programs, the mean was 2.5 with a mode of 1. Thus, the procedural programs averaged approximately twice as many parameters per module as the object-oriented programs. These results are compared graphically in Figure 2. Another way to look at the results is to examine the distribution of the number of parameters. That is, we decided to count the number, or percent, of modules that have a given number of parameters. The percentage data is presented in Figure 3 for the Fortran, C, C++, and Java programs, for up to ten parameters. Figure 3 is read as follows. About 43% of the Java modules had zero parameters, and about 36 had one parameter. It appears that the Fortran programs have the largest number of parameters per module and the Java programs the smallest. However, the distribution of parameters is very different for the applications written in the four different languages, making it difficult to directly compare the results. Another way to look at the results is to examine the distribution of the number of parameters. That is, we decided to count the number, or percent, of modules that Table 12. Parameters/module summary Parameters/ Module Language # Modules Mean Min Max Median Mode Std Dev Fortran C C Java procedural object-oriented object-oriented procedural Java C++ C Fortran Parameters/Module Figure 2. Graphical comparison of parameters per module have a given number of parameters. The percentage data is presented in Figure 3 for the Fortran, C, C++, and Java programs, for up to ten parameters. Figure 3 is read as follows. About 43% of the Java modules had zero parameters, and about 36 had one parameter. It appears that the Fortran programs have the largest number of parameters per module and the Java programs the smallest. However, the distribution of parameters is very different for the applications written in the four different languages, making it difficult to directly compare the results. Another approach is to look at cumulative percentages, or percentiles. This information is presented in Figure 4. Approximately 28% of the Fortran modules, 35% of the C modules, 46% of the C++ modules, and 81% of the Java modules have at least one parameter. Perhaps a more useful way to compare these numbers is to look at them from the inverse, that is, to examine the percentage of modules that have more than one parameter. For the Fortran programs this is 72%, for the C programs it is 65%, for the C++ programs it is 54%, and for the Java programs it is 19%. This is a striking difference. There is a nearly four-fold difference between Fortran and Java. Figure 5 compares the distribution of parameters for the aggregated data for the procedural and object-oriented modules. It is clear from this figure that the objectoriented modules have fewer parameters. Another approach is to look at cumulative percentages, or percentiles. This information is presented in Figure 4. Approximately 28% of the Fortran modules, 35% of the C modules, 46% of the C++ modules, and 81% of the Java modules have at least one parameter. Perhaps a more useful way to compare these numbers is to look at them from the inverse, that is, to examine the percentage of modules that have more than one parameter. For the Fortran programs this is 72%, for the C programs it is 65%, for the C++ programs it is 54%, and for the Java programs it is 19%. This is a striking difference. There is a nearly four-fold difference between Fortran and Java. % of Modules Number of Parameters Fortran C C++ Java Figure 3. Distribution of number of parameters % of Modules (Cumulative ) Number of Parameters Fortran C C++ Java Figure 4. Percentiles for number of parameters Figure 5 compares the distribution of parameters for the aggregated data for the procedural and object-oriented modules. It is clear from this figure that the objectoriented modules have fewer parameters. Figure 6 shows the cumulative percentages, or percentiles, for the aggregated data for the procedural and object-oriented modules. Approximately 35.5% of the procedural modules have at most one parameter while approximately 70% of the object-oriented modules have at most one parameter. This means that approximately 64.5% of the procedural modules have more than one parameter and only about 30% of the object-oriented modules have more than one parameter. By this measure, the object-oriented modules were more than twice as modular as the procedural modules. However, if the results for the object-oriented programs were skewed to the low side as a result of being o
Recommended
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x