Make your own free website on Tripod.com

 

An Empirical Analysis of Function Point Adjustment Factors

C. J. Lokan

School of Computer Science

Australian Defence Force Academy

University of New South Wales

Canberra, Australia

+61 2 6268 8060

c-lokan@adfa.edu.au

 

Abstract

In function point analysis, fourteen "general systems characteristics" (GSC’s) are used to construct a "value adjustment factor" (VAF), with which a basic function point count is adjusted. Although the GSC’s and VAF have been criticized on both theoretical and practical grounds, they are used by many practitioners. This paper reports on an empirical investigation into their use and practical value.

We conclude that recording the GSC’s may be useful for understanding project cost drivers and for comparing similar projects, but the VAF should not be used: doubts about its construction are not balanced by any practical benefit. A new formulation is needed for using the GSC’s to explain effort; factors identified here could guide further research.

 

1. Introduction

In function point analysis (FPA), fourteen "general systems characteristics" (GSC’s) are used to construct a "value adjustment factor" (VAF), with which a basic function point count is adjusted. This stage of function point analysis is controversial.

The general systems characteristics are intended to measure general aspects of size, as opposed to application-specific size [1]. Intuitively, they are used to adjust a function point count so as to better predict effort.

Criticisms of the GSC’s and VAF are both theoretical and practical. Theoretical criticisms are that the construction of the VAF involves operations that are inadmissable according to measurement theory [2]; since "complexity" appears in computing unadjusted function points and again in the GSC’s, there is some double counting [3,4]; and the GSC’s are inter-related [3,5].

Practical criticisms are that not all of the right things are counted as GSC’s [4]; when computing the VAF it is not appropriate to give all of the GSC’s the same weight [4]; the VAF does not provide enough variation [4]; and using the VAF does not improve effort estimation [3,6,7,8].

The adjustment phase of FPA is widely used despite these concerns. It makes intuitive sense to practitioners, whose experience is that the GSC’s are things which influence project difficulty and effort.

An informed decision is needed on whether the GSC’s and VAF are really worth using. If they bring no practical benefits, they should be abandoned. If there are practical benefits, the decision arises of whether or not to use them in spite of theoretical objections. Empirical research is needed to inform this decision.

A study by Desharnais [5] found that only some of the GSC’s were relevant in the adjustment process. He proposed a modified adjustment method, in which different GSC’s have different weights. Several authors have investigated the VAF, finding consistently that it is not effective for improving effort estimation [3,6,7,8].

This paper describes a new empirical study of the FPA adjustment process. A collection of 235 projects is analyzed. The initial focus is on the GSC’s alone, without regard to their use in the VAF. Questions investigated are: how the GSC’s are used overall; how their use varies in different types of projects; and what relationships exist between them. Attention then turns to the adjustment process, to see whether the findings of previous researchers are confirmed in this much larger data set.

Section 2 describes the data set analyzed here. Section 3 looks at the GSC’s individually, describing the distribution of each GSC and noting how the distributions vary for different types of projects. Section 4 investigates relationships between the GSC’s. Section 5 evaluates the VAF. Section 6 presents conclusions.

 

2. Data sample

The projects analyzed here come from the International Software Benchmarking Standards Group ("ISBSG") Repository [9]. This is a public repository of data about completed software projects. The projects cover a wide range of applications, development techniques and tools, implementation languages, and platforms. ISBSG believes that they are representative of better software development projects worldwide.

At the time of this analysis, the repository contained data on 421 projects. After those which were unsuitable for study were excluded (for example, some are measured with Mark II function points [10], which do not use the GSC’s), a sample of 235 projects remained.

The following data was available for each project in the sample: size, measured using the IFPUG standard [1] for function points; the effort expended by the development team; all fourteen GSC values; and project attributes such as application type, development platform, programming language, and date of implementation.

Table 1 summarizes the ranges of function points, effort, and delivery rate.

 

Statistic

Size

(FPs)

Effort (Hours)

Project Delivery Rate (Hours/FP)

Minimum

11

97

0.7

1st Quartile

140

793

4.1

Median

276

2216

6.8

3rd Quartile

725

4787

12.8

Maximum

17520

106500

68.9

Mean

695

5129

9.5

St.Dev.

1512

10136

8.8

Table 1: Characteristics of projects analyzed

 

3. Usage of individual GSCs

Each characteristic is scored on a six-point ordinal scale. A score of zero means "not relevant", and scores of 1 through 5 (5 is high) indicate the importance of the characteristic. IFPUG provides guidelines for determining the scores 1].

Table 2 shows the distribution of scores observed for each characteristic. These distributions give some impression of the nature of the projects. A range of profiles can be seen: mostly high values (eg on-line data entry, data communication); mostly low values (eg distributed data processing); mostly average values (eg complex processing); and a fairly even spread (eg transaction rate).

Characteristic

0

1

2

3

4

5

G01

Data communication

3

0

7

34

27

29

G02

Distributed data processing

33

9

24

12

19

3

G03

Performance

6

15

11

30

26

12

G04

Heavily used configuration

9

23

34

20

5

9

G05

Transaction rate

20

11

13

29

23

5

G06

Online data entry

3

2

8

26

1

60

G07

End user efficiency

6

7

21

23

27

15

G08

Online update

3

5

10

32

37

13

G09

Complex processing

9

15

20

30

23

2

G10

Reusability

15

17

27

26

10

4

G11

Ease of installation

20

11

23

24

7

15

G12

Ease of operation

17

11

34

23

8

6

H13

Multiple sites

33

11

24

16

11

5

G14

Facilitate change

29

7

17

27

10

10

Table 2: percent of projects with each GSC value

One might expect different characteristics to matter more for some types of software than others. The distributions of GSC values for different types of projects are presented in Tables 5-10 (in the appendix).

Projects are classified by type of organization, business area, application type, project type, programming language level, and development platform. These particular classifications were chosen for two reasons. First, they cover the full range of project attributes, including context, nature, and implementation. Second, in our view the GSC’s are an attempt to explain variations in productivity, and these are the classifications that have been found to be related to productivity in the ISBSG data set [9]. The projects were also classified by date of implementation; the few trends that emerged are noted below.

The tables can be interpreted in two ways. For each characteristic, one can see the types of projects for which they are more or less important (Section 3.1). For each group of projects, one can see which GSC’s matter more and which matter less (Section 3.2). Some interesting trends can be seen from each perspective. These are summarized in Section 3.3

 

3.1 Variations in the use of each GSC

Each section below characterizes the overall distribution of a single GSC (see Table 2).

The distributions of GSC values shown in Tables 5-10 are then analyzed. The chi-squared test was used to identify statistically significant differences between distributions (at the 0.05 significance level). Those differences are described.

G01: Data Communication

Teleprocessing is now almost universal. Only 10% of projects report a "below average" score of 2 or less; 56% have an "above average" score of 4 or 5.

The scores tend to be lower for banking projects, and for projects developed on personal computers. There has been a gradual decline, from above average towards average, from 1991 to 1996.

 

G02: Distributed Data Processing

Of all the GSC’s, this has one of the greatest percentages of "below average" values. The distribution is bimodal: systems tend to be either monolithic, or to have distributed processing as a characteristic of above average importance.

There tends to be more distributed processing in engineering systems. Distributed processing is more common on midrange platforms than on other platforms. It is more common in transaction/production systems and office information systems than in management information systems and decision support systems. It is more important in new developments than in enhancement projects.

G03: Performance

This characteristic has a broad spread: scores are below average for 32% of projects, average for 30%, and above average for 38%.

Performance is more important for transaction/production systems than for management information systems. It is more important for new developments than for enhancement projects.

G04: Heavily Used Configuration

Scores for this characteristic are generally low: 66% below average, 20% average, and 14% above average.

Scores are lower for transaction/production systems and office information systems than for management information systems and decision support systems. They are lower for new developments than for enhancement projects; higher for midrange projects than for other platforms; and higher for engineering systems. Scores increase from 3GL projects to 4GL projects to application generator projects.

G05: Transaction Rate

Scores for transaction rate are shared across the range from 0 to 4; there are few scores of 5.

Transaction rates are more important than usual for banking systems, and less for engineering systems. They are more important on mainframes than on other platforms. Although one might expect them to be more important for transaction/production systems, there are no significant differences between application types. Scores for this characteristic rise steadily from 1991 through 1996.

G06: On-line Data Entry

This characteristic has by far the highest scores of all, and the least variation. Fully 60% of projects score this characteristic at 5, the maximum value possible.

According to IFPUG’s guidelines [1], a score of 5 means that more than 30% of transactions involve interactive data entry. Perhaps 30% is too low a threshold nowadays; a higher value might provide more useful discrimination.

The scores are lower (generally 3’s) for a block of Cobol/mainframe/banking projects from one organization. The score is 5 for just about everything else.

G07: End User Efficiency

This characteristic has a broad spread, with a slight tendency towards higher scores: 34% are below average, and 43% are above average.

End user efficiency is more important for management information systems than for transaction/production systems. The scores are lower, and have a flatter distribution, for new developments than for enhancement projects. Again, scores increase from 3GL projects to 4GL projects to application generator projects.

G08: On-line Update

Scores for on-line update tend to be high (half are above average), but they are mostly 3’s and 4’s rather than 5’s.

Scores tend to be higher for transaction/production systems. They are lower on personal computers than on other platforms. Once again, scores increase from 3GL projects to 4GL projects to application generator projects.

G09: Complex Processing

This characteristic has a normal (bell-shaped) distribution, centered about the average score of 3. There are few scores of 0 or 5.

Scores for complex processing are highest on mainframes and lowest on microcomputers; highest in 3GL projects and lowest in 4GL projects. The scores are higher, and have a flatter distribution, for new developments than for enhancement projects. Processing complexity increases steadily from 1991 through to 1996.

G10: Reusability

The importance of reusability is generally low, with 59% below average and only 14% above average, but it is a very mixed bag.

Reusability is slightly more of a concern for decision support systems than for other types of software. It is less of a concern on personal computers than on mainframes.

G11: Ease of Installation

This characteristic has one of the broadest distributions. Scores are low overall (54% below average, 22% above average), but both extremes are well represented. Ease of installation is of no concern in 20% of projects, and extremely important in 15%.

Scores are higher for enhancement projects than for new developments; higher for mainframe projects than for other platforms; and higher for engineering systems than for other business areas.

G12: Ease of Operation

Ease of operation is rarely much of an issue. Scores for this characteristic are nearly the lowest overall: 62% are below average, and only 14% are above average. Scores are spread from 0-3, with 2 the most common score.

The only significant difference emerges when projects are grouped by application type: scores are higher for management information systems and decision support systems than for transaction/production systems and office information systems.

G13: Multiple Sites

This characteristic has the lowest scores of all: 68% are below average, and 33% have the minimum possible score of 0.

Scores are very low for legal systems, and high for engineering systems. They are higher for new developments than for enhancements or redevelopments; higher for 3GL projects than for others; and higher on midrange computers than on mainframes. Once again, scores are higher for management information systems and decision support systems than for transaction/production systems and office information systems.

 

G14: Facilitate Change

Each score for this characteristic is well represented, but generally scores are low: 53% are below average, and 20% above average. The distribution is bimodal, with the two common scores being 0 (of no concern) and 3 (average concern).

Scores are low for 3GL projects, and high for 4GL projects. They are low for new developments; low for mainframe projects; and high for engineering projects. Not surprisingly, this characteristic is more important for management information systems and decision support systems, and less important for transaction/production systems.

 

3.2 GSC usage for different types of software

In this section the projects are broken down into different groups, on various criteria. For each group of projects, we can see which GSC’s matter more and which matter less.

Type of organization (see Table 5)

Manufacturing projects score low for complexity, distribution, and user focus, and high for data communication. Projects developed by utilities (electricity, gas, water) are lower in complexity, higher in performance constraints and distribution, and place more focus on the user (end user efficiency, ease of installation).

The banking projects are dominated by one organization. Almost all projects from this organization had the same GSC values. They score lower than usual in data communication, on-line data entry, and user focus; and higher in processing complexity.

Business area (see Table 6)

Banking systems score lower than usual for on-line data entry; and higher for transaction rate, processing complexity, and reuse (although reuse is still not very important). Engineering projects feature more distributed processing, and focus more than usual on user efficiency; ease of operation is less important. Legal systems focus on the end user; software developed for multiple sites is rare, and ease of installation is rarely a concern.

Application type (see Table 7)

Management information systems and decision support systems often vary together, as do transaction/production systems and office information systems. Management information systems usually vary in opposition to transaction/production systems.

Management information systems and decision support systems tend to be low in complexity, and high in facilitating change by the user. Transaction/production systems and office information systems have more constraints on performance, and are less concerned with facilitating change by the user. Decision support systems also score higher than usual (although still low overall) on distributed processing.

Project type (see Table 8)

New developments score higher than enhancement projects for complex processing, performance constraints, distributed processing (although this is still low overall), and multiple sites. They have less concern for the user (facilitating change, end user efficiency).

The GSC distributions for new developments tend to be flatter. Those for enhancement projects have sharper peaks.

Programming language (see Table 9)

Third generation languages tend to be used more for software intended for multiple sites. Processing complexity scores are also higher for 3GLs, while scores for on-line update are lower. 4GL projects score highly on facilitating change, and low on processing complexity. Application generator projects score highly on data communication, heavily used configuration, end user efficiency, and on-line update.

Development platform (see Table 10)

Mainframe projects score high for processing complexity and transaction rate, and low for user efficiency, data communication, and on-line update. Midrange projects score high for distribution, multiple sites, heavily used configuration, and facilitating change by the user. Microcomputer projects have low scores for on-line update, processing complexity, and reuse.

Project completion date

From 1991 to 1996, there is a trend down (from more 5’s to more 3’s) for on-line data entry and data communication. There is a trend towards increasing importance for transaction rate, complex processing, and multiple sites.

 

3.3 Discussion

The characteristics with the highest scores relate to data communication and on-line entry. These are almost universal. Redefinition of these characteristics may be advisable if they are not to lose their discriminative value.

Redefinition of data communication and on-line entry may already be happening informally. A downward trend in the scores for these characteristics might reflect changing perceptions, as what was once seen as extreme is now seen as normal. An alternative explanation is that this trend is due to the increasing use of personal computers, where communication appears to be less of an issue.

The characteristics with the lowest scores relate to distributed processing, multiple sites, ease of installation and operation, and reusability. The first two might be expected to increase, as intranets and the Internet become more important.

More demands are being placed on software. Processing complexity and transaction rates are each increasing with time.

Some GSC’s vary in clearly recognizable ways for different types of software. Most trends make sense. For example, performance is important for transaction/production systems; systems developed on personal computers are more likely to be stand-alone systems with little concern for communication and on-line entry; facilitating change by the user is most important for decision support systems.

The fewest patterns are observed for heavily used configuration, end user efficiency, ease of installation, ease of operation, and multiple sites. The lack of pattern could mean that these have the greatest chance of giving useful variation between projects, and hence helping to explain variations in productivity. On the other hand, these tend to have the lowest scores, and so are the least important issues in a project.

Where patterns are observable across different types of project, they are generally most strongly associated with the application type. Application type is associated with consistent variations in the importance of performance, distribution, multiple sites, on-line update, and (especially) facilitating change.

Several patterns can also be seen when projects are grouped by project type. Fewer patterns emerge when projects are grouped by platform, language, or business area.

Dramatic variations can be seen when projects are grouped by type of organization, but there are few consistent patterns or easy explanations. In this data set, most categories are dominated by a few organizations. Differences between organizations, rather than between types of organization, probably cause the variation. An example is the apparent link between ease of installation and organization type. There is no obvious reason for the variations that were observed (for example, why ease of installation should be important in engineering but not in banking). This is likely to be a matter of organizational attitude rather than anything else.

 

3.4 Limitations

All of the differences described above, between GSC distributions for different groups of projects, are statistically significant in this data set. Care is needed nevertheless in interpreting them and drawing conclusions from them, for several reasons.

The projects are classified above on one attribute at a time. Sometimes the effect of several attributes may be confounded. For example, there are two blocks of projects which tend to vary together: one of Cobol/mainframe/banking projects, and one of legal/public administration projects. This can mean that a single underlying pattern is reported several times.

A second concern is that some groups are small. For example, when projects are grouped by business area most groups contain less than 20 projects. Small samples limit the strength of conclusions that can be drawn.

When attempting to draw general conclusions, how representative the projects are is a concern. For example, the group of banking projects appears quite large, with 66 projects. Many come from a single organization though, so the sample may not be representative of banking projects generally. The same comment applies to the legal projects.

 

 

4. Relationships between GSC’s

Examination of Tables 5-10 reveals some relationships between GSC’s. For example, transaction rate and complex processing tend to vary together. To a lesser extent, performance varies with them. These relationships suggest that a smaller number of underlying dimensions might be involved, with some being captured several times over by different GSC’s. To investigate this, factor analysis was used to study the common variation within the GSC’s.

Principal components analysis suggested six factors, as did factor analysis using the maximum likelihood method for estimating factor loadings. When six factors were used, however, no GSC loaded significantly on the sixth (loadings of less than 0.5 were not considered significant). The analysis was therefore performed using five factors.

Table 3 shows the factor loadings for each GSC. The highest loading for each GSC is highlighted in bold, provided its value is at least 0.5.

Between them the five factors account for just over half the total variation in the GSC’s. The first three factors each account for about 13% of the variation. The fourth and fifth factors explain less variation.

 

 

Characteristic

Factor 1

Factor 2

Factor 3

Factor 4

Factor 5

Data communication

.54

Distributed data processing

.73

Performance

.30

.46

.12

.37

Heavily used configuration

.60

.22

.48

.16

Transaction rate

.19

.70

.21

Online data entry

-.14

.82

End user efficiency

.13

.58

.45

Online update

.49

.50

.17

Complex processing

.22

.77

-.19

Reusability

.47

.17

.15

.17

.24

Ease of installation

.39

.20

.58

.11

Ease of operation

.26

.49

.24

Multiple sites

.60

.23

.14

Facilitate change

.12

.25

.19

.68

Variation explained

13.1 %

12.5 %

12.5 %

7.9 %

6.1 %

Table 3: Loadings of GSC’s on five factors

 

Several GSC’s load on each of the first three factors:

  1. distributed data processing, multiple sites, heavily used configuration.
  2. transaction rate, complex processing.
  3. data communication, on-line entry, on-line update, end user efficiency.

Each of these factors has a common theme: distribution, the difficulty of the problem to be solved, and interactivity respectively.

Three characteristics do not load significantly on any factor. Performance loads most strongly on factor two, strengthening that factor’s interpretation as one of problem difficulty. Ease of operation loads most strongly on factor four (ease of installation), suggesting ease of use as an interpretation for that factor. Reusability loads most strongly on factor one; this is difficult to interpret.

On-line update loads almost equally on factors two (problem difficulty) and three (interactivity). This makes sense, since high scores for this factor represent not just that most updates are performed on-line, but that complex processing is involved (eg for backups).

Separate factor analyses were performed for each of the different project groupings reported in Section 3. Details are not given here, because the results are unchanged. Although the order of importance of the factors varies from group to group, the same GSC’s always load on the same factors.

 

 

5. The Value Adjustment Factor

Sections 3 and 4 looked at the GSC’s in isolation, without regard to their purpose within function point analysis. Now we turn our attention to that purpose.

The GSC’s are used to construct a value adjustment factor ("VAF"). The result is a number from 0.65 to 1.35. It is used as a multiplier to adjust the basic function point count.

VAF = 0.65 + 0.01 x (sum of GSC’s)

In our view, the adjustment process in FPA is an attempt to explain variations in productivity - in other words, to improve the relationship between function points and project effort. In this section we investigate the relationships between function points and project effort, before and after adjustment with the VAF, to see whether the VAF serves this purpose.

 

5.1 Distribution of VAF

Table 4 summarizes the range of value adjustment factors, and Figure 1 shows their distribution.

Statistic

VAF

Minimum

0.73

1st Quartile

0.96

Median

1.00

3rd Quartile

1.07

Maximum

1.29

Mean

1.01

St.Dev.

0.10

Table 4: Summary of VAF

 

Figure 1: Distribution of VAF

The distribution is skewed slightly to the right, and peaks sharply at 1.00. Tests for normality indicate that the distribution is normal.

The median is exactly 1.00, and the mean 1.01.

The maximum VAF represents an increase of 29% in the function point count; the minimum represents a decrease of 27%. Since the variation is limited to plus or minus 35%, the extremes observed here approach the limits of possibility. Few adjustments are this great; half involve an adjustment of only about 5%.

Although the overall range of variation is quite large, for most projects the VAF does not result in much change to the function point value.

 

 

5.2 Value of VAF for effort prediction

To see if the adjustment process is effective effective for effort prediction, the strength of the relationship between function points and effort was investigated, using both adjusted and unadjusted function points as the independent variable.

Regression models were derived that relate each function point count to effort. The sample was divided randomly into a set of 157 projects (two thirds of them) from which the models were derived, and a set of 78 projects with which the models were evaluated. The accuracy of the models was assessed using R2 (which indicates the amount of variation in effort explained by function points), mean magnitude of relative error ("MMRE"), and the proportion of estimates that were within 25% of the actual effort ("Pred(.25)").

When adjusted or unadjusted function points were plotted against effort, the graphs were fan-shaped and dominated by small values for function points and effort. They were not suitable for linear regression. Taking the logarithms of both dependent and independent variable gave values much better suited to linear regression. The regressions were accordingly done on the transformed values. The residuals were normally distributed, and showed no pattern and no heteroscedasticity.

The resulting equation was: Effort = 17.2 x UFP .863

With this equation, the correlation between estimated and actual effort was 0.76. R2 was thus 0.58, meaning that unadjusted function points explained 58% of the variation in effort. MMRE was 1.28, meaning an average relative error of 128%. The largest relative error of all was 920%. 21% of estimates were within 25% of the actual values.

The equation derived using adjusted function points was: Effort = 21.0 x AFP .826

With this equation, the correlation between estimated and actual effort was 0.77, giving a value of 0.59 for R2. MMRE was 1.25, the maximum relative error was again 920%, and 17% of estimates were within 25% of the actual values.

The R2 values of 58-59% are comparable to Kitchenham’s 50-54% [3], Kemerer’s 55% [6], and Jeffery’s 58% [8].

Using adjusted function points instead of unadjusted function points as a predictor of effort makes almost no difference in any of R2, MMRE, or Pred(.25). Pred(.25) even gets slightly worse.

In 108 of the 235 projects, VAF points in the right direction for correcting estimation errors (eg effort is underestimated from unadjusted function points, but VAF is greater than 1.00 and so indicates that an upwards adjustment is needed). But in 73 projects VAF points in the wrong direction, and in 54 projects VAF is 1.00 and provides no clue. Thus one cannot even use VAF as a simple indicator of whether effort will be over-estimated or under-estimated from unadjusted function points.

In this data set, the VAF provides no assistance for estimating effort. This confirms the findings of other researchers [3,6,7,8]. We have studied a larger sample of projects, and used a wider set of criteria to assess the relationship, and still obtained the same results. This strengthens the statistical evidence that the adjustment process in function point analysis is not effective for effort prediction.

 

 

 

6. Conclusions

A collection of 235 projects has been analyzed, to see how the function point GSC’s are used in practice.

The first part of the analysis was largely descriptive. The GSC values were studied in isolation, without regard to their purpose within function point analysis.

The individual GSC’s may be viewed as categorical information about projects. Interesting trends can be seen when GSC values are compared for different collections of projects. Some GSC’s are more important than others for different types of projects.

Some GSC’s are defined in terms more suited to 1979-1984 than to today’s on-line world. They have less discriminative value now, and should perhaps be redefined.

Factor analysis indicates that the GSC’s reflect a smaller number of underlying dimensions: distribution, problem difficulty, interactivity, ease of use, and focus on the end user. Most of the fourteen GSC’s can be assigned to these factors in obvious ways. One characteristic (on-line update) appears to combine two issues, and should perhaps be redefined to separate the two concerns.

The last part of the analysis evaluated the VAF. The VAF was found not to improve the relationship between function points and effort. This confirms the findings of other researchers.

It seems clear that the value adjustment factor should not be used. Theoretical objections to how it is constructed might be overlooked if it had some practical benefit, but the evidence is that it does not.

Recording the values of the individual GSC’s may still be useful, both for historical comparisons and for understanding differences between projects.

Two avenues for future work are clear. A new formulation is needed if the GSC values are to be used in some way to improve the explanation of project effort. Some of the GSC’s themselves could be profitably redefined. The five underlying five factors identified here (distribution, problem difficulty, interactivity, ease of use, user focus) could provide a starting point for doing so.

 

Acknowledgements

I am grateful to Alain Abran for several valuable discussions early in this research, and for his comments on this paper.

 

References

[1] IFPUG. Function Point Counting Practices Manual, Release 4.0, International Function Point Users Group, Westerville Ohio (1994).

[2] N.E. Fenton, Software Metrics: a Rigorous Approach, Chapman and Hall (1991).

[3] B.A. Kitchenham, Empirical studies of assumptions that underlie software cost-estimation models, Information and Software Technology, Vol 34 no 4, 211-218 (1992).

[4] C.R. Symons, Function Point Analysis: Difficulties and improvements. IEEE Trans. Soft. Eng., Vol 14 no 1, 2-11 (1988).

[5] J.-M. Desharnais, Adjustment model for Function Points scope factors - a statistical study, Proc. IFPUG Spring Conference (1990).

[6] C.F. Kemerer, An empirical validation of software cost estimation models, Communications of the ACM, Vol 30 no 5, 416-429 (1987).

[7] A. Abran, Analysis of the Measurement Process of Function Point Analysis, PhD thesis, École Polytechnique de Montréal (1994).

[8] D.R. Jeffery, and J. Stathis, Function point sizing: Structure, validity and applicability, Journal of Empirical Software Engineering, Vol 1 no 1, 11-30 (1996).

[9] International Software Benchmarking Standards Group, Worldwide Software Development: The Benchmark, Release 4 (1997).

[10] C.R. Symons, Software Sizing and Estimating: Mk II FPA, Wiley (1991).

 

 

Appendix: Tables

Manufacturing (17 projects) Utilities (17 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

6

0

0

18

18

59

6

0

6

0

76

12

Distributed data processing

53

18

0

0

12

18

47

6

0

6

41

0

Performance

0

35

29

24

0

12

12

12

6

6

24

41

Heavily used configuration

35

18

35

6

0

6

18

0

29

0

0

53

Transaction rate

41

35

6

12

0

6

59

18

6

6

12

0

Online data entry

6

0

0

12

0

82

0

0

6

6

0

88

End user efficiency

18

12

12

24

12

24

0

0

24

12

59

6

Online update

6

18

6

24

29

18

0

0

6

41

53

0

Complex processing

24

29

18

29

0

0

18

6

47

29

0

0

Reusability

47

18

12

18

0

6

6

24

41

18

6

6

Ease of installation

35

24

29

0

6

6

29

18

6

0

0

47

Ease of operation

35

24

18

6

18

0

18

41

0

24

18

0

Multiple sites

41

35

6

0

6

12

18

6

12

12

53

0

Facilitate change

47

12

18

12

0

12

29

0

41

6

12

12

Public admin (36 projects) Fin/Prop/Business (33 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

6

0

0

33

25

36

3

3

12

18

45

18

Distributed data processing

47

6

8

28

8

3

24

12

12

9

42

0

Performance

3

31

8

39

11

8

6

21

6

48

12

6

Heavily used configuration

6

17

36

36

3

3

6

21

27

27

18

0

Transaction rate

22

6

8

50

8

6

18

12

15

27

24

3

Online data entry

0

3

0

33

0

64

9

0

0

9

3

79

End user efficiency

3

19

8

28

33

8

18

3

9

12

52

6

Online update

6

0

11

33

22

28

9

12

6

15

45

12

Complex processing

8

11

22

50

3

6

9

21

27

24

18

0

Reusability

6

28

25

42

0

0

9

33

6

15

27

9

Ease of installation

50

3

6

33

6

3

24

12

3

33

3

24

Ease of operation

22

19

22

25

8

3

21

3

36

18

6

15

Multiple sites

50

8

6

25

6

6

64

15

9

0

12

0

Facilitate change

11

14

8

47

6

14

24

12

15

30

12

6

Banking (39 projects) Other (65 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

0

0

10

79

5

2

0

11

17

26

45

0

Distributed data processing

5

3

87

0

3

3

45

14

11

9

20

2

Performance

5

3

10

3

79

0

8

11

14

28

26

14

Heavily used configuration

0

79

15

3

0

3

11

12

46

18

3

9

Transaction rate

5

0

5

10

79

0

18

15

23

26

12

5

Online data entry

0

0

8

79

0

13

3

5

8

6

2

77

End user efficiency

0

3

85

8

0

5

5

6

5

23

31

31

Online update

0

3

5

8

85

0

0

6

9

48

25

12

Complex processing

0

0

5

13

79

3

11

26

25

17

20

2

Reusability

5

3

79

13

0

0

28

12

8

34

15

3

Ease of installation

8

3

79

10

0

0

9

18

22

18

15

17

Ease of operation

5

0

85

8

3

0

15

12

25

32

8

8

Multiple sites

3

3

85

8

3

0

37

14

18

14

12

5

Facilitate change

85

0

3

13

0

0

14

6

17

29

20

14

Table 5: Percent of projects with each GSC value, by organization type

 

 

Financial (21 projects) Personnel (11 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

5

0

5

29

29

33

0

0

9

45

9

36

Distributed data processing

38

24

10

10

14

5

55

27

0

0

0

18

Performance

0

10

24

29

24

14

9

27

9

27

18

9

Heavily used configuration

10

14

52

10

10

5

9

0

82

0

0

9

Transaction rate

14

24

10

19

24

10

45

9

27

0

9

9

Online data entry

0

0

5

19

0

76

0

0

9

0

0

91

End user efficiency

5

0

5

24

52

14

9

9

9

18

27

27

Online update

0

10

10

24

52

5

18

9

18

18

18

18

Complex processing

5

5

19

48

19

5

9

18

27

36

9

0

Reusability

29

19

5

33

10

5

18

18

9

36

9

9

Ease of installation

19

29

24

14

10

5

18

27

18

9

9

18

Ease of operation

14

10

24

24

24

5

18

36

27

18

0

0

Multiple sites

33

24

14

5

24

0

36

27

9

0

9

18

Facilitate change

24

10

10

29

5

24

18

18

36

9

0

18

Manufacturing (18 projects) Banking (66 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

6

0

0

6

28

61

0

2

12

50

24

12

Distributed data processing

50

6

0

17

17

11

12

3

56

5

23

2

Performance

0

11

28

22

11

28

6

9

8

24

48

5

Heavily used configuration

22

17

28

17

0

17

3

52

23

15

6

2

Transaction rate

28

17

11

28

17

0

9

3

11

20

56

2

Online data entry

17

6

6

0

0

72

5

0

5

50

2

39

End user efficiency

28

6

6

17

22

22

9

2

53

9

21

6

Online update

6

11

6

33

22

22

5

6

5

12

65

8

Complex processing

22

17

28

11

22

0

5

9

17

18

50

2

Reusability

33

22

11

11

17

6

6

15

48

14

12

5

Ease of installation

33

11

17

11

11

17

12

5

47

23

2

12

Ease of operation

33

0

33

17

6

11

12

2

62

12

3

9

Multiple sites

56

11

0

6

17

11

30

6

52

6

6

0

Facilitate change

44

17

22

11

0

6

59

5

9

21

5

2

Legal (13 projects) Engineering (16 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

0

0

0

23

23

54

12

0

0

25

62

0

Distributed data processing

77

0

8

15

0

0

12

6

12

0

69

0

Performance

0

15

0

62

23

0

6

0

12

25

12

44

Heavily used configuration

0

38

31

31

0

0

0

25

6

25

0

44

Transaction rate

23

0

0

69

8

0

56

25

6

12

0

0

Online data entry

0

0

0

23

0

77

0

0

0

12

0

88

End user efficiency

0

0

8

23

62

8

0

0

0

12

62

25

Online update

0

0

8

23

46

23

0

0

12

38

50

0

Complex processing

0

23

15

54

8

0

12

6

62

6

12

0

Reusability

0

15

15

69

0

0

6

19

44

19

12

0

Ease of installation

77

0

8

15

0

0

19

6

6

19

6

44

Ease of operation

8

0

46

38

0

8

38

44

6

12

0

0

Multiple sites

77

0

8

15

0

0

0

19

12

12

56

0

Facilitate change

8

23

8

23

0

38

12

0

62

6

19

0

Table 6: Percent of projects with each GSC value, by business area

 

 

 

 

Decision Support Management information

(11 projects) (47 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

0

0

18

64

9

9

2

2

9

38

28

21

Distributed data processing

27

0

9

45

18

0

40

17

6

26

6

4

Performance

9

0

9

64

18

0

6

34

9

17

23

11

Heavily used configuration

9

0

27

55

0

9

11

13

28

34

4

11

Transaction rate

18

18

9

36

18

0

28

13

15

30

9

6

Online data entry

0

0

18

27

0

55

2

0

2

32

2

62

End user efficiency

9

0

9

45

9

27

2

21

9

19

23

26

Online update

18

0

18

64

0

0

2

4

6

57

19

11

Complex processing

0

9

36

45

9

0

6

9

23

43

17

2

Reusability

18

9

0

45

9

18

15

17

21

34

6

6

Ease of installation

9

9

27

36

9

9

13

19

15

36

6

11

Ease of operation

27

0

9

45

9

9

11

30

15

26

15

4

Multiple sites

18

0

0

55

18

9

32

4

17

30

6

11

Facilitate change

0

9

9

45

27

9

11

0

11

45

15

19

 

Office Information Transaction/production

(23 projects) (74 projects)

Characteristic

0

1

2

3

4

5

0

1

2

3

4

5

Data communication

0

0

0

39

35

26

1

0

7

39

31

22

Distributed data processing

17

4

39

9

22

9

22

5

39

7

26

1

Performance

4

13

0

52

26

4

5

4

9

22

38

22

Heavily used configuration

9

39

26

22

0

4

3

34

35

9

4

15

Transaction rate

13

9

4

30

35

9

20

4

12

20

41

3

Online data entry

0

0

17

26

0

57

3

3

11

32

1

50

End user efficiency

4

4

26

43

9

13

4

3

36

18

30

9

Online update

4

4

17

39

30

4

0

3

11

12

59

15

Complex processing

17

13

13

26

26

4

4

7

27

20

39

3

Reusability

17

4

43

17

17

0

8

14

45

22

11

1

Ease of installation

26

0

30

35

9

0

15

7

35

19

4

20

Ease of operation

22

9

61

4

4

0

11

11

50

12

9

7

Multiple sites

13

22

35

22

4

4

26

9

34

8

19

4

Facilitate change

35

0

26

35

4

0

46

9

22

16

4

3

Table 7: Percent of projects with each GSC value, by application type

 

 

New development (142 projects)

Characteristic

0

1

2

3

4

5

Data communication

2

0

8

39

24

27

Distributed data processing

29

8

33

6

20

4

Performance

7

15

5

27

33

13

Heavily used configuration

9

31

32

14

6

8

Transaction rate

21

13

13

21

28

4

Online data entry

4

2

8

28

2

56

End user efficiency

8

3

28

21

20

20

Online update

2

7

11

31

37

13

Complex processing

9

15

20

22

33

1

Reusability

17

15

32

25

8

4

Ease of installation

15

13

33

16

8

15

Ease of operation

17

10

37

20

10

7

Multiple sites

25

11

35

13

12

4

Facilitate change

37

7

19

18

11

7

Enhancement (80 projects)

Characteristic

0

1

2

3

4

5

Data communication

4

1

6

29

26

34

Distributed data processing

41

9

11

21

16

1

Performance

2

15

21

38

16

8

Heavily used configuration

9

12

38

32

5

4

Transaction rate

12

6

15

44

18

5

Online data entry

2

0

10

24

0

64

End user efficiency

2

15

11

26

38

8

Online update

4

2

10

34

39

11

Complex processing

8

15

20

46

9

2

Reusability

14

19

18

30

15

5

Ease of installation

28

9

6

41

4

12

Ease of operation

15

11

32

31

4

6

Multiple sites

45

10

10

24

6

5

Facilitate change

18

6

12

42

6

15

Redevelopment (13 projects)

Characteristic

0

1

2

3

4

5

Data communication

8

0

0

8

62

23

Distributed data processing

31

8

8

15

31

8

Performance

8

15

8

15

15

38

Heavily used configuration

15

8

31

8

0

38

Transaction rate

46

15

0

15

8

15

Online data entry

0

8

0

15

0

77

End user efficiency

8

0

8

31

38

15

Online update

8

0

8

31

31

23

Complex processing

15

15

31

23

0

15

Reusability

8

31

31

23

0

8

Ease of installation

38

8

15

0

8

31

Ease of operation

31

31

15

15

8

0

Multiple sites

38

15

0

0

31

15

Facilitate change

15

8

31

23

15

8

Table 8: Percent of projects with each GSC value, by development type

 

 

3GL (94 projects)

Characteristic

0

1

2

3

4

5

Data communication

5

0

6

37

27

24

Distributed data processing

32

6

36

7

15

3

Performance

6

13

10

29

33

10

Heavily used configuration

11

31

35

11

10

3

Transaction rate

19

10

12

24

30

5

Online data entry

4

2

12

31

2

49

End user efficiency

11

5

33

22

18

11

Online update

3

10

14

26

39

9

Complex processing

14

12

13

32

29

1

Reusability

17

21

31

16

10

5

Ease of installation

24

11

27

18

7

13

Ease of operation

21

4

39

22

6

6

Multiple sites

27

15

36

11

9

3

Facilitate change

49

7

17

13

9

5

4GL (95 projects)

Characteristic

0

1

2

3

4

5

Data communication

2

1

3

33

33

28

Distributed data processing

34

12

14

15

24

2

Performance

7

20

8

31

19

15

Heavily used configuration

6

18

33

33

1

9

Transaction rate

24

11

9

32

21

3

Online data entry

2

2

3

24

1

67

End user efficiency

4

7

13

21

42

13

Online update

3

1

5

37

43

11

Complex processing

7

15

31

28

17

2

Reusability

8

16

25

37

11

3

Ease of installation

18

11

18

32

5

17

Ease of operation

15

21

31

21

6

6

Multiple sites

42

6

16

18

14

4

Facilitate change

14

6

17

39

9

15

Application generator (37 projects)

Characteristic

0

1

2

3

4

5

Data communication

0

0

19

22

19

41

Distributed data processing

35

8

22

8

22

5

Performance

0

5

19

27

32

16

Heavily used configuration

16

16

35

5

5

22

Transaction rate

14

11

24

27

16

8

Online data entry

3

0

11

11

0

76

End user efficiency

0

8

14

24

19

35

Online update

3

5

11

30

19

32

Complex processing

3

24

16

24

27

5

Reusability

24

14

22

22

14

5

Ease of installation

16

14

27

14

11

19

Ease of operation

16

8

27

27

14

8

Multiple sites

32

5

16

19

14

14

Facilitate change

24

8

22

22

16

8

Table 9: Percent of projects with each GSC value, by language type

 

Mainframe (164 projects)

Characteristic

0

1

2

3

4

5

Data communication

1

1

5

41

21

31

Distributed data processing

36

7

29

13

13

2

Performance

4

16

9

32

29

9

Heavily used configuration

9

24

36

22

4

4

Transaction rate

14

10

10

34

28

4

Online data entry

4

1

9

32

1

53

End user efficiency

8

8

25

23

24

12

Online update

4

5

9

29

41

12

Complex processing

7

15

16

35

26

1

Reusability

12

18

30

37

10

2

Ease of installation

21

9

24

29

5

12

Ease of operation

16

10

39

21

6

7

Multiple sites

35

9

26

19

6

5

Facilitate change

36

8

15

28

4

10

Midrange (39 projects)

Characteristic

0

1

2

3

4

5

Data communication

3

0

3

13

51

31

Distributed data processing

21

13

10

5

46

5

Performance

5

8

13

26

21

28

Heavily used configuration

8

8

33

13

10

28

Transaction rate

38

5

21

13

15

8

Online data entry

0

5

5

8

3

79

End user efficiency

0

5

15

23

38

18

Online update

0

0

8

36

44

13

Complex processing

3

18

44

18

13

5

Reusability

18

10

26

26

13

8

Ease of installation

10

15

18

18

10

28

Ease of operation

13

23

18

36

10

0

Multiple sites

28

8

21

13

28

3

Facilitate change

5

5

28

23

28

10

Microcomputer (30 projects)

Characteristic

0

1

2

3

4

5

Data communication

17

0

20

23

27

13

Distributed data processing

37

13

17

17

13

3

Performance

13

17

13

27

23

7

Heavily used configuration

13

40

20

20

3

3

Transaction rate

27

27

13

20

10

3

Online data entry

3

0

3

17

3

73

End user efficiency

3

3

7

27

30

30

Online update

0

10

20

43

10

17

Complex processing

27

17

10

20

23

3

Reusability

33

20

7

23

10

7

Ease of installation

30

17

20

3

13

17

Ease of operation

27

7

23

20

13

10

Multiple sites

27

20

23

7

13

10

Facilitate change

20

3

20

27

20

10

Table 10: Percent of projects with each GSC value, by platform type