A site devoted mostly to everything related to Information Technology under the sun - among other things.

Thursday, June 11, 2026

Aggregate Code Density Metric - A Proposal

There are many existing metrics for quantifying a body of computer code such a Cyclometric Complexity and Line of Executable Code, LOC.

Here I am suggesting that it could be useful to define a new Code Metric, the Aggregate Code Density - defined as:

        Aggregate Code Density lo(Total Lines of Code) / Total Number of Files

This quantity is to be computed for a component, package, JAR/WAR file, system, or project with multiple subfolders and source code files​.  The Logarithm function is used in order to avoid very large numbers for this metric since a file could contains thousands, perhaps several tens of thousands of lines of code.  

It is envisioned that by the phrase "Lines of Code" one includes the declarative statements in configuration files such as .xml, .yml, .yaml, .json, .properties, .envetc.  In this manner, changes to the system's configurations copuld also be measured and tracked.

If more code is developed but the number of files is increased, then Aggregate Code Density increases - and both Cyclometric Complexity LOC will increase with it.

If code is added to the system is in new files, the increase in Aggregate Code Density could be smaller and its variation will have a smaller increase in slope as a function of time.

If the system is refactored, Aggregate Code Density, by necessity, must go down as both the number of lines of code and the number of files should have decreased due to the Refactoring process; if the Aggregate Code Density has remained the same or has increased - ten I imagine that the Refactoring effort was not successful!

If procedural code in Python (let us say) is Refactored into Object-Oriented code, then Aggregate Code Density would go down since eventhough the number of files could increase, the number of repetitive sections of code would decrease.

LOC gives us some of the same insights both at the file level as well as in the aggregate, this new metric is a slightly different one in that it only looks at the entire system and its configurations and not just LOC.  This is an advantage since the deployment of a system  into Cloud environments such as Kubernetes or Pivotal Cloud Factory adds significant complexity to the system and to the cost of its development and maintenance. 


No comments:

About Me

My photo
I had been a senior software developer working for HP and GM. I am interested in intelligent and scientific computing. I am passionate about computers as enablers for human imagination. The contents of this site are not in any way, shape, or form endorsed, approved, or otherwise authorized by HP, its subsidiaries, or its officers and shareholders.

Blog Archive