DZone's Guide to

Big Code: The Ultimate Challenge of Software Engineering (Part 1)

Big Code was born of Big Data, big applications, and big communities. Both Big Data and Big Code were born of complexity, and complexity can't be fully eliminated by any technology.

· Big Data Zone
Free Resource

Access NoSQL and Big Data through SQL using standard drivers (ODBC, JDBC, ADO.NET). Free Download 

How do you imagine the future of software engineering? Some dream about a future in which code will be written directly from your mind, where there won't be any programming at all because code will be written by AI, where everyone will be software engineers and it will be a routine activity for everyone. But full-fledged mind scanning, artificial intelligence, and widespread programming knowledge by everyone (if these are even possible) won't become the reality anytime soon. On the other hand, we cannot say everyone is an accountant just because they have money and work on a budget. Accounting is more than that, and such a statement depends on the level of complexity we consider. 

Complexity is definitive for everything we do. Especially today, when it is raised in all areas of computing. This is challenging and imperative. Data and code became big because of Internet-connected sites, data stores, and applications. Applications became big because they must anticipate all possible use cases for any possible user. Communication became big because of lots of white noise being generated each minute worldwide. The development community became big, too, because development teams are constantly changing and are dispersed worldwide.

Code became big because of all of these factors. Some typical questions that arise about Big Code are:

Can these questions be addressed with extremely efficient programming paradigms? With exciting programming languages? With good architecture? With appropriate code conventions? With respected development culture? Big Code problems only partially affected by the way we write code and imperfect tooling. On the other hand, most paradigms, languages, technologies, and techniques focus on code consistency, whereas Big Code problems are about code completeness (volume and complexity of code logic). It does not matter which paradigm or language you choose — code can always grow over intelligible limits and, at some point, you just won't be able to embrace the complete mechanics of your application.

However, Big Code problems partially come from the nature of cognition. They arise from our inability to grasp more than we can because, at a certain level of volume and complexity, information is characterized by vastness, vagueness, uncertainty, inconsistency, and fallibility. Big Data and Big Code were both born of numerous aspects and use cases that are considered by a specific algorithm. Both are born of complexity, and complexity cannot be fully eliminated by any technology. We can't just ignore some factors or use cases without loss of accuracy. We can break code into smaller parts but they must be integrated somewhere. We can postpone complexity to other layers but we can't avoid it.

Can the above-mentioned questions be handled theoretically? Evidently, yes, because we address all of them somehow at some point. Do we need an enhanced (automated) solution for them? Yes, because our minds are not capable of covering information with volumes higher than a certain level. Why are these questions raised at all? Because code may be quite cryptic (with hard-to-decode abbreviations and "obvious" parts). Because information may span across tools and emails, or not be implied at all. Because all good conventions and development culture points are not respected fully (for different reasons, starting with lack of time and ending with negligence). Because what should be synchronized becomes more and more desynchronized. Is this a problem of a "bad guy"? No — this is the problem of the entire team. Blaming a "bad guy" may bring you satisfaction but it does not resolve the problem. Just compare a situation in projects with a solution that automated some aspect of software engineering (like unit testing or continuous integration). You don't have to preach to follow good coding practices, where this is managed explicitly by corresponding technologies.

Is Big Code challenging enough? All mainstream paradigms were born long ago, and there is a reason for that: abstraction as reality imitating activity is restricted. It is quite improbable that we would see a lot of new ones in coming years, whereas old ones continue to merge. The long story of object-oriented vs. functional opposition came to an end and resulted in a multi-paradigm approach (as both imitate the inseparable space-time dualism of the universe). Could a mythical new paradigm propose a new kind of abstraction better than objects? First, remember Occam's razor. Second, remember that objects replaced structs because the former combines both data and code (methods), whereas the latter contains only data. Therefore, a mythical abstraction should propose some totally new principle unnoticed, unknown, and insignificant before. In the outer world, we can't find anything, as space-time dualism is the current representation. Quantum mechanics, maybe? But it does not replace Newtonian laws (which are applied more frequently in our lives) and is used in parallel with them. Therefore, the most probable quantum computing will operate in parallel with traditional software and paradigms. In the abstract world, the only thing that comes to mind is aspects, but the aspect-oriented paradigm is not widespread (though perhaps there's just not a good enough implementation). Could some technology be more challenging? The most challenging for now is AI. But the problem is that it prefers a black box approach, which prevents widespread usage in software engineering (made from freely distributed paradigms, which can be reproduced by millions of developers at home labs).

That's it for Part 1. Stay tuned for Part 2, where we'll discuss how Big Code can be addressed and imagined today.

The fastest databases need the fastest drivers - learn how you can leverage CData Drivers for high performance NoSQL & Big Data Access.

Opinions expressed by DZone contributors are their own.