Difference between revisions of "Software Development"

From "A B C"
Jump to navigation Jump to search
m
Line 6: Line 6:
  
  
{{dev}}
+
<section begin=intro />
 +
It is not hard to argue that the creation of software is the greatest human cultural achievement to date. But writing software well is not easy and much sophisticated methodology has been proposed for software development, primarily addressing the needs of large software companies and enterprise-scale systems. Certainly: once software development becomes the task of teams, and systems become larger than what one person can remember confidently, failure is virtually guaranteed if the task can't be organized in a structured way.
  
 
+
But our work often does not fit this paradigm, because in the bioinformatics lab the requirements change quickly. The reason is obvious: most of what we produce in science are one-off solutions. Once one analysis runs, we publish the results, and we move on. There is limited value in doing an analysis over and over again. However, this does not mean we can't profit from applying the basic principles of good development principles. Fortunately that is easy. There actually is only one principle.
There is much sophisticated methodology for software development, it mainly addresses the needs of large software companies and enterprise-scale systems. And no question: once development is the task of teams, and the systems become larger than what one person can remember confidently, failure is virtually guaranteed if in a well structured process is not in place. But our work usually does not fit this paradigm, because nowhere do the requirements change faster and more radically than in the bioinformatics lab. The reason is obvious: most of what we do are one-off solutions. Once one analysis runs, we publish the results, and move on. There is limited value in doing an analysis over and over again and even if we think our database or service is the next and greatest thing, the funding panel who pays the bill won't agree. However, this does not mean we can't all profit from applying the basic principles of good software development. Fortunately that is easy. There actually is only one.
 
  
 
<div style="font-size:140%; border: 1px solid #000000; background-color:#EEF8FF; padding:10px; margin:10px;">
 
<div style="font-size:140%; border: 1px solid #000000; background-color:#EEF8FF; padding:10px; margin:10px;">
 
Make implicit knowledge explicit.  
 
Make implicit knowledge explicit.  
 
</div>
 
</div>
 +
 +
Everything else follows.
 +
<section end=intro />
  
  

Revision as of 22:47, 15 January 2015

Software Development
(In a small-scale research context)


It is not hard to argue that the creation of software is the greatest human cultural achievement to date. But writing software well is not easy and much sophisticated methodology has been proposed for software development, primarily addressing the needs of large software companies and enterprise-scale systems. Certainly: once software development becomes the task of teams, and systems become larger than what one person can remember confidently, failure is virtually guaranteed if the task can't be organized in a structured way.

But our work often does not fit this paradigm, because in the bioinformatics lab the requirements change quickly. The reason is obvious: most of what we produce in science are one-off solutions. Once one analysis runs, we publish the results, and we move on. There is limited value in doing an analysis over and over again. However, this does not mean we can't profit from applying the basic principles of good development principles. Fortunately that is easy. There actually is only one principle.

Make implicit knowledge explicit.

Everything else follows.




 

Introductory reading



 

Principles

Make implicit knowledge explicit. This maxim has many implications. It means to list out the requirements. It means to write down your stakeholder objectives. It means to specify the versions of your dependencies. It means to comment, and to document and to draw and to plan. [TBC...]


  • Notation: compact and intuitive; consider information design (less ink!)
  • What things to model: structural (data model, components) and behavioural (state changes, data flow)
  • Consistency between model and implementation! "Seamlessness" (a single, continuous process) and reversibility (between analysis, design and implementation) see Walden and Nerson (). Only then can the model be part of the documentation. Implies: update, versioning, audit ...?
  • Functional design vs. object oriented design (cf. Meyer 97)
  • Correctness (Design by contract? Assertions?)
  • Robust
  • Easy to extend
  • Minimized coupling (cf. shy code and Law of Demeter)
  • Reusable (standardization of interface conventions, option-operand separation, command-query separation and so on)
  • Efficient
  • Validated (Testing!)
  • Well structured (DRY: Do not Repeat Yourself).


A standard in the discussion of software development is the waterfall model, in which the process cascades from step to step, each one dependent on the results of the former. A reasonably good overview is given in Software development process where other models such as Agile Development are also discussed. In terms of tasks to fulfil, it is probably a good and comprehensive approach, if not followed dogmatically.

Methodology


Management

Even though the development focus is usually on the artefacts to be created, management of the process is no less important. A good illustration what this may entail is the Scrum management method, based on ideas of Agile development.

Analysis

Design

  • Architecture
  • Modelling
    • Structural modelling
    • Behaviour modelling
    • Data modelling
Concepts


Implementation

Constructing the system

I hesitate to call implementation "coding", because coding is only one part of the task.

Testing
Documenting

Validation, Deployment, and Maintenance

These may not be distinct in the scenario we are considering here: validation may comprise the one run of discovery we are aiming for, deployment may not apply and maintenance may be foregone as the research agenda moves on.

But this does not mean we can afford ignorance of best practice in scientific software development: simple, but essential aspects like using version control for your code, using IDEs, writing test cases for all code functions etc. These aspects are nowhere better explained than in Greg Wilson's excellent Software Carpentry initiative. Free, online, accessible and to the point. Go there and learn:


Sandve et al. (2013) Ten simple rules for reproducible computational research. PLoS Comput Biol 9:e1003285. (pmid: 24204232)

PubMed ] [ DOI ]


   

Further reading and resources

  • Kim Waldén and Jean-Marc Nerson: Seamless Object-Oriented Software Architecture: Analysis and Design of Reliable Systems, Prentice Hall, 1995.
  • Article in Nature Biotechnology; note that successful here is meant to imply widely used. David Baker's Rosetta package is not mentioned, for example. Nevertheless: good insights in this.
Altschul et al. (2013) The anatomy of successful computational biology software. Nat Biotechnol 31:894-7. (pmid: 24104757)

PubMed ] [ DOI ]