Research Software Engineering

White Paper by Bernhard Rumpe, Software Engineering, RWTH Aachen, Update June 17, 2024

Research Software Engineering (RSE) is a term that was created around 2010 and has started to become prominent in the UK, the US, and now also Germany. Wikipedia (in June 2024) defines it this way:

Definition: Research Software Engineering (RSE)

Research Software Engineering is the use of Software Engineering practices, methods and techniques for research software, i.e. software that was made for and is mainly used within research projects.

Even though the definition also sketches what research software is, this definition is quite fine. It embedds the software development skills of the RSE domain as a special subfield to Software Engineering, while also stretching that specific domain knowledge, namely from the research domain is also needed.

In this white paper, we (1) define the most relevant terms, (2) discuss the similarities and differences between Software Engineering and RSE, and (3) at the end provide recommendations for researchers and universities on how to address the emerging challenges. We approach these topics from our personal Software Engineering perspective, intentionally omitting domain-specific considerations, which are undoubtedly just as crucial.

Software Engineering (SE)

Similar to the original term “software engineering”, the creation of “RSE” was felt necessary because there was and is a crisis in software development, especially a “research software crisis”. Software Engineering gained prominence when F. L. Bauer hosted the first conference on Software Engineering in 1968 to address the software crisis. Since then, the Software Engineering discipline has gained significant understanding of software development cumulating in:

several introductory and expert books on software engineering,
many more books on dedicated sub-fields of software engineering, namely requirements engineering, architecture, design, modeling, testing, development processes, and of course programming,
and the Software Engineering Body of Knowledge (SWEBOK) that structures and aggregates what software developers have learned in the last 50 years.

Software Engineering Challenges

Software is rather heterogeneous, ranging from embedded to desktop, to autonomous, to games, to business, and also to research software. While the problems are always the same, namely:

How to ensure the quality of the software?
How to efficiently develop the software (i.e., preserve developer resources)?
How to meet timing deadlines?

The answers and then, in particular, the development techniques are often different in the various sub-areas of software development, because the starting situation, the kind of software, the complexity drivers, the needed quality characteristics, the context in which the software is operating in, and the skills and preferences of the developers are different.

Software Engineering: 50 years, culminating in the SWEBOK

The Software Engineering Body of Knowledge (SWEBOK) defines 15 knowledge areas:

Software Requirements
Software Design
Software Construction
Software Testing
Software Maintenance
Software Configuration Management
Software Engineering Management
Software Engineering Process
Software Engineering Models and Methods
Software Quality
Software Engineering Professional Practice
Software Engineering Economics
Computing Foundations
Mathematical Foundations
Engineering Foundations

For business software, we know that the programming activities only cost about 15% of the overall development time. For skilled and educated people constructing the actual software is a relatively well-understood activity, but many very costly errors are made due to misunderstandings about the requirements or incorrect definition of the architecture.

Software Engineering Areas

Software Engineering doesn’t only comprise the subdisciplines mentioned above, which cover the different activities within a software engineering project. Due to the various application areas, software engineering is also organized in domain-specific subdisciplines or partners with related domain-specific topics, so that a holistic, integrative development approach becomes feasible. These areas include:

Automotive Software Engineering,
Information Systems,
Avionics Systems and Software Engineering,
Rail Software Engineering,
Embedded Software,
Cloud Software Engineering,
Data Science and Engineering,
Medical Software Engineering,
Government Software Engineering
Gaming Software Engineering
and now also Research Software Engineering.

Other domains, like Quantum Software Engineering, are also emerging.

In the Automotive Software Engineering branch many developers historically transitioned from their domain (i.e. mechanics and electronics) to learning coding and later also Software Engineering techniques, while still using and evolving their knowledge on the original domain. ASE nowadays encompasses a wide range of people with diverse skill sets and roles, ranging from software architects and coders to quality assurance, safety or security specialists, all knowing about engineering cars as well. Some even take on the role of product owners. These professionals use a variety of “Automotive Software Engineering” techniques and methods, though their individual job titles vary widely. This approach could potentially serve also as a framework for Research Software Engineering (RSE), where more software-oriented developers might identify themselves as “RSEs”, others are “Scientists Who Code”, and yet others may be specialized “Research Software Tool Engineers”. No one needs to master all existing SE techniques, but only the relevant ones. To select the relevant ones, it helps that an RSE knows what SE techniques exist.

Research Software

Since 2010 communities have been created, conferences have been organized, and people exchange their findings on how to overcome the challenges of research software development, i.e., how to overcome the research software crisis.

What differentiates the challenges of RSE from other forms of SE?

Kinds of Research Software

First, we recognize that there is no uniform kind of research software. Instead research software mainly falls into one of the following categories (and sometimes combinations thereof):

Embedded control software for complex physical or chemical experiments, including many forms of sensor-based data collections
Simulation of physical, chemical, social, or biological processes in geometrically distributed spaces
Data processing and aggregation
Symbolic manipulation systems, such as computer algebra systems or theorem provers
Demonstrators and prototypes of various forms and with a large variety of goals

While the embedded control software also has its challenges, the main challenges seem to be the simulation and the data processing codes, that are so heavily needed by researchers.

But software consists of parts from different technical domains. While obviously, the mathematical research part receives the main focus in practically useful programs, this often amounts to only 20-30% of the code (this number is estimated, based on personal insights to some projects, but not thoroughly validated). The majority of code has to deal with:

App infrastructure
User interactions
Visualization
Storage and transaction management
Communication between computing and storage nodes
Interacting with neighboring systems
Providing web services
Rights and roles management for access and prevention of undetected changes
Technical monitoring
Testing
Installation, deployment and orchestration

And this is very similar to many other kinds of software. The amount of technical code tends to grow more quickly than the functional research part. Addressing the technical aspects is practically the more difficult task for non-software-engineers, especially the integration of technologies.

Characteristics of Research Software

Several common characteristics of research software are:

It is used for research in the original research domain and thus originally only a by-product, and also treated as such.
Focus is on the publication of the results.
Requirements of the software under development are initially unclear and the software development process is heavily intertwined with the research and innovation process, both on the scientific domain and its mapping into software.
It is developed by researchers of various domains, but mainly not by computer scientists (nor by software engineers)
- An often seen scenario: The complete software is developed by a single Ph.D. candidate and erodes after the Ph.D. is finished.
- A similar scenario: The overall software is complex and has been developed over a long period of time by many developers, but the newly added packages also only have a single Ph.D. author and erode after they leave.
Reuse is difficult, because the software is not designed for reuse.
Applicability of the software is limited, because it has been designed for a single use case or a small set of use cases and it wasn’t designed for extensibility and flexibility.
The software is frequently rewritten:
- Within a Ph.D. project, when the desired outcomes (i.e. requirements) change, rewriting is applied.
- If a new researcher takes the existing software, then a long lasting rewriting takes place to accommodate the new developers preferences. (In the sense of “Don’t trust foreign code”)
Goals of the research institution, respectively their professorial leaders, significantly differ from the goals of the developing researchers, namely: long-lasting, sustainable software programs that can be used as infrastructure for research vs. obtaining quickly publishable insights.

However, the world is changing and the demand for reproducibility of results enforces additional publication of the data and of the underlying software in a form that is permanently available and executable. Zenodo, for example, stores software permanently, but doesn’t ensure executability (yet). Code Ocean, however, already does.

Research Software Engineering

As a consequence of the sustainability and reproducibility requirements on research software, the software turns from a by-product into a long-living, sustainable asset, if not into a core research infrastructure, where the demands for code quality heavily increases. As a consequence of the software crisis in other development domains, software engineering has already discussed these and other typical quality attributes, namely understandability, documentation, reuse, or the ability to evolve a while ago.

It is therefore time to put the focus on the question of how to transfer the body of software engineering knowledge to at least the sustainability of research software. We remember:

“Research Software Engineering is the use of Software Engineering practices in research applications.”

But which practices are actually useful in RSE? Which can be ignored? Which are omitted or forgotten, but would be useful?

We observe:

Focus: Processing Efficiency

SE: The focus of many SE techniques is the efficiency of the developers during the development project, because complexity of software functions is more demanding than the size of data and the computational effort.
RSE: Traditionally, High-Performance Computing (HPC, which covers a large part of RSE) focuses mainly on execution times. Simulations and various other forms of number crunching and data processing force RSE to put focus on the efficiency of programs.
RSE tomorrow: The complexity of programs steadily increases and probably both, the efficiency of the developers and the program will receive focus in the future.

Reuse

SE: SE traditionally focuses on the reuse and provides a large set of mechanisms to increase the reusability of software. This starts with language constructs, such as inheritance, the import of other modules/classes, and explicit definition of interfaces that allow (but not enforce(!)) reuse. This continues on the methodological level, where development processes for frameworks, for features in software product lines, for components, microservices, independently deployable and versioned subsystems, etc. are established. Architectural and design patterns are of great help. Software design and architecture must anticipate reuse.
RSE: To some extent RSE uses these techniques, but too often these high-level techniques are not applied. That is natural, as the focus is not on reuse and explicit architectural design activities are not in place. In order to increase reusability rewriting code is fostered.

Modularity

SE: Modularity is a core technique to assist in fostering reusability. It is applied on various levels, starting with the building of classes, and the design and architecture of components, subsystems, etc. Modularity starts with the requirements, but enforces much focus during architectural design, affects the organization of distributed development and modular quality assurance. Achieving modularity is a professional skill that has to be learned and trained.
RSE: Modularity is applied mainly in class design.

Modeling

SE: Explicit use of software models using UML (or SysML) is common in SE. Domain specific languages (DSL) are used in various domains with specific characteristics, as well as for configuration and increase of modular use of components. These DSLs include, e.g., building information modeling (BIM), control automata, etc.
RSE: Sometimes modeling is applied in RSE as well. The core form of models are the mathematical constraints between physical quantities, timing and geometric topics. Many of them are differential equations.
RSE tomorrow: The mathematical constraints, typically defined in scientific papers, and their software implementation need to be focused more. RSE tools need to better assist in explicitly specifying limitations, especially for ensuring the consistency between efficient implementations and the mathematical formulae (seen as requirements), and for tracing changes. This may include generators, more domain-specific languages, but also more math compilers. But also the software itself will need design techniques and the variants of UML-like techniques need to be integrated.

Automation With Smart Tools

SE: Continuous integration is of great help for developers to get immediate feedback and for project managers to keep track and an overview of the projects progress. Tools are commonly used that help with refactoring, detect malicious or dead code, security vulnerabilities, architectural deficiencies, etc.
RSE: In RSE these techniques are applied, e.g., using Git, GitHub, or GitLab capabilities.
RSE tomorrow: Assistance seems to be helpful for purposeful use. RSE specific solutions could help even more. Existing continuous integration approaches, that aim for short turn-around times, are challenged by those kinds of research software, that require many computational resources to train models.

Versioning

SE & RSE: The Git version control systems with openly accessible tools such as GitHub as well as the locally deployable tooling and infrastructures such as GitLab are generally applicable in SE. Such tools provide helpful assistance: e.g. a ticket system, continuous integration, versioning assistance, branching, variability management, etc. Skillfully applied these improve the development process intensively.
SE & RSE tomorrow: Currently many tools are developed around Git to improve their assistance even further. For example, detecting security breaches, automatic deployment, artifact management, and documentation generation help developers in their tasks. Increasingly, projects, companies, and sometimes also specific communities create their domain-specific tools, which know specificities about the domain and as such can help developing and maintaining the code even better. The Linux open source project with its kernel, master and lieutenant system may act as a good blueprint to build a chain of trust and quality.

Long-Lasting / Sustainable Software

RSE: It is true, that in RSE some larger (but by far not all) pieces of programs should be long-lasting and sustainable. This allows these programs to become reusable for various research questions to be addressed.
SE: This is true for other software as well. Billions of Euros, US Dollars, etc. have been invested in developing software, e.g., in insurances, banks, government, and back-offices of companies, and those assets must be usable as long as possible.
SE & RSE: General SE and RSE face the same problems. Even though the underlying development goals of RSE are special, the methods to achieve sustainability for software are potentially similar. This does not only include the software, but also it’s accessibility, e.g., in form of documentation and the people that have knowledge about the software in its complexity. Open source development alone is not enough to ensure sustainability.

Longevity requirements have different foundations on scientific and technical code: Scientific code handles the scientific questions and solutions, e.g., its core is the implementation of formulae. These aspects should be subject to improvement, extension, or replacement by researchers, i.e. scientits who code. Technical code, e.g., for parallelization, database connectivity, user interfaces, etc. is less influenced by scientific progress, but by technological progress. These parts should be subject to change by software developers or software engineers, e.g. by tailoring existing frameworks and library towards RSE.

Reproducibility

SE: This challenge occurs, e.g., when legal aspects come into play. Banks and insurances have to provide access to the software to tax authorities even 10 years after the software was taken out of operation in hot standby and be able to reconstruct the software service for 20 years while the underlying computing infrastructure and contextual systems evolve rapidly, and the correctness of the software has to be ensured at all times.
RSE: In RSE the same problem occurs, even though the reason is different: the timing is not constrained by legislation, but by scientific standards.

Framework Development Strategies

SE: SE has created various forms of modularity and reusability, including the concept of frameworks with hot spots, various forms of configuration techniques and tooling. Frameworks differ from pure libraries, because they are tightly integrated, establish the control flow, and allow developers to plug in individual adaptations of functionality.
RSE: RSE mainly creates monolithic codes, which are openly accessible, but difficult to understand. Documentation is sometimes available, often as reference to (static, unchangeable) published papers. Libraries of core functions are established.

The development of research software that directly contributes to a publishable research subject tends to be preferred by scientists over the development of libraries, frameworks, reference models (such as reference architectures or reference domain models) and other utilities, that support the development of future research software.

RSE tomorrow: The framework and the product line concepts must be applied more often to achieve the benefits stated above. Incentives must be established for researchers of different career levels to develop and maintain libraries, frameworks, and reference models for research software.

Testing

SE: A tremendous amount of theory and practical advice regarding testing is available, allowing to develop tests for various purposes: from small unit tests, through component and integration tests, up to end-to-end user tests covering requirements, models, code and input data spaces in appropriate form. Testing frameworks allow to easily automate these tests and thus repeat them on each evolutionary step.
RSE: Testing is not common in RSE. Test automation is widely ignored. To some extent this is because manual tests have been sufficient in a setting with reduced execution variability. Sometimes there is only one data set and testing is equal to execution and getting the result immediately. So far plausibility checks often replace testing. Due to the nature of research, often no oracle exists for expected results, making end-to-end tests difficult to develop.
RSE tomorrow: RSE will need more automatic, repeatable tests. It may be that the form of end-to-end tests is rather specific, because e.g. in simulations the timing aspect plays a role. Testing massively distributed software also has its own challenges. Adequate tooling is needed, because otherwise evolution and reproducibility are not achievable.

Development Methods

SE: Various forms of development methods have been established. These include V-model, Rational Unified Process, Extreme Programming, Scrum, and many other agile or more document oriented development processes.
RSE: Hacking still applied too often. Especially the requirements engineering process very much differs from traditional SE. In many Ph.D. projects the candidate is developer and their own sole customer.
RSE tomorrow: Depending on the form of software, the number of people, their skills, the expected quality and outcomes, timelines, etc. several different methods will be needed. The core need will be that someone in the project knows about the activities and their interactions in development projects, selects the right process and ensures that the participants live it in the project. Taking responsibility is important for these projects.

Exploration of new research ideas plays a major role in RSE and must be well integrated into an RSE method. Again, explicit innovation enabling techniques are already well established in core SE and should be applicable in adapted form.

Consulting

SE: In the industrial practice of SE, consulting has become a major driving force. There is a large set of companies that employ consultants of various specific skills, who are working in sometimes large projects to develop various forms of software.
- Skills may be technical, for example dealing with specific software stacks, or they may be applicable for specific activities, such as architectural design, requirements engineering, testing, database connection, installing software as a service, or tool building and adaptation.
- Consulting may also consist of various different activities, such as providing templates, giving courses, training on the job, code reviews, tool and software stack assistance, or other kinds of trouble-shooting and problem-solving.
- Even co-development is doable, when domain-specific know how and thus pi-shaped skills (software engineering + domain knowledge) are present.
RSE: First research organizations are starting to apply in-house consulting.
RSE tomorrow: In-house consulting is a possibility and maybe also a necessity to temporarily bring experienced developers with specific skills to the application projects that need these skills. As a consequence, groups of skilled software engineers (including developers) will be established at least at larger research organizations, to assist in various development stages and potentially also maintenance and curation in the later phases of sustainable software.

An overall impression remains that guidelines for RSE are necessary, alongside tailored processes and methods for the special needs of RSE.

RSE Research

For a precise understanding we put this into parentheses: ((research software) engineering) research) is a foundational field of research that focuses on the engineering techniques, tools, methods, frameworks, etc. to develop research software. It will take inspiration from (generic) software engineering research and find solutions specific for other research domains.

RSE research has to tackle a number of questions that
address the software as a produced result, but also the assistance for the domain-specific researcher as a human being with limited time and always optimizable skills for software development. Some of these questions are rather fundamental research, other should be acompanied by empirical software engineering or organisational, economic research:

How does RSE differ from SE and what specific activities have to be addressed by different methods and tools?
How can classic SE approaches, like the V-Model, RUP, Scrum, or Xtreme Programming be adapted to RSE needs?
Or is an entirely new development process needed?
How to run empirical validation in such a context? (At least researchers could understand that empirical validation is needed, and would hopefully assist?)
How to adapt research software requirements elicitation for RSE? This activity is typically deeply connected with the domain-specific research process itself, which is why traditional requirements engineering techniques don’t fit.
How to define robust architectures that are extensible, maintainable, and allow a reusability for research software, where the forms of extensions, and connections to neighboring functionalities are initially very unpredictable.
How to assist RSE with domain-specific frameworks and reference models (reference architectures and reference domain data models)?
How to gain precise understanding of optimal forms of reuse: From (1) simple copy-paste, through (2) branching and (3) black-box reuse up to (4) pre-deployed service reuse? This is needed across heterogenous languages, software-stacks, and hardware infrastructures and it must be seen in the context of long-term evolution.
How to document and model the desired software with the purpose of (a) effective development, (b) possible code generation, (c) sustainable reproducibility, and (d) comprehension in the sense of FAIR research data management? This is a multidimensional problem, because software needs to evolve, because of (1) bugfixing and (2) evolution of the software stack it is embedded in, such as underlying external data sets and their service APIs, security patches of the operating system or libraries, etc. Docker is only of local help, here. Software also comes in technical and functional variants and is a lively object that very much differs from passive data. Currently, research data management initiatives, such as the German FDMI, are only starting to address this.
How to manage long-term evolution with fluctuating developer groups?
How to manage, govern, and analyze variant-aware and variant-rich research software in long-term projects?
What tools are needed to easily automate various tedious activities that researchers would like to avoid?
What kinds of software engineering models are of use in RSE?
How to integrate mathematical models (e.g. based on calculus), Software Engineering models (e.g. UML, SysML) and leverage their effective use in development?
What are useful domain-specific languages (DSLs) for RSE, either to raise developer efficiency or to decrease quality issues?
Which RSE activities can be assisted by DSLs?
How can DSLs bridge the published mathematical, physical, biological, medical, etc. laws and findings on the one hand and the executable code on the other hand?
Can mathematical / physical laws themselves be codified as DSL models and thus used for validation, testing or even code generation?
What is appropriate meta-information about research software artifacts, their development states and forms of review, certifications etc. to build trust, enable reusability, and document quality?
Refactoring the architecture of research software to fit future needs while retaining current capabilities?
How can software analysis tools contribute to, e.g., checking consistency between mathematical models in published papers and the implemented code? Might this even lead to a change in the forms of our publications?
How can traditional program analysis tools e.g. for dead code, inefficient code, architecture leaks, or unnecessary complexity effectively help the researchers to improve software quality?
How to test and what to test in which intensity? Hat are sufficient tests in the context of big data sets or unclear desired outcomes? How to overcome the unknown test oracle problem?
Is there a notion of test coverage for research software?
What is high-quality assurance for research software in order to achieve high-quality research results? We need adequate and efficient testing, reasoning, reviewing, and certification procedures.
How can an intelligent research co-pilot based on large language models help to address coding, testing, design and quality issues?
How to quantitatively assess and optimize development skills for researchers?
How much does the human developer play a role in RSE developments?
How to design software for user studies in social and economic experiments that interact with test probands?
How to predict development efforts and costs of a scientific development process? (“CoMoMo for research software?”)
How to deal with legacy software or legacy hardware in a long-lasting software infrastructure setting? How to migrate scientific software?
How to predict needed computational, storage and networking ressources for a software experiment/simulation/study upfront?
How to measure the scientific quality of research software?

Many of those questions are not specific to RSE, but apply to classic SE as well. While some of these questions are very general, the answers must typically be very domain-specific to gain improvements for the domain-specific challenges. Classic SE has many (and still optimizable) answers to these questions, but if it is unclear, how they transfer to RSE.

Some Research Topic Examples

The following is a very incomplete set of research topics, just to give examples of what is needed (and what we would like to build research about):

Consistency between academic papers and code, through techniques like:
- Code generation using a LaTeX formula as DSL, or something very similar
- Code generation from new DSLs that generates to code and LaTeX
- Deriving test cases from scientific papers (in case no code can be generated)
- Verification of the relationship of the code to the mathematical model
Code analysis as retrospective: Not only with the usual code analysis results such as deadlock freedom, but to answer the question of whether the code covers the right scientific models (up to numerical analysis of rounding problems).
“Research ChatGPT” as the co-pilot for scientists:
- for creating both code and tests,
- and if models are relevant, also for finding scientific relations in the form of mathematical models.
Energetic analysis: CO~2~ footprint of the software.
How do we combine mathematical models (essentially differential equations) with the theory of digital software models (mainly automata, Petri nets, temporal logic, etc. and also structural models such as class diagrams). There is still not a reasonable solution today – probably also because there is no optimal solution, but dependening on the scale of the processes these different kinds of models interact differently.
Domain Specific Languages (DSL) for research (like we do in MontiCore): Sometimes it may be worthwhile to provide researchers with a conceptually reduced, problem-adapted language and allow researchers to model in their own vocabulary instead of writing code.

One bottom line here is: The connections between

scientific publication,
domain theory and its models,
code, and
tests

are to be addressed even more clearly. Ideally, these are treated with automated solutions, which obviously also include a variety of tools specifically dedicated to research software.

Recommendations and Concluding Observations

The development of research software faces the same challenges as developing any other kind of software. The software engineering body of knowledge addresses these challenges and is of considerable help. Unfortunately, there is no silver bullet and software engineers are not firefighters, who can easily rescue software that has degraded over several years. However, software engineering sometimes provides witchcraft-like techniques, which help best, when applied early.

Research software engineering correctly pushes these techniques into development projects for researchers, but domain-specific adaptations are necessary.

Software development tools for the automation of various activities and wizard-like assistance have enormously increased productivity and simplified the hurdles for newcomers to create pieces of software. Moreover, these tools nowadays enable non-experts to develop significant pieces of software and leverage the knowledge of core software structures, such as persistent storage, communication, compilation, computation orchestration, etc.

Computer science can be proud of having understood the core technologies, turning scientific knowledge into automatically executable algorithms and frameworks, and embedding these into tools usable by non-experts without having a deep understanding of the internal mechanisms of these tools. This also includes integrated development environments (IDEs) with highly assistive editors, Low Code approaches using scientific or other explicit modeling techniques, automated continuous integration, code analysis, etc.

Computer science can be proud of enabling non-experts to write expert-like software. No one might believe, that building even such a simple thing as a single-family house can be done by non-architects, but in software development, we have achieved exactly this. This is why probably the hard part for an RSE is to understand its research domain, while the easier part is (and should be) to understand software development as well.

But the tooling infrastructure is still under intense development, and much more can be done to simplify the jobs of RSEs and also “Scientists who Code”.

Recommendation 1: It is necessary to build better domain-specific tooling to address the domain-specific challenges of research software. Wizard-like smart tools help development amateurs and, to some extent, prevent them from having to focus too much on SE skills themselves.

Smart, possibly domain-specific tooling is part of the research infrastructure and needs high quality. It cannot only be a research prototype and thus should be professionally planned, engineered, and managed.

However, there is complexity in research software that doesn’t go away. If the software shall not be a throw-away software, there are additional methodical topics to address, that currently cannot be completely automated.

Recommendation 2: Researchers who create software for a sustainable, long-lasting infrastructure need to be trained in software engineering skills, which drastically differ from mere programming skills.

Not all developed software needs to sustain. Sometimes simple throw-away experiments are ok, but developers should be aware of the expected outcomes and life expectancies of these outcomes. Their development processes should contain the appropriate measures and mechanisms to achieve their goals.

Researchers need to be aware that software engineering is not only about getting the code right but also involves architectural, design, quality assurance, and management soft-skills to be adopted and lived during a development process.

The development process for sustainable, widely used software is different from that of one-shot software. Managers (i.e. typically institute leaders) need to be aware of that and manage development accordingly from the beginning.

Because we know that practically applicable software engineering skills are not easily obtained, there is a third relevant possibility:

Recommendation 3: Have one or more software engineers be part of your project to get the software and the technical architecture right, adopt the appropriate tools and quality mechanisms, etc.

This can, for example, be done by permanent employment in a project but also by consulting offers, which seems to be an increasingly feasible approach. Various British institutes seem to have started this approach by installing research software engineering coaching with primary software engineering skills and some secondary knowledge about the research domain. This seems to be promising because software engineers are trained to collaborate, and software engineering methods very well assist collaborative approaches.

Such a coaching group could be established centrally in the University and provide individual coaching, methodical upskilling or manage specific tasks, such as architectural refactoring, migration in the software stack, retrofitting security, or quality enhancements in the respective projects.

Upskilling is also needed:

Recommendation 4: Start teaching RSE skills to researchers that go beyond mere coding capabilities.

Currently it seems, that many researchers know how to code, but are not even aware that software development also provides methodological support for other activities of the development. Teaching these seems to be necessary. These researchers need not to be experts in software development, but some upskilling seems helpful.

Recommendation 5: Incentives to use RSE best practices must be developed for all stakeholders involved.

Currently, there are long-term benefits for the scientific community to develop sustainable software. The software engineers developing and maintaining the research software not necessarily benefit from enhanced quality, but only from improved efficiency. Combined with the risks of non permanent assignments of budget for RSE, it is not per se interesting for young researchers (with time limited contracts) to establish good RSE practices.

There should be long-lasting contracts for developers and for maintenance tasks. Developing and maintaining research software should be able to pay off for Ph.D. theses. Postdocs should benefit from developing, maintaining, and evolving research software for their next career steps. E.g., in Germany, evolving and maintaining a research software should be significantly positive with respect to pursuing a habilitation thesis or to get a junior professorship. Finally, assinging budget and time for good RSE should support the career of professors, e.g., by valuing the continuous development of research software in appointment negotiations and project proposal evaluations.

Furthermore, it needs to be considered that not every group can have its own individually developed software codes, but codes should be developed and maintained over several groups respectively in a whole community. This can only be achieved, if (a) technically the architecture of the overall system is modular and decomposable, so that individual subgroups do not interfere too much and (b) organizationally a Plugin-structure allows different individual research groups to provide visual contributions (under their own names). Sich software then becomes a plattform with plugins, variability and extensibility options. Software engineering techniques provide such options, but RSE specific adaptions are definitely needed.

Software engineering is a holistic approach, and many strategic decisions have to be made, therefore:

Recommendation 6: Establishing RSE principles is a core topic especially for the management, i.e. the professors and the research institutions, that needs to be addressed adequately.

And to give various stakeholders a concrete assistance for what needs to be done:

Recommendation 7: Establish a Software Development Guideline for the projects or even for the whole institution, that cover (a) organizational issues, (b) legal issues e.g. with copyrights, and especially (c) a methodological process framework.

Such a methodological process framework must be taking size, criticality, relevance for the university, expected duration of usefulness, heterogeneity of users, expected TRL (technology rediness level) and potentially other key metrics into account and give useful help to select the appropriate methods, activities, and tooling balancing between efficiency, agility, and predictability of the results quality.

Finally, we have learned that there are generic software engineering techniques that can be applied in many domains, but due to the domain-specific differences in characteristics, it is also useful to adapt, enhance and possibly create domain-specific techniques, tools, methods, frameworks, etc. This brings us to:

Recommendation 8: Establish research software engineering research as a research field over RSE.

RSE research will probably keep us busy for a number of years as a foundational field of research, which obviously is to be executed by SE researchers and not so much by the domain researchers themselves.

Thank you Marco Konersmann, Florian Rademacher and Lucas Wollenhaupt for commenting a draft and several SEs and RSEs for commenting an earlier version of this white paper.

Definition: Research Software Engineering (RSE)

Software Engineering (SE)

Software Engineering Challenges

Software Engineering: 50 years, culminating in the SWEBOK

Software Engineering Areas

Research Software

Kinds of Research Software

Characteristics of Research Software

Research Software Engineering

Focus: Processing Efficiency

Reuse

Modularity

Modeling

Automation With Smart Tools

Versioning

Long-Lasting / Sustainable Software

Reproducibility

Framework Development Strategies

Testing

Development Methods

Consulting

RSE Research

Some Research Topic Examples

Recommendations and Concluding Observations

Further links: