Skip to main content
3 of 5
deleted 4 characters in body
Laiv
  • 15k
  • 2
  • 34
  • 72

Inheriting legacy code is one of the most common things in the software industry. So common that, we can find several publications regarding this topic. Worth a mention Working effectively with legacy code - Michael C. Feathers for being one of the most renowned.

However, these publications are strongly focused on refactoring code. If I understood right, that's not your goal, your goal is to document.

With this goal in mind, here some tips that might help you.

1. Get a copy of the code

Get a copy of the code in production. Fork the code in the SCM or make a branch and check it out.

2. Contextualize the code

Decontextualized code doesn't tell too much about its intentions. It tells how things are done, but it doesn't tell too much about the why (requirements) and when (use cases).

Any piece of code serves a purpose. The sum of many pieces serves to a higher purpose. And so on. The first task is to find out the highest purpose. The overall idea (what problem is solving). Then we have to contextualize it within the whole system in order to answer the why and the when.

The answers to these couple questions are important for the future tests. Without the answers, we could end up testing irrelevant uses cases or misinterpret the results.

This irremediably lead us to ask people that have very little time to talk to us about it. However, the functional knowledge is as important (or more) as the technical. One strategy that works for me is inviting these persons to a coffee.

3. Identifying levels of abstractions (responsibilities)

If we have found out the global idea, we already identified the highest abstraction. As we delve into the code, we find more and lower levels of abstractions. And by abstractions, I mean responsibilities.1

The goal is to identify the components involved in each responsibility. How they contribute to the global solution.

4. Document the overall picture

Add a Readme-like file to the project and put in it all the information gathered during the contextualization. Introduce the code and the global picture. Introduce also the most relevant abstractions and why they are relevant. Provide future developers with a little introduction of the code.

5. Document the code

If you have identified the responsibilities of each component, now is time to document them in the code. Take a look at the documentation conventions of the language first. Then write the code documentation. Be clear and concise.

6. Document traps and misleading names

No need to say that most of us are terrible when it comes to naming things. Not everybody craft self-documented or self-descriptive code.

Don't blindly trust in what you read. Get into the methods, functions or classes. If the components names are not aligned with their respective functionalities, document them, but don't change the actual names.

7. Document dependencies

If the code uses 3rd party libs, be back to the Readme file and introduce the dependencies (where and when are used).

8. Document integrations

I use to document integrations with diagrams. Diagrams provide us with the global picture of the different systems involved.

For each integration could be useful to document request/response messages, ER data models, data sources, URLs, endpoints, protocols etc... The more info you gather the better.

Especial attention to the error handling strategies. They provide valuable information about how the business reacts to errors.

9. Test the code

This's easier to say than done. Most of the legacy code we inherit was not written to be testable. Considering the future migration and the actual task (documenting), I'm unsure about the value provided by unit tests.

For this particular case, integration and end-to-end could be more valuable than unit tests because these components may or may not be reused later. The whole implementation could change 2, so I'm not sure if unit tests provide developers with meaningful insights. In any case, the other two (integration and end-to-end) are a must.

Ask the manager how much time can you spend on this task. Implementing tests for code that was not written to be tested takes time. So the tests should be addressed to provide valuable insights about the business (the functionality) rather than the implementations details.

In case of doubts about the unit tests, I suggest taking a look at the following questions

10. Document inconsistencies

During the test phase, we might find discrepancies between the results we got and the results we expected. It would be dangerous to assume that these discrepancies are bugs. We could have found bugs or missing features not documented anywhere. Or we could have misunderstood some of the functional requirements. In any case, we should document these cases. Especially, how to reproduce them.

11. Commit the changes frequently.

As the documentation progress, save the changes. Do frequent commits.

Related questions


Note that I have not mentioned anything about executing and debugging the code. I think I could not say anything that hasn't been said already in the following questions:

Finally, remember that your main goal is not refactoring. It does not mean you can't do changes. You can add logs and comments. In order to differentiate your logs and comments from the existing ones, do use marks. For example, I use XXX or the code of the issue (the task).

public void myMethod(){      
  ...
  // XXX - 08/2017 - Laiv : Dead code. This code is never executed due to ...     
  ...

  LOG.debug("XXX - my_trace - Param :{} ", param);
}

1: Remember that any responsibility can be comprised in turn by one or more secondary responsibilities.

2: If the actual code is going to be reused (adapted) as it's, then unit tests are a must.

Laiv
  • 15k
  • 2
  • 34
  • 72