Decontextualized code doesn't tellsay too much about its intentions. It tells how things are done, but it doesn't tellsay too much about the why (requirements) and when (use cases).
Any piece of code serves a purpose. The sum of many pieces serves to a higher purpose. And so on. The first task is to find out the highest purposehighest purpose. The overall idea (what problem is solving). Then we have to contextualize it withininside the whole system in order to answer the why and the when.
TheThese answers to these couple questions are important for the future tests. Without the answersthem, we could end up testing irrelevant uses cases or misinterpret the results.
This irremediably lead us to ask people that have very little time to talk to us about it. HoweverBut, the functional knowledge is as important (or more) as the technical. One strategy that works for me is inviting these persons to a coffee, keep asking3.
If we have found out the global idea, we already identified the highest abstraction. As we delve into the code, we find more and lower levels of abstractions. And by abstractions, I mean responsibilities.1
The goal here is to identify the components involved in each responsibility. How and how they contribute to the global solution.
Provide future developers with a little introduction of the code.
Add a ReadmeREADME-like-like file to the project and put in it all the information gathered during the contextualization. Introduce the code and the global picture. Introduce also, the most relevant abstractions and why they are they relevant. Provide future developers with a little introduction of
Hints for running the codeapplications and reproduce the use cases are highly appreciated.
If you haveHaving identified the responsibilities of each component,the components now is time to document them in the code. Take a look at the documentation conventions of the language first(Python Docstrings). Then write the code documentation according to these conventions. Be clear and concise.
No need to say that most of us are terrible when it comes to naming things. Not everybody craft self-documented or self-descriptive code.
Don't So, don't you blindly trust in whateverything you read. Get into the methods, functions or classes. If the components names are not aligned with their respective functionalities, document them, but don't change the actual names.
If the code uses 3rd party libs, beBe back to the Readme file and introduce the dependencies (where and when are used3rd party libraries). Where and when are they used, versions, etc. Add links (if possible) to the official pages of the libraries.
I use to documentintroduce the integrations with diagrams. Diagrams provide us with the global picture of the different systems involvedThese are much more expressive than textual documents.
ForAdditionally, for each integration could be useful to document request/response messages, ER data models, data sources, URLs, endpoints, protocols, etc... The more info you gather the better.
This's easier to say than done. Most of the legacy code we inherit was not written to be testable. Considering the future migration and the actual task (documenting), I'm unsure about the value provided by the unit tests.
For this particular case, integrationintegration and end-to-end could beend-to-end are more valuable than unit tests because thesethe actual components may or maymight not be reused laterduring the migration (normally they aren't). The whole implementationmigration could completely change the implementation 2, so I'm not sure if unit tests provide developers with meaningful insights. In any case, the other two (integration and end-to-end) are a must.
Ask theyour project's manager how much time can you spend onhave for completing this task. Implementing tests for legacy code that was not writtenuses to be tested takes timetake longer than usual. So, address the tests should be addressed to provide valuable insights aboutof the business (the functionality) rather than the implementations details.
During the test phase, we might find discrepanciesinconsistencies between the results we got and the results we expected. It would be dangerous to assume that these discrepanciesinconsistencies are bugs. We could have found bugs or missing features not documented anywhere. Or we could have misunderstoodmisinterpreted some of the functional requirements. In any case, we should document these cases. Especially, how to reproduce them.
Note that I have not mentioned anything about executing and debugging the code. I think I could not say anything that hasn't already been said already in the following questions:
Finally, remember that despite your main goal is not refactoring, it doesn't mean you can't do changes. You can add logs and comments. In order to differentiate your logs and comments from the existing ones, use marks. For example, I use XXX and the initials of my name
public void myMethod(){
...
// XXX - Laiv : Dead code. This code is never executed due to ...
...
LOG.debug("XXX - my_trace - Param :{} ", param);
}
2: If the actual code is going to be reused (adapted) as it's, then unit tests are a must.
3: One strategy that works for me is inviting these persons to a coffee.