
Writing a code review for a large project was something that I had never done before and was quite a scary task to take on for the first time. They are a great thing to have done for a project, and help to maintain high standards and well developed solutions.
Having just left a development team and not having had any input into the codebase for the project I was the ideal candidate to write the code review. But.. I’d never written one before and didn’t really know where to start. The first thing I did as anyone would do is Google it. What did I find? Honestly, not very much for what I wanted and needed to review. Trying to follow the idea of open-learning I decided it was time to write up what you could call my opinion on how to write a code review. It isn’t going to be a perfect fit for every requirement but hopefully it helps fill a gap that I found when trying to find out how to do it.
TL;DR
If you need to write a code review for a large project and don’t know where to start:
- Understand the project and tech stack first.
- Structure your report clearly: Introduction, Methodology, Findings, and Recommendations.
- Review repositories for structure, secrets, pipelines, and documentation.
- Check code quality: readability, maintainability, error handling, and standards.
- Pay attention to security for both frontend and backend.
- Review testing strategy and coverage.
- Make actionable recommendations, prioritised by severity.
Understand the Project
Firstly you need to understand the project, what is it and why? What is it for, what requirements are there for this project to need to be built. Why is it being built? Understanding this context will help you understand some of the decisions which have been made.
You need to understand the technology that has been used, how has it been architected and are there are design patterns that have been followed. Knowing this will help you review the code and project against the right best practices and standards.
The project that I was reviewing was a healthcare web portal that consisted of a frontend built in Angular, a dotnet backed API and a SQL Server database. I can’t really share the detail about what it was as I don’t want to give anything away! But essentially it would allow a certain set of health care professionals see records relating to a set of patients.
How to Structure the Report
After carrying out the review you need to be able to present you findings in a concise way, otherwise what is the point of doing the review? If you can’t correctly communicate what you have found then they will never be fixed! This is the structure that I followed for the report I wrote:
- Introduction: Give a brief description of the project and the scope of the review you are writing. The introduction should also include a summary of the recommendations and findings. This provides a quick and easy context into the report.
- Methodology: Describe the approach taken to review the code, including any tools or techniques used. I broke this down into each repository that I was reviewing and what each one would be reviewed for.
- Findings: Present the findings of the code review, including any issues or concerns identified. This is the main body of the report and should be structured in a way that makes it easy to follow. I broke this down into sections for each repository and then further into sections for each area of the codebase. This was further broken down into each repository as per section 2.
- Recommendations: Provide recommendations for addressing the issues identified in the findings. This should be presented in a way that would allow the audience of the report to understand what you found, why it’s a problem and what to do about it.
How to do the Review
Now we know how to structure the report, how do we actually carry out the review? This is the part that I found some of the least information about for the overall part of it. Of course there are some very clear areas that are well documented, but some can be so opinion based that what you find online doesn’t really help.
As this project had three repositories, some things that I did were the same for all three, some however are specific to a given repository.
Repository
The first thing I did which was the same for all repositories was to review the repository itself for a few elements. During this review I made sure to look at the frameworks being used before anything else. Are they the latest version? For example, in this instance the API was using .Net 8.0, which whilst was still in long-term support at the time, .Net 9.0 was released and the latest version.
Files and Folders
- Structure: Does the repository have a clear and logic layout? If you can’t find files in a repository it’s not much use. Equally are files well-named with descriptive names which don’t lead to you having to open each file to find out what it is.
- For example, I found a few files for Azure DevOps pipelines that were named
azure-pipelines.yml
which is OK, however if you have multiple named the same with-1
,-2
, you have no idea what they are!
- For example, I found a few files for Azure DevOps pipelines that were named
- Documentation: Is there a README file that provides an overview of the project and how to get started? This file can also serve as a guide to the repository, especially if it is following a specific template.
- For example, every repository in this project had the default README for either the framework or in one case the standard Azure DevOps template.
Sensitive Data
- Secrets: Have any secrets, keys or passwords been added to the repository? If they have been checked in, they should be removed and they then need to be changed at source.
- For example, I found that some passwords had been checked in to one of the repositories within a script.
- Configuration: Are there any configuration files that contain sensitive information? They shouldn’t be stored in this way and should be stored in a secure location.
- Has an
appsettings.json
file been checked in with the real values rather than just the template? This is a common thing throughout dev teams which can easily lead to compromises.
- Has an
Pipelines
- Correct Pipelines: Does the repository have the right pipelines for the project to cover both build and deployment? Are they correctly configured with approval checks?
- Pipeline Tasks: Do the pipelines complete the right tasks for their purpose?
- It is common to run tests in a build pipeline that may run say on a pull request. However, are they running on the deployment pipeline? Personally I prefer that they do to have a final check for the publish.
Code quality
The next step is to review the code itself. How well is it written? Can it be easily understood? Is it maintainable? Are there any issues with the code that need to be addressed? Is it documented?
Readability
- Clarity: Is the code easy to read and understand? Are variable and function names descriptive and meaningful?
- Have you got
var a = 1
or are they more descriptive likevar age = 1
?
- Have you got
- Comments: Are there comments in the code that explain what the code is doing? Or are there too many comments?
- Arguably comments should be required sparingly. Are you reading the code wishing there were more comments to explain it? Or are you seeing too many comments having to describe specific variables that could be renamed?
- For example, some of the classes in one repository did not have all of the right summary tags. On the flip side, another had a class full of comments for each variable which in my opinion needed looking at as to why it was needed.
- Formatting: Is the code formatted consistently and cleanly?
- Are there some standards that this could be reviewed against? The way code is laid out can be extremely personal, but ideally a team should have a standard approach and there should be no incorrectly formatted files.
Maintainability
-
Modularity: Is the code modular and easy to maintain? Are there any areas of the code that are tightly coupled and difficult to change?
- Have you got a giant monolith of a class to review and understand or is it nicely broken down into cleaner chunks? The appropriate length is subjective and should be reviewed with an understanding of what it is doing and why.
-
Reusability: Are there any areas of the code that can be reused? Follow DRY! (Don’t Repeat Yourself)
- Are there lots of areas of repetition? Could a shared library be created and used throughout the project?
- Always make code as DRY as reasonably possible. It’s cleaner, easier to read and easier to maintain.
-
Commit Messages: Are the commit messages clear and describe what has been changed?
- Just having a message of
update
is not very helpful when reviewing any potential issues.
- Just having a message of
-
Pull Requests: Are the pull requests well written and describe the changes that are being proposed?
- Automatic policies can be set to help with this such as setting a template on the request, or requiring a work item to be linked.
Adherence to Standards
- Coding Standards: Does the project follow the coding standards that have been set by the team or the best practice for that technology?
- If the team have templates and standards they use, check this project against them. If they have deviated flag it up and discuss why.
- Design Patterns: In some industries there are certain patterns and designs that have to be followed. Have they been followed? If not, why?
- Within the public sector in the UK there is a gov.uk design system that is to be followed for any public gov website. This helps with consistency across all public websites.
Error Handling
- Error Handling: Is error handling implemented correctly? Are the right errors being caught and thrown? Do they get translated properly for both applications to understand but if you are presenting them to a user, is it in a user friendly way?
- Something that I found on the API side was that the application layer would throw a not found if it couldn’t find an entity by ID, however no other correct error types were thrown for other paths.
- Is the API returning the correct code and status for the thrown error? A mapper can be used to handle this for you.
- Logging: Does the application correctly log what is happening and why? Are logs appropriately sanitised?
- Something that is great to include in logging is a correlation ID. This can be used to trace a whole transaction and see what happened when and why.
- Retry logic: Is retry logic implemented when relying on another system such as a backend API?
- APIs can sometimes fail, or have a cold start. Retrying them after a short delay can help to ensure that the user still gets the information required rather than being met with an error or no data.
Security
Security is an important aspect of any code review. Are there any security issues that need to be addressed? Are there any areas of the code that could be improved from a security perspective? At this point some the areas to be checked vary depending on the technology stack being used, and you must be more specific depending on the repository. Some areas are of course required regardless and should be treated as such.
Authentication and Authorisation
- Authentication: Has authentication been implemented and is it secure? Does every point of the application that needs authentication have it?
- There are lots of different authentication methods available. Most nowadays use OAuth2 or OpenID Connect. Using one of these helps to follow good standards and best practices.
- Have the team rolled their own authentication? Yes - I have seen this, and honestly some templates still have it in. Why? It’s 2025, there are so many great options for this.
- Authorisation: Have the right roles and permissions been created in the application and are the right levels of authorisation required for each stage?
- Is the data returned trimmed to just the organisation a user belongs to?
- Has the principle of least privilege been followed? A user should have the bare minimum of access to be able to do what they need to do.
Backend API
These are the specific areas that I looked at for the backend API repository.
- Data Validation: Is the data that is passed to the API validated and checked to be in the correct format?
- The given API I was reviewing had little input due to the nature of the application but even the simple logging endpoint still had validation.
- SQL Injection: Are there any areas of the code that are vulnerable to SQL injection attacks?
- The project I reviewed used Entity Framework, so this was not an issue, but it is still worth checking.
- Cross-Site Request Forgery (CSRF): Are there any areas of the code that are vulnerable to CSRF attacks? Are anti-CSRF tokens being used where required? These are required on any endpoints which update in the backend such as POST, PUT and DELETE.
- The application I reviewed didn’t have this in place. There were a small number of endpoints that could be affected by this, just logs, but still should be implemented.
Frontend
These are the specific areas that I looked at for the frontend repository.
- Input validation: Are all inputs validated to ensure that they are in the correct format and exist if required? Are input fields set to the correct types?
- Whilst this application had minimal data input to the backend, there were still search fields and other inputs on the UI that should be validated.
- Cross-Site Scripting (XSS): The project I reviewed used Angular, so this was not as much of an issue as there are built in protections against XSS attacks. It should always be reviewed that the correct protections are in place.
- Cross-Site Request Forgery (CSRF): Yes, this is a duplicate from the backend API side, it should be checked here too. Has one side implemented it but not the other?
- Content Security Policy (CSP): Is a CSP implemented? This is a security feature that helps prevent XSS attacks by restricting the sources of content that can be loaded by the browser. A good tool for check this is CSP Evaluator.
- Just because there is a CSP setup in the code, doesn’t mean it works properly or is even published. Always use a tool to check this.
- Robots.txt accessibility: Is the robots.txt file accessible? This is a file that tells search engines which pages should not be indexed. It should not contain any sensitive information and should be accessible to search engines.
- In this instance, there was a robots.txt file in the repository. It wasn’t however accessible on the web server.
Testing
The next key element that I reviewed was the testing of the code in each repository. Are there tests in place? Are they well written and easy to understand? Do they cover all areas of the code? Are they run as part of the pipeline?
- For the backed API I was looking for both unit tests and integration tests. Unit tests are used to test individual components of the code, while integration tests are used to test how the components work together.
- For the frontend I was looking for unit tests and end-to-end tests. Unit tests are used to test individual components of the code, while end-to-end tests are used to test the entire application from the user’s perspective.
Test coverage is also something that I looked for. The project was utilising Azure DevOps for the build pipelines, with an additional tool called Coverlet to provide code coverage reports. This tool is fantastic for both developers and reviewers. It generates easy to read reports that show you test coverage broken down by both file and the class. Back when I ran a dev team I would always push for a 80% test coverage. Can this always be achieved? No, but having a target helps to ensure that tests are written. Sometimes it just doesn’t happen though, not every path gets tested.
Recommendations
The final section of the report was the recommendations. This is where I provided recommendations to resolve each of the findings. I broke this down into a section for each repository and made sure that each recommendation was numbered so it could easily be reference.
To help the team prioritise the recommendations to resolve I scored them against a simple scale:
- Critical: These should be dealt with as soon as possible. They could pose a serious risk to the project.
- For example, sensitive data in a repository. Someone who shouldn’t have a password, now has it. The password needs to be changed and deleted in the repository. Make sure gitignores are in place to help prevent it happening again.
- High: These should be dealt with next, the could cause you a problem.
- For example, a lack of documentation for the codebase, making it harder to maintain. Does it matter instantly? Probably not, but the moment you have a new developer or one leave the team it will.
- Medium: These are issues that should be addressed, but are not critical to the project.
- For example, poor git standards such as commit messages or pull request descriptions that do not follow best practices.
- Low: These are issues that could be addressed, but have low impact to the project.
- For example, minor updates to pipelines to properly name tasks, a nice to have but not essential.
Conclusion
Writing a code review was not a quick job, it took me a few days once I got into it, and honestly it took me a while to get started. I think the key issue was just not knowing where to start and I hope I’ve helped you with that if you’ve read this far!
Realistically you are going to be reading code that you have never read or written, and that’s the point. Don’t be worried if it’s taking time, there is a reason you have been chosen to do it. Take your time and break it into reasonable chunks. As a friend always said to me, “Take small bites of the elephant, you can’t do it all at once”.
My last opinion is, be consistent and thorough. Take your time and remember the point of doing it is to help build great software in a great way.