Configuring codespell in GitLab Repositories
This guide explains how to integrate codespell, a tool that detects common misspellings in source code, into a GitLab repository.
Two main usage scenarios are covered:
Local pre-commit hooks
codespellruns automatically when you attempt a Git commit, scanning staged changes for spelling errors.CI/CD pipeline enforcement
codespellruns in GitLab CI as a job. If spelling errors are found, the job fails and contributors must fix issues locally before resubmitting.
In both cases, a shared configuration file .codespellrc ensures consistent behavior.
Example:
[codespell]
skip = build,*.drawio,*.svg,*.pdf
ignore-words-list = laf,OT
write-changes = true
skip: Specifies directories or file extensions to ignore (e.g., build outputs, diagrams).ignore-words-list: Defines project-specific terms that should not be flagged (e.g.,laf,OT).write-changes: Letscodespellautomatically apply suggested fixes. If omitted,codespellruns in dry-run mode and only reports issues.
Using codespell as a Git Pre-Commit Hook (Local)
This method prevents spelling errors from being committed.
Steps
Install
pre-commit.pre-commitis a framework that manages Git hooks. Install it with:pip install pre-commit
Activate the pre-commit hook in your repository.
Go to your local repository and enable hooks with:
pre-commit installModify
.pre-commit-config.yaml.Add
codespellconfiguration:repos: - repo: https://github.com/codespell-project/codespell rev: v2.4.1 #latest release version hooks: - id: codespell args: [--config=.codespellrc]
repo: Location of the codespell hook repository.rev: The version of codespell to use. It is recommended to use v2.4.1 to remain consistent with other repositories.id: Selects the codespell hook.args: Passes additional arguments, here explicitly pointing to.codespellrc.
Expected Behavior of the Local pre-commit Hook for codespell
When the local pre-commit hook for codespell runs, it scans only the files that have been modified. During the check, codespell automatically fixes typos when it is confident about the correct replacement.
Automatic Correction by codespell
If there are multiple possible corrections, codespell does not modify the files directly. Instead, it provides suggestions in the terminal output for you to review. For example, messages in the following screenshot show that codespell has found two potential typos, “editers” and “repoted”, in lines 4 and 12 respectively. However, it is uncertain which correction to apply, so it displays the possible alternatives in green highlights for you to review and adopt. # codespell:ignore (intentional typos for demonstration)
Suggested Corrections from codespell
Implement codespell in GitLab CI/CD
To ensure spelling checks are enforced for all contributions, you can integrate codespell directly into the GitLab CI/CD pipeline. When configured, each Merge Request pipeline can trigger a codespell job that scans only the modified files. If spelling errors are found, the job fails, and the contributor must fix the typos locally before pushing the corrected commit.
You can add codespell to your CI/CD process by either modifying .gitlab-ci.yml or, if pre-commit is already integrated (usually in .gitlab/ci/pre_commit.yml), simply updating .pre-commit-config.yaml without changing any CI files.
If you use
.gitlab-ci.ymlto manage all your jobs. Take the following code as an example:codespell_check: stage: check image: $ESP_DOCS_ENV_IMAGE extends: - .before_script_minimal rules: - if: $CI_PIPELINE_SOURCE == "merge_request_event" before_script: - pip install codespell - git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME --depth=1 - git fetch origin $CI_COMMIT_REF_NAME --depth=1 - export MODIFIED_FILES="$(git diff --name-only origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME..origin/$CI_COMMIT_REF_NAME)" script: - | if [ -n "$MODIFIED_FILES" ]; then echo "Running codespell on changed files:" echo "$MODIFIED_FILES" codespell --config .codespellrc $MODIFIED_FILES else echo "No modified files to check with codespell." fi
stage: Specifies the stage of the pipeline where this job belongs. The stage can be check or pre-check as defined earlier in theymlfile.image: Specifies the Docker image used for running jobs. This should match the image already configured in your repository.extends: Inherits configuration from a predefined template, e.g.,.before_script_minimal. This reduces repetition and ensures consistent initialization across jobs.rules: Defines conditions for when the job should run. For example, usingif: $CI_PIPELINE_SOURCE == "merge_request_event"ensures the job only runs in pipelines triggered by a Merge Request, and not for regular branch pushes or scheduled pipelines.before_script:Installs
codespellinto the job environment.Fetches the diff range between target and source branches.
Collects the list of modified files for spell checking.
script: Runscodespellon modified files only. If spelling errors are detected, the job fails.
If your project runs unified pre-commit jobs in the CI pipeline, adding a separate
codespell_checkjob in.gitlab-ci.ymlis unnecessary. The CI pipeline simply uses the same.pre-commit-config.yamlfile, ensuring that the checks performed in CI are identical to those on local machines.
How to Run Unified Pre-Commit Jobs in GitLab CI/CD
Running unified pre-commit jobs in GitLab CI/CD is recommended because it allows you to easily add or update hooks in the future with minimal changes. Follow these steps to set it up:
Step 1: Include the .gitlab/ci/pre_commit.yml file in your root .gitlab-ci.yml configuration.
Example:
include:
- '.gitlab/ci/pre_commit.yml'
Note
If your .gitlab-ci.yml file does not yet define any global workflow rules, please add the following configuration:
workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event" # Run pipelines only for merge requests
- when: always # Fallback rule to always run if no other rules match
Step 2: Create the .gitlab/ci/pre_commit.yml file to include the following job definition:
.check_pre_commit_template:
stage: check # or "pre-check", depending on your pipeline structure
image: $ESP_DOCS_ENV_IMAGE # Use the same image as other jobs in your repository
extends:
- .before_script_minimal # Optional: reuse an existing minimal setup if available
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event" # Run this job only for merge request pipelines.
script:
- pip install pre-commit
- git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME --depth=1 # Fetch target branch latest commit
- git fetch origin $CI_COMMIT_REF_NAME --depth=1 # Fetch source branch latest commit
- |
echo "Target branch: $CI_MERGE_REQUEST_TARGET_BRANCH_NAME"
echo "Source branch: $CI_COMMIT_REF_NAME"
MODIFIED_FILES=$(git diff --name-only origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME..origin/$CI_COMMIT_REF_NAME)
echo "Modified files to check:"
echo "$MODIFIED_FILES"
if [ -n "$MODIFIED_FILES" ]; then
CI=true pre-commit run --files $MODIFIED_FILES
else
echo "No modified files to check."
fi
check_pre_commit:
extends:
- .check_pre_commit_template # All configured pre-commit hooks will run under this job
Note
If your repository includes submodules, make sure to fetch them in the pre-commit job to ensure all files are available for scanning.
In this case, add the following configuration under .check_pre_commit_template:
variables:
GIT_STRATEGY: fetch
SUBMODULES_TO_FETCH: "all"
script:
- fetch_submodules # Usually defined in ``utils.sh`` to fetch submodules
# If 'fetch_submodules' is not available, you can alternatively use:
# - git submodule update --init --recursive
Step 3: Modify the .pre-commit-config.yaml file to add the expected hook.
Example:
repos:
- repo: https://github.com/codespell-project/codespell
rev: v2.4.1
hooks:
- id: codespell
args: [--config=.codespellrc]
After completing these steps, your CI pipeline will automatically run all pre-commit hooks on the files modified in each Merge Request.
For a more detailed example, refer to .gitlab-ci.yml in the ESP-IDF repository.
Summary
Local pre-commit hook: Prevents contributors from committing misspelled files.
CI/CD pipeline check: Enforces centralized spelling checks in CI pipelines.
Unified config:
.codespellrcguarantees consistent behavior across environments.
Combining both approaches ensures consistent spelling and improves documentation quality.
Potential Risks of codespell
The way codespell works is by using a dictionary of common misspellings and their correct forms. During checks, it scans files for words in this dictionary and suggests corrections. This means that codespell is not a comprehensive spell checker – it only detects typos that exist in its dictionary, and may miss other spelling mistakes.
To catch more typos in a smarter way, it is strongly recommended to enable a spell checker in your IDE or editor. For instance, you can Use Spellcheckers in VS Code, which flags spelling mistakes instantly as you type.