You are currently viewing Module 6: Advanced features, deployment, and CI/CD integration with DeepSeek and Flask

Module 6: Advanced features, deployment, and CI/CD integration with DeepSeek and Flask

  • Post author:
  • Post category:Python

Welcome to Module 6 of the tutorial series: Build an AI-Powered Documentation Assistant with DeepSeek and Flask. In this module, you’ll implement advanced features like version control, documentation drift detection, and automated deployment. You’ll also learn how to generate documentation for an entire GitHub repository, including subdirectories. By the end, your documentation assistant will be a fully automated, production-ready web service.

Prerequisites

Before starting Module 6, ensure you’ve completed the following steps:

  1. Module 5: You should have a working Flask app that generates and hosts API documentation on GitHub Pages.
  2. GitHub Repository: Your Flask app and documentation should be hosted on GitHub.
  3. DeepSeek API Key: Ensure your DeepSeek account has sufficient balance.
  4. Cloud Platform Account: Sign up for a cloud platform like Heroku, AWS, or Google Cloud for deployment.
  5. GitHub Access Token: Generate a GitHub access token with the repo scope and add it to your .env file:
    GITHUB_ACCESS_TOKEN=your_github_access_token

     

Lesson 11: Version control & documentation drift detection

In this lesson, you’ll implement version control for your documentation and detect when documentation becomes outdated due to code changes.

Objective

  • Compare old and new versions of documentation.
  • Send update suggestions via GitHub PR comments.
  • Add unit and integration tests for the Flask API.

Step 1: Implement Version Control for Documentation

You’ll add functionality to tag documentation versions and compare them.

  1. Open docstring_generator.py:Ensure that the tag_documentation_version function exists and includes the following code:
    def tag_documentation_version(version):
        """
        Tag the current documentation with a version number.
        """
        import subprocess
        subprocess.run(["git", "tag", f"v{version}"])
        subprocess.run(["git", "push", "origin", f"v{version}"])

    The tag_documentation_version function tags the current documentation with a version number. It imports the subprocess module to run system commands. First, it uses the git tag command to create a version tag, prefixed with “v” and followed by the specified version number. Then, it pushes the newly created tag to the remote repository using the git push command.

     

  2. Update routes.py:Add a new route to handle version tagging.
    @main_bp.route("/tag-version", methods=["POST"])
    def tag_version_route():
        """
        Tag the current documentation with a version number.
        """
        data = request.json
        version = data.get("version")
        try:
            tag_documentation_version(version)
            return jsonify({"message": f"Documentation tagged as v{version} successfully!"})
        except Exception as e:
            return jsonify({"error": str(e)}), 500

    The tag_version_route function handles the tagging of documentation with a version number. The @main_bp.route("/tag-version", methods=["POST"]) decorator maps the function to the /tag-version endpoint and allows only POST requests.

    1. The function retrieves JSON data from the request body and extracts the version value.
    2. It calls the tag_documentation_version function, passing the version as an argument to create the tag.
    3. If the tagging succeeds, the function responds with a JSON message confirming that the documentation was successfully tagged.
    4. If an error occurs during the process, the function catches the exception and returns a JSON response with the error message and a 500 (Internal Server Error) status code.
  3. Test Version Tagging:Use the following curl command to test the new route:
    curl -X POST http://127.0.0.1:5000/tag-version -H "Content-Type: application/json" -d '{"version": "1.0.0"}'

Step 2: Detect documentation drift

You’ll compare old and new versions of documentation to detect changes.

  1. Update docstring_generator.py:Ensure that the detect_outdated_docs function exists and includes the following code:
    def detect_outdated_docs(code, docs):
        """
        Detect outdated documentation by comparing code and docs.
        """
        current_docs = generate_markdown_docs(code)
        if current_docs != docs:
            return "Documentation is outdated. Please regenerate."
        return "Documentation is up-to-date."

    The detect_outdated_docs function identifies outdated documentation by comparing the current code with the provided documentation.

    1. The function generates the current documentation from the code by calling the generate_markdown_docs function and assigns the result to the current_docs variable.
    2. It compares current_docs with the provided docs.
    3. If the two do not match, the function returns a message indicating that the documentation is outdated and needs regeneration.
    4. If the documentation matches the code, the function returns a message confirming that the documentation is up-to-date.

     

  2. Update routes.py:Add a new route to check for outdated documentation.
    @main_bp.route("/check-docs", methods=["POST"])
    def check_docs_route():
        """
        Check if documentation is outdated.
        """
        data = request.json
        code = data.get("code")
        docs = data.get("docs")
        try:
            result = detect_outdated_docs(code, docs)
            return jsonify({"message": result})
        except Exception as e:
            return jsonify({"error": str(e)}), 500

    The check_docs_route function handles a request to check whether the documentation is outdated.

    1. The function listens for a POST request at the /check-docs endpoint.
    2. It extracts the JSON data from the request and retrieves the values for "code" and "docs".
    3. It calls the detect_outdated_docs function to compare the code with the provided documentation and assigns the result to the result variable.
    4. If the function detects outdated documentation, it responds with a JSON object containing the message.
    5. If an error occurs during execution, the function catches the exception and responds with an error message and a status code of 500.
  3. Test documentation drift detection:Use the following curl command to test the new route:
    curl -X POST http://127.0.0.1:5000/check-docs -H "Content-Type: application/json" -d '{"code": "def add(a, b): return a + b", "docs": "Old documentation"}'

     

     

Step 3: Add unit and integration tests

You’ll write tests to ensure your Flask API works as expected.

  1. Create test_routes.py:Add unit and integration tests for your Flask API.
    import os
    import sys
    import pytest
    from app import create_app
    
    sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
    
    @pytest.fixture
    def client():
        app = create_app()
        app.config['TESTING'] = True
        with app.test_client() as client:
            yield client
    
    def test_home_route(client):
        response = client.get('/')
        assert response.status_code == 200
        assert b"Welcome" in response.data
    
    def test_generate_docstring_route(client):
        response = client.post('/generate-docstring', json={"code": "def example(): pass"})
        assert response.status_code in [200, 500]  # 500 if DeepSeek API key is missing
    
    def test_check_docs_route(client):
        response = client.post("/check-docs", json={"code": "def add(a, b): return a + b", "docs": "Old documentation"})
        assert response.status_code == 200
        assert "Documentation is outdated" in response.json["message"]

    The code defines a set of tests for a Flask application using the pytest framework.

    1. The script imports the necessary modules: os, sys, pytest, and the create_app function from the app module.
    2. It adds the parent directory of the current script to the Python path. This step allows the script to access the app module when running tests.
    3. The client fixture creates a Flask test client. It sets the app’s configuration to TESTING mode and provides a client for making requests within a with context.
    4. The test_home_route function sends a GET request to the root endpoint ('/'). It verifies that the response status code is 200 and that the response body contains the word "Welcome".
    5. The test_generate_docstring_route function sends a POST request to the /generate-docstring endpoint with a simple Python function as input. It allows either a 200 status code for successful generation or a 500 status code if the DeepSeek API key is missing.
    6. The test_check_docs_route function sends a POST request to the /check-docs endpoint with code and outdated documentation as input. It asserts that the response status code is 200 and verifies that the message indicates outdated documentation.

     

  2. Run Tests:
    Install pytest and run the tests:

    pip install pytest
    pytest tests/test_routes.py

Lesson 12: Deployment & CI/CD integration

In this lesson, you’ll deploy your Flask app and integrate it with a CI/CD pipeline.

Objective

  • Deploy your Flask app as a web service.
  • Integrate your documentation assistant with CI/CD pipelines for seamless updates.

Step 1: Deploy Your Flask App

You’ll deploy your Flask app to a cloud platform like Heroku.

  1. Install Heroku CLI:
    Install the Heroku CLI if you haven’t already:

    brew install heroku/brew/heroku
  2. Create a Procfile:Add a Procfile to your project root:
    web: python run.py
  3. Deploy to Heroku:Follow these steps to deploy your app:
    heroku login
    heroku create
    git push heroku main
    heroku open

Step 2: Set Up CI/CD with GitHub actions

You’ll automate testing and deployment using GitHub Actions.

  1. Create .github/workflows/ci-cd.yml:Add a GitHub Actions workflow file:
    name: CI/CD Pipeline
    
    on:
      push:
        branches:
          - main
      pull_request:
        branches:
          - main
    
    jobs:
      test:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v2
          - name: Set up Python
            uses: actions/setup-python@v2
            with:
              python-version: '3.9'
          - name: Install dependencies
            run: pip install -r requirements.txt
          - name: Run tests
            run: pytest
    
      deploy:
        runs-on: ubuntu-latest
        needs: test
        steps:
          - uses: actions/checkout@v2
          - name: Deploy to Heroku
            env:
              HEROKU_API_KEY: ${{ secrets.HEROKU_API_KEY }}
            run: |
              git remote add heroku https://heroku:$HEROKU_API_KEY@git.heroku.com/your-app-name.git
              git push heroku main

    The code defines a CI/CD pipeline using GitHub Actions. It automates testing and deployment when a developer pushes code or creates a pull request to the main branch.

    1. Trigger Events: The pipeline activates when someone pushes code to the main branch or opens a pull request targeting the main branch.
    2. Job: test:
      • The job runs on an Ubuntu virtual machine.
      • It checks out the code from the repository.
      • It sets up Python version 3.9.
      • It installs the project’s dependencies from the requirements.txt file.
      • It runs tests using the pytest testing framework.
    3. Job: deploy:
      • This job depends on the successful completion of the test job.
      • It runs on an Ubuntu virtual machine.
      • It checks out the code from the repository.
      • It deploys the code to Heroku using the Heroku API key stored as a secret in the repository.
      • It adds Heroku as a remote Git repository and pushes the code to the main branch of the Heroku app.

    This pipeline ensures that the code passes tests before deploying it to production.

Step 3: Automate Documentation Updates

You’ll update your CI/CD pipeline to regenerate and push documentation when the codebase changes.

  1. Update .github/workflows/ci-cd.yml:Add a step to regenerate and push documentation:
    - name: Regenerate and push documentation
      run: |
        python -c "from app.utils.docstring_generator import generate_markdown_docs, save_docs, push_to_github; code = open('app/main.py').read(); docs = generate_markdown_docs(code); save_docs(docs, 'index.html'); push_to_github(docs, 'your-repo-name')"

    The code defines a step in a CI/CD pipeline that regenerates and pushes documentation to a GitHub repository. Here’s a breakdown:

    1. Run the Python script inline: The code executes a Python script directly from the command line using the -c flag.
    2. Import functions from the docstring_generator module: It imports the generate_markdown_docs, save_docs, and push_to_github functions from the app.utils.docstring_generator module.
    3. Read the code from app/main.py: The script opens and reads the contents of the app/main.py file and stores it in the code variable.
    4. Generate documentation: The script calls generate_markdown_docs(code) to generate documentation from the code’s docstrings and assigns the output to the docs variable.
    5. Save the documentation to an HTML file: The script invokes save_docs(docs, 'index.html'), which saves the generated documentation to a file named index.html.
    6. Push the documentation to GitHub: The script calls push_to_github(docs, 'your-repo-name'), which pushes the documentation to the specified GitHub repository.

    This step automates the documentation generation, saving, and deployment process.

New feature: Generate documentation for an entire repository

In this section, you’ll learn how to generate documentation for an entire GitHub repository, including subdirectories.

Step 1: Update github_api.py

Modify the fetch_repo_contents function to recursively fetch contents from directories.

def fetch_repo_contents(owner, repo, path=""):
    """
    Fetch the contents of a GitHub repository recursively.
    """
    url = f"https://api.github.com/repos/{owner}/{repo}/contents/{path}"
    headers = {"Authorization": f"token {GITHUB_ACCESS_TOKEN}"}
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        contents = response.json()
        all_contents = []
        for item in contents:
            if item["type"] == "dir":
                # Recursively fetch contents of the directory
                all_contents.extend(fetch_repo_contents(owner, repo, item["path"]))
            else:
                all_contents.append(item)
        return all_contents
    else:
        raise Exception(f"Failed to fetch repository contents: {response.status_code}")

The fetch_repo_contents function retrieves the contents of a GitHub repository, including its files and directories, by making requests to the GitHub API. Here’s how the code operates:

  1. Construct the URL: The function builds a URL that targets the contents of the specified repository. It dynamically inserts the owner, repo, and path parameters into the URL.
  2. Set the authorization header: The function includes an authorization header that uses a GitHub access token stored in the variable GITHUB_ACCESS_TOKEN to authenticate the request.
  3. Send a GET request to the GitHub API: The function makes a request to the GitHub API using the requests.get method and stores the server’s response in the response variable.
  4. Check the response status code: If the response status code is 200 (indicating success), the function proceeds to process the response data. Otherwise, it raises an exception with an error message.
  5. Parse the JSON response: The function converts the response data from JSON format into a Python object and stores it in the contents variable.
  6. Initialize an empty list to store all contents: The function creates an empty list called all_contents to store the fetched files and directories.
  7. Iterate through the contents: The function loops through each item in the contents list. If the item is a directory ("type": "dir"), the function recursively calls itself to fetch the contents of that directory and adds them to the all_contents list. If the item is a file, the function directly adds the item to the list.
  8. Return the complete list of contents: After processing all items, the function returns the all_contents list, which contains both files and directories from the specified path.
  9. Handle errors with an exception: If the response status code is not 200, the function raises an exception with an error message indicating the failure and the status code.

Step 2: Update github_api.py

Modify the fetch_and_process_repo function to process all .py files, including those in subdirectories.

        
def fetch_and_process_repo(owner, repo):
    """
    Fetch and process all Python files in a repository.
    """
    print(f"Fetching repository contents for {owner}/{repo}...")
    contents = fetch_repo_contents(owner, repo)
    python_files = [file for file in contents if file["name"].endswith(".py")]
    all_code = ""
    for file in python_files:
        try:
            print(f"Processing file: {file['path']}")
            code = download_file_contents(file["download_url"])
            print(f"Code from {file['path']}:\n{code}\n")  # Debug: Print the fetched code
            all_code += f"# File: {file['path']}\n\n{code}\n\n"
        except Exception as e:
            print(f"Failed to process {file['path']}: {str(e)}")
    print("Repository processing complete.")
    return all_code

The fetch_and_process_repo function retrieves and processes all Python files in a GitHub repository. Here’s how the code works:

  1. Print the fetching status: The function displays a message indicating that it is fetching the contents of the specified repository.
  2. Fetch the repository contents: The function calls fetch_repo_contents(owner, repo) to retrieve all the files and directories within the repository. It stores the result in the contents variable.
  3. Filter for Python files: The function iterates through the contents list and selects only the files with names ending in .py. It stores these files in the python_files list.
  4. Initialize an empty string to store code: The function creates an empty string all_code to store the code from all the Python files.
  5. Process each Python file: The function loops through each file in python_files. For each file:
    • It prints a message showing the file being processed.
    • It calls download_file_contents(file["download_url"]) to fetch the file’s code.
    • It prints the fetched code for debugging purposes.
    • It appends the code to all_code, including a comment indicating the file’s path.
  6. Handle errors gracefully: If an error occurs while processing a file, the function catches the exception and prints an error message with the file path and the error details.
  7. Print the completion message: After processing all files, the function displays a message indicating that the repository processing is complete.
  8. Return the combined code: The function returns the all_code string, which contains the code from all the processed Python files, separated by file path comments.

Step 3: Update routes.py

Add a new route to repo generation.

@main_bp.route("/generate-repo-docs", methods=["POST"])
def generate_repo_docs_route():
    """
    Generate documentation for an entire repository.
    """
    data = request.json
    owner = data.get("owner")
    repo = data.get("repo")
    try:
        markdown_docs = generate_repo_docs(owner, repo)
        save_docs(markdown_docs, "repo_docs.md")  # Save Markdown to /docs/repo_docs.md
        html_docs = generate_html_docs(markdown_docs)
        save_docs(html_docs, "repo_index.html")  # Save HTML to /docs/repo_index.html
        return jsonify({
            "message": "Repository documentation generated successfully!",
            "markdown_docs": markdown_docs,
            "html_docs": html_docs
        })
    except Exception as e:
        return jsonify({"error": str(e)}), 500

The generate_repo_docs_route function defines an API endpoint that generates documentation for an entire GitHub repository. Here’s how the code operates:

  1. Receive the request data: The function accepts a POST request with JSON data containing the repository owner and name.
  2. Extract the owner and repository name: It accesses the "owner" and "repo" keys from the JSON data and assigns them to the owner and repo variables.
  3. Generate the repository documentation: It calls the generate_repo_docs(owner, repo) function to create the documentation in Markdown format and stores the result in markdown_docs.
  4. Save the Markdown documentation: The function saves the Markdown content to a file named repo_docs.md using the save_docs function.
  5. Convert the Markdown to HTML: The function calls generate_html_docs(markdown_docs) to convert the Markdown documentation into HTML format and stores the result in html_docs.
  6. Save the HTML documentation: The function saves the HTML content to a file named repo_index.html using the save_docs function.
  7. Return a success response: The function responds with a JSON object that includes a success message, the generated Markdown documentation, and the HTML documentation.
  8. Handle errors gracefully: If an exception occurs during the process, the function catches the error and returns a JSON object with the error message and a 500 status code.

Run the Flask app and test the /generate-repo-docs endpoint:

curl -X POST http://127.0.0.1:5000/generate-repo-docs -H "Content-Type: application/json" -d '{"owner": "pallets", "repo": "flask"}'

The curl command sends a POST request to the Flask server running on http://127.0.0.1:5000, specifically to the /generate-repo-docs endpoint. Here’s a breakdown:

  1. -X POST: This flag specifies the HTTP method as POST.
  2. http://127.0.0.1:5000/generate-repo-docs: This URL points to the local Flask server and the /generate-repo-docs endpoint.
  3. -H "Content-Type: application/json": This header tells the server that the request body contains JSON data.
  4. -d '{"owner": "pallets", "repo": "flask"}': This option provides the data payload in JSON format. It specifies that the owner of the repository is "pallets" and the repository name is "flask".

The server will receive this request and attempt to generate documentation for the specified repository.

What You’ve Achieved

  • Implemented version control and documentation drift detection.
  • Deployed your Flask app as a web service.
  • Set up a CI/CD pipeline for automated testing and deployment.
  • Automated documentation updates with GitHub Actions.
  • Added support for generating documentation for an entire repository, including subdirectories.

Full code for module 6

You can find the complete code for this tutorial in the GitHub repository.


Next Steps

  • Experiment with deploying to other cloud platforms (e.g., AWS, Google Cloud).
  • Add support for multiple programming languages.
  • Build a user-friendly dashboard for managing documentation.

Facebook Comments