CHAPTER 14

Workflow Artifacts and Caching

Updated: May 15, 2026

25 min read

# CHAPTER 14

Workflow Artifacts and Caching

1. Introduction

GitHub Runners are completely ephemeral. Every time a workflow starts, you receive a brand-new, empty virtual machine. Every time it ends, that machine is destroyed, and the hard drive is wiped clean. While this is fantastic for security and reproducibility, it creates two massive problems:

1. If Job 1 compiles the code, how does Job 2 get the compiled files if the machine was destroyed?

2. If it takes 3 minutes to download 500MB of npm or composer packages, why do we have to download them again for every single commit?

In this chapter, we will solve these problems using two distinct GitHub features: Artifacts (for passing data between jobs) and Caching (for speeding up pipelines).

2. Learning Objectives

By the end of this chapter, you will be able to:

Differentiate between the operational purpose of Artifacts and Caching.

Use actions/upload-artifact to persist data after a job finishes.

Use actions/download-artifact to pass data to sequential jobs.

Use actions/cache to speed up dependency installation times.

Understand how cache keys are dynamically generated based on lock files.

3. Beginner-Friendly Explanation

Imagine a multi-stage factory.

The Artifact (Passing the Baton): Station 1 bakes a cake. Station 2 frosts the cake. Because Station 1 and Station 2 are in different buildings, Station 1 must put the cake in a delivery box (Upload Artifact) and send it. Station 2 receives the box, opens it (Download Artifact), and applies the frosting. Artifacts are the physical handoff of the final product.

The Cache (The Tool Shed): Station 1 needs a very specific wrench to fix the oven. The first time, they drive to the hardware store to buy it (takes 30 mins). Before they leave for the day, they put the wrench in a locked shed. The next day, instead of going to the store, they just grab the wrench from the shed (takes 10 seconds). Caching is saving the heavy tools so you don't have to re-download them.

4. Passing Data with Artifacts

Because jobs run on completely separate servers, they cannot share files directly. If the build job creates app.zip, the deploy job will not be able to find app.zip. You must explicitly upload the file to GitHub's temporary storage, and the next job must explicitly download it.

Uploading:

yaml

12345

      - name: Save compiled code
        uses: actions/upload-artifact@v4
        with:
          name: my-compiled-app # The label for the box
          path: ./build-output/ # What to put in the box

*Note: Artifacts uploaded during a workflow are visible in the GitHub UI and can be manually downloaded by developers as ZIP files!*

5. Speeding Up Builds with Caching

If your composer install command takes 2 minutes, you are wasting valuable cloud minutes. Dependencies rarely change between commits. We can cache the vendor/ or node_modules/ directories.

How Caching Works: You provide a "Key" (usually a hash of your composer.lock file). GitHub checks its storage: "Do I have a saved folder matching this key?"

Cache Hit: GitHub instantly copies the saved folder to your runner. composer install finishes in 1 second.

Cache Miss: GitHub runs composer install normally, and then saves the resulting folder for the next time.

6. Mini Project: Optimize CI Workflow Speed

Let's optimize a PHP workflow by implementing a robust dependency cache.

Step-by-Step Walkthrough:

1. Create .github/workflows/caching-demo.yml.

2. Paste the following declarative code:

yaml

123456789101112131415161718192021222324252627

name: Cached PHP Build
on: [push]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: shivammathur/setup-php@v2
        with:
          php-version: &#039;8.2'

      # Step 1: Attempt to restore the cache BEFORE running composer
      - name: Cache Composer Dependencies
        uses: actions/cache@v4
        with:
          # The physical folder to save/restore
          path: vendor
          # The unique key based on the OS and the hash of the lock file
          key: ${{ runner.os }}-composer-${{ hashFiles(&#039;**/composer.lock') }}
          # Fallback key if an exact match isn't found
          restore-keys: |
            ${{ runner.os }}-composer-

      # Step 2: Install dependencies (Will be instantly fast if Cache Hit!)
      - name: Install Dependencies
        run: composer install --prefer-dist --no-progress

*Run this workflow twice. The first run will take normal time. The second run will be significantly faster because the cache was utilized!*

7. Real-World Scenarios

A data science team had a Python CI pipeline that required downloading 5 Gigabytes of Machine Learning libraries (like TensorFlow and PyTorch) during the pip install phase. Every time a developer pushed a single line of code, the pipeline took 25 minutes just to download the libraries. The developers stopped running tests because it was too slow. A DevOps engineer implemented the actions/cache step. Because the dependencies were massive but rarely changed, the cache hit successfully 99% of the time, dropping the pipeline execution time from 25 minutes down to 2 minutes.

8. Best Practices

Cache Invalidation (The Lock File): In the mini-project, the key was ${{ hashFiles('/composer.lock') }}. This is brilliant engineering. If a developer doesn't add new packages, the hash remains the same, and the cache is used. If a developer runs composer require new-package, the composer.lock file changes. The hash changes. GitHub sees a new key, realizes it's a "Cache Miss", completely ignores the old saved folder, and correctly downloads the new package.

9. Security Recommendations

Artifact Exposure: Artifacts are tied to the repository. If you upload an artifact containing a compiled app with hardcoded API keys, anyone with read access to the GitHub repository can download that ZIP file from the Actions tab and extract the keys. Never include sensitive configuration files in uploaded artifacts.

10. Troubleshooting Tips

Artifact Size Limits: GitHub imposes storage limits on Artifacts. They are meant for passing code, not for storing 100GB database backups. If your pipeline fails during the upload step, check if you are attempting to upload the entire Linux filesystem instead of just a specific ./build/ directory.

11. Exercises

1. Explain the functional difference between an Artifact and a Cache in a CI/CD pipeline.

2. Why do we use the hash of a lock file (like package-lock.json) to generate the Cache Key, rather than just naming the key my-cache?

12. FAQs
Q: Do I need to explicitly "save" the cache at the end of the workflow? A: No! The actions/cache step is smart. If it experiences a "Cache Miss" during the workflow, it will automatically run a post-job cleanup step to save the folder for the next time. You don't need to write any extra YAML.
13. Interview Questions

Q: Describe the mechanics of state persistence across parallel and sequential jobs in GitHub Actions. How do you pass a compiled binary from a 'Build' job to a 'Deploy' job?

Q: A CI pipeline takes 15 minutes to run npm install. Architect the YAML step required to cache the node_modules directory, and explain the cache invalidation strategy ensuring outdated packages are not restored.

14. Summary
In Chapter 14, we mastered the manipulation of state across ephemeral environments. We solved the problem of isolated job execution by utilizing actions/upload-artifact to physically pass compiled deliverables down the assembly line. More importantly, we radically optimized our pipeline performance by implementing dependency caching. By utilizing intelligent cache keys based on cryptographic file hashes, we eliminated redundant downloads, saving immense amounts of cloud compute time and providing developers with the rapid feedback loops essential for true CI/CD.
15. Next Chapter Recommendation
Our pipelines are fast, complex, and highly integrated. But are they secure? Could a malicious Pull Request hijack our runners? Proceed to Chapter 15: GitHub Actions Security Best Practices**.

Featured

Browse All 21+ Subject Areas

Popular Topics

More Topics

Quick Links

Featured

Visual Algorithm Labs

Sorting Algorithms

Data Structures

Featured

Frontend Dev

Career Paths

Skill Tracks

Featured

The Future of Web Architecture in 2026

Categories

Community

Practice Quizzes

Workflow Artifacts and Caching

Workflow Artifacts and Caching

1. Introduction

2. Learning Objectives

3. Beginner-Friendly Explanation

4. Passing Data with Artifacts

5. Speeding Up Builds with Caching

6. Mini Project: Optimize CI Workflow Speed

7. Real-World Scenarios

8. Best Practices

9. Security Recommendations

10. Troubleshooting Tips

11. Exercises

12. FAQs

13. Interview Questions

14. Summary

15. Next Chapter Recommendation

Finish this Chapter

Discussion

Send Feedback / Bug

Feedback Submitted!

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Workflow Artifacts and Caching #

1. Introduction #

2. Learning Objectives #

3. Beginner-Friendly Explanation #

4. Passing Data with Artifacts #

5. Speeding Up Builds with Caching #

6. Mini Project: Optimize CI Workflow Speed #

7. Real-World Scenarios #

8. Best Practices #

9. Security Recommendations #

10. Troubleshooting Tips #

11. Exercises #

12. FAQs #

13. Interview Questions #

14. Summary #

15. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 4

❓ Related Quizzes 5

🧪 Related Labs 1

Send Feedback / Bug

Feedback Submitted!

Workflow Artifacts and Caching

1. Introduction

2. Learning Objectives

3. Beginner-Friendly Explanation

4. Passing Data with Artifacts

5. Speeding Up Builds with Caching

6. Mini Project: Optimize CI Workflow Speed

7. Real-World Scenarios

8. Best Practices

9. Security Recommendations

10. Troubleshooting Tips

11. Exercises

12. FAQs

13. Interview Questions

14. Summary

15. Next Chapter Recommendation