Managing complex tasks with gulp v4

Published on Last modified on

Developing frontend projects consists of many tasks, like starting the development server, testing, bundling modules for production, end-to-end testing, type-checking, fetching translations… you name it.

In smaller projects implementing these tasks using only npm scripts (the scripts field in package.json) might be manageable, but as the project grows the number of scripts grows rapidly, and when we start defining which scripts run in parallel, for example with npm-run-all, we start running into problems.

Too many npm scripts

Scripts start becoming hard to follow, and it’s often unclear which scripts are meant to be run by people, and which only by other scripts. As a workaround we could try documenting this, but because it’s not possible to do it directly in package.json, we have to do it elsewhere, which is prone to becoming outdated and untrustworthy.

Sometimes we need functionality which cannot be performed through CLI, so we might write a Node.js script, and create an npm script for it containing a command like node scripts/complex.js. In case it’s asynchronous, we have to remember to call process.exit(1) when the script is supposed to fail because rejected promises don’t cause errors. Also, scripts which immediately execute are harder to test.

How to make managing tasks easier?

Good news, everyone! Remember gulp?

The version you most likely remember is v3, but in April 2019. v4 was officially announced, see what’s new. Now it has an incredibly simple API and a great way to compose tasks!

A real-world example

Let’s say that we need to create a build script that consists of:

  • fetching some data
  • transpiling server code
  • optimizing images
  • running webpack

Then let’s say that server code needs that fetched data to compile. Image optimization and webpack can be run independently.

Solved with npm scripts

Performing these tasks with npm scripts would look something like this, assuming that we’re using npm-run-all:

{
  "scripts": {
    "build": "run-p compile-server-code optimize-images bundle",
    "fetch-data": "node fetchData.js",
    "compile-server-code": "yarn fetch-data && babel src/server --out-dir dist/server",
    "optimize-images": "imagemin src/images/* --out-dir dist/images",
    "bundle": "webpack --mode production"
	}
}

The benefit of this approach is that it’s centralized, I can figure out the workflow from package.json alone, unless I need to take a look at some tool’s documentation. However, even with just these few tasks I’m already running into problems, like having to run fetch-data as part of transpile-server. Another problem is the run-p command, which provides a shortcut for npm-run-all --parallel at the cost of being cryptic for people who don’t know about npm-run-all. And this is only four scripts, imagine adding ten more.

But the biggest problem I see here is that these scripts are all equally exposed even though they are mostly going to be run as part of build. A workaround might be to prefix the names of these scripts with build: (e.g. build:optimize-images), but what if a script is being run by multiple scripts?

Solved with gulp

This is how the same problem could be solved with gulp:

const gulp = require('gulp')
const { exec } = require('child_process')
// fetch-data.js needs to export an async function
const fetchData = require('./scripts/fetch-data')
 
const compileServerCode = () => {
  return exec('babel src/server --out-dir dist/server')
}
 
const optimizeImages = () => {
  return exec('imagemin src/images/**/* --out-dir dist/images')
}
 
const bundle = () => {
  return exec('webpack --mode production')
}
 
const build = gulp.parallel(
  gulp.series(fetchData, compileServerCode),
  optimizeImages,
  bundle,
)
 
module.exports = {
  build,
}

Comprehending the build task is now effortless!

As you can see, a gulp task can return a promise (fetchData), a child process (compileServerCode, optimizeImages and bundle), a stream, which gulp is best known for, an event emitter, an observable, and an error-first callback. See examples for each.

By exporting the build task we’re exposing it, so we can run it with gulp build. We can also create an npm script for it as an additional layer of abstraction:

{
  "scripts": {
    "build": "gulp build"
  }
}

This is optional, but I prefer running gulp tasks through npm scripts because then package.json is a single source of truth for all available scripts.

This approach is less centralized, but I think of gulp as a tool which embraces decentralization, and by separating internal from exposed tasks we can provide a clearer API for our team. Implementing all tasks in package.json can eventually make us the only experts on them, and they can become harder to refactor because we’re not sure whether someone is using the scripts we want to change.

When we run build, we can see another benefit of using gulp:

yarn build
# [00:14:08] Using gulpfile ~/Code/test/gulpfile.js
# [00:14:08] Starting 'build'...
# [00:14:08] Starting 'optimizeImages'...
# [00:14:08] Starting 'bundle'...
# [00:14:08] Starting 'fetchData'...
# [00:14:08] Finished 'optimizeImages' after 4.38 s
# [00:14:08] Finished 'bundle' after 28.55 s
# [00:14:08] Finished 'fetchData' after 5.39 s
# [00:14:08] Starting 'compileServerCode'...
# [00:14:08] Finished 'compileServerCode' after 12.36 s
# [00:14:08] Finished 'build' after 50.68 s

Conclusion

As retro as gulp might seem, I highly recommend using it for managing tasks in your project. It allows you to split, add comments to and even test your build code, so that the scripts field in your package.json can relax from all that CLI gymnastics.