Webpack complete packaging process analysis

foreword

webpack has played a main role in the front-end engineering field. Understanding its internal implementation mechanism will provide great help to your project construction (whether it is customizing functions or optimizing packaging).

Next, based on the webpack5 source code structure, we will briefly sort out and implement the entire packaging process, so as to think and understand what is done at each stage, and lay the foundation for future expansion and customization of engineering capabilities.

1. Preparation

In the process of process analysis, we will simply implement some functions of webpack, and some functions will be implemented with the help of third-party tools:

  • tapable provides a Hooks mechanism to access plug-ins for work;
  • babel-related dependencies can be used to parse source code into AST for module dependency collection and code rewriting.
// Create repository
mkdir webpack-demo && cd webpack-demo && npm init -y

// Install babel related dependencies
npm install @babel/parser @babel/traverse @babel/types @babel/generator -D

// Install tapable (register/trigger event stream) and fs-extra file manipulation dependencies
npm install tapable fs-extra -D

Next, we create two new entry files and a common module file in the src directory:

mkdir src && cd src && touch entry1.js && touch entry2.js && touch module.js

and add something to the file respectively:

// src/entry1.js
const module = require('./module');
const start = () => 'start';
start();
console.log('entry1 module: ', module);

// src/entry2.js
const module = require('./module');
const end = () => 'end';
end();
console.log('entry2 module: ', module);

// src/module.js
const name = 'cegz';
module.exports = {
  name,
};

With the packaging entry, let's create a webpack.config.js configuration file to do some basic configuration:

// ./webpack.config.js
const path = require('path');
const CustomWebpackPlugin = require('./plugins/custom-webpack-plugin.js');

module.exports = {
  entry: {
    entry1: path.resolve(__dirname, './src/entry1.js'),
    entry2: path.resolve(__dirname, './src/entry2.js'),
  },
  context: process.cwd(),
  output: {
    path: path.resolve(__dirname, './build'),
    filename: '[name].js',
  },
  plugins: [new CustomWebpackPlugin()],
  resolve: {
    extensions: ['.js', '.ts'],
  },
  module: {
    rules: [
      {
        test: /\.js/,
        use: [
          path.resolve(__dirname, './loaders/transformArrowFnLoader.js'), // Convert Arrow Functions
        ],
      },
    ],
  },
};

The above configuration specifies two entry files and an output.build output directory, as well as a plugin and a loader.

Next, we write the core entry file of webpack to implement the packaging logic. Here we create the files needed for the webpack core implementation:

// cd webpack-demo
mkdir lib && cd lib
touch webpack.js // webpack entry file
touch compiler.js // webpack core compiler
touch compilation.js // webpack core compilation object
touch utils.js // Utility function

Here we have created two similar files: compiler and compilation, which are briefly explained here:

  • compiler: the compiler of webpack, the run method it provides can be used to create a compilation object to handle code construction;
  • Compilation: Created and generated by compiler.run, it does all the work of packaging and compilation, and hands over the packaging product to the compiler for output writing.

For the entry file lib/webpack.js, you will see roughly the following structure:

// lib/webpack.js
function webpack(options) {
  ...
}

module.exports = webpack;

For the test case that executes the entry file, the code is as follows:

// Test case webpack-demo/build.js
const webpack = require('./lib/webpack');
const config = require('./webpack.config');

const compiler = webpack(config);

// Call the run method for packaging
compiler.run((err, stats) => {
  if (err) {
    console.log(err, 'err');
  }
  // console.log('Build complete!', stats.toJSON());
});

Next, from the lib/webpack.js entry file, follow the steps below to start analyzing the packaging process.

1. Initialization phase - webpack

  • Merge configuration items
  • create compiler
  • Register plugin

2. Compilation phase - build

  • read entry file
  • Compile from entry file
  • Invoke the loader to transform the source code
  • Collect dependency modules for AST with babel parsing
  • Recursively compile dependent modules

3. Generation stage - seal

  • Create chunk object
  • Generate assets object

4. Write phase - emit

Second, the initialization stage

The logic in the initialization phase is focused on calling webpack(config). Let's take a look at what the webpack() function does in the body.

2.1. Read and merge configuration information

Usually, in the root directory of our project, there will be a webpack.config.js as the configuration source of webpack;

In addition, there is another way to package through the webpak bin cli command, the parameters carried on the command line will also be used as the configuration of webpack.

The configuration file contains the entry modules, output locations, and various loader s, plugin s, etc. that we want to let webpack package and process;

The related configuration can also be specified on the command line, and the weight is higher than that of the configuration file. (The following will simulate the merge processing of webpack cli parameters)

So, we will do one thing first in the webpack entry file: merge the configuration file with the command line configuration.

// lib/webpack.js
function webpack(options) {
  // 1. Merge configuration items
  const mergeOptions = _mergeOptions(options);
  ...
}

function _mergeOptions(options) {
  const shellOptions = process.argv.slice(2).reduce((option, argv) => {
    // argv -> --mode=production
    const [key, value] = argv.split('=');
    if (key && value) {
      const parseKey = key.slice(2);
      option[parseKey] = value;
    }
    return option;
  }, {});
  return { ...options, ...shellOptions };
}

module.exports = webpack;

2.2. Create a compiler object

A good program structure is inseparable from an instance object, and webpack is also not to be outdone. Its compilation operation is driven by an instance object called compiler.

The configuration parameters we pass in are recorded on the compiler instance object, as well as some hooks API for concatenating plugins to work.

At the same time, it also provides the run method to start the packaged build, and emitAssets to write the output disk to the packaged product. This part is described later. Refer to the webpack video explanation: into learning

// lib/webpack.js
const Compiler = require('./compiler');

function webpack(options) {
  // 1. Merge configuration items
  const mergeOptions = _mergeOptions(options);
  // 2. Create a compiler
  const compiler = new Compiler(mergeOptions);
  ...
  return compiler;
}

module.exports = webpack;

The Compiler constructor infrastructure is as follows:

// core/compiler.js
const fs = require('fs');
const path = require('path');
const { SyncHook } = require('tapable'); // Subscription and notification hooks for tandem compiler packaging processes
const Compilation = require('./compilation'); // compile constructor

class Compiler {
  constructor(options) {
    this.options = options;
    this.context = this.options.context || process.cwd().replace(/\\/g, '/');
    this.hooks = {
      // Hooks to start compile time
      run: new SyncHook(),
      // Module parsing complete, executed when output file is written to disk
      emit: new SyncHook(),
      // Executed after the output file has been written
      done: new SyncHook(),
    };
  }

  run(callback) {
    ...
  }

  emitAssets(compilation, callback) {
    ...
  }
}

module.exports = Compiler;

When you need to compile, call the compiler.run method:

compiler.run((err, stats) => { ... });

2.3, plugin registration

After you have the compiler instance object, you can register plugins in the configuration file to intervene in the packaging and build at the right time.

The plugin needs to receive the compiler object as a parameter to have side effect s on the packaging process and products.

The format of the plugin can be a function or an object. If it is an object, you need to provide a custom apply method. The common plugin structure is as follows:

class WebpackPlugin {
  apply(compiler) {
    ...
  }
}

The logic for registering plugins is as follows:

// lib/webpack.js
function webpack(options) {
  // 1. Merge configuration items
  const mergeOptions = _mergeOptions(options);
  // 2. Create a compiler
  const compiler = new Compiler(mergeOptions);
  // 3. Register the plug-in and let the plug-in affect the packaging result
  if (Array.isArray(options.plugins)) {
    for (const plugin of options.plugins) {
      if (typeof plugin === "function") {
        plugin.call(compiler, compiler); // When the plugin is a function
      } else {
        plugin.apply(compiler); // If the plugin is an object, the apply method needs to be provided.
      }
    }
  }
  return compiler;
}

At this point, the initial work of webpack has been completed, and the next step is to call compiler.run() to enter the compilation and construction phase.

3. Compilation stage

The starting point of the compilation job is compiler.run, which will:

  1. Initiate a build notification and trigger hooks.run to notify related plugins;
  2. Create a compilation compilation object;
  3. Read entry entry file;
  4. Compile the entry entry file;

3.1. Create a compilation object

Module packaging (build) and code generation (seal) are implemented by compilation.

// lib/compiler.js
class Compiler {
  ...
  run(callback) {
    // trigger run hook
    this.hooks.run.call();
    // Create a compilation compilation object
    const compilation = new Compilation(this);
    ...
  }
}

The compilation instance records compilation information such as entries, module s, chunks, assets, etc. in the construction process, and provides build and seal methods for code construction and code generation.

// lib/compilation.js
const fs = require('fs');
const path = require('path');
const parser = require('@babel/parser');
const traverse = require('@babel/traverse').default;
const generator = require('@babel/generator').default;
const t = require('@babel/types');
const { tryExtensions, getSourceCode } = require('./utils');

class Compilation {
  constructor(compiler) {
    this.compiler = compiler;
    this.context = compiler.context;
    this.options = compiler.options;
    // Record the current module code
    this.moduleCode = null;
    // Save all dependent module objects
    this.modules = new Set();
    // Save all entry module objects
    this.entries = new Map();
    // All code block objects
    this.chunks = new Set();
    // Store the file object produced this time (one-to-one correspondence with chunks)
    this.assets = {};
  }
  build() {}
  seal() {}
}

Once you have the compilation object, start building the module by executing compilation.build .

// lib/compiler.js
class Compiler {
  ...
  run(callback) {
    // trigger run hook
    this.hooks.run.call();
    // Create a compilation compilation object
    const compilation = new Compilation(this);
    // compile module
    compilation.build();
  }
}

3.2. Read entry entry file

Building a module first starts with the entry entry module. At this time, the first job is to get the entry module information according to the configuration file.

There are various ways of entry configuration, such as: you can not pass it (there is a default value), you can pass in a string, or you can pass in an object to specify multiple entries.

Therefore, reading the entry file needs to consider and be compatible with these flexible configuration methods.

// lib/compilation.js
class Compilation {
  ...
  build() {
    // 1. Read the configuration entry
    const entry = this.getEntry();
    ...
  }

  getEntry() {
    let entry = Object.create(null);
    const { entry: optionsEntry } = this.options;
    if (!optionsEntry) {
      entry['main'] = 'src/index.js'; // By default, the src directory is searched for packaging
    } else if (typeof optionsEntry === 'string') {
      entry['main'] = optionsEntry;
    } else {
      entry = optionsEntry; // Treated as an object, such as a multi-entry configuration
    }
    // Calculate the relative path relative to the project startup root directory
    Object.keys(entry).forEach((key) => {
      entry[key] = './' + path.posix.relative(this.context, entry[key]);
    });
    return entry;
  }
}

3.3. Compile entry entry file

After getting the entry file, build each entry in turn.

// lib/compilation.js
class Compilation {
  ...
  build() {
    // 1. Read the configuration entry
    const entry = this.getEntry();
    // 2. Build the entry module
    Object.keys(entry).forEach((entryName) => {
      const entryPath = entry[entryName];
      const entryData = this.buildModule(entryName, entryPath);
      this.entries.set(entryName, entryData);
    });
  }
}

The build phase does the following:

  1. Read the entry entry file content through the fs module;
  2. call the loader to transform (change) the file content;
  3. Create a module object for the module, parse the source code through AST to collect the dependent modules, and rewrite the path of the dependent modules;
  4. If there are dependent modules, recursively perform the above three steps;

Read file content:

// lib/compilation.js
class Compilation {
  ...
  buildModule(moduleName, modulePath) {
    // 1. Read the original code of the file
    const originSourceCode = fs.readFileSync(modulePath, 'utf-8');
    this.moduleCode = originSourceCode;
    ...
  }
}

Call the loader to convert the source code:

// lib/compilation.js
class Compilation {
  ...
  buildModule(moduleName, modulePath) {
    // 1. Read the original code of the file
    const originSourceCode = fs.readFileSync(modulePath, 'utf-8');
    this.moduleCode = originSourceCode;
    // 2. Call the loader for processing
    this.runLoaders(modulePath);
    ...
  }
}

The loader itself is a JS function that receives the source code of the module file as a parameter, and returns the new code after processing and transformation.

// lib/compilation.js
class Compilation {
  ...
  runLoaders(modulePath) {
    const matchLoaders = [];
    // 1. Find the loader that matches the module
    const rules = this.options.module.rules;
    rules.forEach((loader) => {
      const testRule = loader.test;
      if (testRule.test(modulePath)) {
        // For example: { test:/\.js$/g, use:['babel-loader'] }, { test:/\.js$/, loader:'babel-loader' }
        loader.loader ? matchLoaders.push(loader.loader) : matchLoaders.push(...loader.use);
      }
    });
    // 2. Execute the loader in reverse order
    for (let i = matchLoaders.length - 1; i >= 0; i--) {
      const loaderFn = require(matchLoaders[i]);
      // Call the loader to process the source code
      this.moduleCode = loaderFn(this.moduleCode);
    }
  }
}

Execute the webpack module compilation logic:

// lib/compilation.js
class Compilation {
  ...
  buildModule(moduleName, modulePath) {
    // 1. Read the original code of the file
    const originSourceCode = fs.readFileSync(modulePath, 'utf-8');
    this.moduleCode = originSourceCode;
    // 2. Call the loader for processing
    this.runLoaders(modulePath);
    // 3. Call webpack to compile the module and create a module object for the module
    const module = this.handleWebpackCompiler(moduleName, modulePath);
    return module; // Back to module
  }
}
  1. Create a module object;
  2. Parse the module code into an AST syntax tree;
  3. Traverse the AST to identify the require module syntax, collect the modules in module.dependencies, and rewrite the require syntax to __webpack_require__;
  4. Convert the modified AST to source code;
  5. If there are dependent modules, build the dependent modules deeply recursively.
// lib/compilation.js
class Compilation {
  ...
  handleWebpackCompiler(moduleName, modulePath) {
    // 1. Create a module
    const moduleId = './' + path.posix.relative(this.context, modulePath);
    const module = {
      id: moduleId, // Calculate the relative path of the current module relative to the project startup root directory as the module ID
      dependencies: new Set(), // Store the submodules this module depends on
      entryPoint: [moduleName], // The entry file to which the module belongs
    };

    // 2. Parse the module content into AST, collect dependent modules, and rewrite the module import syntax to __webpack_require__
    const ast = parser.parse(this.moduleCode, {
      sourceType: 'module',
    });

    // Traverse ast, identify require syntax
    traverse(ast, {
      CallExpression: (nodePath) => {
        const node = nodePath.node;
        if (node.callee.name === 'require') {
          const requirePath = node.arguments[0].value;
          // Find module absolute path
          const moduleDirName = path.posix.dirname(modulePath);
          const absolutePath = tryExtensions(
            path.posix.join(moduleDirName, requirePath),
            this.options.resolve.extensions,
            requirePath,
            moduleDirName
          );
          // Create moduleId
          const moduleId = './' + path.posix.relative(this.context, absolutePath);
          // Turn require into a __webpack_require__ statement
          node.callee = t.identifier('__webpack_require__');
          // Modify the module path (refer to the relative path of this.context )
          node.arguments = [t.stringLiteral(moduleId)];

          if (!Array.from(this.modules).find(module => module.id === moduleId)) {
            // Record sub-dependencies in the module's dependencies collection
            module.dependencies.add(moduleId);
          } else {
            // already exists in the module collection. Although the entry module compilation is not added, the dependent entry module must still be recorded on this module
            this.modules.forEach((module) => {
              if (module.id === moduleId) {
                module.entryPoint.push(moduleName);
              }
            });
          }
        }
      },
    });

    // 3. Generate new code from ast
    const { code } = generator(ast);
    module._source = code;

    // 4. Deep recursive build dependent modules
    module.dependencies.forEach((dependency) => {
      const depModule = this.buildModule(moduleName, dependency);
      // Add any compiled dependent module objects to the modules object
      this.modules.add(depModule);
    });

    return module;
  }
}

Usually, when we require a module file, we are used to not specifying the file suffix. By default, the .js file will be searched.

This is related to the resolve.extensions configuration we specified in the configuration file, in the tryExtensions method will try to apply resolve.extensions for each Path with no suffix filled in:

// lib/utils.js
const fs = require('fs');

function tryExtensions(
  modulePath,  extensions,  originModulePath,  moduleContext
) {
  // The first try does not need the extension option (if the user has already passed in the suffix, it will be filled in by the user, and there is no need to apply extensions)
  extensions.unshift('');
  for (let extension of extensions) {
    if (fs.existsSync(modulePath + extension)) {
      return modulePath + extension;
    }
  }
  // does not match the corresponding file
  throw new Error(
    `No module, Error: Can't resolve ${originModulePath} in  ${moduleContext}`
  );
}

module.exports = {
  tryExtensions,
  ...
}

At this point, the "compile phase" ends here, and the next step is the "generation phase" seal.

Fourth, the generation stage

In the "compile phase", files will be built into modules and stored in this.modules.

In the "generating phase", the corresponding chunk is created according to the entry and the module set that the entry depends on is looked up from this.modules.

Finally, combine the runtime webpack module mechanism to run the code, and generate the final assets product through splicing.

// lib/compiler.js
class Compiler {
  ...
  run(callback) {
    // trigger run hook
    this.hooks.run.call();
    // Create a compilation compilation object
    const compilation = new Compilation(this);
    // compile module
    compilation.build();
    // generate product
    compilation.seal();
    ...
  }
}

entry + module --> chunk --> assets The process is as follows:

// lib/compilation.js
class Compilation {
  ...
  seal() {
    // 1. Create chunk according to entry
    this.entries.forEach((entryData, entryName) => {
      // According to the interdependence between the current entry file and the module, it is assembled into chunk s containing all dependent modules of the current entry.
      this.createChunk(entryName, entryData);
    });
    // 2. Create assets based on chunk s
    this.createAssets();
  }

  // Assemble chunks from entry files and dependent modules
  createChunk(entryName, entryData) {
    const chunk = {
      // Each entry file as a chunk
      name: entryName,
      // Data information after entry build
      entryModule: entryData,
      // entry's dependent modules
      modules: Array.from(this.modules).filter((i) =>
        i.entryPoint.includes(entryName)
      ),
    };
    // add chunk
    this.chunks.add(chunk);
  }

  createAssets() {
    const output = this.options.output;
    // Generate assets based on chunks
    this.chunks.forEach((chunk) => {
      const parseFileName = output.filename.replace('[name]', chunk.name);
      // Concatenate runtime syntax for each chunk file code
      this.assets[parseFileName] = getSourceCode(chunk);
    });
  }
}

getSourceCode is a chunk that combines entry and modules into the runtime code template.

// lib/utils.js
function getSourceCode(chunk) {
  const { entryModule, modules } = chunk;
  return `  (() => {    var __webpack_modules__ = {      ${modules        .map((module) => {          return `          '${module.id}': (module) => {            ${module._source}
      }        `;        })        .join(',')}
    };    var __webpack_module_cache__ = {};    function __webpack_require__(moduleId) {      var cachedModule = __webpack_module_cache__[moduleId];      if (cachedModule !== undefined) {        return cachedModule.exports;      }      var module = (__webpack_module_cache__[moduleId] = {        exports: {},      });      __webpack_modules__[moduleId](module, module.exports, __webpack_require__);      return module.exports;    }    (() => {      ${entryModule._source}
    })();  })();  `;
}

At this point, the "generation phase" processing is completed, which also means the completion of the compilation work, and then we return to the compiler for the final "product output".

5. Writing stage

The "writing phase" is relatively easy to understand. The assets already have the final packaged code content, and the last thing to do is to write the code content to the local disk.

// lib/compiler.js
class Compiler {
  ...
  run(callback) {
    // trigger run hook
    this.hooks.run.call();
    // Create a compilation compilation object
    const compilation = new Compilation(this);
    // compile module
    compilation.build();
    // generate product
    compilation.seal();
    // output product
    this.emitAssets(compilation, callback);
  }

  emitAssets(compilation, callback) {
    const { entries, modules, chunks, assets } = compilation;
    const output = this.options.output;

    // Call the Plugin emit hook
    this.hooks.emit.call();

    // If output.path does not exist, create it
    if (!fs.existsSync(output.path)) {
      fs.mkdirSync(output.path);
    }

    // Write the contents of assets to the file system
    Object.keys(assets).forEach((fileName) => {
      const filePath = path.join(output.path, fileName);
      fs.writeFileSync(filePath, assets[fileName]);
    });

    // Trigger the hook after the end
    this.hooks.done.call();

    callback(null, {
      toJSON: () => {
        return {
          entries,
          modules,
          chunks,
          assets,
        };
      },
    });
  }
}

At this point, the packaging process of webpack is complete.

Next, we improve the unimplemented loader and plugin in the configuration file, and then call the test case to test the above implementation.

Six, write loader

In webpack.config.js we configure a custom loader for the .js file type to transform the file content:

// webpack.config.js
module: {
  rules: [
    {
      test: /\.js/,
      use: [
        path.resolve(__dirname, './loaders/transformArrowFnLoader.js'),
      ],
    },
  ],
},

The loader itself is a function that receives the content of the file module as a parameter, and returns the new file content after transformation processing.

Next, we convert the arrow functions used in the file into ordinary functions in loaders/transformArrowFnLoader.js to understand the role of webpack loader.

// loaders/transformArrowFnLoader.js
const parser = require('@babel/parser');
const traverse = require('@babel/traverse').default;
const generator = require('@babel/generator').default;
const t = require('@babel/types');

function transformArrowLoader(sourceCode) {
  const ast = parser.parse(sourceCode, {
    sourceType: 'module'
  });
  traverse(ast, {
    ArrowFunctionExpression(path, state) {
      const node = path.node;
      const body = path.get('body');
      const bodyNode = body.node;
      if (bodyNode.type !== 'BlockStatement') {
        const statements = [];
        statements.push(t.returnStatement(bodyNode));
        node.body = t.blockStatement(statements);
      }
      node.type = "FunctionExpression";
    }
  });
  const { code } = generator(ast);

  return code;
}

module.exports = transformArrowLoader;

Finally, the arrow function becomes the following structure after processing:

const start = () => 'start';
    ||
    ||
const start = function () {
  return 'start';
};

Seven, write plug-ins

From the above introduction, we know that each plugin needs to provide an apply method, which receives the compiler as a parameter.

Through the compiler, you can subscribe to hooks at different stages during the webpack work, so as to affect the packaging result or do some custom operations.

Let's write a custom plugin that binds two compiler.hooks at different times to extend webpack's packaging capabilities:

  • hooks.emit.tap binds a function, which is executed after webpack compiles resources and before writing the output to disk (you can do the operation of clearing the output.path directory);
  • hooks.done.tap binds a function to be executed after webpack writes to disk (you can do some static resource copy operations).
// plugins/custom-webpack-plugins
const fs = require('fs-extra');
const path = require('path');

class CustomWebpackPlugin {
  apply(compiler) {
    const outputPath = compiler.options.output.path;
    const hooks = compiler.hooks;

    // clear build directory
    hooks.emit.tap('custom-webpack-plugin', (compilation) => {
      fs.removeSync(outputPath);
    });

    // copy static resources
    const otherFilesPath = path.resolve(__dirname, '../src/otherfiles');
    hooks.done.tap('custom-webpack-plugin', (compilation) => {
      fs.copySync(otherFilesPath, path.resolve(outputPath, 'otherfiles'));
    });
  }
}

module.exports = CustomWebpackPlugin;

Now, we run the file through node build.js, which will eventually generate a build directory and entry packaged resources under webpack-demo.

end of the article

I believe that after reading this article, you will have a clear understanding of the packaging ideas of webpack.

Tags: Webpack

Posted by AL-Kateb on Fri, 21 Oct 2022 05:36:09 +0530