Skip to content

Latest commit

 

History

History
807 lines (544 loc) · 31.8 KB

ch04.2-writing-logs.md

File metadata and controls

807 lines (544 loc) · 31.8 KB

Read Prev

Writing logs

The code for the entire chapter can be found at the src/chapter_04.2 directory

We've covered how to build the core utility helpers that will help us construct our logging library in a more modular way. However, we haven't gotten to the fun part of actually writing logs yet.

You might be wondering how we can call this a logging library if we haven't even learned how to write logs with it. Don't worry, we're about to get to that!

Before we begin writing logs to files, let's first introduce some concepts related to file-based logging and how we can make it more efficient and faster.

1. Re-using the File Handle

When logging to a file, it's important to manage the file handle efficiently. The file handle is the connection between our code and the log file on disk. Opening and closing the file for every log entry can be slow and use up resources. To avoid this, we want to reuse the file handle throughout the logging process.

To do this, we open the file handle once when we start the logging library and keep it open until the logging is done. This means we don't need to keep opening and closing the file, which can speed up our logging and save resources.

2. Log Rotation

Log rotation is a critical strategy in file-based logging to manage log files over time. As your application generates more log data, log files can become large and unwieldy. Log rotation involves creating new log files periodically or based on certain conditions, and optionally archiving or deleting older log files. This helps keep the log files at a manageable size, ensures easier log analysis, and prevents running out of disk space.

3. Asynchronous Logging

In network based apps, especially those with high levels of concurrency or those that involve asynchronous operations, implementing asynchronous logging can be beneficial, and that's the only reason we're using node:fs/promises instead of node:fs. Asynchronous logging ensures that the act of writing logs doesn't block or slow down the main execution thread of your application. This can prevent performance bottlenecks and maintain the responsiveness of your application.

When logging is performed synchronously, each log entry is written immediately to the log file or output, which can introduce delays and impact the overall performance of your application. Asynchronous logging, on the other hand, involves buffering log messages and writing them to the log file in batches or on a separate thread or process.

We'll need to do more optimization than just using asynchronous file writing. Specifically, we should store the entire log contents in memory and write them periodically. This will make the process extremely fast and ensure that it doesn't consume too much memory.

By decoupling the logging process from the main application logic, we can achieve several advantages:

  • Improved Performance: Asynchronous logging allows the main application thread to continue executing without waiting for log writes to complete. This can be crucial for applications that require high responsiveness and throughput.

  • Reduced I/O Overhead: Writing log messages to disk can be an I/O-intensive operation. By batching multiple log messages together and writing them in one go, you reduce the frequency of disk I/O operations, which can improve efficiency.

  • Better Resource Utilization: Asynchronous logging allows you to optimize resource utilization, such as managing concurrent log writes, handling errors without disrupting the main flow, and efficiently managing file handles.

  • Enhanced Scalability: Applications with multiple threads or processes benefit from asynchronous logging because it minimizes contention for resources like the log file. This is particularly valuable in scenarios where multiple components are concurrently generating log messages

4. Getting Caller Information (Module and Line Number)

Including caller information, such as the file name and line number from which a log message originated, can significantly enhance the effectiveness of our logging library. This feature provides contextual insight into where specific events occurred in the codebase, making it easier to identify the source of issues and troubleshoot them.

When an application encounters an error or unexpected behavior, having access to the module and line number associated with the log message allows developers to:

  • Quickly locate the exact location in the code where the event occurred.
  • Understand the sequence of events leading up to the issue.
  • Make precise code adjustments to fix problems.

Implementing this feature might involve using techniques from the programming language's stack trace or introspection mechanisms. Here's a high-level overview of how you could achieve this:

  1. Capture Caller Information: When a log message is generated, our logging library will retrieve the caller information, including the module and line number. We'll learn it in a bit that how can we do that.

  2. Format the Log Message: Combine the captured caller information with the log message and other relevant details like timestamp and log level.

  3. Output the Log Message: Write the formatted log message to the desired output destinations, ensuring that the caller information is included.

Enough of the theory, let's start writing logs.

Testing our current API

Let's start with building small features that we need to have on our Logger class. Before that, we're going to do some testing on whatever we've built till now.

To do the testing, we'll create a new file test.js and a config.json. test.js will hold our code that we write while making use of our logtar library. The config.json will used to store our config for LogConfig as a json object.

// test.js

const { Logger, LogConfig } = require("./index");

const logger = Logger.with_config(LogConfig.from_file("./config.json"));

The config.json file has the following contents. You may try to tweak the values as well. Try putting in the values that aren't supported by us, and see whether the assert methods that we created actually crash the program or not?

// config.json

{
    "level": 3,
    "file_prefix": "LogTar_",
    "rolling_config": {
        "size_threshold": 1024000,
        "time_threshold": 86400
    }
}

Try executing the test.js file.

$ node test.js

Nothing happens. But this proves that the whole setup for our Logger and other classes is working perfectly fine.

To check whether the config is being loaded properly, we can create a couple of getter methods on our Logger class.

// file: lib/logger.js

class Logger {
    ...
    get level() {
        return this.#config.level;
    }

    get file_prefix() {
        return this.#config.file_prefix;
    }

    get time_threshold() {
        return this.#config.rolling_config.time_threshold;
    }

    get size_threshold() {
        return this.#config.rolling_config.size_threshold;
    }
    ...
}

Now, in test.js print those out to the standard output.

$node test.js

### outputs
LogTar_
1024000
86400
3

Perfect! Everything works fine. But what happens if I try to put a value that's not supported by our Logger API? Let's change the time_threshold to 400 in config.json and re-run the app.

$node test.js

### outputs
Error: time_option must be an instance of RollingConfig. Unsupported param 400
    at Function.assert (/Users/ishtmeet/Code/logtard/lib/utils/rolling-options.js:36:19)

Why? Let's take a quick look at our RollingTimeOptions utility class

// file: lib/util/rolling-options.js

class RollingTimeOptions {
    static Minutely = 60; // Every 60 seconds
    ... // Other options

    // Throw an error when the user sets the value which isn't one of those
    static assert(time_option) {
        if (![this.Minutely, this.Hourly, this.Daily, this.Weekly, this.Monthly, this.Yearly].includes(time_option)) {
            throw new Error(
                `time_option must be an instance of RollingConfig. Unsupported param ${JSON.stringify(time_option)}`
            );
        }
    }
}

You might argue that this isn't the best dev experience. However, I think it is. We should always constrain over how configurable our library is. You would hardly need any other duration other than we specified i.e Minutely, Hourly, Daily etc. and anything more than Yearly

Let's change the time_threshold in config.json back to 86400 which means 1 day.

Implementing logging methods

Since our logger supports 5 types of log_levels, let's write a public method for each one of those on our Logger class.

// file lib/logger.js

class Logger {
    ...

    debug(message) { console.log('Debug: %s', message) }

    info(message) { console.log('Info: %s', message) }

    warn(message) { console.log('Warn: %s', message) }

    error(message) { console.log('Error: %s', message) }

    critical(message) { console.log('Critical: %s', message) }

    ...
}

And in test.js call these methods

// file: test.js
const { Logger, LogConfig } = require('./index')

const logger = Logger.with_config(LogConfig.from_file('./config.json'));

console.log(logger.file_prefix);
console.log(logger.size_threshold);
console.log(logger.time_threshold);
console.log(logger.level);

logger.debug('Hello debug');
logger.info('Hello info');
logger.warn('Hello warning');
logger.error('Hello error');
logger.critical('Hello critical');


// outputs
LogTar_
1024000
86400
3
Debug: Hello debug
Info: Hello info
Warn: Hello warning
Error: Hello error
Critical: Hello critical

DRY (Don't Repeat Yourself)

The "Don't Repeat Yourself" (DRY) principle is a basic concept in software development that promotes code reusability and maintainability. The idea behind DRY is to avoid duplicating code or logic in multiple places within your codebase. Instead, you aim to create a single source for a particular piece of functionality, and whenever you need that functionality, you refer to that source.

DRY encourages developers to write clean, efficient, and modular code by:

  • Reducing the chances of errors: Duplicated code increases the chances of mistakes or bugs when changes are made in one place but not in others.
  • Simplifying maintenance: When a change is required, you only need to update the code in one place, making it easier to keep your codebase up-to-date and consistent.
  • Enhancing readability: Code that is free from unnecessary duplication is easier to understand and follow, making it more accessible to other developers.

Although following the DRY principle is generally beneficial, there can be situations where duplication might not necessarily be a bad thing. Not every instance of code repetition needs to be eliminated, and striving for absolute DRYness in all cases might lead to overcomplicated solutions or premature abstraction.

A guideline often mentioned is the "Rule of Three": If you find yourself repeating the same code or logic more than three times, it's a strong indication that you should consider refactoring that code into a reusable function, class, or module. This threshold helps you strike a balance between reusability and pragmatic simplicity.

Trying too hard to avoid repeating code can make it so complicated that it's hard to understand. It's better to find a balance where your code can be used again and again, but it's still easy to read and work with.

The log method

We'll introduce a private member method called log. The clients don't need to know about what's going inside the library, so making the crucial/core methods private is a nice thing.

Update the code of your Logger class to include the log method. Also, update the code so that all other helper methods call the log method, thus avoiding the code duplication.

// lib/logger.js

const { LogLevel } = require("./utils/log-level");

class Logger {
    ...

    #log(message, log_level) {
        console.log('%s: %s', message, log_level)
    }

    debug(message) {
        this.#log(message, LogLevel.Debug);
    }

    info(message) {
        this.#log(message, LogLevel.Info);
    }

    warn(message) {
        this.#log(message, LogLevel.Warn);
    }

    error(message) {
        this.#log(message, LogLevel.Error);
    }

    critical(message) {
        this.#log(message, LogLevel.Critical);
    }
    ...
}

Executing the code gives us the desired output. The output includes the LogLevel as an integer. However, that's not helpful at all, we should be showing what the LogLevel is with a string representation of the level.

Let's introduce a new static method to_string inside the LogLevel class

// file: lib/utils/log-level.js

class LogLevel {
    ...
    static to_string(log_level) {
        const levelMap = {
            [this.Debug]: "DEBUG",
            [this.Info]: "INFO",
            [this.Warn]: "WARN",
            [this.Error]: "ERROR",
            [this.Critical]: "CRITICAL"
        };

        if (levelMap.hasOwnProperty(log_level)) {
            return levelMap[log_level];
        }

        throw new Error(`Unsupported log level ${log_level}`);
    }
    ...
}

Change the code inside the log() method of Logger class.

// lib/logger.js

const { LogLevel } = require("./utils/log-level");

class Logger {
    ...

    #log(message, log_level) {
        console.log('%s: %s', message, LogLevel.to_string(log_level))
    }
    ...
}

This outputs

Hello debug: DEBUG
Hello info: INFO
Hello warning: WARN
Hello error: ERROR
Hello critical: CRITICAL

Everything looks good.

Considering the log_level member variable

Notice that in our config.json we specified that the log level should be 3 that is LogLevel.Error. Specifying the log level means that we should only write logs that are equal to or above the specified level.

Imagine a production application, which is usually under a very heavy load. We'd like to specify the level as LogLevel.Warn or even LogLevel.Info. We don't care about the LogLevel.Debug logs at all. They might pollute the log files as the debug logs are usually very verbose.

We are going to add a small check to ignore the logs which are below the current log_level.

// file: lib/logger.js

class Logger {
    ...
    #log(message, log_level) {
        if (log_level < this.#config.level) {
            return;
        }

        console.log('%s: %s', message, LogLevel.to_string(log_level))
    }
    ...
}

On running, we get the following output.

Hello error: ERROR
Hello critical: CRITICAL

We only write logs based off the current logger's log_level

Writing to a file

We've been using console.log() to print the log messages to the standard output, or your terminal. However, there are many drawbacks to this approach.

  1. Lack of Persistence: Console logs are ephemeral and disappear once the application session ends (when you close the terminal or exit out of an ssh connection). This makes it challenging to review logs for past sessions.
  2. Performance Impact: Continuous console output can impact application performance, especially when generating a high volume of logs. It can slow down the application and interfere with its responsiveness. There are certain ways to get mitigate this, we'll talk about this in a later chapter.

Create a private log_file_handle member

// file: lib/logger.js

const fs = require('node:fs/promises')

class Logger {
    ...

    /**
     * @type {fs.FileHandle}
     */
     #log_file_handle;

    ...
}

The log_file_handle member variable will be our opened file handle that we'll reference again and again whenever we wish to write logs to a file. It will hold the opened log file, which will be created whenever the client initiates our logging library.

We'll expose a public method on the Logger called init and make that async so that it waits for the init() to finish initializing the file handle before moving further.

// file: lib/logger.js
const fs = require('node:fs/promises')

class Logger {
    ...

    #log_file_handle;

    async init() {
         const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/g, "-") + ".log";
        this.#log_file_handle = await fs.open(file_name, "a+");
    }
}

There's a lot going on in this method. Let's break it down.

A small primer on Regular Expressions

  1. The init method is responsible for initializing the logger, which includes creating or opening a log file with a dynamically generated name based on the current date and time.

  2. this.#config.file_prefix is used to prefix the name of the log file, which is actually set in the config.json file or can be passed as a json object or utilizing the Builder pattern.

  3. new Date().toISOString() generates a string representation of the current date and time in the ISO 8601 format, such as "2023-08-18T15:30:00.000Z".

  4. .replace(/[\.:]+/g, "-") is a regular expression operation. Let's break down the regex:

    • the square brackets [] are used to define a character class. A character class is a way to specify a set of characters from which a single character can match. For example:

      • [abc] matches any single character that is either 'a', 'b', or 'c'.

      • [0-9] matches any single digit from 0 to 9.

      • [a-zA-Z] matches any single alphabetical character, regardless of case.

        You can also use special characters within character classes to match certain character types, like \d for digits, \w for word characters, \s for whitespace, etc.

        In this case we are looking for all the dot (.) characters and colon (:) characters in the string.

    • \. matches a literal dot character. Since the dot (.) is a special character in regular expression, known as a wildcard. It matches any single character except for a newline character (\n). This is useful when you want to match any character, which is often used in pattern matching.

      However, in this case, we want to match a literal dot character in the string (the dot in the date-time format "00.000Z"). To achieve this, we need to escape the dot character by preceding it with a backslash (\). When you place a backslash before a special character, it indicates that you want to match the literal character itself, rather than its special meaning.

    • + matches one or more of any character except newline. We match for all the characters following the dot (.) and the (:) character.

    • /g is the global flag, indicating that the replacement should be applied to all occurrences of the pattern.

    • So, the regex [\.:]+ matches the dot or colon and all the repeated occurences of these 2 characters.

    • The replacement "-" removes the dot and all characters after it and replaces it with a hyphen (-).

  5. The result of the replace operation is a modified version of the ISO string, which now includes only the date and time portion, without the fractional seconds.

  6. .log is appended to the modified date and time string to form the final log file name.

  7. await fs.open(file_name, "a+") opens or creates the log file using the fs.open function with "a+" flag. We talked about the modes in a previous chapter

    • If the file doesn't exist, it will be created.
    • If the file exists, data can be appended to it.
    • The "a+" mode allows both reading and appending. Appending means, we begin writing to the end of the file instead of from the 1st line. However, if the file is empty, it starts from the beginning.

This code initializes the logger by creating or opening a log file with a name based on the current date. The regular expression is used to remove the fractional seconds from the ISO date string, and the resulting string is used as part of the log file name. This allows for creating a new log file every time the init method is called, typically representing logs for a specific time period.

Since regular expressions are one of the most important concepts which are usually neglected, we'll have an entire chapter dedicated to mastering regular expressions in case you'd like to get comfortable using them.

Testing the log file creation

Let us try to test the init method, and see if it creates a log file like we desire?

Before that, we'll look at a small nuance of calling an async function inside a non-async scope. Add the following log statement in the init method

// file: lib/logger.js

class Logger {
    ...

    async init() {
        const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/g, "-") + ".log";
        this.#log_file_handle = await fs.open(file_name, "a+");
        console.log("File created.")
    }
    ...
}

And then call the init method from the test.js

// file: test.js

const { Logger, LogConfig } = require("./index");

const logger = Logger.with_config(LogConfig.from_file("./config.json"));
logger.init();
console.log("End of the file.");

However the program exits, without creating the log file.

$ node test.js

### Outputs
End of the file.

Why is it so? It is because we're using the node:fs/promises module, where all the functions are asynchronous. That means, they do not block any code and continue execution without waiting. This is the key to creating performant concurrent applications.

How do we fix it? Simply by awaiting the init() method.

// file: test.js

const { Logger, LogConfig } = require("./index");

async function main() {
  const logger = Logger.with_config(LogConfig.from_file("./config.json"));
  await logger.init();
  console.log("End of the file");
}

main();

If you try to run now, the output is what you would've expected.

### Outputs
End of the file

You'll also notice that the log file has also been created! For me the log file was created in the root directory. The name of the log file is

LogTar_2023-08-18T17-20-23-132Z.log

LogTar_ is the prefix that I specified in the config.json file. Following it is the timestamp of when the file was created. We also add a .log extension at the end. All working as expected.

Another gotcha

If you're paying attention, you'll have already figured out that the way we're providing path to a file is not a nice way. If we wanted to run the test.js file from a different directory, say lib/config we'll get an error.

$ cd lib/config && node ../../test.js

// outputs
Error: ENOENT: no such file or directory, open './config.json'

Let's fix this by using the __dirname global variable that we used earlier.

// file: test.js

const path = require("node:path");
const { Logger, LogConfig } = require("./index");

async function main() {
  const logger = Logger.with_config(LogConfig.from_file(path.join(__dirname, "config.json")));
  await logger.init();
  console.log("End of the file");
}

main();

Now if we try to run it, it works irrespective of the directory you're running the code from.

$ node ../../test.js
### Outputs
End of the file

Logs directory configuration

When we run our code using the node ../../test.js command, the log files are being created in the same place where we typed that command, which is the lib/config folder. This is not what we want. Our library is a tool that other people will use. We should have complete say over where the log files go. This is because we want to make sure our library works well no matter how others use it.

Normally, logs (like records of what our code is doing) are put in a special folder called "log." This folder sits at the main spot of the entire project.

By centralizing the log files within the log directory at the root of the project that includes our library, we achieve a more organized and modularized approach to logging. This ensures that the log files don't clutter the main project's structure and are conveniently located in one designated place.

Our first script

Inside the package.json file, add a script start with the value of node test.js and get rid of the test script.

// file: package.json

{
    ...
    "scripts": {
        "start": "node test.js"
    },
    ...
}

If you run npm start from any folder of the project, it runs the node test.js command from the directory the package.json file is located.

The logs output in the root of the directory. Let's create them inside the log directory, instead of the root directory.

Update the code inside Logger.init method:

// file: lib/logger.js

const path = require('node:path')

async init() {
    const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/, "-") + ".log";
    this.#log_file_handle = await fs.open(path.join('logs', file_name), "a+");
}

If you do run npm start, it fails.

### Outputs
[Error: ENOENT: no such file or directory, open 'logs/LogTar_2023-08-18T19-14-46-162Z.log']

It's failing because it could not find the log directory. If we mention something like this in our library's readme:

Hey please create a `log` folder in order for this library to work

They'll simply ignore the library and use something better. Let's make the whole process streamlined. Irrespective of where our library's code live, whether it's inside the node_modules folder or in the same directory.

The require object

We've been using require throughout our code, which just imports a module, and exposes all it's exported functionality - whether it's a variable, a function or a class. You may think of it as a function since require('node:path') indeed looks like a function call.

But, require is not just a function!

require also acts like an object that has several useful properties and methods associated with it. When you use require in your code, you are utilizing both its function-like behavior to load modules and its object-like properties and methods to interact with modules and the module system.

For example:

  • require.resolve('module-name'): Returns the path of the module without actually loading it.
  • require.cache: Provides access to the cache of loaded modules.
  • require.main: Provides access to the Module object representing the entry script loaded when the Node.js process launched. This is exactly what we need.

The reason require might feel like both an object and a function is because JavaScript allows functions to have properties. You can test this yourself

function my_fun() {
  console.log("hi");
}
my_fun.hey = "there";

console.log(my_fun.hey); // there
my_fun(); // 'hi'

Adding a new helper to create log directory

Create a new file called helpers.js inside the lib/utils folder.

// file: lib/utils/helpers.js

const fs_sync = require("node:fs");
const path = require("path");

/**
 * @returns {fs_sync.PathLike} The path to the directory.
 */
function check_and_create_dir(path_to_dir) {
  const log_dir = path.resolve(require.main.path, path_to_dir);
  if (!fs_sync.existsSync(log_dir)) {
    fs_sync.mkdirSync(log_dir, { recursive: true });
  }

  return log_dir;
}

module.exports = {
  check_and_create_dir,
};

Let's go through the check_and_create_dir function's code line by line.

  1. The path.resolve() function creates an absolute path by combining a sequence of paths or path segments.

    It processes the sequence from right to left, with each subsequent path added to the beginning until an absolute path is created. For example, if the path segments are /foo, /bar, and baz, calling path.resolve('/foo', '/bar', 'baz') would return /bar/baz because '/bar' + '/' + 'baz' creates an absolute path, but 'baz' does not.

    If, after processing all the segments in the given path, an absolute path hasn't been created yet, then the current working directory is used instead.

  2. The require.main.path holds the absolute path to the directory of the entry point. The entry point is the test.js in our case. Or whatever file you specify while running node file_name.js command.

  3. The fs_sync.existsSync function is checking if a given directory path exists in the file system. It's using the log_dir path (the result of path resolution) to see if the directory is already there.

  4. If the directory doesn't exist, the fs_sync.mkdirSync function comes into play. It's used to create directories. The { recursive: true } option ensures that not only the final directory is created, but also any necessary intermediate directories along the path. In case we wish to specify a log directory as logs/new/today

  5. At the end, function returns the log_dir, which is the full path to the directory. Whether the directory existed before or was just created, this path is what you get as a result from the function.

Updating the init method

    // file: lib/logger.js

    class Logger {
        ...
        async init() {
            const log_dir_path = check_and_create_dir("logs")

            const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/, "-") + ".log";

            // Open/Create the file at the specified log directory and
            // save the file handle to the `#log_file_handle` for further
            // use
            this.#log_file_handle = await fs.open(path.join(log_dir_path, file_name), "a+");
        }
        ...
    }

Completing the log method

The current version of our Logger.log method looks like this:

#log(message, log_level) {
    if (log_level < this.#config.level) {
        return;
    }

    console.log('%s: %s', message, LogLevel.to_string(log_level))
}

Instead of logging to the standard output, let's write it to the log file that is created when the Logger is initialized.

// file: lib/logger.js

class Logger {
    ...
    async #log(message, log_level) {
        // dont' write to the file if
        //     1. The `log_level` argument is less than the `#config.level` value
        //     2. If the `fd` (file descriptor) is either 0 or -1, which means the file
        //        descriptor is closed or not opened yet.
        if (log_level < this.#config.level || !this.#log_file_handle.fd) {
            return;
        }

        // write logs to the opened file
        await this.#log_file_handle.write(log_message);
    }
    ...
}

In the test.js file

// file: test.js

const path = require("node:path");
const { Logger, LogConfig } = require("./index");

async function initialize_logger() {
  let logger = Logger.with_config(LogConfig.from_file(path.join(__dirname, "config.json")));
  await logger.init();

  return logger;
}
async function main() {
  let logger = await initialize_logger();
  logger.error("This is an error message");
}

main();

In test.js we create a new async function initialize_logger that creates a logger, initializes it and returns it.

We call the logger.error() method to print the log to the file.

Let's run the code

$ npm start

A new log file will be created inside the logs directory. Open and see the contents -

This is an error message

Perfect! Everything seems to be working now. The logs are being saved, and we can be proud of it! But wait, the logs aren't useful. We don't know what are those logs, when did we write them and what function called the logger.error() method?

We'll take care of all this in the next chapter.

The code for the entire chapter can be found at the src/chapter_04.2 directory

Read Next