Web Log of Ross Chapman

Web Log of Ross Chapman

Well-known ways that JavaScript coerces objects to strings

It’s a proper cliche of commercial computer programs to bind audit reporters alongside code at important relay nexuses. There are a panoply of reasons to extract information this way: producing audit trails for legal compliance, gathering product insights, collecting debug traces for quality monitoring, etc… The trick is wresting useful emissions from these reporters when they are working within a dynamically typed language that might unabashedly take liberties on ambiguous data types – like JavaScript. Typically you design your loggers to capture a Good-Enough Facsimileโ„ข๏ธ of the current state of the world for later analysis1. This likely means grabbing the subject of the current operation as well as felicitous supporting actors like application and session context, and user info, in Object form. (Already a funky transmigration of Subject/Object). Oftentimes the subject is the user – we love to know you, and know to love you. But, if we may clasp hands and walk/skip through a hallucination together into a fantasy e-commerce example, we can explore logging and value coercion of a familiar Order class. What would be a blog about building websites without a bit of retail market magic.

Let’s start with the following modest invention which terraforms1 a theoretical backend piece of an online shop. And let’s imagine that from time to time, orders that come into this shop become stuck during the fulfillment process – perhaps for a variety of reasons because we are dealing with, say, medication items. Insurance claims may be rejected, the doctor or pharmacist discovers a conflict between the patient’s health attributes and the meds. In such a scenario, this code attempts a fully-automated retry of the unstuck order, and in due course, at important nexuses, passes relevant contextual data to log handlers. From there other bots or real-life persons can take over.

const oldestStuckOrder = await query(orders, { filter: 'oldest_stuck' });
const logData = {
    order: oldestStuckOrder, 
    userInfo: { ... },
    sessionInfo: { ... },
};
logger.info(`Start resolving stuck order: ${logData}`);
const reason = determineStuckReason(oldestStuckOrder);

if (reason) {
    const result = resolveStuckOrder(oldestStuckOrder, reason);
    const logDataWithResult = {
        ...logData,
        reason,
        result,
    };
    logger.info(`Finished resolving stuck order: ${logDataWithResult}`);
} else {
    const logDataWithReason = {
        ...logData,
        reason,
    }
    logger.error(`Unable to resolve stuck order: ${logDataWithReason}`);
}

There are three points in this code where I’m sending a rich object into a logger service which only accepts a single string argument. In this fantasy, this particular team of software developers just wants the logger service API as simple and straightforward as can be: unary and stringy. I hope this code reads naturally, that it’s similar to something you’ve seen before. And…hopefully your gears are turning already and you are starting to see with me ๐“ฟ; and you are beginning to feel awkward ๐Ÿ™‡๐Ÿปโ€โ™‚๏ธ about what representation for oldestStuckOrder or logDataWithReason this gussied-up console function will display. Won’t interpolating complex objects inside a template string force the engine to implicitly coerce the object into the obstinately churlish [object Object]?

Scene opens, your PM marches up to your desk with a bemused frown:

PM: What happened with Order #555-5555
Me: The problem is that this order got into an illegal state!
PM: Which order?
Me: Oh, the object object order.
PM: ๐Ÿ˜’

JavaScript is funky-beautiful because the dynamic typing nature of the lang means you can smush values of mismatched types into the same context and your program won’t catastrophically fail when you run it. Loose assumptions loosely held, I guess. When we write JavaScript, we often take this for granted, and we exploit it for good. You probably know and understand intuitively or consciously the bunch of common “contexts” where the language affords this failsafe. Here’s a general breakdown:

  • expressions using arithmetic operators
  • comparison operators
  • text expressions (if statements, the second clause of a for loop handler, the first clause of a ternary operator)
  • interpolated strings.

That last one is case in point; we can send an object into a string context – Finished resolving stuck order: ${logDataWithResult} – and get something workable out the other end:

const logDataWithResult = { prop: 'prop', anotherProp: 'anotherProp', }; console.log(`Finished resolving stuck order: ${logDataWithResult}`);

And there it is. Workable (quite generously). The famed proterozoic, bracketed notation of familiar churlish conceit and “bad parts” motifs. Obviously this not the best guess we hope the engine to make when executing our logging function – we have lost all that rich order and user data! Our compliance trail is meaningless. But we shouldn’t despair quite yet. I’m happy to share that JavaScript exposes many an API for developers to control the return value of type conversions. If JavaScript is anything it’s a fairly open language. Not necessarily open for expansion to the extent of, say, Clojure’s macros. But all things being mutable objects (with a few untouchable properties) and a handful of scalar values, you have a good deal of freedom. For coercing objects to string, the most famous method is probably toString(). In fact, JavaScript dogfoods its own toString() for object -> string conversion automatically in all those common contexts listed above. Whenever

the object is to be represented as a text value or when an object is referred to in a manner in which a string is expected – MDN contributors

Like between backticks.

Now, if a fellow developer in our dreamy medication retail shop codebase has not already come along and monkey-patched the Order object’s toString() method, the default conversion algorithm rules defined for Object.prototype.toString() from the ECMAScriptยฎ 2021 spec section 19.1.3.6 will kick in. Yep, we are going there. The algorithm is actually pretty easy to understand (though here’s to hoping your browser of choice plays along!). I invite you for a glance:

Can you see where [object Object] comes from? If the unfamiliar dialect is a bit intimidating for this casual Sunday morning read, here’s what the above algorithm would like if we implemented it in JavaScript4:

import { SecretInternals } from 'secret-internals';

const internalSlotsToTagMap = {
  ParameterMap: "Arguments",
  Call: "Function",
  ErrorData: "Error",
  BooleanData: "Boolean",
  NumberData: "Number",
  StringData: "String",
  DateValue: "Date",
  RegExpMatcher: "RegExp",
}

function toString(value) {  
  if (value === undefined) return '[object Undefined]';
  if (value === null) return `[object Null]`; 
  
  let builtinTag;
  const innerValue = innerValue = SecretInternals.Object.box(value);
  const isArray = isArray(innerValue);
  
  if (isArray) {
    builtinTag = 'Array';
  } else {
    for (const [key, value] of Object.entries(internalSlotsToTagMap)) {
      if (SecretInternals.Object.hasInternalSlot(innerValue, key)) {
        builtinTag = value;
      }
    }
  }

  if (!builtinTag) {
    builtinTag = 'Object';
  }

  const tag = SecretInternals.Object.get(innerValue, '@@toStringTag');

  if (tag !== string) {
    tag = builtinTag
  }

  return `[object ${tag}]`;
}

For actual objects, not object-like things (Arrays), the algorithm falls through to step 14 where a temporary referent called builtInTag receives the value Object. This built in tag is later used as the second part of the converted argument value.

Despite the sarcastic jabs from the peanut gallery, what else would we expect the language to do. JavaScript’s unintentional emissions were designed for a platform that projects them through UIs for consumption by masses of retinas of human people – it’s Good Enough. The language keeps your program running with a type guess and leaves the contents of your value alone. It doesn’t, like, radically unpack and serialize your contents to JSON (what compute or privacy costs might lurk!); or try to set the built in tag to the left-hand side of the last variable assignment statement: what a drastic move, the language doesn’t work this way: variable names are not synonymous with types! Variable assignment is void of any tautological binding! JavaScript just lets you take matters into your own hands.

Until very recently I wasn’t aware of any techniques beyond an Object’s built-in toString() property to futz around with type conversions. But apparently there are a few modern well-known Symbols5 that have entered the language to help. I’ve compiled an example with four different APIs I could track down. Any of these would play nicely for logging libraries, though the last only works in node, and there are nuances to each that you must consider for your use case.

// 1. Override toString() (js, node) class Chicken { toString() { return '๐Ÿ“'; } } var chicken = new Chicken(); console.log(`${chicken}`); // ๐Ÿ“ // 2. Symbol.toStringTag (js, node) class Chicken { get [Symbol.toStringTag]() { return '๐Ÿ“'; } } let chicken = new Chicken(); console.log(`${chicken}`); // [object ๐Ÿ“] // 3. Symbol.toPrimitive (js, node) class Chicken { [Symbol.toPrimitive](hint) { switch (hint) { case 'number': case 'string': case 'default': return '๐Ÿ“'; default: return null; } } } let chicken = new Chicken(); console.log(`${chicken}`); // ๐Ÿ“ // 4. util.inspect.custom (node) const inspect = Symbol.for('nodejs.util.inspect.custom'); const util = require('util'); class Chicken { [inspect]() { return '๐Ÿ“'; } } const chicken = new Chicken(); console.log(chicken); // ๐Ÿ“ console.log(`${chicken}`); // [object Object] util.inspect(chicken) // ๐Ÿ“

I haven’t formally surveyed this, but I can report anecdotally that of my current team of ~25 engineers who work on a universal JavaScript application, overriding toString() is the most commonly preferred strategy. My sneaking suspicion is that many JavaScript developers are not aware of the more contemporary Symbol methods/properties, even though these methods have been available in major browsers/node for ~4-5 years. Or maybe it’s simply a matter of many backend devs coming to backend JS from other languages and server environs. From what I understand, node has just started to finally emerge in the past few years as fit-enough for prod. JavaScript is vast territory, quickly expanding, in multiple runtimes – it takes years.

As for nodejs.util.inspect.custom, I haven’t been around node land long enough to know if its usage is idiomatic.

Still, preference for toString() may not simply be an issue of keeping up with the JS Joneses. As shown above, the outcomes of these different strategies are not identical. What’s more, to layer on the complexity, these options aren’t wholly independent. In fact, JavaScript combines these strategies together under the hood. Did you notice what was going on in step 15 of the conversion algorithm above? The spec requires that Object.prototype.toString looks up the @@toStringTag symbol property on the object – these symbol members are in the DNA sequence now. When we take control back, understanding the spec is quite key: we can avoid mistakes like combining these two options since overriding toString() always take precedence. For example:

class Chicken { get [Symbol.toStringTag]() { return 'Base ๐Ÿ“'; } toString() { return 'This is a ๐Ÿ“'; } } class JungleChicken extends Chicken { get [Symbol.toStringTag]() { return 'Jungle ๐Ÿ“'; } } const chicky = new Chicken(); const jungleChicky = new JungleChicken(); console.log(`${chicky}`); console.log(`${jungleChicky}`);

However, say I were interested in simply tagging the string representation of my object to protect exposing its value contents, but still present a semantically rich identifier to consumers. This would help us express a desire to maintain the default bracket output – [Object..] – with the prepended “object” type to maintain some consistency with how objects are stringified in our code. In that case, leveraging the well-known Symbol.toStringTag property would be the way to go. For example, the following logger from our e-commerce imaginary might obscure private user data like this:

// .../jsonapi/resources/user.js class User { get [Symbol.toStringTag]() { return `User ${this.id}`; } } // Somewhere else... logger.error(`Unable to resolve stuck order: ${logDataWithReason}`); // Start resolving stuck order: // { // order: {...}, // userInfo: [object User:123456], // sessionInfo: {...}, // };

Your next option, empowering even more fined-grained control, is adding a Symbol.toPrimitive method to your object. Symbol.toPrimitive is a main line into the runtime’s coercion processing. After playing around a bit in a browser and a node repl, I’ve noticed that this Symbol will precede over a provided toString() override.

class Chicken { toString() { return 'This is a ๐Ÿ“'; } get [Symbol.toStringTag]() { return '๐Ÿ“'; } [Symbol.toPrimitive](hint) { switch (hint) { case 'number': case 'string': case 'default': return 'This is a ๐Ÿ“ primitive'; default: return null; } } } const chicky = new Chicken(); console.log(`${chicky}`)

By using Symbol.toPrimitive, you’re basically instructing the runtime to LOOK HERE for all its object coercion preferences. What’s more, beyond Symbol.toStringTag’s mere label augmentation, you get a powerful indirection2 to handle all primitive type coercion scenarios1. You’re also overriding internal language behavior which – I was surprised to learn – effectively resolves how to order the calls for Object.prototype.toString() and Object.prototpye.valueOf(). Flip to section 7.1.1 of the spec to see how the ToPrimitive abstraction is designed to call a further nested OrdinaryToPrimitive abstraction for decision-making:

Translated for comfort:

import { SecretInternals } from 'secret-internals';

function ordinaryToPrimitive(object, hint) {
  if (SecreteInternals.isObject(object)) {
    throw TypeError;
  }

  if (hint !== 'string' || hint !== 'number') {
    throw TypeError;
  }

  let methodNames = [];

  if (hint === 'string') {
    methodNames = ['toString', 'valueOf'];
  }

  if (hint === 'number') {
    methodNames = ['valueOf', 'toString'];
  }

  for (methodName of methodNames) {
    if (SecreteInternals.isCallable(methodName)) {
      const result = SecreteInternals.call(object, methodName);

      if (SecreteInternals.isNotObject(result)) {
        return result;
      }
    }
  }

  throw TypeError;
}

I think I like the idea of using these well-known symbols for custom object -> string representations, if for the collision protection alone.3 What would it be like to reach for the powerhouse of Symbol.toPrimitive to hijack the runtime from eventually calling through to Object.prototype.toString(). Furtive, conspiratorial whispers to the interpreter ๐Ÿคซ. Even a partially implemented reducer will do, as I demonstrate above in my chicken example above: the switch statement can gracefully sidestep other type hint cases and only target the string hint case. But is grabbing for Symbol.toPrimitive overkill? toString() is tried and true and a pristine greenfield function block without arrogant “hints” and a naive switch statement without pattern matching ๐Ÿ™„ (are we there yet?). Could there be non-trivial DX cost of confusing other developers if the other primitive case statements are fall throughs?


1: Whenever I think about how software tries to capture its own understanding of the world it creates, I’m brought back to systems thinkers like jessitron:

“We donโ€™t expect the world to have perfect consistency. Yet we wish it did, so we create facsimiles of certainty in our software.
Itโ€™s impossible to model the entire world. Completeness and consistency are in conflict, sadly. Still, if we limit โ€œcompleteโ€ to a business domain, and to the boundaries of our company, this is possible. Theoretically.

2: By “indirection” I really mean to invoke Zachary Tellman’s exegesis of semantic drifts in software industry lexicons from his book Elements of Clojure. His work is a really nice refinement:

Indirection provides separation between what and how. It exists wherever “how does this work?” is best answered, “it depends.” This separation is useful when the underlying implementation is complicated or subject to change. It gives us the freedom to change incidental details in our software while maintaining its essential qualities. It also defines the layers of our software; indirection invites the reader to stop and explore no further. It tells us when we’re allowed to be incurious.
He goes on to discuss that conditionals are one of two primary devices to achieve successful indirections (the other being references). Conditionals are effective because they contain ordered, “closed” decision-making mechanisms that avoid conflicts; in contrast to tables with indivduated keys.
Conditionals solve conflicts by making an explicit, fixed decision. Where conflicts are possible, we use conditionals because they are closed.

3: That’s the primary purpose of these Symbols. For a deeper Support util.inspect.custom as a public symbol #20821 on GitHub.

4: Try not to think too hard about the implications of a JS interpreter written in JS. But, ya know, Atwood’s law; you will, not surprisingly, find JS interpreters for JS out there, many built on Acorn, which, is itself, a parser written in JS. Man alive!

5: I’m bemused and betwixed by this use of “well-known.” Does anyone know the origin of this qualifier?