Web Log of Ross Chapman

Web Log of Ross Chapman

A Recursive Validation Function with User-Defined Exceptions

Every time I use a recursive function for something practical in commercial software my notional machine of it’s behavior is refined. This amounts to a small list of heuristics amassing in my mental pocket:

  1. “It’s a function that calls itself.”
  2. “You make your list of tasks first then start from the last and work your way back up” or “It’s like unpacking a box with other, smaller boxes, with other smaller boxes, with other…and then only start looking at the boxes from the smallest to the largest, one at a time” (🎩👌 Aditya Bhargava’s grokking algorithms)
  3. “It’s good for building up a list or value, kind of like reduce”
  4. “It can be less code but less performant.”

After working on another problem recently that involved deep-diving a nested JavaScript object and executing validations against each level I’m adding:

“Recursion is awkward if you need to break early.”

In my particular case I needed to validate a recursive data structure representing an org chart of Employees and return invalid immediately if the code traversed an Employee with bad data – extra fields, missing required fields, fields of the wrong type, etc….

Breaking out from a recursive function is not quite as straightforward as you’d think. Also, historically I was used to seeing recursive code employed for tasks that wanted the call stack to build up all the way through the data.

Like, (deep) flattening an array:

function deepFlatten(nestedArray, result = []) { for (let element of nestedArray) { if (Array.isArray(element)) { deepFlatten(element, result); } else { result.push(element); } } return result; }

Or, fetching a complete set of data from a remote source in chunks:

async function fetchAll(params, all = []) { let chunk = await fetch(params); let nextPage = chunk.nextPage; all = all.concat(chunk.data); if (nextPage) { let nextParams = { ...params, page: nextPage }; return await fetchAll(nextParams, all); } return all; }

What I quickly discovered is that just trying to capture and emit an error from a recursive call stack is already a bit funky. Simply returning false in your function doesn’t work because calls lower on the stack may return true; and since we’re (kind of) “building a value” it only matters what the final call returns. This approach won’t work:

// Will only return false if the last call in the stack returns false function validate(data, schema) { for (let item of data) { for (let rule of schema) { let field = item[rule.name]; let required = rule.required; if (required && !field) return false; // Recurse if (Array.isArray(field)) { validate(field, schema); } } } return true; }

Using recursion is more like a leap of faith - you are handing over control to the JS engine over an unbounded data set; it’s quite reminiscent to the manner in which higher order functions operate with Array and Object collections. For example, forEach is a powerful and declarative alternative to for and for..of/in loops until you find yourself needing to skip over an iteration or break out of the loop. Keywords like continue and break are unavailable in Array and Object collection methods – these are closed iterators.

Your only recourse in a recursive function is relying on outer calls – since the call stack is LIFO - to set that flag and pass it through each stack layer. So capturing and emitting an error from your recursive function might look like this:

function validate(data, schema, errors = []) { for (let item of data) { for (let rule of schema) { let field = item[rule.name]; let required = rule.required; if (required && !field) { errors.push(error); } // Recurse if (Array.isArray(field)) { validate(field, schema, errors); } } } return errors; }

If our program requirements suggest we want to parse the entire org chart for bad data, this function will give us a result array we can further process to report errors. But for my purpose, there’s too big a potential cost of unnecessary runs while a large call stack is cleared for a large org chart.

In order to stop processing the org chart and return an invalid result early, we need a solution that stops execution entirely when the invalid check is entered in the outermost call. Alas, the solution ends up being rather elegant and simple, though counter-intuitive. Rather than returning (false, an error list, etc…), you can throw and thereby forcibly halt the engine’s execution of the code. Here’s an example with throw:

function validate(data, schema) { for (let item of data) { for (let rule of schema) { let field = item[rule.name]; let required = rule.required; // It's even one less character to write! 🤣 // Also now we have total control over the exception content if (required && !field) throw new MissingFieldError(item, rule); // Recurse if (Array.isArray(field)) { validate(field, schema); } } } return true; }

Day in, day out we work constantly with client applications that only trhow as a result of unintended bugs in the program. But we can take advantage of this standard JavaScript behavior and erect an appropriate error boundary. Remember:

Execution of the current function will stop (the statements after throw won’t be executed), and control will be passed to the first catch block in the call stack. 🔗

Therefore we can rename and wrap our recursive function that throws, and put it inside an error boundary to achieve that early break we want. This approach even comes with the added advantage of declaring the content of our user-defined exception at throw site; eg, utilizing meaningful error constructors or factories like missingFieldError().

function validate(data, schema) { try { validateInner(data, schema); } catch (error) { // returns new MissingFieldError()! return error; } return true; }

Even more, the elegance of this design with an outer try/catch allows for separate testing of our validation business logic – the rules against which bad data throw – and error handling – what errors we emit for certain cases.