Unconditional Code for Mapping Protobuf in Golang

Leverage Golang's nil receivers for cleaner and more testable data-centric backend services.

May 16, 2021

black flat screen computer monitor — Photo by Mohammad Rahmani on Unsplash

Unconditional code is code that just works, without errors or exceptions, without funny edge cases represented with tedious IF statements. Golang + gRPC offer a robust way of mapping incoming protobuf messages to internal or external (i.e. outgoing) structures unconditionally.

A word about Golang

Golang is particularly sensitive to "ify" code due to the existence of nil and the decision of having errors as values returned by functions:

func Foo() (*Result, error) {
  ...
}

func UseFoo() error {
  res, err := Foo()
  if err != nil {
    // handle, wrap or at least return `err`
  }
  if res == nil {
    // handle nil
  }
  // Do something with *res
}

Here callers of Foo are forced to handle both an error and a nil case.

Use case example

When dealing with data-centric applications often you need to map some upstream/incoming data to some internal representations for further processing, before mapping it again to an outgoing event structure.

Imagine an e-commerce system composed of several gRPC services:

Checkout composes the purchase and calls Invoicing and Delivery in order to finish it.
Invoicing needs to map the order data to its internal model to render it and send it to the customer over email using configurable templates.
Delivery needs to also map the order data and sent the shipment.

Here is code that deals with the user → customer mapping in Checkout:

func FromUser(u *user.User) (*model.Customer, error) {
  if u == nil {
    return nil, nil
  }
  var customer model.Customer
  customer.Email = u.Email
  if u.FirstName == "" && u.LastName == "" {
    return nil, fmt.Errorf("customer must have first or last name")
  }
  customer.Name = strings.TrimSpace(strings.Join([]string{u.FirstName, u.LastName}, " "))
  if u.BillingAddress == nil {
    return nil, fmt.Errorf("customer must have billing address")
  }
  customer.Country = u.BillingAddress.Country
  customer.City = u.BillingAddress.City
  return &customer, nil
}

There are several problems with this code, e.g. it is difficult to read. It tempting to blame the language - after all, Golang is known to be verbose. However, in my experience the programming language is rarely to blame.

Issues of the code are:

It is hard to read.
It does mapping and validation at the same time.
Caller function will have to deal with both the error and the nil case.
It is really hard to read.
It is difficult to test. We always need to provide a complete User object and assert the entire mapped structure vs a table test case which asserts the mapping of name separately.

The testing problem can explode easily. Consider a function that maps the deep structure of order.Order to an internal Invoice one:

func FromOrder(order *order.Order) (*model.Invoice, error) {
  if order == nil {
    return nil, fmt.Errorf("order is nil")
  }
  if order.User == nil {
    return nil, fmt.Errorf("user is missing")
  }
  customer, err = FromUser(order.User)
  if err != nil {
    // ...
  }
  // ...
}

func TestFromOrder(t *testing.T) {
  testCases := []struct {
    desc string
    in   *order.Order
    want model.Invoice
  }{
    {
      desc: "should create document",
      in:   &order.Order{
        // ...
      },
      want: &model.Invoice{
        // ...
      },
    },
  }
  for _, tC := range testCases {
    t.Run(tC.desc, func(t *testing.T) {
      // ...
    })
  }
}

We have three options to implement the unit tests:

Initialize the complete order.Order and user.User objects to pass the mapper validations. Our tests will be long and hard to maintain, since any changes to FromUser will break the TestFromOrder.
Using indirection is always an option. We can hide the mapping of different parts behind interfaces and use mapper structs instead of functions. This will create clatter in our mapper package and a performance hit from dereferencing.
Not test the FromOrder function.

Luckily, there is a fourth option - unconditional code.

Unconditional Go

We will take advantage of two very nice features of Golang and protobuf:

nil receivers. Since the methods in Golang are functions with receivers (the first parameter, like good old C), nothing prevents us from having the receiver argument nil. Probobuf code generation takes advantage of this:

func (s *MyStruct) GetData() string {
  if s == nil {
    return ""
  }
  return s.Data
}

var ptr *MyStruct // nil
fmt.Println(ptr.GetData()) // this works fine and prints empty line

Range over nil. Iterating over an empty or nil slice works the same - the body of the loop is skipped. In Golang, you don't need explicitly check for nil in those cases. You can also append to a nil slice.

func get() []string {
  return nil
}

func main() {
  for _, s := range get() {
    fmt.Println("never", s)
  }
  var slice []string // nil
  slice = append(slice, "first element") // ["first element"]
}

Therefore, we can map unconditionally:

func FromOrder(order *order.Order) model.Invoice {
  return model.Invoice{
    Customer:  FromUser(order.GetUser()),
    LineItems: FromItems(order.GetItems()),
  }
}
func FromItems(items []*order.Item) []model.LineItem {
  var lineItems []model.LineItem
  for _, item := range items {
    lineItems = append(lineItems, FromItem(item))
  }
  return lineItems
}
func FromItem(item *order.Item) model.LineItem {
  price := decimal.New(item.GetProduct().GetPriceE2(), -2)
  qty := decimal.New(item.GetQuantity(), 0)
  return model.LineItem{
    Description: item.GetProduct().GetDescription(),
    Total:       price.Mul(qty),
  }
}

func FromUser(u *user.User) model.Customer {
  return model.Customer{
    Name:    CombineNames(u.GetFirstName(), u.GetLastName()),
    Email:   u.GetEmail(),
    Country: u.GetBillingAddress().GetCountry(),
    City:    u.GetBillingAddress().GetCity(),
  }
}
func CombineNames(names ...string) string {
  return strings.TrimSpace(
    strings.Join(names, " "))
}

This code is cleaner, shorter and doing one thing - mapping. It is easier to write, read and test.

Lack of error return value also liberates us from all the clatter of Golang's idiomatic error handling when calling the function.

Validation

So far so good, but what about the validation rules that we had embedded in the mapping code? We can't just delete them.

The mapper is a bridge between two models and usually both models would have some invariants (e.g. a line item must always have a description).

Putting the validation logic in that bridge might seem like a good idea, but it is not.

Let's look at a segment of the original FromUser example:

if u.BillingAddress == nil {
  return nil, fmt.Errorf("customer must have billing address")
}
customer.Country = u.BillingAddress.Country
customer.City = u.BillingAddress.City

What are we validating here - user or customer? Does the code mean that a User always has a billing address or a Customer always has country and city?

The only advantage of having the validation logic in the mapper is that we can see why we cannot satisfy the output model's invariants, e.g. "Customer city and country are empty because billing address information is missing". If this is complex in English, it cannot be simple in Golang or any programming language.

A better option is to use pre checks and/or post checks.

Pre-Checks

We can keep mapping code clean by moving all assumptions about the input at the beginning of the processing. This is particularly useful for upstream data produced by an [another] upstream team.

In our example, we have the Invoicing accepts an order.Order over gRPC, so we can add a validator to the handler accepting the gRPC request:

func (h *OrderHandler) handle(ctx context.Context, order *order.Order) error {
  err := h.validator.Validate(order)
  if err != nil {
    return fmt.Errorf("cannot handle invalid order: %w", err)
  }

  err = h.controller.CreateInvoice(order)
  if err != nil {
    return fmt.Errorf("cannot create invoice from order: %w", err)
  }

  return nil
}

This works well when in case Invoicing service is having specific, non-universal requirements about "orders that can have a purchase document", while the Delivery has another set of requirements for oder data.

However, in many cases the order.Order invariants are universal and the Checkout team is the best to enforce them before they sent out the orders downstream to Invoicing or Delivery.

Which brings us to the next topic - post-checks.

Post-Checks

When each service is responsible for the invariants of its own data, outgoing model validation should happen at the end of the execution path of the producer.

This means that the Checkout service should guarantee that all line items have a description, as this is a requirement for invoices.

func (c *controller) CreateOrder(cart *shopping.Cart) error {
  order := mapper.FromCart(cart)
  err := c.validator.Validate(order)
  if err != nil {
    return fmt.Errorf("cannot issue invalid invoice: %w", err)
  }
  invoice, err := c.invoicing.IssueInvoice(order)
  ...
}

A more robust approach for post-checks would be to implement them at the edge of the application code by either enriching the data contract with e.g. protovalidate or via decorators, but what I shared here is also a simple and robust approach.

Conclusion

Unconditional code and pushing data validation to the edge of processing benefits both to the organisation and code:

Code and tests are easier to maintain
Outages are mitigated faster, since the right (upstream) team service “breaks”

This style of coding is also applicable outside of Golang & Protobuf, check out GOTO 2018 • Unconditional Code • Michael Feathers for more inspiration.

References

GOTO 2018 • Unconditional Code • Michael Feathers

Kiril's Software Engineering

Discussion about this post