The Hermit Shrimp

2020.12.17

Nobody Cares About Security

At least anybody who can expense it.

Once again, I have been blown off by managers and other assorted budgets handlers about the potential of major security issues throughout our ecosystem.

"Why do you care? You're just a developer." -t. Managers Everywhere

Well, aside from not only my job and livelihood riding on the security of our data, but everyone in this department's jobs and livelihoods, not much aside from a our customer's own personal information. But, hey, Yahoo doesn't mind letting that data bleed out every few years so why should I care? I know that's a pretty tall order to protect what we've been tasked to protect. I know it's even harder to spend a budget to benefit our job security. Especially when there's a new Keurig model coming out. But is security really where we should be skimping?

I'll give a little background here. I am a cheap developer. I'm talking really REALLY cheap. I've built internal products and libraries off the clock just so that we don't have to pay for external products and services. I've made vendors bend over backwards and give 50%-75% discounts just to get our business. I don't bill gas when I have to travel all over for meetings, trainings, and other whatnot. My office chair is practically falling apart from sheer age and wear while most of my other coworkers replace their workspaces almost bi-yearly. Hell, I do the most high-end computer work in the office yet have the cheapest computer. Sure I'm going full clown by trying to make my workplace better than when I joined, but I also don't want to be the one who's name shows up when the boss man starts asking where expenses are coming from. But security is the one thing is something that I just can't get anybody with a GL code to even acknowledge.

But you're a developer. Shouldn't you know all this security stuff?

I'm not saying I don't know security. I know quite a bit to be honest. Sure I can't quote OWASPs security recomendation of the hour, but I'm not completely out of the loop. Most of what I know is simply because I have to touch pretty much every inch of our software achitecture (and some of the hardware depending on who feels like working that day). But I know what I know and I know that I'm not an expert in this field. I'm constantly learning and implementing new security procedures, patches, and policies to further secure my environment. I'd say 75% of the security work I do should be falling under our network engineers (yes, I'm even managing network and server infrastructure), but they're far too busy letting SSL certificates expire and not letting anyone else be on the reminders for them.

For the love of God, don't let SSL certs expire. Put everyone in IT on the reminder if you have to.

But even with all this effort being put in, I still couldn't do 10% of someone who's truly an expert in the realm of security.

"Shouldn't you bring this up to the network engineers? A large amount of this definitely falls under their responsibilities." Is this the part where I start laughing? No? Okay sure. I'll go talk to them. "Yeah, we're real busy with this project that, you know how it is. Send me an email and I'll get around to it." Ah, yes. The email inbox which you recently bragged about it hitting 50,000 unread messages. The one where system critical messages are squirreled away so that we can find about them two weeks later. Cool. Yeah. I'll send an email there.

The fun fact that I've learned about network engineers in my own personal experiences is that they don't care unless they are told to care. Maybe others have much better interactions. I hope so for the sanity of everyone everywhere, but I've yet to see it and I would classify them with bigfoot and unicorns at this point.

"Have you tried bringing this up with higher ups?" The issue with managers, is that they are terrified that something horrible will be exposed and that they will be sent packing. In their mind, this purposeful ignorance is a survival tactic. From the point of view of a specialist, it's oftentimes hard to understand that 95% of managers fall into the the "Yep, I've managed a thing at one time" category and HR departments everywhere can plug the first one that meets the bare minimum requirements in.

I've learned over time that managers aren't as worried about getting executed, but rather, how long they can smoke that final cigarette before the trigger is pulled.

I don't even feel as if I'm asking for that much to be honest. I'm simply pushing for us to have a security audit. "Wait. Just a security audit? Are you implying that you've never had a security aduit?" most may ask, and all I can reply with is a sigh and a nod at this point. I'm managing a spaghetti of questionable project decisions and archaic vendor applications from long before I started, with a nice layer of swiss cheese security on top of it, while I wait for the day of the guillotine. Honestly, the entire department should be bringing up the same concerns. But we all know how it goes.

If executives, don't understand, management doesn't care, and if management doesn't care, then that money is spent on Zoom business licenses. (Even though we've never used features outside of the Basic plan.)

At this point I'm not sure if it's the ignorance or obsolescence of those in charge. Honestly, I don't even care anymore. We live in a world where organizations are getting hit by attacks every day. We get cryptolocker'd on a regular basis and luckily it's only been very limited accounts from low-level users where we can just purge and move on, but one day, I'll be writing this from an unemployment line even though I'm the loudest canary that any coal miner has ever heard. I guess the point that I want to make here is,

if you're not auditing, you're not living

That might seem a little extreme but knowing you can come into work any day and be fired for what was probably an incredibly simple to remedy security issue is quite terrifying.

2020.12.16

Building a Better API

It's more than just shoving out data.

Everyone has them. You know who. That vendor that replies to your support tickets on four week intervals, treats you like bacterium, and generally seems to despise you as a customer even though you are giving them money for their service. You're probably already gritting your teeth just thinking about them.

Then your managers come to you and ask you to "integrate" with these vendors to get some data. The managers that have neither care nor need to know how atrocious this vendor is nor any idea what a "data" or "api" even is. They'll come in and say, "Alright, this vendor says they have an API for their customer database we have with them. They can't make the reports we need so you'll just need to figure it out." At this point they'll walk out of the room, feeling proud of their accomplishment of using acronyms and go back to pretending that replying to an email is a difficult and technical job.

Now you're sitting here with a one page printout of a marketing sheet that the vendor provided for their API with no reference to any documentation. You'll head back to your desk and go to the vendor's "next-gen" site fill with fancy buzzwords like "HTML 4.0" and "Java" that was probably built when Netscape was still relevant to search for the mythical documentation for an API. After much searching and dead-ends, you eventually find a PDF on Google that someone uploaded several years ago on their personal blog after they had battled with the vendor's first four levels of support technicians to provide.

You open up this closely guarded book of secrets expecting some spark of brilliance or order, but alas, you are immediately kicked in the shins by one of the foulest creatures you have ever laid eyes upon. You sigh then immediately scroll down the the customer endpoints because that's what your boss needs to show the executives that "number go up", so that they can all nod and give themselves raises.

As a side note, let's say this vendor is holding a database of about five million customers.

You finally scroll through the endlessly detailed endpoints for things that doubtfully anyone would ever use and you lay eyes upon the customer endpoint.

/customers/

parameters:

  • deceased
    • true/false
{
    "customers": [
        1,
        2,
        3,
        ...
    ]
}

You dry-heave in your mouth knowing the evil that has been given to you. A thoughtless design by a vendor attempting the absolute bare minimum. An interaction we all know to well. No pagination, no details, no filters of value, not even an ounce of operational value. A raw skeleton of apathy towards the consumer. You query it out of desperation that possibly the documentation is out of date. Your Postman spins and spins and spins. You spend a bit working on some tickets sent by an aggravated user that is mad because they pressed "delete" and the thing was deleted. Eventually you come back to this task.

"Status: 200 OK Time: 21 minutes, 32 seconds Size: 3.13 MB"

"Alright this is doable" you try to reassure yourself. You then look for the individual customer endpoint.

/customers/{id}

{
    "id": 1,
    "links": {
        "demographics": "/customers/1/demographics",
        "billing_profile": "/customers/1/billing_profile",
        "orders": "/customers/1/orders"
    }
}

Here is where you know everything is going to go sideways. One "pull" of the customer database would be a minimum of 15,000,001 queries. Admittedly these endpoints would be able to crawl in a blistering 2 seconds each. But that's okay, we live in a modern era with fancy technology such as multi-threading don't we? But this is where the pain sets in. API request limits.

"Maxiumum of 1 request per 5 seconds."

The math immediately hits you like a brick. That's a maximum of 17,280 calls in a day. We're talking about two and a half years to pull all the customers data. About this time your boss comes by and asks, "What do you think? We'll have it by next week, riiigghhhtt? I already went ahead and told my boss that this will be an easy integration." You try to explain basic mathematical calculations to him, forgetting that numbers above one hundred are big and scary. Your manager mentally clocks out, smacks the back of your chair, chuckles, makes a bad joke about college football, then heads out on his four hour "working" lunch.

You know that there is no hope in contacting the vendor as they still haven't replied to the last six tickets and twelve phone calls that you have entered in the past three months for system breaking issues. Your next hope is to contact the report writers and ask what data they actually need. Maybe we can shave down some API calls? Maybe we don't need to know that they have a Visa versus a Mastercard set up?

"Yes, we need absolutely all the data that you can get."

You're remindeed that you're little more than a magical puppy that can puke out anything that anyone could ever wish for as long as you receive enough kicks.

But now the most painful dawning realization hits you. The one that hurts. Most of you have probably already seen it. Once you sync the entire customer database, you'll have to start all over again. You're unable to get a list of customers that have changed since the last sync. So everytime you do a sync, you have to do a FULL sync. No partials. You can already hear the voice of the vendor right now:

"Doing a date modified filter would be incredibly difficult under our database achitecture. I'm not even sure how high the bill would be for such an enhancement." - Every Vendor Ever

Now you're peeved. You're going to make this vendor pay for their sins. You create a second user account with a second API key. You run both at the same time. No conflict. Thank God that they're just as terrible at programming everything else as they are their APIs. Here's the shining star in your dark night. You find the API endpoint for creating users, wire it up and spit out 10000 new users as fast as you can. At this point you have sheer horsepower on your side. Time to put those enterprise self-hosted virtual servers to work like never before. With sheer determination and a prayer that the vendor doesn't block this workaround, you're able to pull 172,800,000 calls a day.

You kick back call yourself a magician and pretend that anyone in your office knows your first name while your multi-threading monster is causing havoc in some network engineer's office.

But I digress, the problem identified here is something that really should have never existed in the first place. Just a few simple ounces of effort could have made this entire process a far more efficient expense of time for everyone involved. The biggest question to always ask yourself when writing an API is always, "what are people going to do with this?"

Another fun one to ask is, "Will this make someone want to strangle me in a dark alley?"

With many vendors it often seems that this task is delegated to the greenest college graduate they can find, who's only familiarity with an API is a definition on Wikipedia.

For example, The original customer pull could be greatly improved with only a handful of incredibly simple changes. Just adding a date modified filter and pagination would dramatically increase the productivity of this endpoint. And don't you dare set the max page size to 100 for an endpoint that hits millions. You know who you are. Then the customer by id endpoint could be simplified by moving the most common data from the subqueries up to the parent such as the name, date of birth, and maybe the most recent order. Things that generally everyone would be looking for.

And these are just changes that can be done from a very structured point of view, but we can get some much more creative and flexible than this. Imagine this endpoint:

/customers/{id}

parameters

  • demographics: true/false
  • billing_profile: true/false
  • recent_orders: true/false
    • limit_recent_orders: 10
{
    "id": 1,
    "demographics": {
        "first_name": "",
        "last_name": ""
    },
    "billing_profile": {
        "type": ""
    },
    "recent_orders": [1,2,3,4,5]
}

This is an incredibly flexible API that can work well with an entire spectrum of use cases and needs. And odds are, from a infrastructure point of view, this will almost always place less stress on both systems involved rather than running four separate queries. Don't need the billing_profile section? Set it to false. Need only the most recent order? Awesome. limit_recent_orders=1.

Being an API developer, doesn't have to be an abstract witchcraft of smoke and mirrors, sometimes it's just as simple as asking yourself, "would I want to query this?" If the answer is no, then you should probably take a good look at what you have and either make a better version or if you were able to ask this before the API hits production, you can go ahead and prevent the pain now.