MachShip Performance / Acce...

Resolved
Sep 11 at 09:19pm AEST

Hi All,

This is Mike McKay, the CEO at MachShip.

Firstly, can I thank you all, our valuable customers, for your patience today as we worked to resolve the issues with the "slowness" of the system.

As you know, MachShip is generally a very stable and performant platform and we constantly strive to have our uptime and reliability exceed our customers' expectations. Unfortunately, today we have fallen short with the issues we faced this morning.

Over the weekend, we moved our hosting from one provider to another, much larger provider, to get us ready for future scalability and security requirements.

When moving systems between hosting providers, often there are small differences in the way things work that are not easily identified, and often do not present themselves until there is a significant load on the platform.

As MachShip is B2B software servicing the Australian market, our significant loads are seen primarily in the mornings, between 6am and midday, Melbourne time, as that is when most of the bookings and such are being added by you, our customers.

We moved the platform over to the new hosting on Friday night, and over the weekend we fixed several small issues that we could identify, and worked to do as much testing as we could to try to ensure there would be no issues this morning when the load picked up. Unfortunately, there was still the remaining issue of which we were not aware, which has caused the slowness this morning and into this evening.

Long story short, the issues we faced today were related to our database and in particular, its ability to handle the amount of requests that it needs to during our busy periods. Our new hosting required that we tweak some fairly obscure settings to allow our database to handle that load. Unfortunately, computers (and in particular, virtualised servers) being the extremely technical and at times challenging beasts that they are, even for experts, it took us many hours of trial and error to finally find a fix.

Once we found that fix and rolled it out, we saw that the performance of the platform was back to its usual speedy nature, and at that time we have considered the issue as fixed.

We sincerely apologise to you, our customers, and to our extended user base for the inconvenience that this no doubt caused to you today.

Please rest assured that we will continue to closely monitor our systems and whilst we don't anticipate further issues, we will be right on top of them should the pop up.

I would like to take the opportunity to provide some further background as to why we're making the changes we are and what we've changed; here are some of the major changes:

We have moved our hosting to Equinix, who is one of the largest datacenter providers in the world, to ensure that we can scale our services effectively infinitely into the future, from both a volume and geographic perspective.
We have moved to more advanced storage that will allow us to speed up our systems going forward, and to allow us to have more control and visibility over our underlying infrastructure.
We have moved to using BGP for routing traffic into our network, moving us closer to the core internet infrastructure and allowing us to host our services in different places without the need to change our IP addresses going forward.
We have placed Cloudflare's enterprise protection services in front of our systems to provide more security and protection from attacks such as DDoS.
We moved our services from a Melbourne location to Equinix's Sydney datacenter, as step one of our move.
We will be, in the near future, having a full copy of our systems and data running in Equinix's Melbourne datacenter, to provide physical redundancy.
We have different internet providers in both sites to ensure that no problems with a single internet provider can bring MachShip down, and so that we can route around those issues if that ever does happen.

There are many more changes coming in the near future, especially focussed on redundancy and security, to ensure that we can provide you, our customers, with a product that you can continue to trust.

Should you have any further questions relating to the outage or what we're planning to do in the future, please feel free to reach out to our support team.

Once again, please accept our sincerest apologies for the inconvenience caused to you and your customers this morning.

Michael McKay

CEO

MachShip

Updated
Sep 11 at 09:19pm AEST