So it’s been a while since posting here, but my wife and I have been settling into Austin and have been crazy busy getting things moving with Giftnix. Last week however I had some major hiccups with my servers running on Digital Ocean and the issues I started seeing were “TIMEOUTS” on ALL external API calls. This became very problematic because communicating with external API’s is an integral part of what my web server needs to be doing and the problem would start like this.
All external calls would timeout within seconds of being sent out, however I could SSH into the server just fine. It was only outgoing connections from my server. In order to fix the issue I tried rebooting the server, restarting the Node JS process, but time and again I hit the same timeouts, finally seemingly randomly the server begin working again. Nothing on my side had changed. So I continued to watch the issue.
All while this was going on, I was running the same Node JS process on my local workstation without issue. I finally found that the only solution that solved the issue would be spinning up a new server, but not if it was in the same data center (I have a hunch that digital ocean was putting me on the same hardware as my other VPS, because when I spin up new servers in the same data center I still get the same IP for my newly turned on servers). I would spin up a new Digital Ocean server in a different data center and the issues would stop. I still have no idea what caused these issues, but I do know that after multiple days of these Timeouts starting up in the middle of the night I was ready for change, because my local servers don’t have issues and Digital Ocean works great as long as I move data centers when the issue pops up and then the server magically starts working until the problem exists in the new data center and then I’d move to another, quite frustrating, especially as our web server would effectively act down for hours at a time. The server has had no major changes or upgrades in 6 months as well, so it didn’t seem to be anything related to the server and the server might be fine for a few days and then have issues or the server may have an issue and be fixed repeatedly every 10 minutes. One time the issue cropped up and then fixed itself without me touching the process within 15 minutes (I was monitoring it, the system did NOT restart) and then started having the same issue 3 days later.
After moving over to Linode though and testing out another data center I am convinced that something is wrong with Digital Ocean’s networking system, and this graph really put the icing on the cake for me. After migrating our system over to Linode on a similar machine I went to New Relic awhile later and noticed a major change in my apps performance:
Can you tell what time I updated at? 5.30pm. This is embarrassing to me and to Digital Ocean, I had known my server was slower than I’d like because it relies on third party API’s that had always been slow and I excused that as a problem I couldn’t solve. But just by migrating to Linode I noticed my external API calls improved tremendously, and thus my internal users calls were now sped up by a factor of 2-10 times faster for almost all calls. At first I thought that perhaps Linode has a better connection to the outside world then Digital Ocean, but that doesn’t make sense, because I then ran my server again locally and realized that my local server on my mac was making external API calls at similar speeds to Linode, but nowhere near as slow as Digital Ocean. I wish I could pull the graphs to prove if it’s always been this slow or to find when this networking issue began, but I don’t pay for that level of service at New Relic.
So what’s the problem? I don’t know for certain, but I believe that Digital Ocean has a networking problem. If you have any ideas please hit me up. I love Digital Ocean’s offering, but something is wrong with how their servers communicate with the outside world. I need to spin up another brand new server and configure it from scratch again, but I’ve seen a few other blogs touting Linode’s performance speeds in other areas as well. Digital Ocean may be better than Heroku, but until they get their external network connection issues solved I’ll be found on Linode.