The memory test results came back from Synology. The results were both conclusive and inconclusive.
They proved that the environment is stable and not under any strain. But they were unable to demonstrate why CPU utilisation was close to 100%.
There were no rogue (or non-rogue) services or processes spinning (and thus eating up CPU).
There were no jobs ticking along in either TSR or background.
And there were no previously unknown index-based utilities doing indexy-related things.
So it was a bit of a mystery as to why CPU should be almost maxed out (yet delivery performance was unaffected by the high-90%s utilisation – because, remember RAM utilisation was in the normal band).
I spent a couple of days wondering what could eat up CPU let leave RAM relatively untroubled.
I was travelling to work, a few days ago, when I remembered a conversation I had with my friend Avril, many years ago – about processing architecture and IP packets.
Avril was a network genius. She holds a number of networking patents, and has worked at BTs R&D division in Suffolk in a senior capacity. She also lectured, part time, at UWE on Real Time Information Systems.
We had a conversation, years ago, about processor architecture and how some types of processors are designed to dump processes in to RAM, whilst others will handle simple positive/negative steps themselves.
It’s as clear as crystal, if you remember the big RISC v Intel debate.
I looked up the architecture specs for the processors in the NAS and discovered that they are designed on the RISC SPARC model.
Light dawned.
So it was entirely possible that the processors were performing simple RAM-type processes, without actually dumping them in to RAM to be dealt with there.
On the way home I decided to check the router logs.
[later]
A look at the router logs confirmed what I’d thought: I was under a DDoS attack.
The router was stopping 95% of the probes, but the 5% that were getting through were enough to be eating up almost all of the CPU. The NAS was rejecting the 5%, but that yes/no process was where the CPU utilisation was going.
At the peak of the attack the router logs indicated that my infrastructure was receiving c. 500 probes per minute – so many probes that the router log couldn’t record them all, it was filling up and over-writing itself every few minutes.
Unfortunately, effective though the router’s firewall is, it isn’t configurable, so I have to rely on its defaults, and beef up the second layer of security on the NAS (which I’m using as secondary firewall and DHCP server/router).
So I figure that I just have to sit tight and wait out the DDoS attack.
But here’s a thing.
Do you know how much traffic your internet router is receiving?
I doubt many people bother to check.
But crack open the logs and have a look. I’m betting you’ll be surprised at what’s going on behind the scenes.