1

Looking at the output of Munin's diskstats plugin (which is reading from /proc/diskstats) I'm noticing what seems peculiar to me. The disk is a SSD, and I would assume, magnetic or solid state, that latency would increase during periods of heavy writes. Why does it instead decease?

Disk latency per device

Throughput per device

1
  • some SSDs with TLC chips have faster MLC chips as cache, if this cache is full, perf drops. look if your SSD has such configuration. Jul 9, 2018 at 14:56

4 Answers 4

2

Are the writes random or streaming?

I assume they're contiguous streaming "sequential" writes, because i/o latency is always lower with sequential i/o. Latency is higher for random i/o because SSDs still have some latency for i/o bursts, caches aren't as effective, and there may be background garbage collection on the SSD that has to be interrupted, etc.

Read more: https://www.seagate.com/tech-insights/lies-damn-lies-and-ssd-benchmark-master-ti/

2
  • No idea; it was a script inserting in to the DB that ran away. Should have only been a few hundred rows but ended up being tens of thousands.
    – user476819
    Jul 6, 2018 at 4:59
  • Which likels is linear mostly. Log, new allocation and datain to those files.
    – TomTom
    Jul 6, 2018 at 6:14
1

that latency would increase during periods of heavy writes. Why does it instead decease?

Because you do not have heacy writes. It looks your heavy writes are LARGE. While in the other periods you where doing a lot of small things.

See, what is missing here is an IOPS counter. IOS is generally more IOPS limited. If oyu odo few large IOPS (copying large files) then this is more efficient, even for SSD (replacing the content of a cell is faster than reading the cell, changing some bytes, writing the whole cell).

We would need to analyze what you did, but it sort of looks like a large IO operation was overshadooring the rest. Did you possibly have VM's on the server that you even did shut down? Noone can really know without knowing what happened there at this time.

Btw., for hardware and SSD the latency is HORRIFIC normally. With that throughput and SSD I would expect WAY lower than 1ms latency.

I am just looking at a smaller storage unit doing around 1000 IOPS at the moment - less than 1 megabyte, the ton of small io you get when you run 40 or so idle VM's.... write is at 1.85ms OVER NETWORK (!), read at 5.25us (not ms, us). Even your min values are exrtremely high. Your writes generally are in the area I got accessing SSD over a 1 gigabit network.

4
  • > iven that SSD Means Solid State Disc, please explain what you would assume with magnetic here With a magnetic disk, you have to wait for the platter to "come around" to what you're reading or writing plus the overhead of moving the head from one area of the platters to another. Not so with a SSD, where every sector is avilable in O(1) time.
    – user476819
    Jul 6, 2018 at 4:58
  • Again, explain what you mean in the sentence (!): "The disk is a SSD, and I would assume, magnetic or solid state". If the disc is SSD, it is SSD and not magnetic. SImple. Either it is SSD or not.
    – TomTom
    Jul 6, 2018 at 6:14
  • @TomTom I think you misunderstood the sentence structure of the question. It happened to me the first time I read it, too. This disk is a[n] SSD is separate information from what follows: I would assume, magnetic or solid state, that latency would increase during periods of heavy writes. Rephrased the sentence would be along the lines of This disk is an SSD. I would assume, no matter whether a drive is magnetic or solid state, that latency [...].
    – Maz
    Jul 6, 2018 at 13:10
  • Hm, possibly. I edit out that part of my answer.
    – TomTom
    Jul 6, 2018 at 13:26
1

SSDs have a strange behavior with regard to writes that may cause the drop in latency. They will keep the data in an internal write cache for the first few MBs and then write them. This means that the first few writes will see a very short latency since the write is acknowledged once the data hits the internal SRAM but didn't get to the flash media yet. The data is then written to the media fast (usually written in parallel to multiple flash dies over multiple channels) and then the SSD is ready to get another batch like that. If you have writes that generate a large queue at the SSD you will see the latency rise, if your writes are in short bursts you will have a very large reduction of the latency since each burst is handled by SRAM.

0

The average moved on your graph. But beware What the Mean Really Means. The distribution of latency almost certainly has outliers, is not normal, and has multiple modes.

Try getting raw response times and graphing it as a heat map. There probably will be a small number of outliers, and a cluster around 2 ms where the average stabilized. The sample size - number of IOs - changes how much that affects the mean.

Also note that, depending on the implementation, writes can be faster than reads. A storage array or disk drive may acknowledge a write in non-volatile cache faster than it takes to read from the media.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .