Performance testing for PDM servers

How are you testing the performance of your PDM servers?

Has anyone come up with any methods or scripts that can simulate some some typical PDM workloads and output some numbers?

My use case is to see if we could reduce ram, core count, disk speed etc. on some servers and still get the same/similar outcome, since I assume we are limited by network performance anyway. I know I could just check in some large assembly and use s stop watch to time it, but that seems a bit cumbersome and not very scientific :nerd_face:

I’ve never had a good way to do load testing.

RAM: I’m fuzzy on SQL for this, it seems to use as much RAM as you throw at it. We have 64gb, not sure what the impact of dropping to 16gb would be. This is where a dedicated DBA would help, but our company won’t spring for one…..so I’m going to overspec to cover my @$$.

Cores: You could monitor core usage. I’m sure there are tools to monitor over time. We have other activities on the SQL server and other DBs running so we need more.

Disk Speed: Since there is a lot of copying, I’d prioritize this. Network Latency + slow disk speed your users will feel.

My limited experience with SQL server is that it will take as much memory you feed it with before touching the disks. 96GB RAM with a dual socket CPU at 2.1GHz, I keep around 10GB for the OS and the rest is took by SQL process.
Our DB is little over 10GB, so it goes all in RAM, the rest is basically tempdb, database tables and queries. 70~80GB…

For disk performance I made a script with creation and reading of dummy files at various sizes, there is a lot to unpack, but since the server is working mainly on memory it affects a lot the physical access to disk or ssd. You have data in RAM, then in the RAID cache and finally it hits the disks. I graphed a bit of the tests and RAID 10 is faster than RAID5/6. (no big difference rebuilding the 5 or 6 on SSD)

At some point I have to merge PDM archive and PDM database in one machine, maybe with a higher clock speed, and the same core count (16), but be aware that SQL is licensed by core count. I would keep RAID 10 for DB partitions and RAID 6 for archive files split on two separate dedicate controllers-disk array.
Archive writes once the database operations are performed, so you have to tune up the lag.
Version reading could be a peak 1Gbit for a single user.
You can see SSD speed if you recover a server backup or if you hash all your archive files.
I did it during a server move with crc32 and it took like 3hrs on HDD (RAID5 10-15k rpm over 7 disks) vs 30minutes on SSD (RAID 6 normal base server model, 6 disks) for a TB class archive.
Same server model, with similar CPU clock, same memory, different OS (MS windows server 2016 vs 2022)

Could you share the script?

For network testing I started making some PowerShell scripts and hoping to add some more performance metrics like throughput and maybe latency over time. I see goengineer suggests testing TCP connections with a packet size of 1500 bytes, but not sure what that is based on.

An all-in-one PDM server benchmarking tool would be nice. For now I’ll just look for existing file server and SQL server tests, and use what seems relevant for PDM. The DS knowledge base mentions various 3rd party tools as well, so may have to dig a bit more there.

Unfortunately I cannot share what I make at work and I do not have access at those scripts at home.

The first thing I would like to ask you is: what do you want to know or learn with the benchmark?

I did it mainly to see some number behind “SSD is fast” “RAID X is slow“, but at the end of the day if you have enough cache memory the difference could be minimal in a theoretical benchmark run, It may or may not give you some hint compared to a real scenario with 40 users accessing the PDM vault at the same time to read and write 1GB assembly with where used versions shared with other assemblies.

I wrote a post with lesson learnt here. Not so much, but keep your hopes low.
I made some drastic change on our server, but the issues were mainly:

  • server misconfiguration (wrong maintenance plan due to a completely wrong KB article for our language) → fixed with the statistics and indexes rebuilt at least every month
  • domain related (authentication lag up to 1 minute) → moved all users from windows login to PDM login now it is less than 10 seconds including get latest on some folder
  • network misconfiguration in some rack the server was wired to → asked IT to fix it
  • PDM&SW clients options were anarchy → consolidation of options and their deployment

That aside PDM performance is highly environment dependent, how many variables and files reside inside the server and their parent children relations.
We have like 2M file versions inside the vault, too many variables and properties inside and outside our 3d files and too many flawed workflows. (let me call them workflaws)

That said the main DB is like under 15GB in a dedicated SQL server with 96GB available and 2 CPU. (xeon silver 2GHz-ish)

IMHO our memory is enough , but CPU should be a little faster per core, since licensing cost is based per core, increasing the core count is probably going to hurt a lot on the money side.
Unfortunately we have a two servers (DB, ARCHIVE) setup for legacy reasons, but the more I see how the servers operate the more I want to move the archive, as secondary RAID, on the DB to limit the lag between the solidworks PDM services talking to each other for every vault related operation. That said a faster machine helps a lot for other operations like backups, file integrity checks.

Anyway to better answer your request you may take a look here
I use the same tools: diskspd.exe and winsat.exe

You may benchmark with different sizes, playing with RAID settings (RW cache ON/OFF and size) and have the results exported to a txt file so you can import them in excel a make a graph.
Since those are PDM independent it is more for you to learn about the hardware side and its response.