Micro tracer - I think that's it. Actually, I had an old Tektronix tube-based scope that would curve trace tubes. It weighed about 200 pounds as I remember.
Curve tracers are hard to use well. You have to do a lot of mental interpolation and curve fitting to do the more useful things right. They're GREAT for finding voltage breakdowns and such. The actual plate or grid curves of a tube are not all that helpful in themselves. You'd be looking for the grid-cathode voltage that gave X plate current and Y screen current. It's simpler for the user, if not the instrument, to be presented a value of Vgs that gives a value of Ip, along with the identifying tube label number. The info is there on the tracer curves, of course, but you have to learn to squint it out.
I like the idea of running a bunch of data points and storing that info in a file. Today, using a megabyte of storage per tube is pretty trivial, and that's probably way more than you really need. I guess I ought to do some boundary calculations for how much data is really needed. These days, you can get a terabyte for $35-$50, so keeping a lot of data is cheaper than a few hours of human time.
After scanning (is there a better word?) a box of 50 6L6s (and labeling each one with a number so you can later tell them apart), you can simply tell your machine, through perhaps a C or Basic program, to find you the best pairs, quads, hexes, and so on, and list them out for you. You could use some advanced stuff like saying that Vgs @ 50ma (or Ip at -35V) is within X%, gm is within Y%, [something else you compute] is withing Z%, and have it spit out the likeliest candidates.
I can't imagine how a tube "matching" service would not do something like that, unless they just never think of it.
The truly bad problem is the spread of characteristics. The things that are good about a tube - emission current per heater voltage, gm, Ip vs Vgs, etc. all have some statistical spread. The spread is almost certainly close to a normal distribution for each run of tubes, and for each manufacturer over many runs. Keeping a lot of data on the tubes you rapidly get to the point where you can tell the machine (through that C or Basic program again) to tell you the actual distribution including things like the standard deviation. That tells you how far apart you can expect the NEXT tube to be from the ones you've already measured, and can be used to tell you what % of tubes will be within X% of each other for matching, and which are simply too widely distributed.
In the case of an old-school kind of shop, someone with an emission tube tester and a bias fixture can test a tube every ten seconds or so as long as the human plugging them in, turning on the tester, then writing down the results, doesn't get tired. I bet that out of a box of tubes, they would decide on a measurement criteria that made 80% or more of them be "matched" close enough, and sell all the outliers if matching was not requested. Human shopkeepers are fairly predictable in some ways.
As for solid state devices to test tubes - actually, it's a version of there being justice in the universe, with billions (literally, if you count the CPUs and memory) of tiny silicon devices paying homage to their thermionic ancestors.