Powershell: great but inconsistent
1. Context
Powershell (at least the latest 7.4 version) is great in that most things are objects, and there is very little need to parse the output of clis to try to automate something.
Plus, it works the same on Windows, Linux and MacOS.
Yet, Powershell almost always offers many ways to do the exact same things, for no apparent reasons for a newbie. And this is confusing.
Let’s see an example.
2. Get-Process
Get-Process
will retrieve the list of running processes on the machine,
but you can ask it a specific process with:
Get-Process -Id 1
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 10.05 0.72 1 1 systemd
So far, so good.
If you want to retrieve more than 1 process, then you can do so like this:
Get-Process -Id 1,2,3
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 0.00 2.19 2 0 kthreadd
0 0.00 0.00 0.00 3 0 rcu_gp
0 0.00 10.05 0.72 1 1 systemd
3. Where the problems starts
The first issue is that not all cmdlets (the name for powershell commands) will actually accept a list as input like that, so you have to look at the manual of the command.
In this case, you can always iterate over an array with the pipeline, and retrieve the process for each id:
@(1, 2, 3) | ForEach-Object {Get-Process -Id $_}
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 0.00 2.19 2 0 kthreadd
0 0.00 0.00 0.00 3 0 rcu_gp
0 0.00 10.05 0.72 1 1 systemd
Works pretty well, except you could also have written:
@(1, 2, 3).ForEach({Get-Process -Id $_})
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 0.00 2.19 2 0 kthreadd
0 0.00 0.00 0.00 3 0 rcu_gp
0 0.00 10.05 0.72 1 1 systemd
So, at least 2 ways to iterate on an array, but why? (In reality, there is a third way…)
4. Using the pipeline
No matter, you accept you can iterate on an array in 2 (or 3) ways, with different performance characteristics.
You write a script where you first gather the list of ids, and then you send those in a pipeline.
But since Get-Process
accepts a list of ints as -Id
, then maybe you can
feed it the list of ints directly:
@(1, 2, 3) | Get-Process
Get-Process: The input object cannot be bound to any parameters for the command either because the command does not take pipeline input or the input and its properties do not match any of the parameters that take pipeline input.
Get-Process: The input object cannot be bound to any parameters for the command either because the command does not take pipeline input or the input and its properties do not match any of the parameters that take pipeline input.
Get-Process: The input object cannot be bound to any parameters for the command either because the command does not take pipeline input or the input and its properties do not match any of the parameters that take pipeline input.
Well, ok, maybe it doesn’t know it’s for -Id
, so let’s try:
@(1, 2, 3) | Get-Process -Id
Get-Process: Missing an argument for parameter 'Id'.
Specify a parameter of type 'System.Int32[]' and try again.
That doesn’t work either.
The thing is, in powershell, everything is an object, right?
You look online, and you realize that the set of parameters expected by a command
is itself an object, and it can be provided as a PSObject
or PSCustomObject
.
Fair enough:
@(1, 2, 3) |
ForEach-Object {New-Object PSObject -property @{ Id = $_ }} |
Get-Process
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 10.05 0.75 1 1 systemd
0 0.00 0.00 3.17 2 0 kthreadd
0 0.00 0.00 0.00 3 0 rcu_gp
It works, but this ForEach-Object
is complicated, and you could have stuck
Get-Process
in there directly.
Fiddling around, you find another way:
@(@{Id=1}, @{Id=2}, @{Id=3}) |
ForEach-Object {[PSCustomObject]$_} |
Get-Process
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 10.05 0.75 1 1 systemd
0 0.00 0.00 3.17 2 0 kthreadd
0 0.00 0.00 0.00 3 0 rcu_gp
Better, but still not great.
After longer, you eventually reach this:
@([PSCustomObject]@{Id=1}, [PSCustomObject]@{Id=2}, [PSCustomObject]@{Id=3}) |
Get-Process
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 10.05 0.75 1 1 systemd
0 0.00 0.00 3.17 2 0 kthreadd
0 0.00 0.00 0.00 3 0 rcu_gp
5. How about some JSON?
Another topic: how about parsing some json and using some powershell commands to change it?
Sounds easy enough when you stumble upon ConvertFrom-Json
!
You try:
ConvertFrom-Json ./Dev/java_statics/airports.json
ConvertFrom-Json: Conversion from JSON failed with error: Input string '.' is not a valid number. Path '', line 1, position 1.
Turns out, ConvertFrom-Json
does not read files at all, but wants its input to
a string!
So, you’re forced to Get-Content
first, which is a bit annoying, but it works:
Get-Content ./Dev/java_statics/airports.json |
ConvertFrom-Json |
Where-Object {$_.continent -eq 'EU'} |
Select-Object -Last 10
And it’s a bit slow! airports.json
is 1GiB, true, but it takes 233 seconds to finish.
While using jq
and bash is much faster at 64 seconds:
jq -c 'map(select(.continent = "EU")) | .[]' ./Dev/java_statics/airports.json | tail -n 10
How about dumping some json then? Like dumping info about process id 1.
Get-Process -Id 1 | ConvertTo-Json
WARNING: Resulting JSON is truncated as serialization has exceeded the set depth of 2.
# a json follows
Wait, what? By default, it won’t output a json with depth level higher than 2
?
And because you can’t always know how deep is your json, you basically have to
always run ConvertTo-Json -Depth 10000
and hope it’s enough most of the time.
Seeing that makes you wonder if ConvertFrom-Json
suffers the same issue, so
you look up Get-Help ConvertFrom-Json -Detailed
, and you see this:
-Depth <System.Int32>
Gets or sets the maximum depth the JSON input is allowed to have. The
default is 1024.
So, both ConvertFrom-Json
and ConvertTo-Json
have default depth limits,
and they aren’t even the same (2
vs 1024
).
6. The End
Powershell can be more pleasant for quick scripts, thanks to passing objects
in the pipeline, and good manuals and auto-completion.
All commands follow the Verb-Noun
system, it tends to be make much more sense
than ls
or ps aux
or pwd
.
However, finding the recommanded way to do things isn’t very clear, mainly because there are many ways to do the same thing, and it’s not always clear why (like iterating).
On top of that, the UX of many commands have awful defaults (like ConvertFrom-Json
and ConvertTo-Json
) that just trip you up, or commands that are just
too difficult to use (such as Select-Xml
and its xmlnamespace handling).
I can see why alternatives start to pop up such as nushell, even though it’s not that stable today