The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

soostibi (11) [Avatar] Offline
#1
First of all congratulation for this book! I really enjoyed reading it and started to write scripts which I've never done before!
Actually there are only two books worth buying about PS: Bruce's and the Cookbook by Lee Holmes. These are perfect complementers to each other.

One question about comparing strings, here is my problem. I use Windows with Hungarian language setting, Hungarian sort order. The Hungarian alphabet: a, á, b, c...
When comparing strings:

[1] PS I:>"ab" -lt "ac" # that's OK
True

[2] PS I:>"a" -lt "á" # that's OK
True

[3] PS I:>"ab" -lt "áb" # that's OK
True

[4] PS I:>"ac" -lt "áb" # that's the problem!
False

So the last result is not correct. Why? Is that a bug?
Kiron (38) [Avatar] Offline
#2
Re: Question about comparison of strings
When both operands are strings PowerShell compares strings as strings, alphabetically:

# true because á's position in the alphabet is greater than a's
'á'-gt'a'

# true, same as before; the next element in both operands is the same
'áb'-gt'ab'

# true, á's position in the alphabet is greater than a's
# and so is c's position regarding b's
'ác'-gt'ab'

# false, even though á's position is greater than a's
# b's position is lower than c's; the string 'áb' comes
# before -or is less than- 'ac' when sorted alphabetically
'áb'-gt'ac'

# sorting the strings demonstrates it
'á','a' | sort
'áb','ab' | sort
'ác','ab'| sort
'áb','ac'| sort

# compare numbers as strings
# true, obviously
'13'-gt'12'

# sort numbers as strings
'13','12'|sort

# false, but 123 is greater than 13
# the thing is that they are strings not integers
'123'-gt'13'

# sort numbers as strings
'123','13'|sort
soostibi (11) [Avatar] Offline
#3
Re: Question about comparison of strings
I don't really catch your point. Let's substitute á with b, and b with c., and c with d (shift letters by 1 from á) Then:
'áb'-gt'ab' --> 'bc' -gt 'ac' true
'ác'-gt'ab' --> 'bc' -gt 'ac' true
'áb'-gt'ac' --> 'bc' -gt 'ad' according to your explanation should be false, but it is true.

Why? Why á behaves differently then b?
soostibi (11) [Avatar] Offline
#4
Re: Question about comparison of strings
Look at this:
[39] PS I:>[string]::compareordinal("áb", "ac"smilie
128
[40] PS I:>"áb" -gt "ac"
False

Strange...
Kiron (38) [Avatar] Offline
#5
Re: Question about comparison of strings
Sorry, I'm not good at explaining, I try to describe what I understand is happening, especially, since you had not received a satisfactory response in the NewsGroup for a while; even though many MVPs must have seen your post, none tried. smilie

I'm not certain on which specific sorting algorithm PowerShell implements, but when sorting two strings it compares both strings' first character, whichever is lower determines that string's position to be first in the sort order; if the characters are equal then the algorithm moves on to the next character and compares them. Once a difference is established the sort order is established and the algorithm stops checking. If both strings are equal then either one could be place ahead or before.

Let's take two strings 'aábza' and 'aábca' (hope they don't mean anything offensive in hungarian). The first three characters on both strings are equal but the fourth characters are not. Compare the first string's fourth character's position in the alphabet against the second string's fourth character's, z's position is greater than c's, i.e. 'c' is before 'z'. This means that 'aábca' -lt 'aábza' or 'aábza' -gt 'aábca'

'aábca'-lt'aábza'
'aábza'-gt'aábca'

The [String]::CompareOrdinal Method 'Compares two specified String objects by evaluating the _numeric_ values of the corresponding Char objects in each string.' This is not the same as String or Word sorting, because 'á' comes before 'b' alphabetically but á's char value is greater than b's.

# 'á' comes before 'b' alphabetically
'á'-lt'b'

# this is equivalent to 225 -gt 98
[int][char]'á'-gt[int][char]'b'

# shifting the letters one position...
# true, compare each string operand's first character's position in the
# alphabet; b's position is greater than a's; since b's position is
# greater -comes after- a's no need to check the next element
'bc'-gt'ad'

# verify by sorting, a comes before b always
'bc','ad'|sort

String comparison is done alphabetically not according to the characters corresponding _numerical_ value.

Run these commands to see the differences in numerical value order and 'alphabetic' order:

# get a [string[]] of some chars
[string[]]$chars=32..255|%{"$([char]$_)"}

# sorted by numerical value, not alphabetically
"$chars"

# alphabetically sorted case insensitive
"$($chars|sort)"

# alphabetically sorted case sensitive
"$($chars|sort -ca)"
soostibi (11) [Avatar] Offline
#6
Re: Question about comparison of strings
I found the solution! The default Hungarian sort order is not the official. If I switch to technical, then I get the correct ordering:
[1] PS C:> 'a','b','c','á','ab','áb','ac','ác'|sort
a
ab
ac
á
áb
ác
b
c

Anyway, thank you!