To display the character associated with an ASCII/ UTF-8 code do the following:
1 2 |
PS> [char]97 a |
Thanks to this PowerTip.
Now how about the reverse? Can I get the ASCII/ UTF-8 code of a character.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
PS> [int]'a' Cannot convert value "a" to type "System.Int32". Error: "Input string was not in a correct format." At line:1 char:1 + [int]'a' + ~~~~~~~~ + CategoryInfo : InvalidArgument: (:) [], RuntimeException + FullyQualifiedErrorId : InvalidCastFromStringToInteger # Bummer! How about if I set it in a variable and then try? PS> $a="a" PS> [int]$a Cannot convert value "a" to type "System.Int32". Error: "Input string was not in a correct format." At line:1 char:1 + [int]$a + ~~~~~~~ + CategoryInfo : InvalidArgument: (:) [], RuntimeException + FullyQualifiedErrorId : InvalidCastFromStringToInteger # Maybe the double quotes were causing the letter to be interpreted as a string rather than a character? PS> $a='a' PS> [int]$a Cannot convert value "a" to type "System.Int32". Error: "Input string was not in a correct format." At line:1 char:1 + [int]$a + ~~~~~~~ + CategoryInfo : InvalidArgument: (:) [], RuntimeException + FullyQualifiedErrorId : InvalidCastFromStringToInteger |
No luck. But I think am the right track.
The error InvalidCastFromStringToInteger
is what tells me PowerShell is trying to cast from a [string]
to [int]
– which is not what I want and will obviously fail. I want to cast from a [char]
to [int]
so let’s be explicit about that.
1 2 |
PS> [int][char]'a' 97 |
Good, that works!
Now how about getting the ASCII/ UTF-8 of a string. Can I do that?
As expected, you can’t just pass two characters and hope it works!
1 2 3 4 5 6 7 |
PS> [int][char]'as' Cannot convert value "as" to type "System.Char". Error: "String must be exactly one character long." At line:1 char:1 + [int][char]'as' + ~~~~~~~~~~~~~~~ + CategoryInfo : InvalidArgument: (:) [], RuntimeException + FullyQualifiedErrorId : InvalidCastParseTargetInvocation |
What I need is an array of [char]
elements. Which I can then type cast to an array of [int]
elements.
First let’s look whether there’s any method available to convert a string to an array of characters?
1 2 3 4 5 6 7 8 9 10 11 |
PS> "abc" | gm TypeName: System.String Name MemberType Definition ---- ---------- ---------- ... ToChar Method char IConvertible.ToChar(System.IFormatProvider provider) ToCharArray Method char[] ToCharArray(), char[] ToCharArray(int startIndex, int length) ... |
Looks like there is. Does the following work?
1 2 3 4 5 6 7 8 9 10 11 |
PS> "abc".ToCharArray() a b c PS> [char]"abc".ToCharArray() Cannot convert the "System.Char[]" value of type "System.Char[]" to type "System.Char". At line:1 char:1 + [char]"abc".ToCharArray() + ~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidArgument: (:) [], RuntimeException + FullyQualifiedErrorId : ConvertToFinalInvalidCastException |
No, but that gives me a hint on the solution. The output of the ToCharArray()
method is of the data type System.Char[]
whereas [char]
is shorthand for the System.Char
data type.
1 2 3 4 5 6 7 8 9 |
PS> [char] | gm -Static TypeName: System.Char Name MemberType Definition ---- ---------- ---------- ConvertFromUtf32 Method static string ConvertFromUtf32(int utf32) ... |
So maybe [char[]]
is what I need? Does such a data type exist?
1 2 3 4 5 6 7 8 9 |
PS> [char[]] | gm -Static TypeName: System.Char[] Name MemberType Definition ---- ---------- ---------- AsReadOnly Method static System.Collections.ObjectModel.ReadOnlyCollection[T] AsReadOnly[T](T[] array) ... |
Sure enough it does!
So let’s try the following:
1 2 3 4 5 6 7 8 |
PS> [char[]]"abc".ToCharArray() a b c PS> [char[]]"abc" a b c |
I don’t need the ToCharArray()
method either as if I just type cast a string to an array of characters the method is invoked implicitly. Sweet!
Armed with this info I try type casting the string to an array of integers to get their ASCII/ UTF-8 values:
1 2 3 4 |
PS> [int[]][char[]]"abc" 97 98 99 |
Nice!
Can I make this better? As in, say I had a longish string; currently the above snippet just gives a bunch of codes and that’s not very helpful if I want to see the code of a particular letter. Can I get the output such that it shows each character followed by it’s ASCII/ UTF-8 code? Something like this:
1 2 3 4 |
PS> [char[]]"abc" | %{ "$_ -> [int]$_" } a -> [int]a b -> [int]b c -> [int]c |
D’oh! Doesn’t help. But I am on the right track, and I know what to do. You see, within double quotes the [int] is not evaluated (thanks to this Hey, Scripting Guy! post) and so I have to force evaluation through any one of the methods mentioned in that post. I prefer the VBScript approach, so here goes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
PS> [char[]]"abc" | %{ "$_ -> " + [int]$_" } a -> 97 b -> 98 c -> 99 PS> [char[]]"i am a longish string" | %{ "$_ -> " + [int]$_ } i -> 105 -> 32 a -> 97 m -> 109 -> 32 a -> 97 -> 32 l -> 108 o -> 111 n -> 110 g -> 103 i -> 105 s -> 115 h -> 104 -> 32 s -> 115 t -> 116 r -> 114 i -> 105 n -> 110 g -> 103 |
Bingo!