类型双关 - Wikiwand

為了遵循C99/C++的嚴格別名規則，可以使用union:^[1]

bool is_negative(float x) {
    union {
        unsigned int ui;
        float d;
    } my_union = { .d = x };
    return my_union.ui & 0x80000000;
}

GCC編譯器支持這樣的語言擴展。^[2]

其他的類型雙關，見數組步長。

已隱藏部分未翻譯內容，歡迎參與翻譯。

Pascal

A variant record permits treating a data type as multiple kinds of data depending on which variant is being referenced. In the following example, integer is presumed to be 16 bit, while longint and real are presumed to be 32, while character is presumed to be 8 bit:

  type variant_record = record
     case rec_type : longint of
         1: ( I : array [1..2] of integer );
         2: ( L : longint );
         3: ( R : real );
         4: ( C : array [1..4] of character);
     end;
   Var V: Variant_record;
      K: Integer;
      LA: Longint;
      RA: Real;
      Ch: character;
  ...
   V.I := 1;
   Ch := V.C[1];   (* This would extract the first binary byte of V.I *)
   V.R := 8.3;   
   LA := V.L;     (* This would store a real into an integer *)

In Pascal, copying a real to an integer converts it to the truncated value. This method would translate the binary value of the floating-point number into whatever it is as a long integer (32 bit), which will not be the same and may be incompatible with the long integer value on some systems.

These examples could be used to create strange conversions, although, in some cases, there may be legitimate uses for these types of constructs, such as for determining locations of particular pieces of data. In the following example a pointer and a longint are both presumed to be 32 bit:

 Type PA = ^Arec;
 
    Arec = record
      case rt : longint of
         1: (P: PA);
         2: (L: Longint);
    end;
 
  Var PP: PA;
   K: Longint;
  ...
   New(PP);
   PP^.P := PP;
   Writeln('Variable PP is located at address ', hex(PP^.L));

Where "new" is the standard routine in Pascal for allocating memory for a pointer, and "hex" is presumably a routine to print the hexadecimal string describing the value of an integer. This would allow the display of the address of a pointer, something which is not normally permitted. (Pointers cannot be read or written, only assigned .) Assigning a value to an integer variant of a pointer would allow examining or writing to any location in system memory:

 PP^.L := 0;
 PP := PP^.P;  (*PP now points to address 0 *)
 K := PP^.L;   (*K contains the value of word 0 *)
 Writeln('Word 0 of this machine contains ',K);

This construct may cause a program check or protection violation if address 0 is protected against reading on the machine the program is running upon or the operating system it is running under.

C#

In C# (and other .NET languages), this is a bit harder to achieve because of the type system, but can be done nonetheless, using pointers or struct unions.

Pointers

C# only allows pointers to so-called native types, i.e. any primitive type (except string), enum, array or struct that is composed only of other native types. Note that pointers are only allowed in code blocks marked 'unsafe'.

 float pi = 3.14159;
 uint piAsRawData = *(uint*)&pi;

Struct unions

Struct unions are allowed without any notion of 'unsafe' code, but they do require the definition of a new type.

 [StructLayout(LayoutKind.Explicit)]
 struct FloatAndUIntUnion
 {
     [FieldOffset(0)]
     public float DataAsFloat;
     [FieldOffset(0)]
     public uint DataAsUInt;
 }

 // ...

 FloatAndUIntUnion union;
 union.DataAsFloat = 3.14159;
 uint piAsRawData = union.DataAsUInt;

Raw CIL code

Raw CIL can be used instead of C#, because it doesn't have most of the type limitations. This allows one to, for example, combine two enum values of a generic type:

 TEnum a = ...;
 TEnum b = ...;
 TEnum combined = a | b; // illegal

This can be circumvented by the following CIL code:

 .method public static hidebysig
     !!TEnum CombineEnums<valuetype .ctor ([mscorlib]System.ValueType) TEnum>(
         !!TEnum a,
         !!TEnum b
     ) cil managed
 {
     .maxstack 2

     ldarg.0 
     ldarg.1
     or  // this will not cause an overflow, because a and b have the same type, and therefore the same size.
     ret
 }

The cpblk CIL opcode allows for some other tricks, such as converting a struct to a byte array:

 .method public static hidebysig
     uint8[] ToByteArray<valuetype .ctor ([mscorlib]System.ValueType) T>(
         !!T& v // 'ref T' in C#
     ) cil managed
 {
     .locals init (
         [0] uint8[]
     )

     .maxstack 3

     // create a new byte array with length sizeof(T) and store it in local 0
     sizeof !!T
     newarr uint8
     dup           // keep a copy on the stack for later (1)
     stloc.0

     ldc.i4.0
     ldelema uint8

     // memcpy(local 0, &v, sizeof(T));
     // <the array is still on the stack, see (1)>
     ldarg.0 // this is the *address* of 'v', because its type is '!!T&'
     sizeof !!T
     cpblk

     ldloc.0
     ret
 }

為了遵循C99/C++的嚴格別名規則，可以使用union:^[1]

bool is_negative(float x) {
    union {
        unsigned int ui;
        float d;
    } my_union = { .d = x };
    return my_union.ui & 0x80000000;
}

GCC編譯器支持這樣的語言擴展。^[2]

其他的類型雙關，見數組步長。

已隱藏部分未翻譯內容，歡迎參與翻譯。

Pascal

  type variant_record = record
     case rec_type : longint of
         1: ( I : array [1..2] of integer );
         2: ( L : longint );
         3: ( R : real );
         4: ( C : array [1..4] of character);
     end;
   Var V: Variant_record;
      K: Integer;
      LA: Longint;
      RA: Real;
      Ch: character;
  ...
   V.I := 1;
   Ch := V.C[1];   (* This would extract the first binary byte of V.I *)
   V.R := 8.3;   
   LA := V.L;     (* This would store a real into an integer *)

 Type PA = ^Arec;
 
    Arec = record
      case rt : longint of
         1: (P: PA);
         2: (L: Longint);
    end;
 
  Var PP: PA;
   K: Longint;
  ...
   New(PP);
   PP^.P := PP;
   Writeln('Variable PP is located at address ', hex(PP^.L));

 PP^.L := 0;
 PP := PP^.P;  (*PP now points to address 0 *)
 K := PP^.L;   (*K contains the value of word 0 *)
 Writeln('Word 0 of this machine contains ',K);

This construct may cause a program check or protection violation if address 0 is protected against reading on the machine the program is running upon or the operating system it is running under.

C#

In C# (and other .NET languages), this is a bit harder to achieve because of the type system, but can be done nonetheless, using pointers or struct unions.

Pointers

 float pi = 3.14159;
 uint piAsRawData = *(uint*)&pi;

Struct unions

Struct unions are allowed without any notion of 'unsafe' code, but they do require the definition of a new type.

 [StructLayout(LayoutKind.Explicit)]
 struct FloatAndUIntUnion
 {
     [FieldOffset(0)]
     public float DataAsFloat;
     [FieldOffset(0)]
     public uint DataAsUInt;
 }

 // ...

 FloatAndUIntUnion union;
 union.DataAsFloat = 3.14159;
 uint piAsRawData = union.DataAsUInt;

Raw CIL code

Raw CIL can be used instead of C#, because it doesn't have most of the type limitations. This allows one to, for example, combine two enum values of a generic type:

 TEnum a = ...;
 TEnum b = ...;
 TEnum combined = a | b; // illegal

This can be circumvented by the following CIL code:

 .method public static hidebysig
     !!TEnum CombineEnums<valuetype .ctor ([mscorlib]System.ValueType) TEnum>(
         !!TEnum a,
         !!TEnum b
     ) cil managed
 {
     .maxstack 2

     ldarg.0 
     ldarg.1
     or  // this will not cause an overflow, because a and b have the same type, and therefore the same size.
     ret
 }

The cpblk CIL opcode allows for some other tricks, such as converting a struct to a byte array:

 .method public static hidebysig
     uint8[] ToByteArray<valuetype .ctor ([mscorlib]System.ValueType) T>(
         !!T& v // 'ref T' in C#
     ) cil managed
 {
     .locals init (
         [0] uint8[]
     )

     .maxstack 3

     // create a new byte array with length sizeof(T) and store it in local 0
     sizeof !!T
     newarr uint8
     dup           // keep a copy on the stack for later (1)
     stloc.0

     ldc.i4.0
     ldelema uint8

     // memcpy(local 0, &v, sizeof(T));
     // <the array is still on the stack, see (1)>
     ldarg.0 // this is the *address* of 'v', because its type is '!!T&'
     sizeof !!T
     cpblk

     ldloc.0
     ret
 }

類型雙關

Socket例子

浮點例子

使用union

Pascal

C#

Pointers

Struct unions

Raw CIL code

參考文獻

外部連結

Wikiwand in your browser!

類型雙關

Socket例子

浮點例子

使用union

Pascal

C#

Pointers

Struct unions

Raw CIL code

參考文獻

外部連結

Wikiwand in your browser!