반응형

우선 비트필드를 알아보기 전 데이터를 도대체 왜 정리를 해야하는 가에 대한 질문이다. 왜 [바이트패딩]이란 것이 존재하며, 그것을 수정하기 위한 비트필드를 차후에 알아보는 것이 순서에 맞겠다.

 

Data Alignment - 데이터 정렬

 

Data structure alignment - Wikipedia

Data structure alignment refers to the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing. The CPU in modern computer hardware performs reads and write

en.wikipedia.org

더보기

Data structure alignment refers to the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.

The CPU in modern computer hardware performs reads and writes to memory most efficiently when the data is naturally aligned, which generally means that the data address is a multiple of the data size. Data alignment refers to aligning elements according to their natural alignment. To ensure natural alignment, it may be necessary to insert some padding between structure elements or after the last element of a structure.

Although data structure alignment is a fundamental issue for all modern computers, many computer languages and computer language implementations handle data alignment automatically. Ada,[1][2] PL/I,[3] Pascal,[4] certain C and C++ implementations, D,[5] Rust,[6] C#,[7] and assembly language allow at least partial control of data structure padding, which may be useful in certain special circumstances

Definitions[edit]

A memory address a is said to be n-byte aligned when a is a multiple of n bytes (where n is a power of 2). In this context a byte is the smallest unit of memory access, i.e. each memory address specifies a different byte. An n-byte aligned address would have a minimum of log2(n) least-significant zeros when expressed in binary.

The alternate wording b-bit aligned designates a b/8 byte aligned address (ex. 64-bit aligned is 8 bytes aligned).

A memory access is said to be aligned when the data being accessed is n bytes long and the datum address is n-byte aligned. When a memory access is not aligned, it is said to be misaligned. Note that by definition byte memory accesses are always aligned.

A memory pointer that refers to primitive data that is n bytes long is said to be aligned if it is only allowed to contain addresses that are n-byte aligned, otherwise it is said to be unaligned. A memory pointer that refers to a data aggregate (a data structure or array) is aligned if (and only if) each primitive datum in the aggregate is aligned.

Note that the definitions above assume that each primitive datum is a power of two bytes long. When this is not the case (as with 80-bit floating-point on x86) the context influences the conditions where the datum is considered aligned or not.

Data structures can be stored in memory on the stack with a static size known as bounded or on the heap with a dynamic size known as unbounded.

Problems[edit]

A computer accesses memory by a single memory word at a time. As long as the memory word size is at least as large as the largest primitive data type supported by the computer, aligned accesses will always access a single memory word. This may not be true for misaligned data accesses.

If the highest and lowest bytes in a datum are not within the same memory word the computer must split the datum access into multiple memory accesses. This requires a lot of complex circuitry to generate the memory accesses and coordinate them. To handle the case where the memory words are in different memory pages the processor must either verify that both pages are present before executing the instruction or be able to handle a TLB miss or a page fault on any memory access during the instruction execution.

When a single memory word is accessed the operation is atomic, i.e. the whole memory word is read or written at once and other devices must wait until the read or write operation completes before they can access it. This may not be true for unaligned accesses to multiple memory words, e.g. the first word might be read by one device, both words written by another device and then the second word read by the first device so that the value read is neither the original value nor the updated value. Although such failures are rare, they can be very difficult to identify.

Data structure padding[edit]

Although the compiler (or interpreter) normally allocates individual data items on aligned boundaries, data structures often have members with different alignment requirements. To maintain proper alignment the translator normally inserts additional unnamed data members so that each member is properly aligned. In addition the data structure as a whole may be padded with a final unnamed member. This allows each member of an array of structures to be properly aligned.

Padding is only inserted when a structure member is followed by a member with a larger alignment requirement or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the amount of padding required to maintain alignment. For example, if members are sorted by descending alignment requirements a minimal amount of padding is required. The minimal amount of padding required is always less than the largest alignment in the structure. Computing the maximum amount of padding required is more complicated, but is always less than the sum of the alignment requirements for all members minus twice the sum of the alignment requirements for the least aligned half of the structure members.

Although C and C++ do not allow the compiler to reorder structure members to save space, other languages might. It is also possible to tell most C and C++ compilers to "pack" the members of a structure to a certain level of alignment, e.g. "pack(2)" means align data members larger than a byte to a two-byte boundary so that any padding members are at most one byte long.

One use for such "packed" structures is to conserve memory. For example, a structure containing a single byte and a four-byte integer would require three additional bytes of padding. A large array of such structures would use 37.5% less memory if they are packed, although accessing each structure might take longer. This compromise may be considered a form of space–time tradeoff.

Although use of "packed" structures is most frequently used to conserve memory space, it may also be used to format a data structure for transmission using a standard protocol. However, in this usage, care must also be taken to ensure that the values of the struct members are stored with the endianness required by the protocol (often network byte order), which may be different from the endianness used natively by the host machine.

Computing padding[edit]

The following formulas provide the number of padding bytes required to align the start of a data structure (where mod is the modulo operator):

1
2
3
padding = (align - (offset mod align)) mod align
aligned = offset + padding
        = offset + ((align - (offset mod align)) mod align)
 

For example, the padding to add to offset 0x59d for a 4-byte aligned structure is 3. The structure will then start at 0x5a0, which is a multiple of 4. However, when the alignment of offset is already equal to that of align, the second modulo in (align - (offset mod align)) mod align will return zero, therefore the original value is left unchanged.

Since the alignment is by definition a power of two, the modulo operation can be reduced to a bitwise boolean AND operation.

The following formulas produce the aligned offset (where & is a bitwise AND and ~ a bitwise NOT):

1
2
3
4
padding = (align - (offset & (align - 1))) & (align - 1)
        = (-offset & (align - 1))
aligned = (offset + (align - 1)) & ~(align - 1)
        = (offset + (align - 1)) & -align
 

 

Typical alignment of C structs on x86[edit]

Data structure members are stored sequentially in memory so that, in the structure below, the member Data1 will always precede Data2; and Data2 will always precede Data3:

1
2
3
4
5
6
struct MyData
{
    short Data1;
    short Data2;
    short Data3;
};
 

If the type "short" is stored in two bytes of memory then each member of the data structure depicted above would be 2-byte aligned. Data1 would be at offset 0, Data2 at offset 2, and Data3 at offset 4. The size of this structure would be 6 bytes.

The type of each member of the structure usually has a default alignment, meaning that it will, unless otherwise requested by the programmer, be aligned on a pre-determined boundary. The following typical alignments are valid for compilers from Microsoft (Visual C++), Borland/CodeGear(C++Builder), Digital Mars (DMC), and GNU (GCC) when compiling for 32-bit x86:

  • A char (one byte) will be 1-byte aligned.
  • A short (two bytes) will be 2-byte aligned.
  • An int (four bytes) will be 4-byte aligned.
  • A long (four bytes) will be 4-byte aligned.
  • A float (four bytes) will be 4-byte aligned.
  • A double (eight bytes) will be 8-byte aligned on Windows and 4-byte aligned on Linux (8-byte with -malign-double compile time option).
  • A long long (eight bytes) will be 4-byte aligned.
  • A long double (ten bytes with C++Builder and DMC, eight bytes with Visual C++, twelve bytes with GCC) will be 8-byte aligned with C++Builder, 2-byte aligned with DMC, 8-byte aligned with Visual C++, and 4-byte aligned with GCC.
  • Any pointer (four bytes) will be 4-byte aligned. (e.g.: char*, int*)

The only notable differences in alignment for an LP64 64-bit system when compared to a 32-bit system are:

  • A long (eight bytes) will be 8-byte aligned.
  • A double (eight bytes) will be 8-byte aligned.
  • A long long (eight bytes) will be 8-byte aligned.
  • A long double (eight bytes with Visual C++, sixteen bytes with GCC) will be 8-byte aligned with Visual C++ and 16-byte aligned with GCC.
  • Any pointer (eight bytes) will be 8-byte aligned.

Some data types are dependent on the implementation.

Here is a structure with members of various types, totaling 8 bytes before compilation:

1
2
3
4
5
6
7
struct MixedData
{
    char Data1;
    short Data2;
    int Data3;
    char Data4;
};
 

After compilation the data structure will be supplemented with padding bytes to ensure a proper alignment for each of its members:

1
2
3
4
5
6
7
8
9
10
11
struct MixedData  /* After compilation in 32-bit x86 machine */
{
    char Data1; /* 1 byte */
    char Padding1[1]; 
/* 1 byte for the following 'short' to be aligned on a 2 byte boundary
assuming that the address where structure begins is an even number */
    short Data2; /* 2 bytes */
    int Data3;  /* 4 bytes - largest structure member */
    char Data4; /* 1 byte */
    char Padding2[3]; /* 3 bytes to make total size of the structure 12 bytes */
};
 
 

The compiled size of the structure is now 12 bytes. It is important to note that the last member is padded with the number of bytes required so that the total size of the structure should be a multiple of the largest alignment of any structure member (alignment(int) in this case, which = 4 on linux-32bit/gcc)[citation needed].

In this case 3 bytes are added to the last member to pad the structure to the size of a 12 bytes (alignment(int) × 3).

1
2
3
4
struct FinalPad {
  float x;
  char n[1];
};
 

In this example the total size of the structure sizeof(FinalPad) == 8, not 5 (so that the size is a multiple of 4 (alignment of float)).

1
2
3
4
struct FinalPadShort {
  short s;
  char n[3];
};
 

In this example the total size of the structure sizeof(FinalPadShort) == 6, not 5 (not 8 either) (so that the size is a multiple of 2 (alignment(short) = 2 on linux-32bit/gcc)).

It is possible to change the alignment of structures to reduce the memory they require (or to conform to an existing format) by reordering structure members or changing the compiler’s alignment (or “packing”) of structure members.

 
1
2
3
4
5
6
7
struct MixedData  /* after reordering */
{
    char Data1;
    char Data4;   /* reordered */
    short Data2;
    int Data3;
};
 
 

The compiled size of the structure now matches the pre-compiled size of 8 bytes. Note that Padding1[1] has been replaced (and thus eliminated) by Data4 and Padding2[3] is no longer necessary as the structure is already aligned to the size of a long word.

The alternative method of enforcing the MixedData structure to be aligned to a one byte boundary will cause the pre-processor to discard the pre-determined alignment of the structure members and thus no padding bytes would be inserted.

While there is no standard way of defining the alignment of structure members, some compilers use #pragma directives to specify packing inside source files. Here is an example:

1
2
3
4
5
6
7
8
9
10
11
#pragma pack(push)  /* push current alignment to stack */
#pragma pack(1)     /* set alignment to 1 byte boundary */
 
struct MyPackedData
{
    char Data1;
    long Data2;
    char Data3;
};
 
#pragma pack(pop)   /* restore original alignment from stack */
 

This structure would have a compiled size of 6 bytes on a 32-bit system. The above directives are available in compilers from Microsoft,[8]Borland, GNU,[9] and many others.

Another example:

1
2
3
4
5
6
struct MyPackedData
{
    char Data1;
    long Data2 __attribute__((packed));
    char Data3;
};
 
 

Default packing and #pragma pack[edit]

On some Microsoft compilers, particularly for the RISC processor, there is an unexpected relationship between project default packing (the /Zp directive) and the #pragma pack directive. The #pragma pack directive can only be used to reduce the packing size of a structure from the project default packing.[10] This leads to interoperability problems with library headers which use, for example, #pragma pack(8), if the project packing is smaller than this. For this reason, setting the project packing to any value other than the default of 8 bytes would break the #pragma packdirectives used in library headers and result in binary incompatibilities between structures. This limitation is not present when compiling for x86.

Allocating memory aligned to cache lines[edit]

It would be beneficial to allocate memory aligned to cache lines. If an array is partitioned for more than one thread to operate on, having the sub-array boundaries unaligned to cache lines could lead to performance degradation. Here is an example to allocate memory (double array of size 10) aligned to cache of 64 bytes.

1
2
3
4
5
6
7
8
9
10
11
12
#include <stdlib.h>
double *foo(void) {
   double *var;//create array of size 10
   int     ok;
 
   ok = posix_memalign((void**)&var, 6410*sizeof(double));
 
   if(ok != 0)
     return NULL;
 
   return var;
}
 
 

Hardware significance of alignment requirements[edit]

Alignment concerns can affect areas much larger than a C structure when the purpose is the efficient mapping of that area through a hardware address translation mechanism (PCI remapping, operation of a MMU).

For instance, on a 32-bit operating system, a 4 KiB (4096 Bytes) page is not just an arbitrary 4 KiB chunk of data. Instead, it is usually a region of memory that's aligned on a 4 KiB boundary. This is because aligning a page on a page-sized boundary lets the hardware map a virtual address to a physical address by substituting the higher bits in the address, rather than doing complex arithmetic.

Example: Assume that we have a TLB mapping of virtual address 0x2CFC7000 to physical address 0x12345000. (Note that both these addresses are aligned at 4 KiB boundaries.) Accessing data located at virtual address va=0x2CFC7ABC causes a TLB resolution of 0x2CFC7 to 0x12345 to issue a physical access to pa=0x12345ABC. Here, the 20/12-bit split luckily matches the hexadecimal representation split at 5/3 digits. The hardware can implement this translation by simply combining the first 20 bits of the physical address (0x12345) and the last 12 bits of the virtual address (0xABC). This is also referred to as virtually indexed (ABC) physically tagged (12345).

A block of data of size 2(n+1) - 1 always has one sub-block of size 2n aligned on 2n bytes.

This is how a dynamic allocator that has no knowledge of alignment, can be used to provide aligned buffers, at the price of a factor two in space loss.

 

1
2
3
4
5
6
// Example: get 4096 bytes aligned on a 4096 byte buffer with malloc()
 
// unaligned pointer to large area
void *up = malloc((1 << 13- 1);
// well-aligned pointer to 4 KiB
void *ap = aligntonext(up, 12);
 
 

where aligntonext(p, r) works by adding an aligned increment, then clearing the r least significant bits of p. A possible implementation is

1
2
3
// Assume `uint32_t p, bits;` for readability
#define alignto(p, bits)      (((p) >> bits) << bits)
#define aligntonext(p, bits)  alignto(((p) + (1 << bits) - 1), bits)
 
더보기

Many CPUs, such as those based on Alpha, IA-64, MIPS, and SuperH architectures, refuse to read misaligned data. When a program requests that one of these CPUs access data that is not aligned, the CPU enters an exception state and notifies the software that it cannot continue. On ARM, MIPS, and SH device platforms, for example, the operating system default is to give the application an exception notification when a misaligned access is requested.

Misaligned memory accesses can incur enormous performance losses on targets that do not support them in hardware.

Alignment

Alignment is a property of a memory address, expressed as the numeric address modulo a power of 2. For example, the address 0x0001103F modulo 4 is 3; that address is said to be aligned to 4n+3, where 4 indicates the chosen power of 2. The alignment of an address depends on the chosen power of two. The same address modulo 8 is 7.

An address is said to be aligned to X if its alignment is Xn+0.

CPUs execute instructions that operate on data stored in memory, and the data are identified by their addresses in memory. In addition to its address, a single datum also has a size. A datum is called naturally aligned if its address is aligned to its size, and misaligned otherwise. For example, an 8-byte floating-point datum is naturally aligned if the address used to identify it is aligned to 8.

Compiler handling of data alignment

Device compilers attempt to allocate data in a way that prevents data misalignment.

For simple data types, the compiler assigns addresses that are multiples of the size in bytes of the data type. Thus, the compiler assigns addresses to variables of type longthat are multiples of four, setting the bottom two bits of the address to zero.

In addition, the compiler pads structures in a way that naturally aligns each element of the structure. Consider the structure struct x_ in the following code example:

 

1
2
3
4
5
6
7
struct x_
{
   char a;     // 1 byte
   int b;      // 4 bytes
   short c;    // 2 bytes
   char d;     // 1 byte
} MyStruct;
 

The compiler pads this structure to enforce alignment naturally.

Example

The following code example shows how the compiler places the padded structure in memory:

1
2
3
4
5
6
7
8
9
10
// Shows the actual memory layout
struct x_
{
   char a;            // 1 byte
   char _pad0[3];     // padding to put 'b' on 4-byte boundary
   int b;            // 4 bytes
   short c;          // 2 bytes
   char d;           // 1 byte
   char _pad1[1];    // padding to make sizeof(x_) multiple of 4
}
 
 

Both declarations return sizeof(struct x_) as 12 bytes.

The second declaration includes two padding elements:

  • char _pad0[3] to align the int b member on a four-byte boundary
  • char _pad1[1] to align the array elements of the structure struct _x bar[3];

The padding aligns the elements of bar[3] in a way that allows natural access.

The following code example shows the bar[3] array layout:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
adr
offset   element
------   -------
0x0000   char a;         // bar[0]
0x0001   char pad0[3];
0x0004   int b;
0x0008   short c;
0x000a   char d;
0x000b   char _pad1[1];
 
0x000c   char a;         // bar[1]
0x000d   char _pad0[3];
0x0010   int b;
0x0014   short c;
0x0016   char d;
0x0017   char _pad1[1];
 
0x0018   char a;         // bar[2]
0x0019   char _pad0[3];
0x001c   int b;
0x0020   short c;
0x0022   char d;
0x0023   char _pad1[1];
 

Alpha, IA-64, MIPS, SuperH 같은 구조를 지닌 많은 CPU들은 정렬되지 않은 데이터들을 읽지 않는다.(Refuse Read) 정렬되지 않은 데이터를 하드웨어에서 지원하지 않을 경우, 처리과정에서 심각한 성능 저하를 초래할 수 있다.

 

데이터들을 성공적으로 정리를 하기 위해서는 어느 정도의 기준점이 필요할 것이다. 그 기준은 바로 struct 혹은 class 내에 있는 변수들 중 가장 메모리를 큰 자료형을 기준으로 한다. (참고 - MS 기준 기본 자료형) 큰 자료형으로 메모리 공간을 확보한 후에 이제 작은 자료형을 마련된 메모리 공간으로 삽입한다. 이때 삽입되는 순서는 정의된 데이터 순서이다. 또한 다음 데이터를 삽입했을 때 메모리 공간 초과 시 남겨진 공간은 그대로 두고 새로운 메모리 공간을 확보해 데이터를 삽입하는 행위를 반복한다.

 

예를 들면

1
2
3
4
5
6
7
8
9
10
struct MixedData  /* After compilation in 32-bit x86 machine */
{
    char Data1; /* 1 byte */
    char Padding1[1]; /* 1 byte for the following 'short' to be aligned on a 2 byte boundary
assuming that the address where structure begins is an even number */
    short Data2; /* 2 bytes */
    int Data3;  /* 4 bytes - largest structure member */
    char Data4; /* 1 byte */
    char Padding2[3]; /* 3 bytes to make total size of the structure 12 bytes */
};
 
 

이러한 구조체가 있다고 하면, 해당 구조체에서 가장 큰 자료형은 4바이트 짜리 int형 자료형이다. 때문에 char(1byte) 2개 short(2byte) 하나씩 하여 4바이트를 맞춘 뒤 int메모리에 넣는 식이다. 이렇게 되면 char형을 하나 빼먹는다는 가정하에 메모리 공간이 낭비가 되는 경우가 생길 것이며, 그러한 공간 때문에 엑세스 하는데 느려질 가능성이 있다. 물론, 당장 눈으로 본다면 말이다.

 

하지만, 전체적으로 또 시스템적으로 보면 이렇게 큰 자료형 기준으로 포장(Packed)을 할 경우 약 37.5% 적은 메모리를 사용할 수 있다.(접은 글 참조) 이 부분은 '자료크기가 어떻게 되느냐'에 민감한 프로젝트가 있다면, 굉장히 치명적인 부분일 것이다. 모든 클래스에 이런 현상을 체크해야함은 어떠한 프로젝트를 제작하느냐에 따라 꽤 피곤한 행위가 될 수 있다.

 

+ x86에선 포인터가 4바이트로 세팅된다.

+ x64에선 포인터가 8바이트로 세팅된다.

만약 클래스, 구조체 내 포인터가 있다면 위와 같은 크기로 패딩이 될 것이다.

 

 

 

 

/Zp(구조체 멤버 맞춤)

 

/Zp(구조체 멤버 맞춤)

/Zp(구조체 멤버 맞춤)/Zp (Struct Member Alignment) 이 문서의 내용 --> 구조체의 멤버가 메모리에 압축 되는 방식을 제어 하 고 모듈의 모든 구조체에 대해 동일한 압축을 지정 합니다.Controls how the members of a structure are packed into memory and specifies the same packing for all structures in a module. 구문Syntax /Zp

docs.microsoft.com

더보기

구조체의 멤버가 메모리에 압축 되는 방식을 제어 하 고 모듈의 모든 구조체에 대해 동일한 압축을 지정 합니다.

구문

/Zp[1|2|4|8|16]

설명

합니다/Zpn옵션을 각 구조체 멤버가 저장할 위치를 컴파일러에 지시 합니다.첫 번째 멤버 유형 크기의 작은 경계에 멤버를 저장 하는 컴파일러 또는n-바이트 경계입니다.

사용 가능한 압축 값은 다음 표에 설명 되어 있습니다.

/Zp 인수효과

1 1 바이트 경계에서 구조체를 압축 합니다.동일/Zp합니다.
2 2 바이트 경계에서 구조체를 압축 합니다.
4 4 바이트 경계에서 구조체를 압축 합니다.
8 8 바이트 경계 (x86, ARM 및 ARM64 기본값)에서 구조체를 압축 합니다.
16 (X64에 대 한 기본값) 16 바이트 경계에서 구조체를 압축 합니다.

특정 맞춤 요구 사항이 있는 경우가 아니면이 옵션을 사용 하지 마세요.

경고

C++Windows SDK의 헤더 집합 및 가정/zp8내부적으로 압축 합니다.메모리 손상이 경우 발생할 수 있습니다 합니다/ZpWindows SDK 헤더 내에서 설정을 변경 합니다.헤더에서 영향을 받지 않습니다/Zp옵션이 명령줄에서 설정 합니다.

사용할 수도 있습니다컨트롤 구조체 압축 합니다.정렬에 대한 자세한 내용은 다음을 참조하십시오.

Visual Studio 개발 환경에서 이 컴파일러 옵션을 설정하려면

  1. 프로젝트의속성 페이지대화 상자를 엽니다.자세한 내용은 참조 하세요Visual Studio에서 설정 C++ 컴파일러 및 빌드 속성합니다.

  2. 구성 속성>C/C++>코드 생성속성 페이지를 선택합니다.

  3. 수정 된구조체 멤버 맞춤속성입니다.

프로그래밍 방식으로 이 컴파일러 옵션을 설정하려면

참고자료

MSVC 컴파일러 옵션 \ MSVC 컴파일러 명령줄 구문

해당 설정은 바이트 패딩(데이터 정렬)을 기준을 명확히 하나로 통일 시켜주는 설정이다.

 

 

예제_

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <iostream>
 
 
 
struct MixedData  /* After compilation in 32-bit x86 machine */
{
    char Data1; /* 1 byte */
    char Padding1[1]; /* 1 byte for the following 'short' to be aligned on a 2 byte boundary
assuming that the address where structure begins is an even number */
    short Data2; /* 2 bytes */
    int Data3;  /* 4 bytes - largest structure member */
    char Data4; /* 1 byte */
    char Padding2[3]; /* 3 bytes to make total size of the structure 12 bytes */
};
struct FinalPadShort {
    short s;
    char n[3];
};
int main()
{
    std::cout << sizeof(FinalPadShort) << std::endl;
    std::cout << sizeof(MixedData) << std::endl;
}
 
 

구조체 MixedData는 위 바이트패딩 예제에서 다룬 구조체이므로 크기는 생략하겠다. 구조체 FinalPadShort 같은 경우는 기본형으로 계산을 하게 되면, 가장 큰 자료형이 Short이므로 메모리 공간을 short(2byte)로 맞춘다. 후에 short * 1, char * 3의 크기(5byte)를 short 세 개(6byte)로 맞출 수 있으므로 최종 6byte의 공간이 필요하게 된다.

 

하지만, 강제로 그 메모리 공간을 1byte로 정해버리면 어떻게 될까?

 short * 1, char * 3 => (5byte)

가 되므로 그냥 1byte 5개면 다 담을 수 있게 된다. 해당 실행사진은 다음과 같다.

구조체 멤버 맞춤 1바이트 크기
구조체 멤버 맞춤 기본형

비트 필드

 

C++ 비트 필드

C++ 비트 필드C++ Bit Fields 이 문서의 내용 --> 클래스와 구조체는 정수 형식보다 작은 스토리지 공간을 차지하는 멤버를 포함할 수 있습니다.Classes and structures can contain members that occupy less storage than an integral type. 이러한 멤버는 비트 필드로 지정됩니다.These members are specified as bit fields. 비트 필드에 대 한 구문을

docs.microsoft.com

더보기

Classes and structures can contain members that occupy less storage than an integral type. These members are specified as bit fields. The syntax for bit-field member-declarator specification follows:

Syntax

declarator : constant-expression

Remarks

The (optional) declarator is the name by which the member is accessed in the program.It must be an integral type (including enumerated types). The constant-expressionspecifies the number of bits the member occupies in the structure. Anonymous bit fields — that is, bit-field members with no identifier — can be used for padding.

 참고

An unnamed bit field of width 0 forces alignment of the next bit field to the next type boundary, where type is the type of the member.

The following example declares a structure that contains bit fields:

 

예제는 밑에다 설명하였다.

 

The conceptual memory layout of an object of type Date is shown in the following figure.

 
Memory Layout of Date Object

Note that nYear is 8 bits long and would overflow the word boundary of the declared type, unsigned short. Therefore, it is begun at the beginning of a new unsigned short.It is not necessary that all bit fields fit in one object of the underlying type; new units of storage are allocated, according to the number of bits requested in the declaration.

Microsoft Specific

The ordering of data declared as bit fields is from low to high bit, as shown in the figure above.

END Microsoft Specific

If the declaration of a structure includes an unnamed field of length 0, as shown in the following example,

 

예제는 밑에다 설명하겠다.

 

then the memory layout is as shown in the following figure:

 
Layout of Date Object with Zero-Length Bit Field

The underlying type of a bit field must be an integral type, as described in Fundamental Types.

If the initializer for a reference of type const T& is an lvalue that refers to a bit field of type T, the reference is not bound to the bit field directly. Instead, the reference is bound to a temporary initialized to hold the value of the bit field.

Restrictions on bit fields

The following list details erroneous operations on bit fields:

  • Taking the address of a bit field.

  • Initializing a non-const reference with a bit field.

See also

Classes and Structs

바이트 패딩은 메모리 공간 하나를 현재 가장 큰 자료형에 맞추는 것이라 하였다. 비트필드 메모리 공간도 마찬가지이다. 일단 메모리 공간 하나를 구조체 혹은 클래스 내부에 가장 큰 자료형에 맞추는 규칙은 같다. 하지만, 개별적으로 가용할 데이터를 조작함에 최종 메모리 크기가 조금씩 변동될 수 있다.

 

비트 필드의 문법은

[자료형] [변수 이름] [: n]

순이다.

 

비트 필드는 자료형이 어떻든 n 만큼의 비트로 데이터를 구성해 메모리를 구성하겠다는 것이다.

 

 

 

 

예제1_

1
2
3
4
5
6
7
8
// compile with: /LD
struct Date {
   unsigned short nWeekDay  : 3;    // 0..7   (3 bits)
   unsigned short nMonthDay : 6;    // 0..31  (6 bits)
   unsigned short nMonth    : 5;    // 0..12  (5 bits)
   unsigned short nYear     : 8;    // 0..100 (8 bits)
};
 
 

예제1 - 비트필드

선언된 순서대로 : n 만큼의 비트를 채워넣는다. 이때, 예제 그림을 보면 노란색 영역이 메모리를 균등하게 해주는 [Padding]부분인데 그 패딩, 메모리 공간의 기준은 앞서, 구조체 내 가장 큰 자료형이라 하였다. 이 그림에서는 16비트 즉, short 크기만큼이 하나의 메모리 공간이 되겠다.

 

메모리 공간에 맞추어 비트를 채워넣다가 [nYear] 데이터를 삽입하려는데 그 경계, 메모리공간을 초과하는 데이터 크기였다. 따라서, 새로운 메모리 공간을 확장하고 남은 곳을 비워둔다.

 

 

바이트 패딩과 비트필드의 구조체 크기 차이

 

 

 

 

예제2_

1
2
3
4
5
6
7
8
9
// compile with: /LD
struct Date {
   unsigned nWeekDay  : 3;    // 0..7   (3 bits)
   unsigned nMonthDay : 6;    // 0..31  (6 bits)
   unsigned           : 0;    // Force alignment to next boundary.
   unsigned nMonth    : 5;    // 0..12  (5 bits)
   unsigned nYear     : 8;    // 0..100 (8 bits)
};
 
 

바이트 패딩 건너뛰기

우선, 그림을 설명하기 전에 예제에 쓰인 [unsigned] 부터 예기하자면, 그냥 [unsigned int] 이다. 즉, 구조체에 구성된 변수들은 4바이트짜리 4개라는 것을 알아야한다.

 

그래서 가장 큰 메모리 공간이 int 크기로 되었고, 이제 그 공간에 채워넣어야 하는데 각각 3, 6, 5, 8 비트로 쪼갰다. 문제는 중간에 변수이름 없는 unsigned : 0이다. 해당 변수는 변수가 아니라 그냥 다음 메모리 경계선으로 뒤에 오는 데이터들을 담아두라는 것이다.

 

 

 

 

예제2 응용_

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
struct Date0 {
    unsigned nWeekDay;  
    unsigned nMonthDay; 
    // unsigned short : 0;
    unsigned nMonth;  
    unsigned nYear;   
};
struct Date1 {
    unsigned nWeekDay : 3;    // 0..7   (3 bits)
    unsigned nMonthDay : 6;    // 0..31  (6 bits)
    // unsigned  : 0;        // Force alignment to next boundary.
    unsigned nMonth : 5;    // 0..12  (5 bits)
    unsigned nYear : 8;    // 0..100 (8 bits)
};
struct Date2 {
    unsigned nWeekDay : 3;    // 0..7   (3 bits)
    unsigned nMonthDay : 6;    // 0..31  (6 bits)
    unsigned  : 0;        // Force alignment to next boundary.
    unsigned nMonth : 5;    // 0..12  (5 bits)
    unsigned nYear : 8;    // 0..100 (8 bits)
};
int main()
{
    std::cout << "Byte Padding Size: " << sizeof(Date0) << std::endl;
    std::cout << "Bit Field Size: " << sizeof(Date1) << std::endl;
    std::cout << "Zero Length Bit Field Size: " << sizeof(Date2) << std::endl;
}
 
 

바이트 패딩과 비트필드, Zero-Length Align의 구조체 크기 차이

 

  • Data0 의 구조체_ 하나의 메모리 공간은 int 크기[32bit]이고, 각각의 원소는 비트필드 없이 int형 크기를 유지한다.
    => [int Size (4) * 4 = 16]
  • Data1 의 구조체_ 하나의 메모리 공간은 int 크기[32bit]이고, 각각의 원소는 비트필드가 있으며 [3 + 6 + 5 + 8 = 21][bit] 크기를 가진다. 이는 메모리 공간 하나에 모두 들어가는 크기이므로 메모리 공간 확장없이 int 하나에 다 담을 수 있겠다. => [int Size (4) * 1 = 4byte]
  • Data2 의 구조체_ 하나의 메모리 공간은 int 크기[32bit]이고, 각각의 원소는 비트필드가 있으며 [3 + 6 + 32(정렬) + 5 + 8 = 48][bit] 크기를 가진다. 이는 메모리 공간 두 개에 모두 들어가는 크기이므로 메모리 공간을 한 번 확장하여 메모리에 담아야겠다. => [int Size(4) * 2 = 8Byte]

만약 자료형을 넘어서는 비트를 할당하게 될 경우 어떻게 될 것인가? 예를 들면 [short N = 134 : 28] 과 같이 16비트짜리 자료형에 28비트를 넣을 수 있겠냐는 것이다.

 

결론 - 터진다. 해당 오류는 /c 명령줄을 사용해 컴파일 오류를 피할 수 있다.

 

 

 

컴파일러 오류 C2034

 

docs.microsoft.com

 

'identifier' : type of bit field too small for number of bits

The number of bits in the bit-field declaration exceeds the size of the base type.

The following sample generates C2034:

1
2
3
4
struct A {
   char test : 9;   // C2034, char has 8 bits
};
 
 

Possible resolution:

1
2
3
4
5
// compile with: /c
struct A {
   char test : 8;
};
 
반응형
Posted by Lotus1031
,